Exploiting easy data in online optimization
Document type :
Communication dans un congrès avec actes
Title :
Exploiting easy data in online optimization
Author(s) :
Sani, Amir [Auteur]
Sequential Learning [SEQUEL]
Neu, Gergely [Auteur]
Sequential Learning [SEQUEL]
Lazaric, Alessandro [Auteur]
Sequential Learning [SEQUEL]
Sequential Learning [SEQUEL]
Neu, Gergely [Auteur]
Sequential Learning [SEQUEL]
Lazaric, Alessandro [Auteur]

Sequential Learning [SEQUEL]
Conference title :
Advances in Neural Information Processing 27
City :
Montreal
Country :
Canada
Start date of the conference :
2014-12-08
HAL domain(s) :
Informatique [cs]/Apprentissage [cs.LG]
English abstract : [en]
We consider the problem of online optimization, where a learner chooses a decision from a given decision set and suffers some loss associated with the decision and the state of the environment. The learner's objective is ...
Show more >We consider the problem of online optimization, where a learner chooses a decision from a given decision set and suffers some loss associated with the decision and the state of the environment. The learner's objective is to minimize its cumulative regret against the best fixed decision in hindsight. Over the past few decades numerous variants have been considered, with many algorithms designed to achieve sub-linear regret in the worst case. However, this level of robustness comes at a cost. Proposed algorithms are often over-conservative, failing to adapt to the actual complexity of the loss sequence which is often far from the worst case. In this paper we introduce a general algorithm that, provided with a "safe" learning algorithm and an opportunistic "benchmark", can effectively combine good worst-case guarantees with much improved performance on "easy" data. We derive general theoretical bounds on the regret of the proposed algorithm and discuss its implementation in a wide range of applications, notably in the problem of learning with shifting experts (a recent COLT open problem). Finally, we provide numerical simulations in the setting of prediction with expert advice with comparisons to the state of the art.Show less >
Show more >We consider the problem of online optimization, where a learner chooses a decision from a given decision set and suffers some loss associated with the decision and the state of the environment. The learner's objective is to minimize its cumulative regret against the best fixed decision in hindsight. Over the past few decades numerous variants have been considered, with many algorithms designed to achieve sub-linear regret in the worst case. However, this level of robustness comes at a cost. Proposed algorithms are often over-conservative, failing to adapt to the actual complexity of the loss sequence which is often far from the worst case. In this paper we introduce a general algorithm that, provided with a "safe" learning algorithm and an opportunistic "benchmark", can effectively combine good worst-case guarantees with much improved performance on "easy" data. We derive general theoretical bounds on the regret of the proposed algorithm and discuss its implementation in a wide range of applications, notably in the problem of learning with shifting experts (a recent COLT open problem). Finally, we provide numerical simulations in the setting of prediction with expert advice with comparisons to the state of the art.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
Collections :
Source :
Files
- https://hal.archives-ouvertes.fr/hal-01079428/document
- Open access
- Access the document
- https://hal.archives-ouvertes.fr/hal-01079428/document
- Open access
- Access the document
- document
- Open access
- Access the document
- SNL14.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- SNL14.pdf
- Open access
- Access the document