Machine Learning in the Bandit Setting: Algorithms, Evaluation, and Case Studies

Friday, February 10, 2012 - 11:15am - 12:15pm

Computer Science - Seminar Event

Location: Torgerson 2150

Our speaker is Dr. Lihong Li from Yahoo! Research. Here is the abstract of his talk: Much of machine-learning research is about discovering patterns---building intelligent agents that learn to predict future accurately from historical data. While this paradigm has been extremely successful in numerous applications, complex real-world problems such as content recommendation on the Internet often require the agents to learn to act optimally through autonomous interaction with the world they live in, a problem known as reinforcement learning. Using a news recommendation module on Yahoo!'s front page as a running example, the majority of the talk focuses on the special case of contextual bandits that have gained substantial interests recently due to their broad applications. We will highlight a fundamental challenge known as the exploration/exploitation tradeoff, present a few newly developed algorithms with strong theoretical guarantees, and demonstrate their empirical effectiveness for personalizing content recommendation at Yahoo!. At the end of the talk, we will also summarize (briefly) our earlier work on provably data-efficient algorithms for more general reinforcement-learning problems modeled as Markov decision processes.

Contact: T. M. Murali
Email: murali@cs.vt.edu
Phone: 5402318534

Website


View Full Calendar ->