BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//ical.marudot.com//iCal Event Maker
X-WR-CALNAME:CISPA DLS/ Nathan Kallus: Smooth Contextual Bandits
NAME:CISPA DLS/ Nathan Kallus: Smooth Contextual Bandits
CALSCALE:GREGORIAN
BEGIN:VTIMEZONE
TZID:Europe/Berlin
LAST-MODIFIED:20201011T015911Z
TZURL:http://tzurl.org/zoneinfo-outlook/Europe/Berlin
X-LIC-LOCATION:Europe/Berlin
BEGIN:DAYLIGHT
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
DTSTART:19700329T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
DTSTART:19701025T030000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230628T095829Z
UID:1687946179674-67932@ical.marudot.com
DTSTART;TZID=Europe/Berlin:20230704T100000
DTEND;TZID=Europe/Berlin:20230704T120000
SUMMARY:CISPA DLS/ Nathan Kallus: Smooth Contextual Bandits
URL:https://cispa-de.zoom.us/j/61118095073?pwd=UmM4bnNuamVjeVQwRy9qTDhIbHNyZz09%20
DESCRIPTION:Contextual bandit problems model the inherent cost of learning in personalized decision-making in new environments\, whether in marketing\, healthcare\, or revenue management. Specifically\, the cost is characterized by the optimal growth rate of the regret in cumulative rewards compared to an optimal policy given full prior knowledge of the environment. Naturally\, the optimal rate should depend on how complex the underlying supervised learning problem is\, namely how much can observing rewards in one context tell us about mean rewards in another context. Curiously\, this obvious-seeming relationship is obscured in current theory that separately studies the easy\, fully-extrapolatable case and hard\, super-local case. To characterize the relationship more precisely\, I study a nonparametric contextual bandit problem where expected reward functions are β-smooth (roughly meaning β-times differentiable). I will show how this interpolates between the two extremes previously studied in isolation: non-differentiable-response bandits (β ≤ 1)\, where rate-optimal regret is achieved by decomposing the problem into non-contextual bandits\, and parametric-response bandits (β = ∞)\, where rate-optimal regret is often achievable without any exploration at all. We develop a novel algorithm that works for any given smoothness setting by operating neither fully locally nor fully globally. We prove its regret is rate-optimal\, thereby characterizing the optimal regret rate and revealing a fuller picture of the crucial interplay between complexity and regret in dynamic decision-making. Time permitting\, I will also discuss how to construct valid confidence intervals from data collected by contextual bandits\, a crucial challenge in the enterprise to replace randomized trials with adaptive experiments in applied fields from biostatistics to development economics.\n\nThis talk is based on the following papers:\n(1) Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes (https://pubsonline.informs.org/doi/abs/10.1287/opre.2021.2237)\n(2) Post-Contextual-Bandit Inference (https://papers.nips.cc/paper/2021/hash/eff3058117fd4cf4d4c3af12e273a40f-Abstract.html)\n\n\nShort Bio: \nNathan Kallus is an Associate Professor in the School of Operations Research and Information Engineering and Cornell Tech at Cornell University. He also holds a Research Director position for the Product Machine Learning Research at Netflix. Nathan's research interests include personalization\; optimization\, especially under uncertainty\; causal inference\; sequential decision-making\; credible and robust inference\; and algorithmic fairness. He holds a PhD in Operations Research from MIT as well as a BA in Mathematics and a BS in Computer Science both from UC Berkeley. Before coming to Cornell\, Nathan was a Visiting Scholar at USC's Department of Data Sciences and Operations and a Postdoctoral Associate at MIT's Operations Research and Statistics group.\n\nThe talk will take place in a hybrid mode with a physical presence in the Bernd Therre lecture hall at CISPA and via Zoom:\n\nhttps://cispa-de.zoom.us/j/61118095073?pwd=UmM4bnNuamVjeVQwRy9qTDhIbHNyZz09  \nID: 611 1809 5073\nPasscode: F8F7%*
LOCATION:CISPA 0 - Stuhlsatzenhaus 5 - Bernd Therre lecture hall 0.05
END:VEVENT
END:VCALENDAR