Learning and repeated decision-making in mobility

Learning and repeated decision-making

One of the problems I am interested in studying is learning behavioral dynamics of humans: can we learn how people make decisions (i.e. how people learn), and can we use the corresponding models to predict what their future decisions will be? To answer these questions, I use the framework of of sequential decision-making to model the learning process and decision-making process of humans. Players making repeated decisions, in which they learn over time to optimize their choices, can be efficiently modeled by a sequential process in which they optimize a payoff function at each step, linked to the “reward” they experience. The techniques developed leverage known models, such as the replicator dynamics and the hedge algorithm. In order to understand if these decision-making process models converge to an equilibrium (or not), we use optimization methods commonly developed in machine learning, such as mirror descent and stochastic gradient descent. Depending on the assumptions made on the learning process humans use to make their decisions, we prove convergence of these processes to a set of equilibria (in some cases Nash equilibria of an underlying game). The more specific the assumptions on the learning process humans use, the more guarantees there is for the type of convergence (i.e. on average, no-regret, almost surely etc.), and potentially for convergence rates. For illustration purposes, we have implemented an online gaming framework on Mechanical Turk (Amazon’s crowdsourcing service for parallelizable tasks). In this online game (see video to gain a better understanding of the experiments), distributed online players use Mechanical Turk to play against each other, while we watch the convergence rate of their game, and use our algorithms to predict the decisions they will make, based on what we observed they did in the past. The results are very exciting, humans indeed converge to the equilibria predicted by our models, and the learning rates we infer are representative of their behavior well enough to predict a few steps their future actions in time. This type of work has numerous applications, in particular systems in which players (for example companies such as Waze/Google, INRIX, Apple Maps, routing motorists running their apps) compete for a given resource selfishly (for example road capacity), and improve their decision making process over time by learning the dynamics of their agents (motorists).

The Routing Game Experiment with Walid Krichene

Learning and modeling behavioral changes in transportation networks

I am interested in studying the use of distributed networks to model large-scale mobility patterns in urban environments. Several scales of the problem are challenging and interesting. At super large scale, we have studied the integration of cell tower records data (mainly CDR data) into user equilibrium models, to perform user equilibrium inference using a new approached called cellpath. At this scale, I am also interested in understanding the effect of massive adoption of new services (for example routing services like Waze/Google, INRIX, Apple Maps, or Mobility as a Service (MaaS) apps) on congestion. In particular, I am interested in understanding how selfish routing algorithms contribute to redistribute traffic in previously uncongested areas, and what impact they have on overall optimality of traffic. At smaller scales, we have used filtering and estimation techniques to integrate GPS and mobile data into traffic flow models, following work started with the Mobile Millennium project [URL], which is still ongoing and the focus of great interest. Using the same types of models (networks of hyperbolic PDEs discretized with the Godunov scheme), we have created adjoint-based optimal control schemes to produce “on demand” congestion patterns, showing that with proper design of cost functions, one can create near-arbitrary patterns in time-space diagrams. We illustrated this by creating “Cal” logo looking time-space diagrams, Go Bears! More recently, my work has focused on modeling Mobility as a Service (MaaS) companies such as Lyft and Uber using Jackson networks, to study the problems of rebalancing fleets and surge pricing. Using convex optimization though block-coordinate descent, I have created new modeling frameworks to characterize the impact of cyberattacks on MaaS companies (for example through fake reservation requests to capture competitor’s rides).