|| Bicausal Optimal Transport for Markov Chains via Dynamic Programming
||Vrettos Moulos, UC Berkeley, United States|
||D4-S3-T3: Reinforcement Learning
||Thursday, 15 July, 22:40 - 23:00
||Thursday, 15 July, 23:00 - 23:20
In this paper we study the bicausal optimal transport problem for Markov chains, an optimal transport formulation suitable for stochastic processes which takes into consideration the accumulation of information as time evolves. Our analysis is based on a relation between the transport problem and the theory of Markov decision processes. This way we are able to derive necessary and sufficient conditions for optimality in the transport problem, as well as an iterative algorithm, namely the value iteration, for the calculation of the transportation cost. Additionally, we draw the connection with the classic theory on couplings for Markov chains, and in particular with the notion of faithful couplings.