2022-05495 - Continuous Time Reinforcement Learning. From Theory to Practice.
Level of qualifications required : Graduate degree or equivalent
Fonction : Internship ResearchAbout the research centre or Inria department
The Inria University of Lille centre, created in 2008, employs 360 people including 305 scientists in 15 research teams. Recognised for its strong involvement in the socio-economic development of the Hauts-De-France region, the Inria University of Lille centre pursues a close relationship with large companies and SMEs. By promoting synergies between researchers and industrialists, Inria participates in the transfer of skills and expertise in digital technologies and provides access to the best European and international research for the benefit of innovation and companies, particularly in the region.
For more than 10 years, the Inria University of Lille centre has been located at the heart of Lille's university and scientific ecosystem, as well as at the heart of Frenchtech, with a technology showroom based on Avenue de Bretagne in Lille, on the EuraTechnologies site of economic excellence dedicated to information and communication technologies (ICT).Context
Reinforcement Learning in the recent years has attracted a lot of attention. Deep RL managed to beat human or even expert performance in such tasks as atari games and GO. Unlike classical machine learning, RL helps to train an agent capable of taking decisions based on the state of the environment the agent is in. It is an attractive function that leads to multiple applications in numerous domains.
One of the many applications of RL is control tasks. The simplest ones such as CartPole or Pendulum are the common testbeds for new RL algorithms, while more complex ones are of interest for robot learning. Those tasks are usually considered in the discrete time settings, which on the one hand simplifies the problem so that state-of-the-art RL algorithms can be applicable and on the other hand leads to suboptimal control related to the regularity of decision making process. Nevertheless, there are some problems for which it is necessary to be able to take decisions at the arbitrary moments of time or at high frequency, e.g. high frequency stock trading, autonomous driving and snowboard riding.
Continuous Time Reinforcement Learning (CTRL), compared to Discrete Time Reinforcement Learning (DTRL), deals with the continuity of the problem. In this context, the dynamics of the system are expressed as a PDE (Partial Derivative Equation) for deterministic environments and SDE (Stochastic Derivative Equation) for stochastic environments. The value function (a useful measure to estimate the quality of a policy of actions) can be found from Hamiltonian-Jacobi-Bellman equation that replaces Bellman equation in discrete time. Despite promising performance on simple use cases 1,2,3,4,5, CTRL methods do not match the performance of DTRL algorithms in general case. There are several challenges that prevent CTRL from further scaling:
But the emerging trend of SciML that intends to combine neural networks and PDE/SDE, like physics informed neural networks 6,7 or Neural ODEs 8 are bringing new tools to address CTRL.
The objective of this internship is to develop a basic environment for CTRL with a few classical uses-cases (CartPole, Pendulum, Acrobot, Swimmer), test some promising strategies for CTRL and test some possible improvements.Work:
about 590€ gross per month (internship allowance)General Information
Theme/Domain : Optimization, machine learning and statistical methods Scientific computing (BAP E)
Town/city : Lille
Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.Instruction to apply
CV + cover letter
Defence Security : This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy : As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.