ENBIS-18 in Nancy

2 – 25 September 2018; Ecoles des Mines, Nancy (France) Abstract submission: 20 December 2017 – 4 June 2018

Direct Policy Search: An Introduction

4 September 2018, 09:20 – 09:40


Submitted by
Jérôme Collet
Jérôme Collet (EDF R&D)
Usually, stochastic optimization of a system consists in the following steps: modelling of the stochastic process driving the system, parameter estimation, and search for the policy. It is also possible to bypass these steps: a policy is a function of the past variables, optimizing a given criterion. So, if we assume the policy belongs to an adequate set of parametric functions, one “just” has to find the optimal parameter to choose a policy.
This bypassing can fulfil the following goals: reduce computational burden (regarding memory use or processing time), optimize in a multi-objective setting (since some stochastic optimization methods are inherently single-objective), or avoid the modelling of a poorly known stochastic process.

EDF is now facing new energy management problems, in small energy storages (home energy storages, electric vehicles), with poorly known demand processes, so our goal is to bypass its modelling.

Research on “Direct Policy Search” is increasingly active since the 2000s, so we propose a survey on recent papers. Then, we focus on two policy sets: Linear Decision Rules, and Binary Decision Trees. We show in details its use on small examples, both theoretical and real, and compare the results we obtained.
View paper

Return to programme