This webpage offers a series of benchmark problems for testing ADP/RL algorithms. In each we have found the *optimal* policy by creating and solving a discrete version of the problem.
We have found that popular algorithms based on using various machine learning algorithms can work surprisingly poorly on classical inventory/storage problems. See
Daniel Jiang, Thuy Pham, Warren B. Powell, Daniel Salas, Warren Scott, “A Comparison of Approximate Dynamic Programming Techniques on Benchmark Energy Storage Problems: Does Anything Work?,” IEEE Symposium Series on Computational Intelligence, Workshop on Approximate Dynamic Programming and Reinforcement Learning, Orlando, FL, December, 2014.
We have had our best success with methods that can be described as “lookup table with structure.” Below, we list problems that exploit convexity and monotonicity.
Convex problems – The value function is convex in a single resource dimension
- Energy storage datasets I – prepared by Daniel Salas
Monotone problems – The value function is monotone in each dimension of the state variable
- Energy storage datasets II – prepared by Daniel Jiang (Created June 3 2015)
- Optimal stopping problems – prepared by Daniel Jiang (Created June 3 2015)
Energy storage datasets prepared by Warren Scott
This is a series of benchmarked energy storage datasets prepared by Warren Scott for the paper:
Energy storage datasets I – prepared by Daniel Salas
The datasets reflect a relatively simple energy storage problem depicted by (in its full form) one battery, a variable (but free) stochastic source (wind or solar), a limitless source at a random prices (from the grid), to serve a fairly predictable but time varying load. We visualiez the problem using
The Princeton energy storage benchmark datasets are a series of finite horizon problems that consist of four components:
- A renewable source of energy (free, but variable and usually stochastic).
- The power grid – an infinite supply of energy (and a market) at a random price.
- A load – usuallly time dependent, usually stochastic.
- A single storage device used to smooth out flows
Most of these problems use time-dependent processes. These might reflect a daily cycle for energy storage, or they are simply randomly generated from a time-dependent process.
The problems are described in the paper
Daniel Salas, W. B. Powell, “Benchmarking a Scalable Approximation Dynamic Programming Algorithm for Stochastic Control of Multidimensional Energy Storage Problems,”
The problems below include both deterministic and stochastic settings. The optimal benchmark for the deterministic problems was computed by solving the full problem as a linear program. The stochastic problems were solved as discrete Markov decision processes. A description of how to use the datasets is contained in
The datasets include matlab code for generating the scenarios. For non-Matlab types, the scenarios are contained in a text file so that you can compare against exact benchmarks.
Deterministic datasets (10 problems)
Stochastic datasets (21 problems)
We have been undertaking a body of research where we exploit monotonicity in the value function. The monotone-ADP algorithm, and descriptions of the datasets, are given in
Daniel Jiang, W. B. Powell, “An Approximate Dynamic Programming Algorithm for Monotone Value Functions,” (under review)
Energy storage datasets II – prepared by Daniel Jiang
These datasets are based on the Salas storage datasets (above), but includes stochastic demands, and uses a more compact way of representing the optimal policy. The datasets, with complete software and documentation, can be downloaded from:
Optimal stopping problems – prepared by Daniel Jiang
The optimal stopping datasets, with complete software and documentation, can be downloaded from