This webpage offers a series of benchmark problems for testing ADP/RL algorithms. In each we have found the *optimal* policy by creating and solving a discrete version of the problem.

We have found that popular algorithms based on using various machine learning algorithms can work surprisingly poorly on classical inventory/storage problems. See

Daniel Jiang, Thuy Pham, Warren B. Powell, Daniel Salas, Warren Scott, “A Comparison of Approximate Dynamic Programming Techniques on Benchmark Energy Storage Problems: Does Anything Work?,” IEEE Symposium Series on Computational Intelligence, Workshop on Approximate Dynamic Programming and Reinforcement Learning, Orlando, FL, December, 2014.

We have had our best success with methods that can be described as “lookup table with structure.” Below, we list problems that exploit convexity and monotonicity.

Convex problems – The value function is convex in a single resource dimension

Monotone problems – The value function is monotone in each dimension of the state variable

Energy storage datasets prepared by Warren Scott 

This is a series of benchmarked energy storage datasets prepared by Warren Scott for the paper:

Click here for the datasets.

Energy storage datasets I – prepared by Daniel Salas

The datasets reflect a relatively simple energy storage problem depicted by (in its full form) one battery, a variable (but free) stochastic source (wind or solar), a limitless source at a random prices (from the grid), to serve a fairly predictable but time varying load. We visualiez the problem using

princeton energy storage

The Princeton energy storage benchmark datasets are a series of finite horizon problems that consist of four components:

    • A renewable source of energy (free, but variable and usually stochastic).
    • The power grid – an infinite supply of energy (and a market) at a random price.
    • A load – usuallly time dependent, usually stochastic.
  • A single storage device used to smooth out flows

Most of these problems use time-dependent processes. These might reflect a daily cycle for energy storage, or they are simply randomly generated from a time-dependent process.

The problems are described in the paper

Daniel Salas, W. B. Powell, “Benchmarking a Scalable Approximation Dynamic Programming Algorithm for Stochastic Control of Multidimensional Energy Storage Problems,”

The problems below include both deterministic and stochastic settings. The optimal benchmark for the deterministic problems was computed by solving the full problem as a linear program. The stochastic problems were solved as discrete Markov decision processes. A description of how to use the datasets is contained in

Readme file

The datasets include matlab code for generating the scenarios. For non-Matlab types, the scenarios are contained in a text file so that you can compare against exact benchmarks.

Deterministic datasets (10 problems)

Stochastic datasets (21 problems)

Monotone problems

We have been undertaking a body of research where we exploit monotonicity in the value function. The monotone-ADP algorithm, and descriptions of the datasets, are given in

Daniel Jiang, W. B. Powell, “An Approximate Dynamic Programming Algorithm for Monotone Value Functions,” (under review)

Energy storage datasets II – prepared by Daniel Jiang

These datasets are based on the Salas storage datasets (above), but includes stochastic demands, and uses a more compact way of representing the optimal policy. The datasets, with complete software and documentation, can be downloaded from:

Energy storage datasets II

Optimal stopping problems – prepared by Daniel Jiang

The optimal stopping datasets, with complete software and documentation, can be downloaded from

Optimal stopping datasets