What’s new?

Below we provide a list of working papers (under review at various journals), papers that have been accepted but have not yet appeared, and papers that have recently appeared.

Recent working papers (in various stages of review): Updated November, 2016.

Lina al-Kanj, W.B. Powell, B. Bouzaiene-Ayari, “The Information-Collecting Vehicle Routing Problem: Stochastic Optimization for Emergency Storm Response,”

B. Cheng, W.B. Powell, “Co-optimizing Battery Storage for Frequency Regulation and Energy Shifting using Multiscale Dynamic Programming,”

Tsvetan Asamov, W. B. Powell, “Regularized Decomposition of High-Dimensional Multistage Stochastic Programs with Markov Uncertainty,”

Xinyu He, Warren B. Powell, “Optimal Learning for Stochastic Optimization with Nonlinear Parametric Belief Models,”

Ilya Ryzhov, Martijn Mes, W.B. Powell, Gerald van den Berg, “Bayesian Exploration for Approximate Dynamic Programming,”

Ricardo Collado, Jae Ho Kim, W.B. Powell, “Quantile Optimization in Electricity Trading in the Presence of Storage with Heavy-Tailed Prices,”

Tsvetan Asamov, W.B. Powell, “SDDP vs. ADP: The Effect of Dimensionality in Multistage Stochastic Optimization for Grid Level Energy Storage,”

Wang, Y. W. B. Powell, R. Schapire, “Finite-time analysis for the knowledge-gradient policy,”

Daniel Jiang, W. B. Powell, “Optimal Policies for Risk-Adverse Electric Vehicle Charging with Spot Purchases,”

Yingfei Wang, Warren B. Powell, “The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks with an Application in Personalized Health Care”

Si Chen, Kristofer-Roy G. Reyes, Xinyu He, Maneesh K. Gupta, Jesse P. Goodman, Michael C. McAlpine, Warren B. Powell, “Optimal Learning for Release Profile Matching with a Noncentral Chi-Squared Noise Distribution”

Vazquez-Anderson, Jorge; Baldridge, Kevin; Reyes, Kristofer; Powell, Warren; Contreras, Lydia N, “Understanding the role of structural accessibility in RNA targeting via biophysical modeling of in vivo anti-sense hybridization,”

Ilya Ryzhov, Martijn Mes, W.B. Powell, Gerald van den Berg, “Bayesian Exploration for Approximate Dynamic Programming,”

Yan Li, Han Liu, W.B. Powell, “The Knowledge Gradient Policy using a Sparse Additive Belief Model,”

J. Khazaei, W. B. Powell, “SMART-Invest -A stochastic, dynamic planning for optimizing investments in wind, solar, and storage in the presence of fossil fuels: The Case of the PJM Electricity Market,”

Javad Khazaei, Michael Coulon, W.B. Powell, “ADAPT: A Price Stabilizing Compliance Policy for Renewable Energy Certificates: The Case of SREC Markets,” Click here for copy.

W. R. Scott, W. B. Powell, S. Moazeni,  “Least Squares Policy Iteration with Instrumental Variables vs. Direct Policy Search: Comparison Against Optimal Benchmarks Using Energy Storage,”

Daniel Jiang, W. B. Powell, “An Approximate Dynamic Programming Algorithm for Monotone Value Functions,”

R. Collado, J. Kim, W. B. Powell, , “Quantile Optimization using Asymmetric Signum Functions”

Donghun Lee, W. B. Powell, “Bias-Corrected Q-Learning to Control Max Operator Bias”


Papers which are to appear:

S. Moazeni, W.B. Powell, B. Defourny, B. Bouzaiene-Ayari, “Parallel Non-Stationary Direct Policy Search,” Informs J. on Computing.

C. Archer, Hugo P. Simao, W. Kempton, W.B. Powell, M. Dvorak, “The challenge of integrating offshore wind power in the US electric grid. Part I: Wind Forecast Error,” Renewable Energy (to appear)

Hugo P. Simao, W.B. Powell, C. Archer, W. Kempton, “The challenge of integrating offshore wind power in the US electric grid. Part II: Simulation of electricity market operations,” Renewable Energy (to appear)

Warren B Powell, A Unified Framework for Optimization under Uncertainty, Informs TutORials, November, 2016. (invited, lightly reviewed).

This tutorial article shows how a wide range of problems, spanning classical stochastic search, dynamic programming, stochastic programming, optimal stopping, stochastic control, along with learning problems such as the multiarmed bandit problem, can all be cast in a common framework.

Daniel Salas, W. B. Powell, “Benchmarking a Scalable Approximation Dynamic Programming Algorithm for Stochastic Control of Multidimensional Energy Storage Problems” Informs J. on Computing (to appear)

The article describes a scalable algorithm based on approximate dynamic programming that optimizes battery storage across a grid. The paper includes careful benchmarking that shows that the method provides very near optimal solutions on realistic, time-dependent problems.

Arta Jamshidi and W. B. Powell, “A Recursive Local Polynomial Approximation Method using Dirichlet Clouds and Radial Basis Functions” SIAM J. on Scientific Computation (to appear)

The article describes a nonparametric method for local approximation of smooth functions that can be easily updated in recursive applications.

Lina al-Kanj, Belgacem Bouzaiene-Ayari, W. B. Powell, “A Probability Model for Grid Failures Using Incomplete Outage Information,” IEEE Transactions on the Smart Grid (to appear)

Utilities have to restore the grid from storm outages, without knowing precisely where the outages have occurred. Their primary source of information is “lights out” calls from customers who have lost their power. But an outage can trigger a recloser that cuts power to a large number of houses that are no-where close to the actual outage. We develop a probability model that uses Bayes theorem to calculate the probability of outages based on the history of lights out calls, as well as a failure to call, which is also a form of information. We present a rigorous probability model, describe the application of Bayes theorem, and then describe a strategy for overcoming the familiar curse-of-dimensionality that typically arises when using Bayes theorem.

Somayeh Moazeni, I. Arciniegas Rueda, M. Coulon, B. Song, W.B. Powell, “A Non-Parametric Structural Hybrid Modeling Approach for Electricity Prices,” J. of Quantitative Finance (to appear)

Utilities face the problem of signing electricity contracts for up to 10 years into the future. Forward contracts do not exist for electricity contracts this far into the future, but fuel contracts are available. We use a nonparametric structural model of the supply stack and a load forecast, combined with stochastic forecasts of fuel prices, to create a stochastic model of 10-year electricity prices.

Harvey Cheng, W. B. Powell, Arta Jamshidi, “Optimal Learning with a Local Parametric Belief Model,” J. Global Optimization

Introduces the knowledge gradient using a hierarchical, local parametric belief model. The paper includes a convergence proof and demonstrates the effectiveness on different problems, including the surprisingly difficult newsvendor problem.

(Click here to download paper)

Boris Defourny, Ilya O. Ryzhov, W. B. Powell, “Optimal Information Blending with Measurements in the L2 Sphere,” Mathematics of Operations Research (to appear)

This paper introduces risk in the context of optimal learning using a knowledge gradient-type policy.

(Click here to download paper)


Moazeni, Somayeh, W.B. Powell, A. H. Hajimiragha, “Mean-Conditional Value-at-Risk Optimal Energy Storage Operation in the Presence of Transaction Costs,” IEEE Transactions on Power Systems, Accepted July, 2014 (to appear).

This paper addresses the formulation and solution of an optimal energy storage management problem under risk consideration and transaction costs of trading energy with the power grid. The price evolves as a stochastic process, capable of correctly explaining the seasonality effects as well as the tail fatness and spikiness in its distribution. Transaction costs capture the price impact of the storage operation on the electricity spot price. A risk analysis of an optimal risk neutral deterministic policy as well as the simple myopic policy indicates that the realized operational cost may notably differ from the expected cost by a considerable probability. This difference suggests that we need to consider risk. Using the downside risk measure of Conditional Value-at-Risk, an optimal risk averse conversion and transmission strategy, among the grid, the renewable power generation source, and an energy storage is proposed to fully satisfy the electricity demand and minimize the expected operational cost as well as the risk. Our numerical study using data from NYISO demonstrates the impacts of risk consideration and the transaction cost parameters.

(Click here to download paper)


Belgacem Bouzaiene-Ayari, C. Cheng, S. Das, R. Fiorillo, W.B. Powell, “From Single Commodity to Multiattribute Models for Locomotive Optimization: A Comparison of Integer Programming and Approximate Dynamic Programming,” Transportation Science,  Appeared online, August 4, 2014.  http://dx.doi.org/10.1287/trsc.2014.0536

The paper compares three optimization models for the locomotive planning problem for Norfolk Southern railroad: an aggregate model that is used in production to provide high level guidance on locomotive flows, a more detailed multicommodity flow formulation (which could be solved only over a three day horizon), and a very detailed model based on the modeling and algorithmic framework of approximate dynamic programming. The ADP-based formulation, called PLASMA, has been used by NS for a number of years for strategic fleet planning, and for daily operational planning. NS (as of this writing) only uses the deterministic version of the code, but this paper describes experiments using stochastic travel times (a major issue at North American freight railroads) and demonstrates that ADP produces much more robust solutions. Click here for more details.

(Click here to download paper)

W. B. Powell, “Perspectives of Approximate Dynamic Programming,” Annals of Operations Research, Annals of Operations Research (2012), DOI 10.1007/s10479-012-1077-6 (c) Springer.

This is an overview of ADP, covering some of the historical evolution, how to model a sequential decision process, and a summary of the four fundamental classes of policies that are used in most of the stochastic optimization literature, spanning reinforcement learning, dynamic programming, stochastic programming, and variations of lookahead policies.

(Click here to download paper)

Papers which have recently appeared:

W. B. Powell and S. Meisel, “Tutorial on Stochastic Optimization in Energy Part I: Models and Policies,” IEEE Trans. on Power Systems, Vol. 31, No. 2, pp. 1459-1467, 2016.

Reviews the different canonical models used by different communities, and introduces our own canonical modeling framework and introduces the four classes of policies.

(Click here to download paper)

W. B. Powell, Stephan Meisel, “Tutorial on Stochastic Optimization in Energy II: An energy storage illustration”, IEEE Trans. on Power Systems, Vol. 31, No. 2, pp. 1468-1475, 2016

Illustrates the process of modeling a stochastic, dynamic system using an energy storage application, and shows that each of the four classes of policies works best on a particular variant of the problem. A fifth problem shows that in some cases a hybrid policy is needed.

(Click here to download paper)

Si Chen, K-R G. Reyes, M. Gupta, M. C. McAlpine, W. B. Powell, “Optimal Learning in Experimental Design Using the Knowledge Gradient Policy with Application to Characterizing Nanoemulsion Stability,” SIAM J. Uncertainty Quantification, Vol. 3, pp. 320-345, 2015. DOI 10.1137/140971129.

Calculating the E max of the knowledge gradient is computationally difficult when the underlying belief model is nonlinear in the parameters. This paper proposes and tests the idea of using a knowledge gradient policy based on a sampled belief model. The policy is demonstrated in the context of an experimental setting designing the parameters for a water-oil-water mixture for drug delivery.

(Click here to download paper)

Yingfei Wang, K. G. Reyes, K. A. Brown, C. A. Mirkin, W. B. Powell, “Nested Batch Mode Learning and Stochastic Optimization with an Application to Sequential Multi-Stage Testing in Materials Science,” SIAM J. Scientific Computing, Vol. 37, No. 3, pp. B361-B381, DOI: 10.1137/140971117, 2015.

This paper addresses a complex optimal learning problem arising in the laboratory sciences, which involves guiding the process of making nested, batch decisions. The first decision involves creating a design of a nano particle. Then it is possible to test the performance (the reflectivity of the surface of a material) by testing different densities, which can be done in a batch experiment.

(Click here to download paper)

Michael Coulon, Javad Khazaei, W. B. Powell, “SMART-SREC: A stochastic model of the New Jersey solar renewable energy certificate market,” Journal of Environmental Economics and Management, Vol. 73, pp. 13-31, 2015.

New Jersey uses targets to encourage the use of solar energy by setting up a market for solar renewable energy certificates. This paper develops a dynamic programming model of these markets and calibrates the model against historical data for New Jersey.

(Click here to download paper)

D. Jiang, W. B. Powell, “Optimal Hour-ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming” Informs J. on Computing, Vol. 27, No. 3, pp. 525-543 (2015).

The hour-ahead bidding problem requires setting a low (buy) and high (sell) bid that determines how a battery is managed one hour in the future, which means that the post-decision state has to include both the bids we just determined, along with other variables such as energy in storage, prices and exogenous energy sources. The dynamic program features a value function that is monotone in all the state variables, which is exploited in the design of an optimal algorithm.

(Click here to download paper)

Yingfei Wang, K. G. Reyes, K. A. Brown, C. A. Mirkin, W. B. Powell, “Nested Batch Mode Learning and Stochastic Optimization with an Application to Sequential Multi-Stage Testing in Materials Science,” SIAM J. Scientific Computing, Vol. 37, No. 3, pp. B361-B381, DOI: 10.1137/140971117, 2015.

The paper proposes a method for computing the knowledge gradient when the belief model is nonlinear in the parameters, using a discretized sample of the unknown parameters. The method is demonstrated in the context of a complex experimental problem involving the design of nanoemulsions..

(Click here to download paper)

Yingfei Wang, K. G. Reyes, K. A. Brown, C. A. Mirkin, W. B. Powell, “Nested Batch Mode Learning and Stochastic Optimization with an Application to Sequential Multi-Stage Testing in Materials Science,” SIAM J. Scientific Computing, Vol. 37, No. 3, pp. B361-B381, DOI: 10.1137/140971117, 2015.

The paper develops a knowledge gradient policy for guiding an initial design decision (e.g. the size and shape of nanoparticles) followed by batch learning of a secondary tunable parameter (e.g. the density of particles) to maximize a metric (reflexivity of a surface).

(Click here to download paper.)

Daniel Jiang, Thuy Pham, Warren B. Powell, Daniel Salas, Warren Scott, “A Comparison of Approximate Dynamic Programming Techniques on Benchmark Energy Storage Problems: Does Anything Work?,” IEEE Symposium Series on Computational Intelligence, Workshop on Approximate Dynamic Programming and Reinforcement Learning, Orlando, FL, December, 2014.

This paper uses two variations on energy storage problems to investigate a variety of algorithmic strategies from the ADP/RL literature. All of these methods are tested on benchmark problems that are solved optimally, so that we get an accurate estimate of the quality of the policies being produced. Somewhat surprisingly, generic machine learning algorithms for approximating value functions did not work particularly well. What did work well is best described as “lookup table with structure.” The structure we exploit is convexity and monotonicity. Test datasets are available at http://www.castle.princeton.edu/datasets.htm.

(Click here to download paper)

Ryzhov, I., P. I. Frazier and W. B. Powell, “Stepsize Selection for Approximate Value Iteration and a New Optimal Stepsize Rule,” IEEE Transactions on Automatic Control, Vol. 60, No. 3, pp. 743-758, 2015.

This paper studies stepsizes in the context of bootstrapping-based algorithms (value iteration, Q-learning), which introduces the problem of designing a stepsize that balances the need to add costs/rewards over time (which requires larger stepsizes), while smoothing noise (which requires smaller stepsizes). We provide tight upper and lower bounds that proves that 1/n converges to the optimal solution, but converges so slowly it should never be used. We then use a single state, single-action problem to derive an optimal stepsize for both steady state and finite horizon problems.

(Click here to download paper)

W.B. Powell, B. Bouzaiene-Ayari, Clark Cheng, Sourav Das, Ricardo Fiorillo, Coleman Lawrence, “Locomotive Planning at Norfolk Southern: An Optimizing Simulator Using Approximate Dynamic Programming,” Interfaces, Vol. 44, No. 6, pp. 567-578., 2014.

This paper describes the implementation of PLASMA at the Norfolk Southern Railroad.Click here for more details.

(Click here to download paper)

Warren B. Powell. “Clearing the Jungle of Stochastic Optimization.” INFORMS Tutorials in Operations Research: Bridging Data and Decisions, pp. 109-137, November, 2014, http://dx.doi.org/10.1287/educ.2014.0128

This invited tutorial unifies different communities working on sequential decision problems. First, it provides a simple, five-part canonical form for modeling stochastic dynamic programs (drawing off established notation from the controls community), with a thorough discussion of state variables. It then summarizes four fundamental classes of policies called policy function approximations (PFAs), policies based on cost function approximations (CFAs), policies based on value function approximations (VFAs), and lookahead policies. There is a detailed discussion of stochastic lookahead policies (familiar to stochastic programming). It closes with a summary of results using approximate value functions in an energy storage problem.

(Click here to download paper)

Lauren Hannah, W.B. Powell, D. Dunson, “Semi-Convex Regression for Metamodeling-Based Optimization,” SIAM J. on Optimization, Vol. 24, No. 2, pp. 573-597, (2014).

Stochastic search involves finding a set of controllable parameters that minimizes an unknown objective function using a set of noisy observations. We consider the case when the unknown function is convex and a metamodel is used as a surrogate objective function. Often the data are non-i.i.d. and include a observable state variable, such as applicant information in a loan rate decision problem. State information is difficult to incorporate into convex models. We propose a new semi-convex regression method that is used to produce a convex metamodel in the presence of a state variable. We show consistency for this method. We demonstrate its effectiveness for metamodeling on a set of synthetic inventory management problems and a large, real-life auto loan dataset.

(Click here to download paper)

E. Barut and W. B. Powell, “Optimal Learning for Sequential Sampling with Non-Parametric Beliefs,” Journal of Global Optimization, Vol. 58, pp. 517-543 (2014). DOI 10.1007/s10898-013-0050-5.

This paper derives the knowledge gradient when the belief model is captured using kernel regression, a class of nonparametric statistical models.

(Click here to download paper)

Shi, N., Song, H., & Powell, W. B. The dynamic fleet management problem with uncertain demand and customer chosen service level. International Journal of Production Economics, 148, pp. 110–121 (2014). doi:10.1016/j.ijpe.2013.09.010

(Click here to download paper)