NEW: The Decision Factory — a novel about decisions under uncertainty. Get it on Amazon
Decision Science · · Bit Bros

Sequential Decision Analytics: A Better Framework

Why single-shot optimization isn't enough and how to think about decisions that unfold over time.

decision-sciencesequential-decisionsoptimization

The Problem with “Solve Once, Execute Forever”

Most optimization in practice works like this: gather data, build a model, solve it, get a plan, execute the plan. It’s clean, it’s familiar, and it’s taught in every operations research textbook.

It’s also a terrible fit for most real-world problems.

Why? Because the world doesn’t stand still while you execute your plan. New information arrives. Demand shifts. A truck breaks down. A shipment is delayed. A customer cancels a massive order. Every one of these events changes what the “optimal” action should be — but your single-shot plan has no mechanism to adapt.

Sequential decision analytics offers a fundamentally different way to think about these problems. Instead of asking “what’s the optimal plan?”, it asks: “what’s the best policy for making decisions over time as new information arrives?”

This is the framework pioneered by Warren Powell, and it’s the backbone of how we approach decision problems at Bit Bros.

Decisions Unfold Over Time

Consider a distribution center managing inventory for 5,000 SKUs. Every day, new decisions need to be made:

  • Which items to replenish and in what quantity
  • How to allocate scarce warehouse capacity
  • Which orders to prioritize for shipping
  • Whether to accept a spot buy opportunity

Each of these decisions is made under uncertainty (you don’t know tomorrow’s demand), and each decision affects future states (ordering too much today means less warehouse space tomorrow). This is not a single optimization problem — it’s a sequence of decisions linked through time, state, and information.

Sequential decision problems have three defining characteristics:

  1. Decisions are made at multiple points in time, not all at once
  2. New information arrives between decisions, changing what you know
  3. Earlier decisions constrain later ones through resource states, inventory levels, commitments, etc.

The Four Meta-Classes of Policies

Here’s where it gets powerful. Warren Powell’s framework identifies four fundamental approaches to designing policies for sequential decision problems. Every method you’ve ever seen — from simple rules to deep reinforcement learning — falls into one of these four meta-classes.

1. Policy Function Approximations (PFAs)

What it is: A direct mapping from the current state to an action. Think of it as a decision rule or lookup table.

Supply chain example: “When inventory for SKU X drops below 200 units, order 500 units.” That’s a PFA. The parameters (200 and 500) can be tuned, but the structure of the rule is fixed.

When it works: When the decision space is manageable and you can parameterize a sensible rule. PFAs are simple, fast, interpretable, and often surprisingly effective. Don’t underestimate a well-tuned (s, S) inventory policy.

2. Cost Function Approximations (CFAs)

What it is: Modify the objective function of a deterministic optimization to implicitly account for uncertainty, then solve the modified problem.

Supply chain example: You have a vehicle routing model that minimizes total distance. But you know travel times are uncertain, so you add a penalty term that penalizes routes with high variance. You’re still solving a “one-shot” optimization, but the modified cost function steers you toward robust solutions.

When it works: When you already have a working deterministic optimization model and want to make it uncertainty-aware without rebuilding it from scratch. CFAs are a pragmatic bridge between deterministic and stochastic approaches.

3. Value Function Approximations (VFAs)

What it is: Estimate the future value (or cost) of being in a particular state, then make decisions that optimize immediate reward plus estimated future value.

Supply chain example: In a warehouse allocation problem, a VFA might estimate the value of having open capacity in each zone. When deciding whether to accept a new storage request, you weigh the immediate revenue against the estimated future value of keeping that space available for higher-priority items.

When it works: When the state space is manageable enough to approximate and the value of future states significantly impacts current decisions. This is the territory of reinforcement learning and approximate dynamic programming.

4. Direct Lookahead Approximations (DLAs)

What it is: At each decision point, solve an approximate model of the future to determine what to do now. Only the first decision is executed; then new information arrives and you solve again.

Supply chain example: Rolling horizon planning — every week, you solve a multi-week optimization model, execute only the first week’s decisions, then re-solve next week with updated data. Model Predictive Control (MPC) in process industries works exactly this way.

When it works: When you can build a reasonable model of the near future and solve it fast enough to be practical. DLAs are powerful but computationally expensive — the quality of your lookahead model matters enormously.

There Is No One-Size-Fits-All

This is the critical insight: the right policy class depends on the problem. A simple PFA might crush a complex VFA on a problem with a manageable decision space. A CFA might be the right move when you have an existing optimization pipeline and can’t justify a rewrite. A DLA might be necessary when the future interactions between decisions are too complex for simpler approaches.

The framework gives you a language for design. Instead of jumping straight to a technique (reinforcement learning! linear programming! heuristics!), you first ask:

  • What are my decisions?
  • What information do I have when I make them?
  • What uncertainty am I facing?
  • How do my decisions interact over time?

Then you pick the policy class — or combination of classes — that fits.

Why This Matters

Most organizations are stuck in one of two extremes: either they use rigid deterministic plans (a degenerate DLA with no uncertainty), or they use simple rules of thumb (untuned PFAs). The sequential decision analytics framework reveals the entire design space between these extremes.

Better decisions don’t always require more data or fancier algorithms. Sometimes they require a better framework for thinking about the problem in the first place. That’s what sequential decision analytics provides — and it’s why we consider it essential to everything we do.