Stakeholders · 7 min read

P50, P85, P95 — Reading Probabilistic Delivery Forecasts

Communication · Published April 2026

The output of a Monte Carlo forecast looks like this:

ConfidenceDate
P50May 14
P85July 10
P95August 14

Three numbers. Three commitments. The stakeholder asks, "When will it be done?" — what do you give them?

This article walks through how to read each percentile and which one to commit to depending on context.

The intuition

"P85 = July 10" means: 85% of the simulated futures finished by July 10. Or equivalently: there's a 15% chance the team is still working on it past that date.

Each percentile is a probability statement, not a date. The dates are derived from the probability — given how the team has performed historically, this is the date by which X% of similar scenarios completed.

P50 — Best case

Reading: "Half the simulations finished by this date. Half didn't."

P50 is the median. It's the date that has 50/50 odds.

When to use:

When NOT to use:

Translation: "Best case, we're done by May 14, but we recommend planning around the more likely range."

P85 — Target / Recommended commit

Reading: "85% of simulations finished by this date. 15% slipped."

P85 is the standard commit point in mature agile teams. It's calibrated enough that you'll hit it most of the time, but not so conservative that you under-deliver.

When to use:

Why 85% specifically?

It's the sweet spot between confidence and ambition. P50 is too aggressive (50% chance of slipping). P95 is too conservative (15% slack baked in). P85 forces honest planning while leaving accountability sharp.

Translation: "We're committing to July 10. There's an 85% chance we'll deliver by that date based on the team's recent throughput."

P95 — Conservative / Buffer

Reading: "95% of simulations finished by this date. 5% slipped."

P95 is the buffer. It's the date you give to stakeholders who absolutely cannot tolerate a missed commitment — typically external customers under contract, or downstream teams whose work depends on completion.

When to use:

When NOT to use:

Translation: "If the schedule absolutely cannot move, we plan around August 14 — that's the 95% confidence date."

The three-tier conversation

Use all three percentiles in conversation, not just one:

"Best case, May 14. Recommended commit, July 10 — that's our 85% confidence target. Buffer for the contractual deadline, August 14 — 95% confidence. We'd suggest committing to July 10 internally and flagging August 14 to the customer success team for any external SLAs."

This frames the conversation around calibrated confidence, not arbitrary dates. The team commits to P85, the customer-facing comms reference P95, and P50 stays internal as a stretch goal.

Common stakeholder objections

"Why isn't this just one date?"

Because the team doesn't deliver work at a single fixed rate. It delivers work at a rate that varies based on scope, team size, and risk. A range reflects reality. A single date is a fiction.

"P85 sounds like an excuse to slip"

The opposite. P85 is harder to hit than the average — most teams' "estimates" are P50 or worse. Committing to P85 means the team is being more conservative, not less.

"We need a date, not a range"

Then commit to P85 and call it the date. Just understand: that date is a probability statement. If conditions change (scope grows, team shrinks, dependencies surface), the probability associated with that date changes too.

"Where do these numbers come from?"

From the team's actual throughput history. The simulation samples real past performance to project forward. It's not a guess — it's an inference. See Monte Carlo Forecasting for Azure DevOps for the methodology.

Tracking confidence drift

The most useful operational practice: pin a target date and track how confidence drifts week over week.

Suppose you committed to "85% confidence by July 10" in early April. Two weeks later, you re-run the simulation. The output shows that July 10 is now only 62% confident — confidence dropped 23 points.

That's a leading indicator. Something changed: scope grew, the team had a slow week, dependencies surfaced. The drop is data — investigate, talk to the team, decide whether to re-baseline or accept the risk.

Without confidence tracking, you'd discover the slip in mid-July. With it, you spot the trend in mid-April and have time to act.

Default thresholds for different contexts

ContextRecommended commit percentile
Internal sprint planningP85
Quarterly roadmap to PMP85
Customer-facing release dateP95
Compliance / regulatory deadlineP95 + 10% additional buffer
Cross-team dependencyP85 (with P95 published as worst-case)
Stretch goal / OKRP50

Adjust based on your organization's tolerance for missed dates. Teams with frequent stakeholder conflicts tend to over-commit (use P50); teams that lose stakeholder trust tend to under-commit (use only P95). Both are signs of broken calibration — P85 is the rebalance point.

Get P50/P85/P95 forecasts on your real Azure DevOps data

Nexus Hub Pro runs Monte Carlo simulations and produces all three percentiles in seconds. Pin a target date, track confidence drift week over week.

Install from Marketplace →