The output of a Monte Carlo forecast looks like this:
| Confidence | Date |
|---|---|
| P50 | May 14 |
| P85 | July 10 |
| P95 | August 14 |
Three numbers. Three commitments. The stakeholder asks, "When will it be done?" — what do you give them?
This article walks through how to read each percentile and which one to commit to depending on context.
The intuition
"P85 = July 10" means: 85% of the simulated futures finished by July 10. Or equivalently: there's a 15% chance the team is still working on it past that date.
Each percentile is a probability statement, not a date. The dates are derived from the probability — given how the team has performed historically, this is the date by which X% of similar scenarios completed.
P50 — Best case
Reading: "Half the simulations finished by this date. Half didn't."
P50 is the median. It's the date that has 50/50 odds.
When to use:
- Internal planning ("if we finish by P50 we have buffer for the next thing")
- Optimistic scenarios in stakeholder conversations ("at best, May 14")
- Sprint-level commits when the team has high confidence
When NOT to use:
- External commitments. P50 is a coin flip — half the time you'll miss.
- Anything tied to revenue, contracts, or downstream dependencies.
Translation: "Best case, we're done by May 14, but we recommend planning around the more likely range."
P85 — Target / Recommended commit
Reading: "85% of simulations finished by this date. 15% slipped."
P85 is the standard commit point in mature agile teams. It's calibrated enough that you'll hit it most of the time, but not so conservative that you under-deliver.
When to use:
- Roadmap commitments to product / executive stakeholders
- Public release date communications
- Cross-team dependency planning
Why 85% specifically?
It's the sweet spot between confidence and ambition. P50 is too aggressive (50% chance of slipping). P95 is too conservative (15% slack baked in). P85 forces honest planning while leaving accountability sharp.
Translation: "We're committing to July 10. There's an 85% chance we'll deliver by that date based on the team's recent throughput."
P95 — Conservative / Buffer
Reading: "95% of simulations finished by this date. 5% slipped."
P95 is the buffer. It's the date you give to stakeholders who absolutely cannot tolerate a missed commitment — typically external customers under contract, or downstream teams whose work depends on completion.
When to use:
- Customer-facing SLA commitments
- Contract delivery dates
- Compliance deadlines (where missing has legal/financial consequences)
- Buffer planning for downstream releases
When NOT to use:
- Default planning. If P95 is the only date stakeholders see, your team will pad estimates and feedback loops will degrade.
- Internal sprint commits. Use P85.
Translation: "If the schedule absolutely cannot move, we plan around August 14 — that's the 95% confidence date."
The three-tier conversation
Use all three percentiles in conversation, not just one:
"Best case, May 14. Recommended commit, July 10 — that's our 85% confidence target. Buffer for the contractual deadline, August 14 — 95% confidence. We'd suggest committing to July 10 internally and flagging August 14 to the customer success team for any external SLAs."
This frames the conversation around calibrated confidence, not arbitrary dates. The team commits to P85, the customer-facing comms reference P95, and P50 stays internal as a stretch goal.
Common stakeholder objections
"Why isn't this just one date?"
Because the team doesn't deliver work at a single fixed rate. It delivers work at a rate that varies based on scope, team size, and risk. A range reflects reality. A single date is a fiction.
"P85 sounds like an excuse to slip"
The opposite. P85 is harder to hit than the average — most teams' "estimates" are P50 or worse. Committing to P85 means the team is being more conservative, not less.
"We need a date, not a range"
Then commit to P85 and call it the date. Just understand: that date is a probability statement. If conditions change (scope grows, team shrinks, dependencies surface), the probability associated with that date changes too.
"Where do these numbers come from?"
From the team's actual throughput history. The simulation samples real past performance to project forward. It's not a guess — it's an inference. See Monte Carlo Forecasting for Azure DevOps for the methodology.
Tracking confidence drift
The most useful operational practice: pin a target date and track how confidence drifts week over week.
Suppose you committed to "85% confidence by July 10" in early April. Two weeks later, you re-run the simulation. The output shows that July 10 is now only 62% confident — confidence dropped 23 points.
That's a leading indicator. Something changed: scope grew, the team had a slow week, dependencies surfaced. The drop is data — investigate, talk to the team, decide whether to re-baseline or accept the risk.
Without confidence tracking, you'd discover the slip in mid-July. With it, you spot the trend in mid-April and have time to act.
Default thresholds for different contexts
| Context | Recommended commit percentile |
|---|---|
| Internal sprint planning | P85 |
| Quarterly roadmap to PM | P85 |
| Customer-facing release date | P95 |
| Compliance / regulatory deadline | P95 + 10% additional buffer |
| Cross-team dependency | P85 (with P95 published as worst-case) |
| Stretch goal / OKR | P50 |
Adjust based on your organization's tolerance for missed dates. Teams with frequent stakeholder conflicts tend to over-commit (use P50); teams that lose stakeholder trust tend to under-commit (use only P95). Both are signs of broken calibration — P85 is the rebalance point.
Get P50/P85/P95 forecasts on your real Azure DevOps data
Nexus Hub Pro runs Monte Carlo simulations and produces all three percentiles in seconds. Pin a target date, track confidence drift week over week.
Install from Marketplace →