Years ago, a friend of mine walked into an upscale bar in Chicago. He ordered a gin and tonic. “Twenty dollars,” the bartender told him. “Twenty!” my friend said, going into cardiac arrest. (This was a decade before craft cocktails and mixologists and $35 drinks.) “Why not forty? Why not sixty?”

It was the right response, even if it didn’t save him any money. This bar had decided that an ounce of gin, a splash of tonic, and a lime wedge were worth twenty dollars. Never mind that a bar two blocks over might have said the same things were worth five dollars. When an assigned value has no basis other than what the assigner feels is best, the number can be anything. Twenty makes as much sense as forty. Or sixty. Or a hundred. 

We do something similar when we assign story points. What is a “point” and how many should be assigned? Good question. It depends entirely on the team. This is by design. Points were created to obscure the time aspect of engineering work. With points, we (engineers) meant to give ourselves the ability to size effort, without giving others (sponsors, business stakeholders) a timeframe that could be questioned, debated, pushed, or used to compare teams. 

All good intentions, yes. But there are a few reasons it’s time to move on from story points.

The business can’t understand it

Business stakeholders can understand features. They can understand bugs. These things, manifested as issues, are the common currency between engineering and the business. Points only obscure things (again: by design). A burndown chart doesn’t inspire understanding or confidence. It fails to answer the basic questions in language anyone can understand: how much is done, how much remains, and when will it be finished?

Holding stubbornly to a thing that stakeholders can’t follow is a recipe for confusion at best, and distrust at worst. Imagine a sales rep declining to say how far along he was in dollars against quota, and instead using points velocity. Would you leave the room feeling inspired about the quarter?

We struggle to explain it

For engineers, obscuring the time aspect may once have sounded like a feature: Now management can’t bother us with their clueless assumptions around person-days! In fact, it’s a bug. A pretty big one. Adhering to story points puts us in the rather absurd role of estimating work size with units that have no common standard, and then trying to explain to others what we’ve come up with, and why. (I know of one case where the team devolved to simply holding up the number of fingers they felt correlated to the effort, with the manager taking the average of the number of fingers shown.)

Again, the rationale for points makes sense. Estimating isn’t an imperfect science; it’s not a science at all. There are too many variables in play—the experience of the team, the amount of concurrent work, the type and priority of the new work, the number of cross-team dependencies, even the time of year, to name a few—for people to be able to estimate effectively. So the desire to stop framing estimates in a specific number of days, which suggests a precision none of us have, and instead speak more impressionistically in “points,” is entirely understandable.

It just doesn’t actually help.

Make machines do the work

Truthfully, I suspect story points have persisted not because we think it’s an excellent system, but for lack of anything better. We’ve settled—if not for “good enough,” then at least for “what everyone else seems to be doing.” (We’ve also gotten religious about it: building up methods and best practices, writing books, hiring coaches… all of which is a surefire way to get anything to stick around.)

What would a better way look like? It would start with using language the business can follow: not points, but issues. Issues are the best way for showing what engineering is doing, and how well it’s going.

Of course, this still leaves us with the bigger problem of estimation. Deciding that a certain program of work is roughly equivalent to 50 issues (across epics, stories, tasks, etc.) is considerably easier than estimating how much time (or points) those 50 issues are likely to take. Going back to what we said above, there are just too many variables for people to be able to estimate effectively.

But what if we replaced that one word, people, with a different one: machines?

A machine—specifically, machine learning (ML)—is optimized for exactly the kind of complexity that goes into estimating. How does this work? With Pinpoint, we scan the entire work history of an organization as reflected in issue management tools like Jira, examining the metadata associated with all the work items (including parent items, children, grandchildren, etc.). From this, we derive the historic Cycle Time and Throughput at the organization, team and individual levels. We refine the model further by analyzing type and priority of the work. Armed with this analysis, we can apply it to any new work and automatically forecast the time required.

ML changes the game

Ron Jeffries, the originator of story points, recently took on their thorny history in a thoughtful post. Among his observations:

  • I think using [story points] to predict “when we’ll be done” is at best a weak idea.
  • I think tracking how actuals compare with estimates is at best wasteful.

Agreed on both fronts. Regarding his second point, I recall once upon a time asking teams to do exactly this. We would baseline our estimates (ha), track our actuals (haha), and, when the project was finished, we would go through the analysis of comparing actuals to estimates, using what we learned to make better estimates for future work (rofl).

As an idea, it sounded wonderful (to me). As a practical matter, it proved impossible. Too complex, too laborious, and all in service of something we’d forget about the minute another landslide of work hit. If you’d told me there would come a day when the right technology could automate most of this away, I would’ve begged for a time machine.

That day has come.


Get the data science behind high-performance teams.