Pinpoint Engineering

The TL;DR from my session on AI and EngOps at AIDe...

This morning I had the opportunity to chat with software engineers and data scientists at the AI Dev World Conference on a topic I just happen to be v...

Understanding coding efficiency: the measures that matter

We’ve discussed the value of looking at your software development process as a pipeline—doing so allows us to borrow methodology from manufacturing to measure things like speed, throughput and quality. To continue that manufacturing analogy, most finished products are made up of components, and each of those parts has to be manufactured. In software, code is plainly the critical part, so understanding the health of the code pipeline is critical to understanding overall engineering performance.

If there were such a thing as GAAP for code stages, it would probably look something like this:

  1. Create branch
  2. Write code
  3. Create pull request
  4. Review pull request
  5. Modify code (sometimes)
  6. Merge code

The aim, of course, is to move as much code through these stages as quickly as possible, without loss of quality. This is our pipeline, and as with any pipeline, we want to know how much gets through, how fast, and how good is the result.

We use signals to illuminate engineering performance. In the case of the code pipeline, we derive five key signals to tell us how well the pipeline is performing.

1. Merge Time

Merge Time reflects the end-to-end time for coding work, measured in days. It bears a strong resemblance to Cycle Time, which measures the time from opening to completing an issue. Both signals evaluate how long it takes for work to move through their respective pipelines.

We also derive the distinct components of Merge Time. These are:

  • Time from branch creation to pull request creation
  • Time spent in review
  • Time from completed review to merge

This granularity makes it possible to identify bottlenecks.

2. Review Rate

Review Rate determines the percentage of merged code changes that went through a pull request review. The pull request and review process exists for a reason: to vet code quality and functionality before updating master. Review Rate shows the rigor with which people and teams are abiding the PR process.

This isn’t to say code can never be merged without review. But understanding the rate at which this may be happening is important, especially in larger organizations, and/or for teams whose Defect Ratio (closed defects divided by open defects) is trending in the wrong direction.

3. Review Rework Rate

Review Rework Rate provides insights on how much merged code was modified as a result of a review. A high Review Rework Rate may signal that a developer is contending with code that’s higher in complexity, or whose function is poorly defined, or both. It may also be a symptom of speed over thoroughness. In either case, a high Review Rework Rate suggests a person in need of help, maybe in the form of further training or coaching.

4. Code Throughput

This looks at total code changes (additions and deletions) merged per person per month. Since code is a key output of any software team, Code Throughput is an important signal of team effectiveness and process maturity.

5. Contribution Balance

Contribution Balance looks at how evenly coding work is balanced across a team. We’ve written about using the 80/20 rule to understand work balance. In the case of Contribution Balance, we’re interested in understanding what percent of a team makes 80 percent of its coding changes?


Signals tell the story

The real power is in the natural (not to mention quantifiable) relationship among signals. As an example, consider a team with low Review Rework Rates, a comparatively slow Merge Time, good Contribution Balance, and low Code Throughput. Taken together, the signals suggest an overly methodical culture.

By illuminating the code pipeline, it becomes possible to measure the health of your delivery process, and to uncover areas where teams or people need help. Guesses and feelings (it feels like it takes that team forever to merge), are replaced with hard data. You can see not only the present reality, but what the trend has been, and whether your code pipeline is getting more or less efficient over time.

Related Post:

The TL;DR from my session on AI and EngOps at AIDevWorld

This morning I had the opportunity to chat with software engineers and data scientists at the AI Dev World Conference on...

Our 12 go-to Python libraries for data science

We use data science — machine learning, natural language processing, etc. in Pinpoint to correlate data from all the too...

Two data science life hacks to improve your workflow

Data science is fundamental to Pinpoint’s application. But, like most startups, we are still in the process of building ...

Subscribe to the Pinpoint Engineering Blog

cat developer