How to Identify Software Development Bottlenecks
Resolving Bottlenecks in your Software Engineering Team
Step 1: Get a Baseline
First we start by measuring our North Star metrics. These give us a picture of what the overall performance of the engineering process is. This should capture sufficient amounts of data to be statistically significant.
Establish a baseline Cycle Time by looking at the team's 3-6 month average. This tells us how long the average time from first commit to Pull Request merged and will help us diagnose issues that appear between these phases.
Use the Haystack filters to identify your team or repository if you'd like a more localised picture.
Step 2: Drill Down to Identify Improvement Areas
Next we should seek to understand where the constraints are in the software engineering process. The video at the top of this article shows this process interactively, but we've also described this here.
We can see that this team's Cycle Time is predominently spent with code in Review Time, by clicking "Review Time Drill-In" we can understand this process:
The Review Time report then shows us that most this time is spent with the code in Rework Time:
By selecting Rework Time in the data table below, we can then see in the distribution table there are some outlier Pull Requests taking over 72 hours to complete:
We can then identify that the common risk factors associated with these Pull Requests are that they are too large:
Step 3: Understand the Root Cause and Propose Fix
After identifying the constraint, you can set about finding out how you fix it. You may find there are various technical, process and human factors at play that you need to remedy.
Some problems, like slow code reviews, can be fixed by simply enabling Haystack Slack or email notifications to alert developers of common risks so they can take action in time. Others are more deep issues that require further analysis.
We discuss this in the next article in this series: How to Fix Software Development Bottlenecks
Step 4: Repeat Frequently
Once you've got a fix in place, it's tempting to just keep measuring the Leading Metric that is associated to. For example; if your constraint was that your Pull Requests were too big, you could monitor the Pull Request Size metric as it keeps going further and further down. However, there becomes a point at which it is no longer the constraint.
There becomes a point where the constraint is broken and the area for optimisation moves elsewhere. Instead of focussing on local metrics, look at the Global North Star metrics. When constraints are broken, begin the process afresh to identify and break the new constraint.
Be careful to not let inertia lead you to continuously optimise something that is no longer the constraint.