“I just got this report from this monitoring service and the new page takes 20 seconds to load!”
Almost all of us in software have had someone complain that a web page or part of a program takes too long. The information thrown at us is not always clear, and someone wants it fixed. And they want it fixed now. Resist the temptation to create a knee-jerk solution and remember to tackle this problem systematically.
Verify The Current State
First, get your hands on the performance report they’re talking about. It may have useful information that could save you the next few step. If that’s not available, try to figure out where the claim is coming from. You will need this initial information in order to eventually prove you sped it up.
Duplicate It
Try to load the page or run the program in your development environment so that you can see it happening in a place you can mess with. Make sure you’re seeing the same type of performance that the report mentions. If not, find out why then come back. I’ve fixed an issue before by finding out a configuration setting that shouldn’t be on, except in super-special circumstances, was on.
Use a stop watch or the developer tools in your browser to get times. Then do it again. And again. Get to the point where you have a pretty good idea of how long it takes, then record at least 10 runs to get a statistically good average.
I just used the word statistics and someone might think that it would take at least 100 times to get a valid number. But let’s be pragmatic. If I have to load that 20 second page 1000 times, with breaks for notes, that’s at least 35 minutes of really boring, tedious work. Unless there is really high variability, I’ll have good numbers in 10 to 20 runs. If there is high variability, fix that first, then come back here.
Pinpoint It
Use logging statements as an unobtrusive way to get timings on the code in question. Start big and work to smaller pieces. Be sure to get timings around the important steps and then tackle the ones that are slow.
For example, you might be surprised that your initialization code, which started out not doing much, has bloated to a 5 second operation. Now you drill into your initialization code and repeat with new logging statements. Do this again and again until you find a problem you can fix, like making the same call to the database several times.
I’ll squeeze in one small plug for one of my open source libraries here. HiPerfMetrics makes it super easy to pinpoint the troublesome code by putting Start and Stop time blocks around the code you want to monitor, with nice reporting. Check it out at HiPerfMetrics.
Fix it
Control yourself here. Only fix one thing. Once you have fixed the one thing, get new timings and record them so you can compare to your original timings. 10 to 20 runs should get you a solid number. Try to use the same number of runs you used originally.
Keep Going
Hopefully, your last fix made a difference. Now repeat the Duplicate-Pinpoint-Fix cycle again and again until you have actually improved it enough that you can brag about it.
Brag About It
Using your notes from the Fix It steps. Show how the individual changes improved performance and how much performance improved overall. Create a spreadsheet with bar charts demonstrating the improvement. Also, make absolutely sure that the report mentioned in the beginning looks better now. It’s really embarrassing to say you’ve fixed it, but then have the report disagree with you.