dead code and butterflies

Yesterday, Windows Update decided I won the IE 10 lottery and earned myself a new browser. Then I remembered the IE SunSpider performance scandal (turns out this was for IE 9, my how time flies!) and my own thoughts on the difficulty and fragility of dead code optimizations. For background, the Lies, damned lies, and benchmarks article at Ars is a decent read.

At Coverity, one of the things we checked for was dead code. Not the intentional simple if (0) { deadcode++; } stuff, but more complex cases which we felt were likely to result from programmer error. If the second half of a function tests for a particular combination of local variable values, but the top half of the function cannot set that combination, maybe it’s a bug. Unfortunately, said checker was notoriously fragile. Customers would send us code samples and ask why we failed to detect the dead code. You’d take a look at the function and the first answer was always the same. No clue.

Then we’d analyze it with debug output. Peer deep into the tea leaves (ahem, log file) and ponder the results. This never went anywhere, so you’d make exactly the kinds of tweaks mentioned in a Hacker News thread, and then perform differential cryptanalysis of the tea leaves. Eventually, you’d uncover some change that could be made to the analysis. Run the testsuite, hope it doesn’t explode. Analyze some larger open source projects, hope it doesn’t explode. If everything checked out, you had a change that could be committed. Usually you didn’t. The small, tiny, absolutely inconsequential tweak needed for the analysis to handle this case inevitably resulted in massive variations to the results for some other code base. A butterfly flapping its wings.

I’m sympathetic to whichever Microsoft engineer worked on their optimizer. The exact problem I was tackling was slightly different than building an optimizer, but getting a dead code detector to handle a new case without accidentally killing off too much code is tricky business. If your change is successful, practically by definition it’s very narrowly targeted and won’t detect minor variations. Is it cheating? To the extent a benchmark is supposed to reflect real code, it makes a good test for an optimizer. Considering that the optimization in question didn’t have much impact on the overall SunSpider score, I’m not convinced it was even intentional. Perhaps somebody added new a dead code optimization and it just happened to render one benchmark obsolete.

As you move past simple optimizations, the circumstances in which they can be applied grow fewer and fewer. It’s not a bad idea to just call it a day and quit while you’re ahead. But if you want to get really far ahead, you keep piling on more and more micro optimizations in an effort to cover more real world code. I don’t know how the IE team handled this or where IE 10 stands today; at Coverity we would have incrementally improved the checker until it handled more variations. One butterfly at a time.

Posted 13 Mar 2013 16:40 by tedu Updated: 14 Mar 2013 14:00
Tagged: programming thoughts