books chapter five
A few different perspectives this week.
All the previous interviewers were generally kind of down on Java, but now we’re talking to Joshua Bloch, Java Architect at Google, who’s pretty obviously down with Java. He also likes the Design Patterns book, in stark contrast to some previous subjects.
As far as programming goes, in any language, he spends a lot of time just getting the names of identifiers right. Always keep a dictionary handy. Everybody knows one letter names are bad, but this is the first I’ve read about this level of attention. That kind of sounds excessive at first, but after some reflection I think he’s on to something. I can recall a few times when a function was roughly named correctly, but there was enough wiggle room for misunderstanding. Java is infamous for preposterously long names, but I suspect it’s possible to conjure up some precise short names as well.
We spend some time talking about the coming age of concurrency and how Java is best suited to tackle it, and the rising use of Java within Google to replace some C++ code, and the burden of adding generics to Java. It reads not just a little, but a lot, like an interview today with someone from the Go team. His advocacy for Java concurrency seems pretty weak. They’ve got lowlevel primitives, but also ConcurrentHashMap, which to be honest doesn’t seem all that high level. You look around at Go or Rust today, and they’re taking a different approach. His comments on generics read exactly like what one would read today about Go. You didn’t really need them to write code, then they added them and the language became far more complicated than anybody expected. The more things change.
One of the hardest problems he worked on turned out to be what we now call stack clash. A debug library would occasionally print the wrong thread ID in its messages. This was ignored as inconsequential. Finally, other problems arose, and after some debugging they found that an enormous stack variable was causing the stack pointer to jump way past the red zone and into another thread’s stack. A valuable lesson in not ignoring inconsequential bugs without understanding the cause.
A colleague apparently mentioned that problems like this are why it’s important to understand how things work all the way down. He counters that it’s a reason to use safe languages (I believe the bug happened in C code), but that sounds like exactly the reason people need to study what’s happening beneath the abstraction. A “safe” language that doesn’t do stack probing will have the exact same issue.
Tim Brady was the first employee at Yahoo. Yahoo got its start as a simple catalog of links that Jerry Yang and David Filo were collecting, mostly for their EE research. Then people sent them links, and the catalog grew and grew. I liked this as a slight variation of the typical advice to build a product that solves a problem you have. They built something, a very little something, that solved their problem but then they let other people use it.
For a long time, Yahoo ignored full text search. They’d search their directory, and if that failed, use this partner or that partner to search the rest of the web. This worked well enough when full text search was kinda bad, and they could swap out one partner for another. Then Google came along and made search work, far better than before, and Yahoo was done. There’s a tipping point where the lesser product gets so much better that it becomes the better product.
Mike Lazaridis founded Research In Motion, foresaw the rising important of wireless data and mobile email, and made the BlackBerry, which dominates the market. Maybe revise that last part for the second printing.
Very early on the company’s history, they had an opportunity to persue a contract for the Canadian space agency, which had always been a dream of Tim’s. After the prototypes would be approved, there would be a mass production order for... six. He turned that down to focus on wireless, which he was sure would eventually become much more popular. And indeed, it has. BlackBerry was even used by NASA, so in the long run he fulfilled his space dream anyway.
He attributes a great deal of BlackBerry’s success to consistency, avoiding the fads, keeping the product simple. And I think that’s true up to a point. You learned how to use a BlackBerry and you were set. But then other fruits came along, and kind of like Yahoo, they found themselves obsolete before they had much of a chance to pivot.
Everybody has heard the tale of the second system. All the stuff which was reasonably cut from the first system gets unreasonably piled into the second system. The very worst second systems are those designed by a whole team of architects who have each designed one previous system, which seems to the the natural tendency. You’ve built X, Y, and Z, and now you want to build the successor system, XYZ, which combines and unites these disparate systems. But everybody has their favorite features which got cut, and are essential to add to the next version. Brooks proposes self discipline as the cure, but I also wonder if this isn’t an argument that we should let separate products evolve a little longer on their own. Build X2, Y2, Z2, and so forth, and then see what features truly are essential. Only combine finished products, not works in progress.
Which gets into the second half of the second system effect, when obsolete functionality is refined beyond its economic benefit. He gives as an example the linkage editor, which I believe we’d call ld, and its support for overlays. Apparently the OS/360 overlay system was the greatest ever built, but the technique was somewhat obsolete by the time it shipped, in favor of dynamic linking. Alas, the overlay support itself, fine as it is, dramatically slows down the linker. Recognize when to let go and focus effort on building other features.
On specifications, several important traits are identified. It must be precise. It must be consistent. It must be complete. One strategy is to have only one or two people write the final specification. They can work from the notes of many others, but the final product should be in their own voice.
Many specifications have a formal and prose version. It’s essential that one specify which has precedence. Beware the dangers of specification by implementation. It’s often easier to change the manual than the code, but that doesn’t result in the best product. A good solution is to build multiple implementations in parallel. This makes it easier to detect discrepancies and also encourages making the right fix. He includes a few examples of systems where unspecified behavior eventually leaked into a de facto spec because there was only one implementation, such as the contents of a register after a particular operation. I’ve seen this plague just about every language, where building a second implementation eventually resorts to running tests against the first implementation instead of reading the manual.
A chapter on tools. Craftsmen love their tools and always use the best.
14. The power of plain text is that it’s easy to work with. Source code, obviously, but also configuration files, data files, etc. Its lowest common denominator nature means you’ll always have the tools to work with it, even when things have gone terribly wrong.
15. Learn to use the shell, instead of clicking in the GUI. I don’t disagree, but there’s some cherry picking of examples here.
16. Learn to use a powerful text editor and use it for everything. Code, email, administration, etc. I happen to use vim for a great many things, but not everything. And I think this advice might get pretty awkward for someone using gmail and an IDE. Not actually advice I would follow unquestioningly. Veering a little bit into myth here.
17. Use source control. Can’t disagree with this, but the few remaining cavemen who don’t yet use source control probably aren’t going to start anytime soon.
We’re going to learn about some bits, bit by bit. Paul Revere used two lanterns, or bits, to convey information about the British invasion. 00 for no action, 01 for land, 11 for sea. But what about 10? That’s also for land. It can be hard to see an unlit lantern at a distance at night, so we’ve added some redundancy.
We can also use bits to encode movie ratings, from A+ all the way down to Pauly Shore. Or film speed. Or UPCs. UPC is a good case study, because there’s a lot of redundancy. The first few bits always follow the same pattern, as do the last bits. This lets you tell if you’re reading the code forwards or backwards. And there’s some check bits as well for parity.
Stopping here, since the next two chapters go together.
From Yahoo to RIM to OS/360, a lot of companies invested in obsolete technology. Classic disruption, where something that didn’t work gets better and eventually replaces the dominant technique. Do we include Java in the group as well?