Tuesday, 10 December 2013

Multithreading is easy? If only...

There’s a well-argued, thoughtful blog essay on blog.smart bear.com, arguing that multithreaded programming is easy:

Perhaps you’ve accepted the common fallacy that “Multithreading is hard.” It’s not. If a multithreaded program is unreliable it’s most likely due to the same reasons that single-threaded programs fail: The programmer didn’t follow basic, well known development practices. Multithreaded programs seem harder or more complex to write because two or more concurrent threads working incorrectly make a much bigger mess a whole lot faster than a single thread can.

Oh I wish that “multithreading is hard” were a fallacy.

Writing a background thread that performs a long-running calculation? easy. Especially the examples from the multithreading books, which are almost invariably Fibonacci calculations, a somewhat irrelevant example, because Fibonacci numbers can be calculated quickly without the need for multithreading.

Writing a background thread that can be cancelled immediately, that gives useful progress information, that doesn't leave other threads indefinitely blocked if cancelled, and that cleans up after itself if cancelled? hard.

For my single-threaded code I have great static analysis tools, IDE magic, and testing frameworks, all which help me write clean code. For my multi-threaded code, I'm on my own. Let one, just one, mutable variable escape from a thread, and suddenly I have non-deterministic error possibilities, delightfully difficult to reproduce, to find, and to fix. Perhaps I’m coding while tired, and I inadvertently make a mutable list escape from its thread. My programmer’s toolkit is not smart enough to detect this subtle error. Now I’ve got an error that may not occur today. It may not occur within the next month. It may not occur ever. Or just possibly I now have a new source of really subtle errors.

That reminds me of a war story. I joined a software team for version 2 of their project. My main task was making some slow things go much faster. I did that. And then really strange things started happening. I eventually found a mutable data structure shared between threads. When the system was slow, this bug was hidden and never had a chance to occur. By fixing other problems, suddenly this bug, that had been in the system for a year but had never caused a known problem, was causing crashes daily.

My main product, Poker Copilot, is heavily multi-threaded. To minimise the pain of multi-threaded bugs in Java, I follow these guidelines I created just now:

  • Use Java’s own excellent Executor framework. It was introduced in Java 5; before that we had to use horrible low-level multi-threading techniques.
  • Read, re-read, and re-re-read the canonical Java Concurrency in Practice. Readable, thorough, and written by Java deities.
  • Use immutable variables and classes fanatically. Returning a list? Don’t. Instead, return an immutable copy of the list. Creating a new class? Make it an immutable, final class. By default go for immutable options. Google’s Guava Java libraries have a comprehensive set of immutable data structures that make this easy.

No comments:

Post a Comment