Tuesday, 8 April 2014

Gmail 10 years on

It is 10 years since Gmail changed what email could be. It is good to recall just how much Gmail altered what we expected from an email client. Here’s a good reminder.

In the end, Gmail ended up running on three hundred old Pentium III computers nobody else at Google wanted. That was sufficient for the limited beta rollout the company planned, which involved giving accounts to a thousand outsiders, allowing them to invite a couple of friends apiece, and growing slowly from there.

I recall how excited I was when I got an invite to join Gmail.

Saturday, 29 March 2014

How to Write a Spelling Corrector...

…in 21 lines of code.

In the past week, two friends (Dean and Bill) independently told me they were amazed at how Google does spelling correction so well and quickly. Type in a search like [speling] and Google comes back in 0.1 seconds or so with Did you mean: spelling. (Yahoo and Microsoft are similar.) What surprised me is that I thought Dean and Bill, being highly accomplished engineers and mathematicians, would have good intuitions about statistical language processing problems such as spelling correction. But they didn't, and come to think of it, there's no reason they should: it was my expectations that were faulty, not their knowledge.

I figured they and many others could benefit from an explanation. The full details of an industrial-strength spell corrector are quite complex (you con read a little about it here or here). What I wanted to do here is to develop, in less than a page of code, a toy spelling corrector that achieves 80 or 90% accuracy at a processing speed of at least 10 words per second.

Read it here.

Tuesday, 25 March 2014

Why I like using Amazon Web Services

Amazon Web Services (AWS) are fast, cheap, and reliable. Usually you have to pick two out of three, but with AWS I get all three.

One of my uses for AWS is a MySQL database for tracking customer sign-ups and log-ins in Poker Copilot. It is a simple way to monitor usage patterns.

I’ve been moving around South America for the last ten weeks, running my business from hotel Internet connections. Every time I try to access this database, I find my access blocked, because access has to be granted to an IP address (or IP subnet) on a case-by-case basis. Each time I access the database from a different IP address I need to go into the AWS web-based console, and add my IP address (or to be precise, my CIDR/IP). Then, and only then, can I access the database.

It’s annoying. Because it is secure. Well, part of a secure configuration. And I like it. If I set up an MySQL instance myself on a rented virtual server, I’d need to set up this stuff. And I’d do it wrong, because setting up and maintaining a database server is not what I usually do. It’d be an after-thought.


Saturday, 22 March 2014

Learning by Teaching

Java 8 was released this week. I’ve been using an early access version of Java 8 for some months. Indeed, I wrote SeeingStars in Java 8.

Java 8 includes many new features, APIs, and additional syntax. Best of all it includes lambda expressions. This is possibly the biggest update to Java ever.

I’m finding it tough to learn and remember all the new stuff in Java 8. Then I recalled reading that when you teach a concept you become very knowledgeable on it. That is, teaching something is a good way to learn it very well yourself. So in light of this, I’ve restarted the Java Newsletter. For a while each week I’ll be covering a new feature in Java 8. If you use Java, I recommend signing up here.

Saturday, 1 March 2014

Lessons from Apple's SSL Bug

There’s a summary here of Apple’s recent SSL bug in iOS.

This sort of subtle bug deep in the code is a nightmare. I believe that it's just a mistake and I feel very bad for whoever might have slipped in an editor and created it.

Here's a stripped down that code with the same issue:

extern int f();
int g() {
int ret = 1;
  goto out;
ret = f();
return ret;

If I compile with -Wall (enable all warnings), neither GCC 4.8.2 or Clang 3.3 from Xcode make a peep about the dead code. That's surprising to me. A better warning could have stopped this but perhaps the false positive rate is too high over real codebases?

I fired up AppCode, the world’s best Objective-C IDE, which happens to also support C. I added the code snippet above, and it instantly and correctly highlighted the line “ret = f();” as unreachable code.

Lessons I take from this:

  • Use a state-of-the-art IDE that has excellent real-time code analysis tools. Don’t ignore the warnings it gives unless you have a really good reason for doing so. And even then use a error suppression technique.
  • Don’t ignore compiler warnings. Oh right, I already wrote that. It’s important, you see. Start ignoring warnings, and then when a really important one appears you won’t notice because it will be just one of dozens of warnings that you conditioned yourself to ignore.
  • Before committing code, run static analysis tools on it. Fix the issues detected.
  • If possible, have a formal code review on any code you’ve changed that is dealing with security, memory management, or threading.

There is much data and research that shows that these techniques lead to much higher quality software.

Tuesday, 25 February 2014

Doing A/B Testing? Read This

This article claims that "MOST WINNING A/B TEST RESULTS ARE ILLUSORY”. I agree.

Please ignore the well-meaning advice that is often given on the internet about A/B testing and sample size.
For instance, a recent article recommended stopping a test after only 500 conversions. I’ve even seen tests
run on only 150 people or after only 100 conversions [5]. This will not work. The truth is that nearer 6000
conversion events (not necessarily purchase events) are needed.


Almost two-thirds of winning tests will be completely bogus. Don’t be surprised if revenues stay flat or even go down after
implementing a few tests like these.

Read it all here (PDF).

Friday, 21 February 2014

From Dental Technician to Board Game Designer

Here’s a profile of Klaus Teuber, who left his unhappy life as a dental technician after creating the most famous and successful board game of recent decades.

Pete Fenlon, the C.E.O. of Mayfair Games, said, “Our volume of sales will be such that, over time, [Settlers of] Catan could, in terms of gross revenue, be the biggest game brand in the world.”

I’m a big fan of the game. I’ve bought it in 3 languages and I’ve also bought the iPad version.