S Anand

Statistically improbable phrases on Google AppEngine

I read about Google AppEngine early this morning, and applied for an invite. Google’s issuing beta invites to the first 10,000 users. I was pretty convinced I wasn’t among those, but turns out I was lucky.

AppEngine lets you write web apps that Google hosts. People have been highlighting that it give you access to the Google File System and BigTable for the first time. But to me, that isn’t a big deal. (I’m not too worried about reliability, and MySQL / flat files work perfectly well for me as a data store.)

What’s more interesting unlike Amazon’s EC2 and S3, this is free up to a certain quota. And you get a fair bit of processing power and bandwidth for free. One of the reasons I’ve held back on creating some apps was simply because it would take away too much bandwidth / CPU cycles from my site. (I’ve had this problem before.) Google quota is 10 GB of bandwidth per day (which is about 30 times what my site uses). And this is on Google’s incredibly fast servers It also offers 200 million megacycles a day. That’s like a dedicated 2.3 GHz processor (200 million megacycles = 200,000 GHz x 1 second ~ 2.3 GHz x 86,400 seconds/day) — better, because this is the average capacity, not peak capacity. The only restriction that really worries me is that only 3 apps are allowed per developer.

So I decided to give a shot at publishing some code I’d kept in reserve for a long time. You may remember my statistical analysis of Calvin & Hobbes. For this, I’d created a script in Perl that could generate Statistically Improbable Phrases (SIPs) for any text. This is based on (a somewhat limited) 23MB corpus of ebooks that I had. I’d wanted to put that up on my website, but …

AppEngine only uses Python. So the first task was to get Python, and then to learn Python. The only saving grace was that I was just cutting-and-pasting most of the time. Google wasn’t helping:

Google AppEngine Over Quota Error

Anyway, the site is up. You can view it at sip.s-anand.net for now. Just type a URL, and it’ll tell you the improbable words in that site.

Visit sip.s-anand.net

Technical notes

I realise that these are statistically improbable words, not phrases. I’ll get to the phrases in a while.

The logic is simple:

  • Get the frequency of words in a corpus. I pre-generated this file. It has over 100,000 words.
  • Get the URL as text. Rather than muck around with Python, I decided to use the W3 html2txt service.
  • Convert the text to words. Splitting text into words is tricky. For now, I’m simply assuming that any group of letters is a word, and anything that’s not a letter is a word delimiter.
  • Find the relative frequency (improbability) of words. This is the frequency in the URL divided by the frequency in the corpus, normalised (i.e. scale it so that the maximum value is 1.0).
  • Create a tag cloud. I use the word frequency as the size and the improbability as the colour. You need a bit of mathematical jugglery to get the pattern right. Right now, I’m taking the 6th root of the improbability and the logarithm of the frequency to get a reasonably smooth tag cloud.

The source code is at statistically-improbable-phrases.googlecode.com.

Update: 12-Apr-2008. I’ve added some interactivity. You can play with the contrast and font size, the filter out common or infrequent words.

Update: 22-Apr-2008. Added concordance. You can click on a word and see the context in which it appears.

Firefox 3 Beta 5 crashes

I just upgraded from Firefox 3 Beta 4 to Beta 5. It’s amazing how unstable Beta 5 is compared to the earlier version. Gmail crashes. Google maps crashes. Almost every other site I visit crashes. And looks like I’m not alone: doing a Google search for “Firefox 3 beta x crash” shows a consistently increasing number of results.

Number of Google search results for Firefox 3 Beta crashes, by Beta version

Update (8/Apr/08): As the comments rightly point out, this could simply be because more people use Beta 5. Here’s the number of Google hits for “Firefox 3 Beta x” — and it shows a clear increasing trend.

Number of Google search results for Firefox 3 Beta, by Beta version

So, adjusting for this, here’s the relative crash frequency:

% of Firefox 3 Beta crash mentions on Google, by Beta version

Beta 5 still stands out.

Maybe Google search results are not a good proxy. Maybe the mention of “crash” doesn’t indicate the software itself crashing. But it sure crashes a lot more for me.

Time management

Some years ago, a friend asked me to write about how I manage my time. It seemed to him I was doing a good job of it, given that I had time to pursue my interests.

It’s something I tried to do consciously. Every few years, I used to go down the route of “time management”. I’d read stuff and try it out.

But over time, I’ve come to believe that “time” is not really “manageable”. Think about it: are most of your actions planned? Me, I just react out of habit, no matter how well planned I try to be. What I do is largely driven by what I’m in the habit of doing.

Not that time management advice is useless, but you’ll end up not following most of it. You act on a fraction of what you read. A fraction of that turns into a habit. That’s still useful. But the point is, rather than pick up 10 tips on time management, it’s more useful to pick one or two pieces of advice that you like, and are likely to act on. (You won’t do things you don’t like anyway.)

So time management is about acquiring habits that save time (and is not about reading tips that are tough to habitualise).

That begs an obvious question and a subtle one. The obvious one is what habits save time? The subtle one is why save time?

Why save time?

You’ve probably heard the phrase “time is money”. For a while, I took that statement literally. I tried to act by assigning monetary value to my time, and by doing the most profitable thing.

I was making Rs 10,000 a month at that time. That’s about Rs 50 an hour. So I figured I wouldn’t do anything that earned me less than Rs 50 an hour outside of work. I mean, if I’m making Rs 50 an hour at work, why should I make any less outside?

One small hitch. I wasn’t making any money outside of work. In fact, I was spending money. So unless I took up a night job, or started freelancing, that rule of thumb was useless. (Besides, I didn’t want to spend time outside of work working. I wanted to have fun. Watch movies, for instance.)

So I needed a different way of handling this. If I spend 3 hours at a movie for Rs 60, that could be a benchmark. If something’s more expensive than Rs 20/hour, I’d rather watch a movie. If it’s less expensive, I’d do that. Take books, for instance. A typical novel would cost Rs 180 and I’d finish it in 12 hours. At Rs 15 / hour it’s a more economical way of spending time.

Except that it doesn’t quite work that way. How much fun I had, had nothing to do with how much I paid for it.

Frankly, in daily life, I don’t think you can treat the phrase “time is money” literally. Time has nothing to do with money.

Time is like money in a different way, though. By itself, it isn’t worth much. Think about it. What can you do with money? Buy stuff you like. And if you can’t, it’s useless.

Obelix: How silly! Fancy throwing out good onion soup to make room for sesterii! Asterix: But Obelix, with sesterii, you can buy onion soup! Obelix: That's the point! Why throw out the onion soup when it was in the cauldron already?

If all you need is onion soup, why throw it out for sesterii?

Time’s like that. What can you do with time? Do stuff you like. And if you can’t, it’s useless.

There are usually two reasons people want to manage time. One is where they don’t enjoy something, and would rather spend as little time at it as possible. But look, if you don’t enjoy that stuff, time management isn’t your problem. You need to get out of your job or whatever. Managing time more efficiently is simply going to let you efficiently waste your time. (Though in the short run, that’s probably the best you can do — efficiently get rid of nuisances. I’ll talk about that shortly.)

The other reason is where they have too many (enjoyable) things to do, and can’t do all of them. But hey, if you have too much enjoyable stuff, you don’t have a problem! In a way, this is like wanting to buy many things and not having enough money. With money, you can earn more or wish for less. With time, you just have to wish for less. (Living longer may not be a practical option.) Just pick anything you like to do. Don’t regret the stuff you can’t. You only have 24 hours, and you’re among the lucky few who can fill it with things you enjoy.

So, I’m effectively saying, there’s no point trying to do things more efficiently in the long run. Picking what you do is more important than doing it efficiently. (That roughly correlates to the third habit in Stephen Covey’s Seven Habits: Put First Things First. It’s the key to time management.)


So, how do you pick what to do? You’d probably want to pick something that you like, or something that’s good for you.

But it’s tricky to predict what you like.

  • We don’t know what we want. Sometimes, it’s as simple as that — we just don’t know what we’d like to do.
  • Too much of anything… I love watching movies, but I’ve never managed to watch more than 4 a day. I’ve tried breaking that record many times. Just doesn’t work. At the end of the 4th movie, I’m sick and my bum is sore. Do I prefer movies to cleaning up? Usually. But by the end of the 4th, I’d rather clean up.
  • Preferences are not consistent. I prefer a 7 megapixel camera to a 2 megapixel one. I prefer a cheaper camera to a more expensive one. So between a $100 2MP camera and a $200 7MP camera, I’m just making a wild guess.
  • Preferences are not static. If I’m tired, I’d rather watch a movie I’ve seen before. If not, I’ll experiment with an art film. There’s no telling beforehand what my mood is going to be at any point.

It’s just as tricky to figure out what’s good for us. We have no clue what will happen tomorrow. We have no clue what consequences our actions will have. (Read The Black Swan to get a flavour of that.) So we’re really guessing and groping — though sometimes with a lot of confidence.

On the whole, it’s difficult to figure out what to pick. So what do you do?

This is completely outside the realm of time management. This is about choice. I have a few (bad) habits that guide me.

  1. Follow your moods
  2. Work less
  3. Procrastinate

Those are my principles. (But like Groucho Marx, I do have others.)

Follow your moods

There are times when people do certain things better. I’ve heard some people study best early in the morning. Others study best late at night. I don’t know if there’s any physiological benefit one way or the other, but even if it’s psychological, it makes a huge difference to study when you think you’ll learn better.

Sometimes I’m in a mood to write articles. When I do, the article usually writes itself. If not, I could spend days at it without any progress.

If there’s any reality to this, then the best thing to do is to do what you feel like doing. You’ll naturally accomplish this faster. That’s typically what I do when I’m given any work. I usually wait until I just feel like it. Then it’s usually a matter of a few hours before the job is done. Sometimes the mood doesn’t quite arrive before the deadline, in which case there’s always inspiration.

Calvin & Hobbes: Do you have an idea for your story yet? No, I'm waiting for inspiration. You can't just turn on creativity like a faucet. You have to be in the right mood. What mood is that? Last-minute panic.

Seriously: do what you feel like doing the most at the moment. That’s a great way of becoming more efficient.

In fact, I would go as far as saying, mood management is more important than time management. Moods are more precious than time. If you’re in a mood to call people, pick up the phone and talk to folks you’ve been out of touch with. That mood is rarer than the time to make calls. (At least for me, the reason I am not in touch is because I’m not in a mood — not because I don’t have time.)

Optimise that mood. Do what you’re in a mood for. And when your mood changes, go with the flow. Do a lot more of what you feel like doing. You’ll do more (which is probably good), and of what you like (which is certainly good).

Work less

I’ve talked about this in Less is more. At the end of the day, 90% of the stuff you do is useless. So why do it? Just focus on the 10%.

Procrastinate

I can’t put this better than Paul Graham’s article on procrastination.

Good procrastination is avoiding errands to do real work.

You won’t know what the important 10% until much later, so you may as well wait to find out if it’s important, and then do things.


So what am I saying?

  • Time management is about habits, not tips
  • Picking what you do is more important than doing it efficiently
  • But it’s difficult to figure out what to pick
  • So avoid doing stuff until you know it’s worth doing
  • Work when you’re in the mood — it’s faster that way

Think about it.

Reading books on a laptop

I have the habit of reading books on the screen. It’s something that started from the early 90s, when I got a copy of The MIT Guide to Lockpicking. Since I didn’t have access to a printer, I’d spent hours poring over the document on the screen. And then I discovered Project Gutenburg

I’ve heard many people ask if I have a problem with this. Personally, no. I’ve been staring at screens from the age of 12, and I’m quite used to it. My job requires me to stare at a screen for most of the day anyway. (I’m not saying there’s no a strain on the eye. My eyes are red at the end of the day. I don’t know if they would be less red if I’d been staring at paper instead of a screen. But my glasses have remained roughly the same power over ~15 years, so it’s probably not ruining my eyesight much.) For those who are like me who reads all the time and spends a lot of more time facing their laptops, you might want to check this sd card, a very good quality card that can be handy in the future.

To me, the main advantage of a book is that a book is a lot easier to handle.

  • You can fit a book into your bag, sometimes into your pocket.
  • You can hold it in your hand comfortably — it’s easy to grip, and light.
  • You can open it instantly (no need to boot up).
  • You can bookmark it (or even just remember the last page number) and quickly flip to that

None of these is possible on a computer.

Or is it?

On a desktop, I agree — it’s impossible to read for long. Your back would kill you. I’ve done it for many years, and it’s not worth the pain. With a laptop, however, you can lie down on the bed or sofa and read. It’s a huge advantage. (For just this one reason alone, I’d suggest that everyone buy a laptop.)

As for carrying books, I carry my laptop to work every day, so there’s no incremental burden. But if you weren’t doing that, it’s probably not a great idea. When I travel on weekends, I’d much rather take a physical book than a laptop. This is probably the single biggest problem with a laptop — that it doesn’t travel as easy as a book.

That’s probably offset by the advantage that a laptop isn’t really a book — it’s a library. I don’t need to decide which book to read. I can bring them all along, pick what I like, and when I’m done, move on to the next. And I’m not restricted to books. I have a fairly good collection of movie scripts and comics. Depending on how long I have on the train, and my mood, I can pick between these.

One thing that makes a laptop a lot easier to use is to rotate it.

Laptop in landscape mode

Laptop in portrait mode (rotated)

If you hold the laptop this way, it’s surprisingly easy to handle. I find that I can read this way even when standing on a crowded train — which is as much as I can expect from any book. (Strangely enough, it doesn’t seem to attract too much attention on the train either.)

If you have a decent graphics card, you can rotate your screen using the graphics properties. (I’m sure there are are hotkeys to do this. My two-year old daughter somehow knows them, and manages to turn the screen upside down in a fraction of a second, while I spend then next 5 minutes struggling to restore an upside-down screen.)

If not, you can just use a PDF reader (like FoxIt, which is better than Acrobat Reader) to rotate the page by 90°.

A laptop takes care of the problems of bookmarking and load time as well. I usually leave mine on hibernate, and it takes about 10 seconds to open up to where I left off. Sometimes I just leave the laptop on in the bag — for example if I’m changing trains.

The other solution, of course, is to try an ebook reader. Given my laptop, I haven’t tried one. But other than the ease of holding it, there’s no big I see.


The other question is, how do you find ebooks?. Other than buying them, I find that the easiest option is to search on Google. A surprisingly large number of them are indexed.

Here’s a custom search engine for ebooks.

Chaining functions in Javascript

One of the coolest features of jQuery is the ability to chain functions. The output of a function is the calling object. So instead of writing:

var a = $("<div></div>");
a.appendTo($("#id"));
a.hide();

… I can instead write:

$("<div></div>").appendTo($("#id")).hide();

A reasonable number of predefined Javascript functions can be used this way. I make extensive use of it with the String.replace function.

But where this feature is not available, you an create it in a fairly unobstrusive way. Just add this code to your script:

Function.prototype.chain = function() {
var that = this;
return function() {
    // New function runs the old function
    var retVal = that.apply(this, arguments);
    // Returns "this" if old function returned nothing
    if (typeof retVal == "undefined") { return this; }
                // else returns old value
    else { return retVal; }
}
};
var chain = function(obj) {
        for (var fn in obj) {
                if (typeof obj[fn] == "function") {
                    obj[fn] = obj[fn].chain();
                }
    }
        return obj;
}

Now, chain(object) returns the same object, with all its functions replaced with chainable versions.

What’s the use? Well, take the Google AJAX search API. Normally, to search for the top 8 “Harry Potter” PDFs on esnips.com, I’d have to do:

    var searcher = new google.search.WebSearch();
    searcher.setQueryAddition("filetype:PDF");
    searcher.setResultSetSize(google.search.Search.LARGE_RESULTSET);
    searcher.setSiteRestriction("esnips.com");
    searcher.setSearchCompleteCallback(onSearch);
    searcher.execute("Harry Potter");

Instead, I can now do this:

chain(new google.search.WebSearch())
.setQueryAddition("filetype:PDF")
.setResultSetSize(google.search.Search.LARGE_RESULTSET)
.setSiteRestriction("esnips.com")
.setSearchCompleteCallback(onSearch)
.execute("Harry Potter");

(On the whole, it’s probably not worth the effort. Somehow, I just like code that looks like this.)

Less is more

The hours in consulting are pretty long. 65 hours a week used to be my norm, and that’s ignoring the travel time to and from work. So there wasn’t too much life outside of work. (I’ve come to realise, though, that what you do outside of work doesn’t change that much with more free time. What does change is that you just enjoy it more — both in and out of work.)

We have a day, once every month or two, where you take time off from whatever project and head back to the office. One such featured a session with the managers telling the consultants how to succeed. Pretty good advice, actually… but that’s not what I’m going to talk about. It’s something about the nature of that advice.

The advice had a lot of TO-DOs and suggestions. Do this. Do that. Focus more on this. Focus a lot on that. Great. Now we know what to do more of.

My question, towards the middle of the session, was: OK, so what do we do less of, then?

You can’t do more of something unless you do less of something else. In most places, it’s easy to answer this with: “Oh, you need to be more efficient.” or “Cut the idle gossip”. For us, none of these were applicable.

The question pretty much remained unanswered. And with good reason. It’s a tough question.

Later, I got involved with a proposal. I wrote a few bits of it. (One page, actually.) Others wrote a few bits of it. And then some standard appendices were added to it. Finally, it ended up as a 180-page document.

The interesting thing is, I can bet no human ever read those 180 pages end-to-end.

I know no one at our end did, because we turned it around in 1 week, and I was the last to assemble the document before sending it out.

I’m guessing no one at the client end did, because they’d have gotten 5 such documents, and had a week to shortlist down to 3.

So if we didn’t read it and they didn’t read it, why did we put it in?

I think I know why. In my IBM days, I had to make a presentation to the management on productivity. I knew nothing of management or productivity. So I put in a report that had a lot of high-sounding words (you know… value-add, leverage, etc.) that looked reasonably impressive and had no basis in fact.

I did that mostly because I was scared. Of seeming to know less. Of being wrong. You know.

(Funnily enough, the presentation was pretty well received. I don’t know if it was because they were polite or had become numb to bullshit.)

This fear is pretty common. I know how that 180-page document ended up as a 180-page document, and I’m sure you’ve seen this happening before. First, here’s a sample conversation at the client end, when they’re writing up a request for information.

Martin: So, what do I put in the RFI?

Clive: Here’s a template we used. You can use some of that. Ask Nick for the one he used last month, and Natalie for hers. Maybe you should get something from our procurement team and information security group to be on the safe side.

Martin: And how do I make the RFI out of this? (BTW, this is a “bold” question that’s rarely asked.)

Clive: Well, make sure you cover everything from all of these documents.

So the RFI asks asks:

  • if any of your 80,000 employees are a member of any one of the following 340 organisations that are considered disruptive,
  • how many employees you have in each geography, function and vertical — where the break-down provided is as per their definitions (we cook up numbers which, if you add up, totals to over 200,000)
  • how much you spent on paper-clips last fortnight, and other such intimate corporate P&L secrets

And we answer these. The answers to the above 3 questions were “No”, a table of numbers, and “We are not at liberty to divulge this information…”

Now, looking at the answers above, it still doesn’t add up to 180 pages. It’s hardly half a page. But you’ve got to take the following conversation at our end into account.

Steve: You know, we’ve got to put in some details about our methodologies in this section.

Me: I have.

Steve: Yeah, but maybe we should add more, you know, like supply chain methodologies and change management.

Me: But they’re irrelevant!

Steve: Well, can’t say that. Change management is always relevant. SCM… well, no harm putting it in. They can skip it if they don’t want to read about it.

That’s it, isn’t it? There’s no harm in doing more. I’ll just toss it in. If you don’t want to read it, skip it. I’ll just ask you to do more of these. If you can’t, skip the useless stuff.

An innocuous sounding statement: do more. I tremble whenever anyone suggests it. There’s no defence.

There’s a fundamental belief at work here. That more is better.

This is fueled by a lack of confidence. Put in high-sounding words. They look impressive. What’s missed is that experts use jargon because they understand what it means, and it conveys a lot in few words. Others follow a cargo cult science.

What we lose, though, is subtle.

Firstly, it wastes time. It wastes my time. It wastes your time. But hey, time is not all that important. (I’m not saying this sarcastically. I believe that wasting time is quite OK, really, and it’s not such a big deal.)

What’s more important is that it destroys focus. Some things in the document are important. Most others are not. In a 180-page document, I can’t find the important stuff! It actually does harm to put it in if it’s irrelevant.

That’s the tough tradeoff, really. A tangible incremental value against an intangible loss of focus. The value looks attractive when you’re less confident. The document seems completely unfocused anyway.

So what the heck, put it in.

Do more of this. And that too.


So what can you do? Quite a bit, surprisingly.

Firstly, you’ve got to believe that less is more. The response to “What’s the harm in adding…?” is “It dilutes the message”. There’s two things here. Believing it. And having the courage to say it. Trust me, you really believe it only when you say it.

Next, you’ve got to understand — really understand — before you write or speak. That requires not fooling yourself. And it requires a lot of practice. I’ve had nearly 20 years of training in fooling myself, so it’s an uphill task. Many people are worse off, never having tasted true understanding.

Third, you’ve got to be brave enough to shut up, or say “I don’t know”. Initially, this was tough for me, but I learnt from a friend. I always thought him not-so-smart, but honest. He’d ask, “But why?” and when I’d explain, he’d say, “I don’t understand it.” After two hours of trying to get him to understand, I’d realise that I was the one who never got it in the first place. After a while, I got into the habit of being very prepared before I explained anything to him.

Saying “I don’t know” doesn’t make people think less of you, I’ve found. I know a lot of people disagree with me. One of the most consistent feedbacks I’ve received in the first half of any project or firm I’ve been in is, “He should speak up.” Dammit, I don’t have anything to say! If I know something, I’ll say it. If not, I’ll shut up. Now, despite this feedback, no one’s quite objected to me. And in the second half, they’re always amazed at how much I’ve improved based on the feedback.

The feedback had nothing to do with it, of course. I just happen to know more in the second half of a project.

There’s a reason why your boss wants you to talk. It makes you appear knowledgable. In the short term, that’s good. You talk about “value” and “leverage” and people nod wisely.

In the long term, it makes you less able to say “I don’t know.” (What? This brilliant chap who knew all about value and leverage doesn’t understand our way of calculating ROI?)

It makes you less likely to ask questions.

It makes you learn less.

It makes you dumb.

On the other hand, I’ve learnt to plead ignorance up front. “Do you understand ROI?” “No.” Not even an excuse for it. Frankly, it saves time.

Sometimes, a meeting’s running late, I’m hungry, and I just nod at whatever’s said, and you lose the window of opportunity to ask. Except, I’ve learnt, there’s no such thing as a window of opportunity. If you don’t get it, ask. If they’ve said it thrice, and you still don’t get it, ask. More likely they’re not clear about it.


Postscript: This morning, I had to convert a document into a standard template. My document was 3 pages long. The template (just the headings) was 14 pages long.

Why? Because someone wants all documents in that format. Does it help them? Maybe not. But it has to be done. Standards.

Sometimes, it’s easier to give up. The smart thing is to minimise the effort on pointless work. I took 15 minutes. Beyond a point, I protect myself rather than the poor reader.

Lazy bargain hunting

I’m thinking of buying a digital keyboard with touch sensitive keys and MIDI support. (The one other thing that I thought off — a pitch bend — puts the keyboards out of my budget.)

I’d like a good deal. (Who doesn’t?) But I don’t like to spend time searching for one. (Who does?)

So here’s the plan.

Firstly, I’ll restrict my search to Amazon.co.uk. For electronics items, I haven’t found anyone consistently cheaper. Tesco has some pretty low prices, but not the range. eBuyer is pretty good, but not often enough. Google Products is the only other one that gets me consistent lower prices, but I’ve had my credit card identity stolen once before while shopping online, so I’d rather not pick any random seller listed on Google.

Amazon has a secret discount. You can search for electronics items with 30% off or more. And then you can narrow it down to Sound & Vision > Musical Instruments > MIDI Keyboards. Further cap a 100 – 200 GBP restriction. That leaves us with one product:

MIDI keyboard on Amazon

While that matches my criteria, I’m in no hurry and can wait for more offers to come up. But I don’t want to keep checking this page every day. So, RSS to the rescue. You probably think I can’t get enough of RSS feeds. And you’d be right. The thing is, as an attention mechanism, it is incredibly powerful, and I never cease to be amazed that the things it lets me do.

Using my XPath checker and a bit of trial and error, I figured all product links link to “amazon.co.uk/dp/…” with a <span> inside. So this XPath gets all the links:

//a[contains(@href,'/dp/')][span]

And I made an RSS feed out of that using my XPath server and subscribed to it on Google Reader.

Combining a bunch of such searches, I have a shopping folder on Google Reader has all the items I’m searching for. Now that’s lazy bargain hunting.


Which is all very fine. But given that I’m buying a car in a hurry right now, and I’m not doing any bargain hunting, it’s a classic case of being penny-wise and pound-foolish. Sigh…

Implicit information

From what I’ve seen, puzzles and exam questions share two un-real-worldly characteristics. Firstly, you are guaranteed that a solution exists. Secondly, you are given that all the information provided to you is relevant. (Well, not always. Some case studies I’ve seen have had their share of contrived irrelevance. But that’s often what it is, I think. People fill in the relevant stuff, and then try and distract by adding irrelevant material in the hope of making it more real-world-like. But that’s just a guess).

These are very powerful constraints. I know of nothing that has given me as much confidence in solving puzzles as the assurance that a solution exists (and that someone thinks me capable of getting it).

But it’s more than just a confidence builder. The guarantee that a solution (and invariably it’s a unique) is a very powerful one. An extreme case is an objective type question, which explicitly provides three guarantees:

  1. There is a solution
  2. There is only ONE solution
  3. It is among the choices listed below

(Some papers try and take away the first guarantee by having an (E) None of the above category. But that’s still leaving behind the other two more powerful guarantees.)

Marking answers randomly, or marking (A) for every question would still get you 25% in an exam with 4 choices. (Marking (C) would prove just as good, unless you had a kind professor like this.) That’s better than any real-world scenario I’ve seen. (Real-world strategies aren’t much better, though.)

Using guarantee 2, you can eliminate choices easily. If (A) and (B) do not satisfy some property of the solution, they CANNOT be the answer. There’s only one solution, and these are not it.

Using guarantee 3, you can pick the last remaining choice wihout having to check it. The solution is definitely among the choices listed. So you don’t need to solve an objective type question. You just need to pick the right answer — which is completely different.

The principle applies even outside of objective type questions, especially in mathematically-oriented problems, or puzzles. And you can solve it by trial and error. For example, try this one from Martin Gardner‘s Mathematical Magic Show:

Two brothers own n sheep, each of which is sold for n dollars. Thus they have n2 dollars in all. This is in the form of 10 dollar notes and 1 dollar coins, the number of 1 dollar coins being less than 10 dollars. The elder brother divides the money as follows: he takes a note for himself, gives one to his younger brother, takes a note for himself and so on. At the end, the younger brother complains that the elder took the first note as well as the last. So the elder gives the younger all the one dollar coins. The younger brother complains that he still has more. So the elder brother writes the younger a cheque to equalize their share. What was the cheque for?

Now, this is a weird problem. Think about it. You’re told almost NOTHING. And you have to guess what the amount is. (Note: you don’t have to guess what ‘n’ is. That’s impossible.)

Here’s how I solved the problem. I said, let me find even one case where the elder brother gets the first and last note. Let’s see what the answer is. Whatever the answer is for that case, it has to be the answer for all other cases — because otherwise, the problem does not have a unique solution.

So I tried n=1. n=2. n=3. For n=4, the amount is 16. That’s 1 $10 note and 6 $1 coins. The elder brother would get the first and the last $10 note. The younger would get $6. So the elder would have $4 more than the younger, and would write out a cheque for $2. (It’s amazing how many people get as far as the $4, but forget to divide by two.)

You can try if for any other value that has an odd number of $10 notes. It has to be for n ending with 4 or 6. That means n2 ends in 6, and the cheque has to be for $2.

Notice that you didn’t need number theory to get the answer. The assurance that there is a unique answer is enough.


There’s another kind of implicit information usually available: the amount of information there is. For example, take the following question:

Which city has a higher population: San Antonio or San Diego?

Children in the US apparantly had difficulty answering it. Children in Germany had less trouble. The reason? The German kids had heard of San Diego, but not San Antonio. They figured the one they’d heard of was more likely bigger. Knowing less may be better.

It’s the same principle you use to check spellings. Run a Google search on two spellings. The one that returns a higher number of results is the correct spelling. (Of course, Google has a spelling correction mechanism that works well, but I use it for Tamil words. I can never tell if I should use ர or ற.)

Of course, the fundamental assumption here is: MORE INFORMATION = MORE CORRECT, which is not always the case. But the point I’m driving to is this:

You’re always given additional information. Even if you’re not given any information, that’s informative.