S Anand

The Non-Designers Design Book

I’ve been thumbing through books on visual design for a while, and recently, picked up a copy of The Non-Designer’s Design Book by Robin Williams.

If there’s one book that I’d suggest to a newbie on visual design, it’s this one. It’s rare among design books in that it offers 4 design principles that are easy to remember, easy to spot when violated, and easy to fix. Over 90% of the slides that I have reviewed violate at least one of these principles (often all), so I guess there’s a 90% chance this book will improve your design.

The four principles are (in the order of how often I see them violated):

  1. Alignment. Every edge of every element should be aligned with an edge of another element.
    Get that? Every edge of every element. No exceptions.
  2. Repetition. Use the same styles right through the presentation: fonts, size, colours, shapes.
    If you ever change a font, you must have a reason. Same for colour, size, shape, etc.
  3. Contrast. If you do change something, change all the way. Change the font, size, colour, everything.
    If two elements are not the same, then make them very different.
    If there were just 3 words on this slide, what should they be? Make those stand out.
  4. Proximity. Related items should be close together and grouped. Unrelated items should be far away.
    After designing your slide, list the elements, group them, and redesign to keep the groups together.

Contrast and proximity are important for the message. Proximity groups information into messages, and contrast highlights the key message. Alignment and repetition are more important for design. It makes for more appealing reading.

Williams orders these in a different way to create a memorable acronym. (I’ll never forget it.)

  1. Contrast
  2. Repetition
  3. Alignment
  4. Proximity

I’ll let you read the book and absorb it better. At less than 200 pages, it’s a very readable book.

Caching pages on Apache

I don’t use any blogging software for my site. I just hand-wired it some years ago. When doing this, one of the biggest problems was caching.

Consider each blog entry page. Each page has the same template, but different content. Both the template and content could be changed. So ideally, blog pages should be served dynamically. That is, every time someone requests the page, I should look up the content, look up the template, and put them together.

I did that, and within a few days outgrew my hosting service‘s CPU usage limit. Running such a program for every page hit is too heavy on the CPU.

One way around this is to create the pages beforehand and serve it as regular HTML. But every time the template changes, you need to re-generate every single page. I had over 2,500 pages. That would kill the CPU usage if I changed the template often.

At that point, I did a piece of analysis. Do I really need to regenerate all 2000 blog entries? Wouldn’t the 80-20 rule apply? The Apache log confirmed that 20% of the URLs were accounting for 76% of the hits. So I’d be wasting my time regenerating all the pages every time I changed the template.

Graph: 20% of URLs account for 76% of hits

So based on this, I decided to dynamically cache the pages. When a page is requested for the first time, I create the page and save it in a cache. The next time, I’d just serve it from the cache. If the template changes, I just need to delete the cache. This way, I only generate pages that are requested, and they’re only generated once.

OK, so that’s the background. Now let me get to how I did it.

I wrote a Perl script, blog.pl, that would generate a page in the html folder whenever it is called. Next, I changed Apache‘s .htaccess to run this program only if the page did not exist in the html folder.

# Redirect to cache first
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]*)\.html$       html/$1.html

# If not found, run program to create page
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^html/([^/]*)\.html$  blog.pl?t=$1

The first block redirects Apache to the cache. The second block checks if the file exists in the cache. If it doesn’t, the Apache redirects to the program. The program creates the page in the cache and displays it. Thereafter, Apache will just serve the file from the cache.


This Apache trick can be used in another way. I keep files organised in different folders to simplify my work. But to visitors of this site, that organisation is irrelevant. So I effectively merge these folders into one. For example, I have a folder called a in which I keep my static content. I also have this piece of code:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]+)$   a/$1

If any file is not found in the main folder, just check in the a/ folder. So I can access the file /a/hindholam.midi as /hindholam.midi as well.

This can be extended to a series of folders: either as a cascade of caches, or to merge many folders into one.

How often to write

If you look at the number of entries I’ve written every month since 2005, there has been a clear decline. While I was averaging almost an entry a day in 2005 and 2006, that dropped to 2-3 entries a month since mid-2007.

Number of entries per month declining

This doesn’t bother me. I’ve been lucky to never have lost sight of the purpose of this website. This website is meant for me. Not for you, the reader. For me, the author.

Writing helps me clarify my thoughts. It forces me to learn. It gives me input from a broad audience. It preserves my thoughts. It kills boredom. But nowhere in that list is the need to entertain or enlighten you.

Not that I care less about you, but rather that I care more about me. If I start writing because I need to keep up the pace of output, the quality declines and I stop enjoying it. (This contradicts what I said earlier about Quantity Always Trumps Quality. Well, let me take back the “quality declines” part. If I stop enjoying it, it’s not worth doing.)

So I’ve been taking micro-sabbaticals. Just 3 posts between July – November 2007. No posts in June – July 2008. Whenever I have something to write, or feel like writing, I just go ahead.

It’s very relaxing. I don’t feel the obligation to keep up the readership. In fact, I don’t keep track of the readership, so that helps.

But in fact, while the number of posts has dropped, the average volume of writing hasn’t changed all that much. If you look at the size of writing (I write about 25KB worth a month), except for a blip near end-2007, it hasn’t changed that much. Those blips in the middle were me copying and pasting articles on Classical Ilayaraja, so they don’t really count.

Size of entries per month has not changed much

In other words, I spend about as much time as before writing. I write about the same stuff as before. Except that I’m putting in a bit more work into each piece, and it takes longer.

It’s just a different way of doing things. I’m getting more out of building larger pieces than blogging fragmented threads, so I’ve moved that way. And in doing so, I need to take a break every now and then, because you just can’t get some stuff done at a stretch.

That’s fine by me, and I hope you don’t mind. In fact, as Asimov put it, “I’m not too proud to ask a favour. Please don’t mind.”


I’m writing this for two reasons. One is to tell you why you don’t see stuff regularly from me, and to tell you not to expect any regularity. Just subscribe to the RSS feed and we’re all better off.

The other is because I see bloggers abandoning some great blogs. (You know who you are.) I think it’s sort of like earthquakes and forest fires. The pressure to take a break from blogging keeps building up, and unless indulged in, bloggers quit. Something like Guru’s sabbatical is a great idea. It provides the option for the return, and reduces the cost of taking a break.

A new home page

I have a new home page design. (If you’re reading the RSS feed, check the home page.)

One reason is that the old home page’s design sucked. Almost everyone told me that it was drab in black and white. Personally, I think the new home page sucks in terms of colours as well. There’s too many. I suck at picking colours. The only good thing about these colours is that I left it to the judgement of experts. These are the colours in Powerpoint 2007‘s “Concourse” theme color, and I’ve just lifted them.

So, no, it wasn’t the colours that drove the redesign. My last redesign was over a year ago. I changed the structure from a list of links to two lists: one where I was just linking to interesting sites (bookmarking, really) and the other where I was writing content. The purpose behind that was to allow me to focus on writing stuff rather than just bookmarking.

And that worked pretty well for me.

In the last several months, I find myself writing more code than articles. I don’t quite have a way of sharing that. The new home page has a section dedicated to the sites I’m creating, and hopefully, it’ll let me share what I’m doing in a clearer way.

Another problem I have is that in attempting to write articles, I’ve cut myself off from writing the frivolous. Sometimes, I just need to share something small, like “I bought an Acer Aspire 5715Z” without going into the details of it. That’s not a bookmark. That’s not an article. I need a space in-between.

And that’s exactly the space micro-blogging captures.

I created a Twitter account last month. With the huge number of problems that twitter has, such as downtime and the lack of IM support, I hadn’t written a single tweet. Day-before, I created an account at identi.ca and it works just fine. Given that I now have 4 mobile devices, I should be able to do some decent microblogging.


This is actually my third or fourth attempt at redesigning my earlier home page. Every time, I’d start with a redesign, struggle with it, try to get things just right, and then eventually abandon the effort after a few weeks. This time, I succeeded — within a matter of three hours on my flight from Washington DC to London.

Two reasons. Yesterday, I found this CSS framework: 960.gs. It’s a grid system. And grids are absolutely the best way to get layouts for the web.

The other is an article from Coding Horror titled Quantity Always Trumps Quality. If you try to do stuff quickly, you end up doing better stuff than if you tried to do better stuff. To hell with perfection. Just get it out of the door.

Launching applications

Opening programs from the Start – All Programs menu is painful. For many years, I relied on the quick launch bar.

QuickLaunch

But it’s space constrained. There are only so many applications you can place there. I want space enough for frequently used documents as well. Recently, I decided that I need all the space on the screen. So my task bar is on auto hide, and that makes the quick launch bar a little tougher to use as well. And finally, I can’t use the quick launch bar with the keyboard. That’s important.

So I switched to the pinned menus on the Start Menu.

StartMenu

This works better with the keyboard. I access Word, I just type the Ctrl-Esc, W. Excel: Ctrl-Esc, E. But I run short of letters soon. I have trouble between Powerpoint and processing, for instance. And I can’t store documents.

I tried Enso Launcher and Launchy, both of which are great products, but I just can’t stand the thought of them hogging up all the memory that they do. Launchy in particular.

Given that I almost always have one or two command prompts open, I write my own little tool to do the job now. It’s a command line launcher I’ve written in Perl. I call it “o”. At the first run, it indexes my hard disk. (Well, not all of it. I’ve picked what I need.) Now, if I want to read Harry Potter and the Deathly Hallows, I just type:

> o harry potter hallows

If I wanted to pick a Harry Potter book, I could:

> o harry potter
    0: D:/Entertainment/Books/Hugo Awards/2001 - J K Rowling - Harry Potter and the Goblet Of Fire.rar
    1: D:/Entertainment/Books/J K Rowling.1.Harry Potter and The Sorcerer's Stone.pdf
    2: D:/Entertainment/Books/J K Rowling.2.Harry Potter and The Chamber of Secrets.pdf
    3: D:/Entertainment/Books/J K Rowling.3.Harry Potter and The Prisoner of Azkaban.pdf
    4: D:/Entertainment/Books/J K Rowling.4.Harry Potter and The Goblet of Fire.doc
    5: D:/Entertainment/Books/J K Rowling.5.Harry Potter and the Order of the Phoenix.pdf
    6: D:/Entertainment/Books/J K Rowling.6.Harry Potter and the Half-Blood Prince.pdf
    7: D:/Entertainment/Books/J K Rowling.7.Harry Potter and the Deathly Hallows.pdf
    8: D:/Entertainment/Books/J K Rowling.The Harry Potter Encyclopedia.doc
    9: D:/My Pictures/2005-06 London/2005-07-16 06 Waterstones Oxford Street Harry Potter release.JPG
    ... more
> (0-9, q, any word): prince
D:/Entertainment/Books/J K Rowling.6.Harry Potter and the Half-Blood Prince.pdf

The program lists the files matching the words I typed, and lets me filter within that.

I just wrote this yesterday, and already, I’ve used it dozens of times. Here’s the source.

PS: While I was at it, I downloaded a Flickr uploader for Perl. So I can now upload images with the command line. This easily saves me at least 5 minutes per article.

Illegally in Germany

In October 1997, Ram, my manager at IBM, strolled over to my desk and asked if I would like to visit the US. I’d never been there before. The impulse was to say “Yes”. But…

I’d written the CAT exam once before. Didn’t get through. Applied once again. But thanks to my diligence, I’d given the wrong residence address, and never got my admission card, and didn’t bother following it up. This would be my third “attempt”. And I didn’t want to goof it up again. (I didn’t get through that one either, as it turned out.)

“Ram, I need to be back on Dec 11th.”

“Mmm… I think we should be back by then.”

“NO MATTER WHAT!”

He smiled, and said “OK. We’ll be back by Dec 11th NO MATTER WHAT.” He thought I was going to get married or something.


It was quite warm in Bangalore, so I set out with a T-shirt and formal trousers. As I was leaving, my landlord and landlady (very nice people, and in retrospect, very far-sighted) pulled me in and said, “Have some snacks. You’ll feel hungry on the way.”

I tried my protests. They’ll feed me on the plane. I’m already carrying some food. I have cash to buy stuff. I’m fat and dieting. Didn’t matter. I still ended up carrying a fairly hefty package. “And this is for Kallol.” Another package. A colleague travelling with me was an ex-roommate as well. I just hoped I wouldn’t exceed 27 kgs.

It was a KLM flight that would halt at Amsterdam. We were to land early morning in Amsterdam, take a connecting flight to Boston, and then over to Charlotte. We’d reach Charlotte by night, in time for the class next day.

The flight itself was uneventful, except for my first non-vegetarian bite.

And then the fun began.

Breakfast was done by around 5:00am local time. The captain announced that we were near Amsterdam, fasten your seatbelts.

5:30am. No landing.

6:00am. No landing. When I pulled the shutters up, we were still flying over clouds.

7:00am. No landing.

7:30am. The captain announces that due to bad weather at Amsterdam, we would not be able to land there. We were being diverted to Cologne.

Not having been on any long-haul flights before, I wasn’t even worried. It was a KLM connecting flight. KLM would do something. But for feeling a bit hungry, things were fine.

At around 11:00am, the plane began its descent. We were amidst clouds, though. For quite a while… and the plane kept descending…

Until, all of a sudden, I could see the ground about 20 feet from the plane! The fog dense enough to be indistinguishable from clouds. (Or at least, I couldn’t tell the difference.) Lucky the pilot managed to land, and I’m surprised he even tried.

8:00am. We’re still in the plane, waiting.

9:00am. Hungry. No one has told us anything yet.

9:30am. We’re all asked to get down. Delighted, we all got off, ready to board the next plane…

… only to be herded off into a glass building on the terminal, where our luggage was waiting for us. No problem. Pick up luggage. Wait.

10:00am. All the flight staff had cleared the terminal. And, looking out of the glass walls, we could see our plane taking off! There was a fair bit of confusion (and mild panic) in the room, but being the suave software engineers that we were, we stay put and relaxed.

11:00am. Still in the glass building. No flight has landed or taken off. Worse, no human in sight. I mean it: not a single human in sight other than us KLM passengers in this deserted terminal. We’re still hungry.

12:00noon. My snacks finally come out. We all have a bite. That turned out to be our lunch.

12:30pm. Some official enters the building and is mobbed. The closest we could get to him (or her?) was about 50m behind many hundreds of raised heads.

12:45pm. Official vanishes. We ask around if anyone knows more than we do. No one seems to.

1:30pm. Another official enters. Vanishes after a few minutes.

2:00pm. Finally, word gets around that we’ll be travelling via bus to Amsterdam. Clearly we’d missed our connecting flight. We’d be put in to the same flight the next day.

2:10pm. We hear a lot of activity. People start streaming out of the building. We try to join in the rush.

2:20pm. Ahead of us, we see a guy checking passports. Now, none of us had a German visa. Presumably it was OK, but in any case, we were entering Germany without a valid visa. The official stamped my passport without question.

2:30pm. We exit the airport. The temperature was 0 degrees C. I was still in my T-shirt. My warm clothes were packed. That day, I learnt two lessons. One, never keep all your warm clothes inaccessibly in the check-in baggage. (I had my check-in baggage. But it was packed, and if I opened it, I can’t put it back in. Besides, we were being herded into a bus: not much chance of hanging around to open a suitcase.) Two, it’s actually possible to get a headache from the cold. For 15 freezing minutes, we stood on the road waiting for the bus, and enjoying the pleasures of our first day on European soil.

2:45pm. Bus arrives. Mob tries to enter bus. Half of our group manages to get through. I am left behind. Fortunately, next bus is only 5 minutes behind.

7:00pm. Bus finally arrives at Schipol airport. We’re herded out to the KLM counter. By now, it’s been well over 24 hours since my last full meal.

7:30pm. We’re told we’ll get a hotel to stay in, and our flight is confirmed for the next day. At this point, we’re famished. So we exchanged some currency, and decided to buy some food. I picked a green apple. This happened to be my first green apple. No one had told me that apples could taste sour. (While on that topic, I must mention apple pies. I love apple pies in India. I hate apple pies in London. I suspect it’s the red versus green apples.)

7:31pm. I take one bite. Another bite. Have a funny feeling in my stomach. Burning sensation. And at that point, I collapsed. Physically. Just dropped on the floor and had to be pulled up.

8:00pm. Finally reach the hotel. Not entirely sure how. I’m too tired for anything but milk, so I get a glassful and go to sleep.


PS: We finally reached Charlotte a day late. Fortunately, we didn’t miss much.

Apparantly, most passengers on the flight complained to KLM and received gifts / free miles of a substantial magnitude. We didn’t know of that till much later.

This remains my only trip to Germany till date. My passport still holds an entry stamp without a visa.

We did get the bonus of spending half a day in Amsterdam, which is a rather nice place. Again, without a visa.

In search of a good editor

It’s amazing how hard it is to get a good programming editor. I’ve played around with more editors/IDEs than I care to remember: e Notepad++ NoteTab SciTE Crimson Editor Komodo Eclipse Aptana

There are four features that are critical to me.

  • Syntax highlighting. Over time, I’ve found this to increase readability dramatically. Look at this piece of code with and without syntax highlighting:
    Syntax Highlighting
    Doesn’t the structure of the document just jump out with syntax highlighting? Anyway, I’ve gotten used to that.
  • Column editing. I want to be able to do this:
    Column Editing
    Being able to type across rows is incredibly useful. I use it both for programming as well as to complement data-processing on Excel.
  • Unicode support. I often work with non-ASCII files, particularly in Tamil. Unicode support comes in handy when debugging pages for my songs site.
  • Auto-completion. This is 10 times more productive than having to look up the manual for each function.
    AutoCompletion

(Oh, and it’s got to be free too. Except for e Text Editor, all the others qualify.)

The problem is, none of the browsers that I’ve looked at support all of these features.

Editor Syntax highlighting Column editing Unicode support Auto-completion
e Text Editor Yes Yes No Yes
Crimson Editor Yes Yes No No
Notepad++ Yes No Yes No
NoteTab-Lite No No No No
SciTE Yes No Yes Yes
TextPad Yes No Yes No
UltraEdit Yes No No ?
Aptana Yes No Yes Yes
Eclipse Yes No Yes Yes
Komodo Yes No Yes Yes

Wikipedia has a more in-depth comparison of text editors.

Actually, there’s another parameter that’s pretty important: responsiveness. When I type something, I want to see it on the screen. Right that millisecond. With some of the features added by these editors, there’s so much bloat that it often takes up to one second between the keypress and the refresh. That’s just not OK.

I’ve settled on Crimson Editor as my default editor these days, simply because it’s quick and has column editing. (Column editing on e Text Editor is a bit harder to use.) When I am writing Unicode, I switch over to Notepad++. For large programs, I’m leaning towards Komodo right now, largely because Eclipse is bloated and Aptana was slow. (Komodo is slow too. Maybe I’ll switch back.)

There’s many other things on my “would love to have” features, like regular-expression search and replace, line sorting, code folding, brace matching, word wrapping, etc. Most of those, though, are either not too important, or most browsers already have them.

Well, there’s the sad thing. I’ve been hunting for a good text editor for over 10 years now. May someone write a lightweight IDE with column editing.

JPath – XPath for Javascript

XPath is a neat way of navigating deep XML structures. It’s like using a directory structure. /table//td gets all the TDs somewhere below TABLE.

Usually, you don’t need this sort of a thing for data structures, particularly in JavaScript. Something like table.td would already work. But sometimes, it does help to have something like XPath even for data structures, so I built a simple XPath-like processor for Javascript called JPath.

Here are some examples of how it would work:

jpath(context, “para”) returns context.para
jpath(context, “*”) returns all values of context (for both arrays and objects)
jpath(context, “para[0]”) returns context.para[0]
jpath(context, “para[last()]”) returns context.para[context.para.length]
jpath(context, “*/para”) returns context[all children].para
jpath(context, “/doc/chapter[5]/section[2]”) returns context.doc.chapter[5].section[2]
jpath(context, “chapter//para”) returns all para elements inside context.chapter
jpath(context, “//para”) returns all para elements inside context
jpath(context, “//olist/item”) returns all olist.item elements inside context
jpath(context, “.”) returns the context
jpath(context, “.//para”) same as //para
jpath(context, “//para/..”) returns the parent of all para elements inside context

Some caveats:

  • This is an implementation of the abbreviated syntax of XPath. You can’t use axis::nodetest
  • No functions are supported other than last()
  • Only node name tests are allowed, no nodetype tests. So you can’t do text() and node()
  • Indices are zero-based, not 1-based

There are a couple of reasons why this sort of thing is useful.

  • Extracting attributes deep down. Suppose you had an array of arrays, and you wanted the first element of each array.
    Column Selection
    You could do this the long way:
    for (var list=[], i=0; i < data.length; i++) {
        list.push(data[i][0]);
    }
    

    ... or the short way:

    $.map(data, function(v) {
        return v[1];
    })

    But the best would be something like:

    jpath(data, "//1")
    
  • Ragged data structures. Take for example the results from Google's AJAX feed API.
    {"responseData": {
     "feed": {
      "title": "Digg",
      "link": "http://digg.com/",
      "author": "",
      "description": "Digg",
      "type": "rss20",
      "entries": [
       {
        "title": "The Pirate Bay Moves Servers to Egypt Due to Copyright Laws",
        "link": "http://digg.com/tech_news/The_Pirate_Bay_Moves_Servers_to_Egypt_Due_to_Copyright_Laws",
        "author": "",
        "publishedDate": "Mon, 31 Mar 2008 23:13:33 -0700",
        "contentSnippet": "Due to the new copyright legislation that are going ...",
        "content": "Due to the new copyright legislation that are going to take...",
        "categories": [
        ]
       },
       {
        "title": "Millions Dead/Dying in Recent Mass-Rick-Rolling by YouTube.",
        "link": "http://digg.com/comedy/Millions_Dead_Dying_in_Recent_Mass_Rick_Rolling_by_YouTube",
        "author": "",
        "publishedDate": "Mon, 31 Mar 2008 22:53:30 -0700",
        "contentSnippet": "Click on any \u0022Featured Videos\u0022. When will the insanity stop?",
        "content": "Click on any \u0022Featured Videos\u0022. When will the insanity stop?",
        "categories": [
        ]
       },
       ...
      ]
     }
    }
    , "responseDetails": null, "responseStatus": 200}
    

    If you wanted all the title entries, including the feed title, the choice is between:

    var titles = [ result.feed.title ];
    for (var i=0, l=result.feed.entries.length; i<l; i++) {
        titles.push(result.feed.entries[i].title;
    }
    

    ... versus...

    titles = jpath(result, '//title');
    

    If, further, you wanted the list of all categories at one shot, you could use:

    jpath(result, "//categories/*")
    

Automating Internet Explorer with jQuery

Most of my screen-scraping so far has been through Perl (typically WWW::Mechanize). The big problem is that it doesn’t support Javascript, which can often be an issue:

  • The content may be Javascript-based. For example, Amazon.com shows the bestseller book list only if you have Javascript enabled. So if you’re scraping the Amazon main page for the books bestseller list, you won’t get it from the static HTML.
  • The navigation may require Javascript. Instead of links or buttons in forms, you might have Javascript functions. Many pages use these, and not all of them degrade gracefully into HTML. (Try using Google Video without Javascript.)
  • The login page uses Javascript. It creates some crazy session ID, and you need Javascript to reproduce what it does.
  • You might be testing a Javascript-based web-page. This was my main problem: how do I automate testing my pages, given that I make a lot of mistakes?

There are many approaches to overcoming this. The easiest is to use Win32::IE::Mechanize, which uses Internet Explorer in the background to actually load the page and do the scraping. It’s a bit slower than scraping just the HTML, but it’ll get the job done.

Another is to use Rhino. John Resig has written env.js that mimics the browser environment, and on most simple pages, it handles the Javascript quite well.

I would rather have a hybrid of both approaches. I don’t like the WWW::Mechanize interface. I’ve gotten used to jQuery‘s rather powerful selectors and chainability. So I’ll tell you a way of using jQuery to screen-scrape offline using Python. (It doesn’t have to be Python. Perl, Ruby, Javascript… any scripting language that can use COM on Windows will work.)

Let’s take Google Video. Currently, it relies almost entirely on Javascript. The video marked in red below appears only if you have Javascript.

The left box showing the top video uses Javascript

I’d like an automated way of checking what video is on top on Google Video every hour, and save the details. Clearly a task for automation, and clearly not one for pure HTML-scraping.

I know the video’s details are stored in elements with the following IDs (thanks to XPath checker):

ID What’s there
hs_title_link Link to the video
hs_duration_date Duration and date
hs_ratings Ratings. The stars indicate the rating and the span.Votes element inside it has the number of people who rated it.
hs_site The site that hosts the video
hs_description Short description

So I could do the following on Win32::IE::Mechanize.

use Win32::IE::Mechanize;
my $ie = Win32::IE::Mechanize->new( visible => 1 );
$ie->get("http://video.google.com/");
my @links = $ie->links
# ... then what?

I could go through each link to extract the hs_title_link, but there’s no way to get the other stuff.

Instead, we could take advantage of a couple of facts:

  • Internet Explorer exposes a COM interface. That’s what Win32::IE::Mechanize uses. You can use it in any scripting language (Perl, Ruby, Javascript, …) on Windows to control IE.
  • You can load jQuery on to any page. Just add a <script> tag pointing to jQuery. Then, you can call jQuery from the scripting language!

Let’s take this step by step. This Python program opens IE, loads Google Video and prints the text.

# Start Internet Explorer
import win32com.client
ie = win32com.client.Dispatch("InternetExplorer.Application")
 
# Display IE, so you'll know what's happening
ie.visible = 1
 
# Go to Google Video
ie.navigate("http://video.google.com/")
 
# Wait till the page is loaded
from time import sleep
while ie.Busy: sleep(0.2)
 
# Print the contents
# Watch out for Unicode
print ie.document.body.innertext.encode("utf-8")

The next step is to add jQuery to the Google Video page.

# Add the jQuery script to the browser
def addJQuery(browser,
    url="http://jqueryjs.googlecode.com/files/jquery-1.2.4.js"
 
    document = browser.document
    window = document.parentWindow
    head = document.getElementsByTagName("head")[0]
    script = document.createElement("script")
    script.type = "text/javascript"
    script.src = url
    head.appendChild(script)
    while not window.jQuery: sleep(0.1)
    return window.jQuery
 
jQuery = addJQuery(ie)

Now the variable jQuery contains the Javascript jQuery object. From here on, you can hardly tell if you’re working in Javascript or Python. Below are the expressions (in Python!) to get the video’s details.

# Video title: "McCain's YouTube Problem ..."
jQuery("#hs_title_link").text()
 
# Title link: '/videoplay?docid=1750591377151076231'
jQuery("#hs_title_link").attr("href")
 
# Duration and date: '3 min - May 18, 2008 - '
jQuery("#hs_duration_date").text()
 
# Rating: 5.0
jQuery("#hs_ratings img").length
 
# Number of ratings '(8,288 Ratings) '
jQuery("#hs_ratings span.Votes").text()
 
# Site: 'Watch this video on youtube.com'
jQuery("#hs_site").text()
 
# Video description
jQuery("#hs_description").text()

This wouldn’t have worked out as neatly in Perl, simply because you’d need to use -> instead of . (dot). With Python (and with Ruby and Javascript on cscript), you can almost cut-and-paste jQuery code.

If you want to click on the top video link, use:

jQuery("#hs_title_link").get(0).click()

In addition, you can use the keyboard as well. If you want to type username TAB password, use this:

shell = win32com.client.Dispatch("WScript.Shell")
shell.sendkeys("username{TAB}password")

You can use any of the arrow keys, control keys, etc. Refer to the SendKeys Method on MSDN.