Formatting tables
Formatting tables in Excel is a fairly common task, but there are a number of ways to improve on the way it’s done most of the time.
Here are a few tips. Fairly basic stuff, but hopefully useful.
Formatting tables in Excel is a fairly common task, but there are a number of ways to improve on the way it’s done most of the time.
Here are a few tips. Fairly basic stuff, but hopefully useful.
An hour of jogging is worth about one Cheese Whopper. Now, are you going to really spend an hour on the road every day just to burn off that extra burger? You don't exercise to lose weight (although it certainly helps). You exercise because you'll live longer and you'll feel better.I’m afraid I’ll live too long anyway, so I won't bother exercising just yet. It's down to eating less. Sadly, I like food. So to make my “diet” work, I need foods that add less calories per gram. Usually, when browsing stores, I check these manually. But being a geek, I figured there’s an easier way. Below is a graph of some foods (the kind I particularly need to avoid, but still end up eating). The ones on the top add a lot of calories (per 100g), and better to avoid. The ones at the right cost a lot more. Now, I’m no longer at the point where I need to worry about food expenses, but still, I can’t quite kick the habit, also you might want to check out this Rootine's comparison of B12 methylcobalamin and cyanocobalamin that will help you in your diet. Hover over the foods to see what they are, and click on them to visit the product. (If you’re using an RSS reader and this doesn’t work, read on my site.)
Does it matter which month you’re born in?
Based on the results of the 20 lakh students taking the Class XII exams at Tamil Nadu over the last 3 years (via Reportbee), it appears that the month you were born in can make a difference of as much as 120 marks out of 1,200 – or 10%!
Most students who took the Class XII exams in 2011 were born between March 1991 and June 1992. The average marks of each student (out of 1200) is shown in the graph below.
Students born in June 1991 scored the lowest – around 720/1200. This suddenly shoots up in July, then in August, and the students born in September score as much as 840/1200 on average. From there on, it’s downhill.
This result is consistent across years. In 2009 and 2010, you see a similar pattern.
Why could this be?
Malcolm Gladwell’s book Outliers offers a clue.
Outliers opens, for example, by examining why a hugely disproportionate number of professional hockey and soccer players are born in January, February and March.
The answer turns out to be completely unrelated to numerology or astrology.
It’s simply that in Canada the eligibility cutoff for age-class hockey is January 1. A boy who turns ten on January 2, then, could be playing alongside someone who doesn’t turn ten until the end of the year—and at that age, in preadolescence, a twelve-month gap in age represents an enormous difference in physical maturity.
In Tamil Nadu, students must be 5 years old before entering Class 1. Schools open mid-June. So students born in June 1994 would barely make it in June 1999 – making them the youngest students in the class. July and August students would be missed – but since many schools implement this policy leniently, they sometimes make it in as well. September borns are often consistently the eldest students in a class.
This pattern reflected in the marks. The eldest – the September 1993 borns – score the highest. The next eldest, the October 1993 borns, score a bit less. And so on. (There are older students who take the exam – the ones born before September 1993 – but many of these are failed students from the previous year, introducing a bias in the results.)
Perhaps this initial advantage that the elder students have over their classmates continues through the years? Whatever the reason, it’s clear that if your child is born in September, he or she already has a 100 mark advantage! If you’re overwhelmed from being a parent, you can always take a breather and play games like 토토사이트.
The IMDb Top 250, as a source of movies, dries out quickly. In my case, I’ve seen about 175/250. Not sure how much I want to see the rest.
When chatting with Col Needham (who’s working his way through every movie with over 40,000 votes), I came up with this as a useful way of finding what movies to watch next.
Each box is one or more movies. Darker boxes mean more movies. Those on the right have more votes. Those on top have a better rating. The ones I’ve seen are green, the rest are red. (I’ve seen more movies than that – just haven’t marked them green yet 🙂
I think people like to watch the movies on the top right – that popularity compensates (at least partly) for rating, and the number of votes is an indication of popularity.
For example, my movie pattern tells me that I ought to see Cidade de Deus, Inglourious Basterds and Heat – which I knew from the IMDb Top 250, but also that I ought to cover Kick-Ass, The Hangover and Juno.
It’s easy to pick movies in a specific genre as well.
Clearly, there are many more Comedy movies in the list than any other type – though Romance and Action are doing fine too. And I seem the have a strong preference for the Fantasy genre, in stark contrast to Horror.
(Incidentally, I’ve given up trying to see The Shining after three attempts. Stephen King’s scary enough. The novel kept me awake checking under my bed for a week at night. Then there’s Stanley Kubrick’s style. A Clockwork Orange was disturbing enough, but Haley Joel Osment in the first part of A.I. was downright scary. Finally, there’s Jack Nicholson. Sorry, but I won’t risk that combination on a bright sunny day with the doors open.)
You can track your list at http://250.s-anand.net/visual.
For those who want to play with the code, it’s at http://code.google.com/p/two-fifty/source/browse/trunk/visual.html.
Sometimes, school marks are moderated. That is, the actual marks are adjusted to better reflect students’ performances. For example, if an exam is very easy compared to another, you may want to scale down the marks on the easy exam to make it comparable.
I was testing out the impact of moderation. In this video, I’ll try and walk through the impact, visually, of using a simple scaling formula.
BTW, this set of videos is intended for a very specific audience. You are not expected to understand this.
Rough transcript
First, let me show you how to generate marks randomly. Let’s say we want marks with a mean of 50 and a standard deviation of 20. That means that two-thirds of the marks will be between 50 plus/minus 20. I use the NORMINV formula in Excel to generate the numbers. The formula =NORMINV(RAND(), Mean, SD) will generate a random mark that fits this distribution. Let’s say we create 225 students’ marks in this way.
Now, I’ll plot it as a scatterplot. We want the X-axis to range from 0 to 225. We want the Y-axis to range from 0 to 100. We can remove the title, axes and the gridlines. Now, we can shrink the graph and position it in a single column. It’s a good idea to change the marker style to something smaller as well. Now, that’s a quick visual representation of students’ marks in one exam.
Let’s say our exam has a mean of 70 and a standard deviation of 10. The students have done fairly well here. If I want to compare the scores in this exam with another exam with a mean of 50 and standard deviation of 20, it’s possible to scale that in a very simple way.
We reduce the mean from the marks. We divide by the standard deviation. Then multiply by the new standard deviation. And add back the new mean.
Let me plot this. I’ll copy the original plot, position it, and change the data.
Now, you can see that the mean has gone down a bit — it’s down from 70 to 50, and the spread has gone up as well — from 10 to 20.
Let’s try and understand what this means.
If the first column has the marks in a school internal exam, and the second in a public exam, we can scale the internal scores to be in line with the public exam scores for them to be comparable.
The internal exam has a higher average, which means that it was easier, and a lower spread, which means that most of the students answered similarly. When scaling it to the public exam, students who performed well in the interal exam would continue to perform well after scaling. But students with an average performance would have their scores pulled down.
This is because the internal exam is an easy one, and in order to make it comparable, we’re stretching their marks to the same range. As a result, the good performers would continue getting a top score. But poor performers who’ve gotten a better score than they would have in a public exam lose out.
Yesterday, I wrote about node.js being fast. Here are some numbers. I ran Apache Benchmark on the simplest Hello World program possible, testing 10,000 requests with 100 concurrent connections (ab -n 10000 -c 100
). These are on my Dell E5400, with lots of application running, so take them with a pinch of salt.
PHP5 on Apache 2.2.6<?php echo “Hello world” ?> |
1,550/sec | Base case. But this isn’t too bad |
Tornado/Python See Tornadoweb example |
1,900/sec | Over 20% faster |
Static HTML on Apache 2.2.6Hello world |
2,250/sec | Another 20% faster |
Static HTML on nginx 0.9.0Hello world |
2,400/sec | 6% faster |
node.js 0.4.1 See nodejs.org example |
2,500/sec | Faster than a static file on nginx! |
I was definitely NOT expecting this result… but it looks like serving a static file with node.js could be faster than nginx. This might explain why Markup.io is exposing node.js directly, without an nginx or varnish proxy.
I’ve moved from Python to Javascript on the server side – specifically, Tornado to Node.js.
Three years ago, I moved from Perl to Python because I got free hosting at AppEngine. Python’s a cleaner language, but that was not enough to make me move. Free hosting was.
Initially, my apps were on AppEngine, but that wouldn’t work for corporate apps, so I tried Django. IMHO, Django’s too bulky, has too much “magic”, and templates are restrictive. Then I tried Tornado: small; independent modules; easy to learn. I used it for almost 2 years.
The unexpected bonus with Tornado was it’s event-based model: it wouldn’t wait for file or HTTP requests to be complete before serving the next request. I ended up getting a fair bit of performance from a single server.
Trouble is, Python’s a rare skill. I tried selling Python in corporates a couple of times, and barring RBS (which used it before I came in, and made it really easy for me to build an IRR calculator), I’ve failed every time. Apart from general fear, uncertainty and doubt, getting people is tougher.
Javascript’s a good choice. It has many of Python’s benefits. It’s easy to recruit people. Corporates aren’t terrified of it. Rhino was good enough a server. All it lacked was the “cool” factor, which node.js has now brought it. And besides,
Bye, Python.
I haven’t found an open or reliable database providing the geo-location of Indian PIN codes. That’s a bother if you’re creating geographic mash-ups. The closest were commercial sources:
Crowd-sourcing this might help. Here’s a site where you can map the location of any PIN code you know:
For example, if you knew the exact location of the PIN code 110083 (which happens to be Mongolpuri in New Delhi), just go to http://pincode.datameet.org/IN/110083 and move the marker to where it should be.
I’ve initially populated the data from GeoNames. Arun has offered OpenStreetMap data. If you know of any sources that we could use, please let me know. And if you want to use the data, feel free. It’s CC licensed. You can check out the source on github too.
Time for the annual update on software I use. This time, I’ve got Wakoopa to help me with the relative usage as well. Here’s the top 100 software / web apps I’ve used recently, and how long I spent on them.
HTML 4 & 5: The Complete Reference is an iPhone / iPad app that does exactly what it says: a reference for HTML 4 and 5.
It has a list of all tags, clearly demarcated as HTML4, HTML5 or both. The application is fairly easy to scroll through to find the tag or attribute you want. Clicking on a tag, you get:
You can also scroll through the list of attributes and see which tags they’re valid for.
The part that quite interested me was the “characters” or HTML entities. Quite often, I’d want the pound (£) or right angle quotes (»), but wouldn’t know the character or entity reference. So far, I’ve been using this HTML entity reference to search for characters, where I can just type in the word (e.g. pound or quote) and it filters the list to show what I want. I was really hoping to see that on the app, but was disappointed. It lets you search, but it’s not search as you type. And the result points you to a section that contains the character – not directly to the character. (It’s a bit difficult to find the character in the longer sections).
There’s also a section where you can see elements by “task” – e.g. Forms, Link-related, Document Setup, Interaction, etc. This is a pretty useful break-up if you’re looking for the right element for the job, or browsing for interesting new elements to discover in HTML5. (I found the <menu>
and <command>
tags this way.
You do have the option of just downloading the PDF version of the HTML5 spec and reading it in iBooks, of course. So while I find the book useful, without a search-as-you-type feature, I suspect it won’t do much for my speed of looking up things, so I’ll just stick to the spec for the moment.
Disclosure: I’m writing this post as part of O’Reilly’s blogger review program. While I’m not getting paid to review the app, I did get it for free.