S Anand, Author at S Anand

Why I’m blogging less

April 26, 2014 December 3, 2021 / How I do things / 4 Comments

My blog’s been through a number of phases. Between 1996 – 1999, it was just a website with a few facts about my and some of my juvenile ramblings. Inspired by robotwisdom.com, I converted it into a blog – except that I didn’t know what blogging was and just called it “updating my site every day.” It was mostly a link blog.

In 2006, around the time when I moved from Mumbai to London, I reduced my link-blogging and started writing longer articles talking about my experiences. This was a fairly productive phase, and I was churning a few dozen articles every year until 2012.

In the UK, I didn’t know many people, and wasn’t comfortable going out of the way to interact. My blog was the primary means of sharing my thoughts and work.

In 2012, when I moved back to India, that changed. I started speaking at various events. (Some of my talks are recorded.) I’ve been speaking at one or two events every month, which is roughly the volume of blogging I was doing since 2006.

So, effectively, my output medium has changed. Instead of writing, I speak. Correspondingly, my blogging has come down.

How does it feel? Well, on the one hand, there’s a lot more direct feedback when you’re speaking to an audience. You can interact with them, ask questions, play games like dadu online – all of which I can do on a blog as well, but this is real-time. When my audience laughs, I steer my talk more towards funny insights. When my audience claps, I steer it towards more impressive techniques. When my audience reacts like dead fish, I switch to Q&A. When my audience is lost in their own conversation, I terminate the talk early.

Effectively, my content is often shaped in real time. And that can be (usually) an exhilarating experience. I used to worry that the talks didn’t have the permanence of blog posts. But like I said, many of them are recorded. I also worried that the audience response would not be permanent, like blog comments. But Twitter fills that void.

For example, yesterday, I was speaking at the Great Indian Developer Summit. Here are the tweets going out as I was speaking.

Kashinath Pai. P.: #GIDS visualising data by Anand »

Naresha: Looking forward for a cool ‘Visualizing Big Data’ presentation from @sanand0 #gids »

ARAVIND CHEKKALURE: #gids watg for a solution in visualizating bigdata..woww »

Raj: Analysis of big data and visualization of big data is very different #gids »

Sundarraj Kaushik: Now to visualize data with Anand at #gids »

Raj: Anand’s session on visualization of big data surly interesting talk of day. I attended previously #gids »

UK Gupta: Another Excitin &Interestin Session “Leveraging #Cloud Services2Build & Integrate Analytics in Ur #IoT Solutions” by @Ragural #IntelDZ #gids »

Siva Narayanan: Doordarshi party is the worst loser in Indian politics #gids »

Siva Narayanan: Very cool viz about Indian Elections from gramener #gids »

ARAVIND CHEKKALURE: Examples for visualiztion that anad took is rally imprazv #gids »

Siva Narayanan: Margin of victory for winner isn’t affected by number of candidates. Affects runner up. #gids »

Goutham G: Enjoyed every bit of information on xls by vinod #GIDS »

Sanaulla: Very interesting facts and presentation by s anand in visualizing big data #gids »

THIYAGARAJAN.R: #gids Hi everyone, Anand session on Data visualization is interesting… happening on Main Hall »

Sanaulla: Visualization helps grasp huge amount of data quite easily #gids »

Vignesh Rajendran: Sanand might be called @NateSilver538 of Indian politics analysis #gids »

45hw1nk5: By far the beat talk so far, visualising data by S Anand from Gramener. #gids »

Rahul Sharma: On the Everest of knowledge with S Anand.. Courtesy – ‘Big Data’ 😀 #gids »

Sundarraj Kaushik: A very pertinent subject visualization of election statistics at #gids. »

Siva Narayanan: UdayKumar has 1600 cases against him #gids »

Rishi Raj Srivastav: Great Indian politics (data) visualization by Anand. #GIDS »

Siva Narayanan: Singh has been most popular last name in Indian elections every time #gids »

ARAVIND CHEKKALURE: Big data visualization is this much easy..like anand speks #gids »

Sundarraj Kaushik: A very colourful presentation without actual mention of big data or visualization. Wonderful presentation at #gids. »

Govind Kanshi: Gujarat, Maharashtra have longest names fighting in elections #gids powerful story as usual by @sanand0 »

Venkat ramanan v: Data visualization at it’s best. #gids »

Kamlesh ®: RT @govindk: Gujarat, Maharashtra have longest names fighting in elections #gids powerful story as usual by @sanand0 »

brntbeer: Talk about last names and regions of India. I’m definitely an outsider! #gids »

Sundarraj Kaushik: Is the dropout of girls the cause of better results of girls. Anand at #gids »

Sanaulla: Best session of the day: visualizing big data #gids »

ujwala: Visualizing big data session is very very interesting. Nicely done. #gids »

Siva Narayanan: Sun sign has a big impact on exam performance #gids »

Venkat ramanan v: Intriguing session on visualization #gids http://t.co/sezeRP48BM »

Siva Narayanan: Almost nobody is born in august in India! #gids People are fudging birth dates. »

Raj: Thanks #saltmartch for invite such a good speaker. #gids »

ujwala: RT @K2_181: Almost nobody is born in august in India! #gids People are fudging birth dates. »

Vinod Srinivas: @greatindiandev #gids #Anand was at his best in his session on #Visualisation »

Kiran Bhat: Lets get people to SEE data #Gramener #gids »

Sonali Patnaik: #GIDS “lets get people to see data” good session @sanand »

Amol Khanapurkar: Easily the best session at #gids for me by S Anand from http://t.co/1lVuBMpPlW »

isha jain: Amazing facts and awesome session on data visualization by Anand… #gids »

Kashinath Pai. P.: Absolutely mind blowing presentation by s anand #GIDS »

Sundarraj Kaushik: Thanks to Anand S for a marvellous and pertinent presentation at #gids 2014 »

Naresha: @sanand0 Those were amazing visualizations of data. One of the best sessions of #gids. »

Amol Khanapurkar: Data visualization can provide insights that no amount of analytic processing can hope to provide. #gids »

Vijay Singh: Session on big data visualisation was a joy ride #gids »

Mrugen Deshmukh: @Gramener Most entertaining talk yet. by S. Anand #gids »

Kashinath Pai. P.: Wonderful work @greatindiandev . inviting @sanand0 was absolutely amazing. #GIDS »

Sachin: That was an really awesome session on big data visualisation.. Had fun… #gids »

ARAVIND CHEKKALURE: Its reLy gd session by anand on visualizatg Bigdata..but never touch any tools and technologis. DisAptD #gids »

Harpreet Singh: Great session by anand Add visualisation to data to make it information #gids »

Raja Guru T: #gids thing of awesomeness visualization of large data. Lovely session by Anand. Way to go Saltmarch. Loving it. »

Raja Guru T: RT @ujwala: Visualizing big data session is very very interesting. Nicely done. #gids »

Prashanth: #gids data visualization session was amazing »

Subhashish Dutta: At #gids today, awesome visualization of some big data in the Indian context by Anand of Gramener. »

Japesh Thyagarajan: An impressive and fun session from Anand on Visualising Big Data, amazing illustration of Election and Education system , Hats off #GIDS »

Raj: I must say visualization of big data best session of #gids »

Apart from being able to preserve comments, I get to hear of this feedback a lot quicker than on a blog.

What I miss, though, is discoverability. When I blog, search engines index the content for anyone to find. I still get relevant comments on 15-year old blog posts. That, I suspect, will not be the case even for recorded talks.

But in any case, I’m afraid I will continue blogging less and speaking more over the course of the next few years. Please bear with me until then!

A utilitarian’s apology

April 9, 2014 April 9, 2014 / How I do things / 3 Comments

A couple of years ago, my HTC Explorer’s screen died. I bought a Micromax A50. This triggered a series of reactions prompting this post.

I have many defects. Like most men, I can’t tell colours apart – like the difference between pink and purple – and am constantly corrected by my six-year-old. I can’t hear two people at the same time – or even in-between each other. I can’t find things outside of my narrow field of vision. I can’t recognise faces, and need at least three one-on-one interactions before I place people. (If you ask me “Do you recognise me?” and I say “Yes, of course!”, I’m usually lying.) I can’t place voices on the phone. My memory is terrible – my wife’s learnt to make me write errands on my laptop. I cannot identify cars – in fact, I couldn’t drive until recently.

I also lack a fashion sense, despite being a keen student of design. I can understand rules of thumb, like how large line heights should be, or why high saturation colours are jarring. I can even give passable judgement on the quality of clothing.

The trouble is, I don’t see much value in it. I’m a utilitarian.

This post is an apology from all utilitarians like me. We’re sorry – we just don’t see the the point of a Mont Blanc pen or a Cartier watch. Our Bic pens and digital watches work just as well. We’re not saying you shouldn’t buy them. It’s just that we don’t understand why you would.

This is not an argument against expensive items. I bought the iPod and loved it. Same with the iPhone 4. I have two iPads. I’m fairly picky about the earphones I buy. The thing is, the reason I buy these is because there’s a value that matches the price. Where I don’t see the value, that’s just throwing money away.

So that’s why I travel in buses or autos. I can work on my laptop while someone else drives. That’s why I walk or climb stairs. I get to lose weight without wasting time at a gym. That’s why I don’t wear a watch and don’t subscribe to newspapers or TV.

For my non-utilitarian friends out there, this is from us utilitarians. Please forgive us. We don’t see the value.

Weight lines, again

January 14, 2014 January 14, 2014 / How I do things / 7 Comments

A few years ago, I ended up losting weight, mostly by dieting. That worked out rather well up to a point: I lost about 20kgs rapidly. But I ended up putting them back on almost as rapidly.

What I learnt from this was that dieting made me more short-tempered. It also reduced my metabolic rate. My body would adjust to the hunger and enter a “starvation-mode”, using the limited food ridiculously efficiently. So I’d have to eat even less to continue losing weight.

This time, I’m going to try it the slow way.

Firstly, my targets are moderate. I plan to lose about 1kg every month. (It’ll take me a few years to achieve my target. That’s good – it postpones the time when I’ll say “Ah, I’m thin. I can eat now” and become fat again.)

Secondly, I’m not going to keep myself hungry. I’m just going try to stop eating when I’m not hungry.

This happens for two reasons. One’s because I usually watch a movie when I eat, and keep eating until the movie’s done. The other’s because when I’m between activities, I raid the kitchen. It’s not realistic to pretend that I can curb these tendencies. But it’s possible to be a bit more aware of them.

Stocking the house with healthier foods helps. I also find that some fruits in particular keep my stomach full for longer. They’re low value for money energy-wise, but great for dieting and health.

Thirdly, I’m going to start exercising, but in my own, slow, way. I’m not good at running or going to the gym, but weirdly, I rather like climbing stairs So the next step is to abandon lifts and only use the stairs.

None of these is a big step. But I’m not in a hurry, and these are more like habits I’d like to get into for the rest of my life rather than short-term measures.

Here’s to a lighter 2014!

Motorbike science lab

October 24, 2013 October 24, 2013 / Links / 1 Comment

My cousin’s working on an interesting project at the Agastya Foundation. A group of scientifically inclined volunteers go around on a bike to schools, taking with them a science lab kit, and show children in rural schools a variety of experiments.

Google will award this and 3 other projects (out of 10) Rs 3 crores based on public votes. You can vote for and read more at https://impactchallenge.withgoogle.com/india2013#/agastya|vote

Courtesy

September 4, 2013 September 4, 2013 / How I do things / Leave a Comment

We are often subject to body searches, baggage inspections, and identity verifications. At malls. At airports. At offices.

These are to ensure that no one carries ammunition inside, or goods or secrets outside. In other words, to deter terrorists and thieves.

It’s nothing personal, of course. When someone does not know me, I can choose to accept that (or not; the choice is mine).

When I’m invited somewhere, however, I assume that I am not deemed a security threat. Therefore, I expect that:

My and my belongings will not be searched or scanned
I need not leave behind my personal belongings
I need not carry an identity card

Please afford me this courtesy if you are inviting me.

For some months now, I’ve visited many corporate offices. The reception is comprised of security guards, a metal detector and a register. I’m given a tag and an escort.

I’m not fussy. I’m not worried about being greeted, for example. I’m quite happy to plug into a power socket and work on my laptop until logistics are sorted out. But when that happens at the security outpost with no sitting space, or outside the gate in the rain, it inconveniences me.

A few weeks ago, I was at Singapore, and visited a client’s office in slippers. One of them complemented my choice of footwear, and remarked that he had not yet risen high enough in the corporate ladder to afford this luxury. (There’s a series of stories behind my footwear that I’ll get to later.)

That told me something. After a long time, I now can afford this luxury. Especially if someone knows me well enough to invite me to their office.

I hope to point them to this blog post and request that security be arranged so that I can be afforded this small courtesy; be treated with trust rather than as a terrorist or a thief.

(If their organisation’s practice does not permit this, I’m happy to meet outside. Besides, our office is happy to extend warm hospitality.)

Open source in corporates

June 4, 2013 June 3, 2013 / How I do things / 7 Comments

[This is a post that I’d published internally in InfyBlogs in Dec 2009. Time to share it.]

Last month, my first application went live.

I’ve been writing code for 20 years. Not one line of my code has been officially deployed in a corporate. (Loser…)

It’s a happy feeling. Someone defined happiness as the intersection of pleasure and meaning. Writing code is pleasurable. Others using it is meaningful.

But this post isn’t quite about that. It’s about the hoops I’ve had to jump through to make this happen.

I’ve been living in a nightmare since March 2009. That was when I decided that I’d try and get corporates to use open source.

March 2009

It began with a pitch to a VC firm. They were looking to build a content management system (CMS). Normally we’d pull together slides that say we’ll deliver the moon. This time, we put together demo based on WordPress’ CMS plugins.

The meeting went fabulously well. We said, “Here’s a demo we’ve built for you. Do you like it?” The business lead (Stuart) was drooling and declared that that’s exactly what they wanted. The IT lead (another Stuart) was happy too, but warned the business users: “Just remember: this isn’t how we do development, so don’t get your hopes up that we can deliver stuff like this :-)”

Time to make my point. I asked, “What’s your policy on open source software?”

The business lead went quiet. “I don’t know,” he finally said. Fair enough.

I turned to the IT lead. “Well, we don’t use it as a matter of policy… there are security concerns…” he said.

“Which web server do you use?”

”Oh, OK. I see what you mean. We use Apache. So on a case to case basis, we have exceptions. But generally we have security concerns.“

”Why? Do you believe open source software is more insecure than commercial software?“

He thought about it for a while. “Well… maybe. I don’t know.” We debated this a bit. Then we found the real issue: “It’s just that we don’t have control over the process. We don’t know enough about it to decide.”

A couple of weeks later, I tried pitching to a newspaper. This time, it was our sales team that raised the same question. “But… isn’t open source insecure?”

I didn’t even bother pitching any open source stuff to them. But I’d learnt my lessons:

Demo the application. Don’t talk about it.
Show it to the business first, and then tackle IT.

Aside: June 2009

In June, I got another chance at a client where we were building their new website. The very first thing I did was ask to see the Javascript. Total mess, and filled with browser-incompatible DOM requests. So I went over to their web development team.

“Look, why don’t you guys use a Javascript library? It’ll get you cross browser compatibility and compact maintainable code at the same time.”

And, to their credit, they said, “Sure. Which library?”

I showed them this and we agreed on jQuery. So, if nothing else, I’ve managed to get one open source library into a corporate.

July 2009

I was also looking at payments on the website, and our client was looking to replace their chargeback application. Since I had a week off, I built a working PCI compliant prototype on Django. (I must clarify what I mean by PCI compliant. You see, any application that stores credit card information must pass through a stringent security clearance process. I bypassed the problem by not storing the card information. I’ve realised that I’ve been building PCI compliant applications all my life – and it’s a huge benefit to let people know that.)

This time, I applied the lessons I’d learned, and demo-ed it to the business, who were thrilled. Time to tackle IT.

I started with the architecture team. Matt on the architecture team was the most approachable. So I went over, demo-ed it, and said, “Matt, this took a week to put together. It’s based on some new technologies. Are you game to try these out?”

He was. And quite enthused about it too. So we put together a proposal for the architecture review board, proposing a new technology stack: Django / Python and MySQL. As before, I showed the demo before I talked technology. I had prepared answers to all security related questions upfront (and practically memorised section 3 of the PCI guidelines.) The clincher, though, was the business case. To build it on Java, it would cost ~1,000 person days. On Django, I’d mostly done it in 5. There was no way of justifying 1,000 person days for an application that could save, at best £100,000 a year.

So they said “Go ahead, we’re fine if operations and infrastructure are fine.”

It was time to find a Django developer in Infy. I hunted for a couple of weeks but none was available. (Only 2 people that I knew knew Django in the first place.) So that effort got canned, and we were back to the 1,000 person day solution. (Which got canned too, later.)
But in the process, I’d learned my third lesson.

If you’re trying new technologies, plan on delivering it yourself.

October 2009

Another application popped up that looked like a prime candidate for introducing open source. They were using an Excel application to fraud screen orders, and wanted to make a web app out of it.

I followed the same route as before. Demo it. Show it to business first, then IT. Built it myself. I skipped Architecture, since they’d already approved the technology stack, and took it straight to Infrastructure.

“This application uses Apache as the web server, MySQL as the database, and uses PHP and Javascript for the application logic. Could we get a Linux server to host it?”

Our entire conversation lasted 30 seconds. He said, “No. We use Windows servers” (I was fine)

“… and you’ll need to chance Apache to IIS” (fine again)

“… and we don’t support PHP, so it’ll have to be Java or .NET” (I don’t know .NET or Java… but fine)

“… and we don’t support MySQL, it’ll have to be SQL Server” (fine, I guess)

“… and we don’t have DBAs available until January, so you’ll have to wait.” (definitely not good.)

So back to the drawing board on the technology stack. I needed something in Java (I know very little Java, but nothing at all in .NET) and to avoid the DBA headache, it would have to bundle in a database. I first explored key-value stores like CouchDB, Redis, etc. None of them worked on Java. The only one I found that did was Persevere, and it was a JSON data store, which fit perfectly with my plans.

By this time, I’d also learn my my fourth and most important lesson.

Don’t try to promote open source. Just deliver the application

I said, “This is a custom-built application that runs on Java. Could we get a Windows server to host it?”

The answer was “Yes”, and we had it live the next day.

PS: December 2009

The application’s deployed and running. It has about 10,000 orders fraud screened by now.
And the lessons are well learnt. So when some came over asking if there was any image resizing solution I knew off, I said: “Sure, who’s your business sponsor?” Then I went over and said, “Let me show you this ~~open source~~ application called ImageMagick. It handles aspect ratios correctly, and can crop too. Doesn’t this look professional?” Then I went over to IT and said, “~~It’s open source, so you can change it.~~ It has Java bindings, so you can integrate it into your environment. It can handle 8 3000×2400 images a second on my puny laptop. It’s used by your competitors. And I can build it for you if you like.”

I might just have my second open source entry into a corporate this year.

The scary Internet

June 3, 2013 June 3, 2013 / How I do things / 4 Comments

I’m not that difficult to scare, and this log message certainly didn’t help:

ip223.hichina.com [223.4.183.127] failed - POSSIBLE BREAK-IN ATTEMPT!

That’s the message I saw – one thousand five hundred and seventy times yesterday in /var/log/auth.log on one of my Amazon EC2 instances.

Someone, presumably from China, has been patiently trying out a variety of SSH keys to log into this system.

These were grouped as batches. There were exactly 314 attempts at 8am yesterday, then 314 at 12noon, then 314 at 4pm, then 314 at 8pm, then 232 at 3am today. (All times are in UTC – that is, UK time without daylight saving). Every burst took 9 minutes to run through all 314 attempts.

The worst part was, when I tried using SSH this morning, I wasn’t able to log in. (It turned out that I had made a configuration error, but this is the sort of thing that gets me quite worried.)

Perhaps I shouldn’t be complaining. I’ve written enough scrapers to make most webmasters cringe at their logs. I remember a few years ago, when I was working on a project at Tesco, and was scraping bestsellers lists from most sites. (Here’s a blog post about it.) We were putting together a prototype to see how real-time competitive pricing could help.

The scraper was a pretty mild one. It would visit a hundred links, roughly at the pace of one a second. No images were loaded, of course, just the HTML.

One fine day, a few weeks after this had started, I got a call from Andy.

“Hi Anand, are you running any scrapers on our books website?”

“Yes, why?”

“Oh! The site’s very slow. Could you shut it down immediately?”

Turns out that not a single page on the site loaded, and it had almost crawled to a halt. Now, obviously, my little 100-page script could hardly cause damage, but it’s easy to understand their reactions. No unauthorised scraping! After a few days of trying to figure out what the problem was, they increased the memory and things went back to normal. Not a bad solution, actually – throw hardware at the problem, and if it vanishes, it’s probably the cheapest solution.

But anyway, I’m sure it’s some nice chap who’s just curious to know what I’ve got on my servers. I’d be happy to share some of it. And even if it’s not so nice a chap, there’s little that I can do, is there?

Update (1pm India, 3rd June): Actually, I now realise that this has been happening ever four hours since May 29th, as regular as a clockwork. Wish I knew enough UNIX programming to pull a prank…

Hosting options

June 1, 2013 June 1, 2013 / How I do things / 5 Comments

I’ve been trying out a number of options for hosting recently, and have settled on Amazon spot instances.

Here were my options:

Application hosting, like Google AppEngine. I used this a lot until 2 years ago. Then they changed their pricing, and I realised what “lock-in” means. I can’t just take that code and move it to another server. Besides, I’m a bit wary of Google pulling the plug. Heroku? Same problem. I just want to take the code elsewhere and run it.
Shared hosting, like Hostgator. This blog is run on Hostgator and I’m extremely happy with them. But the trouble is, with shared hosting, I don’t get to run long-running processes on any ports I like.
Run you own servers. The problem here is quite simple: power cuts in India.
Dedicated hosting, like Amazon EC2, Azure, GCE, etc. This remains as pretty much the main hosting option

I’m a price optimisation freak. So I ran the numbers for a year’s worth of usage. I was looking at the CPU cost of a large machine with 7-8GB RAM. Bandwidth and storage are negligible. The cost per hour worked out to:

Amazon: $0.32 / hr in Singapore, $0.24 in Virginia
Google: $0.29 / hr in Europe
Microsoft: $0.32 / hr in US

The price is not all that different, but I need low latency, so Singapore it what it’ll have to be.

EC2 location	Latency (ms)
Singapore	139
Oregon, US	334
Japan	517
Ireland	618
Australia	620
California, US	677
Virginia, US	710

Now comes the choice of the right model. At $0.32 per hour, that’s $230 a month.

Amazon offers some ways of getting this down. Instead of on-demand instances, I could go for reserved instances. For a year of usage, that’d get the price down to about $131 a month, nearly halving it. ($739 upfront for a heavy utilisation large reserved instance, with $0.095 * 24 * 365.25 for the year.)

In this case, I know I’ll need the servers for a year. Probably more, but then, I might want to switch later. So this isn’t a bad move. But we can do better. Amazon also offers spot instances. Spot instances might get shut down any time – but in reality, so can on-demand instances. I need to plan for it anyway. I’m not going to host anything that’s so sensitive that if it’s down for a few hours, I’ll have a problem.

But what’s attractive is the pricing. Typically, it’s $0.04 per hour, making it about $29 per month. Even if it shoots up to twice that, at $58, it’s less than a fourth of the on-demand price and less than half the reserved instance price.

I’ve managed to script the entire setup up sequence as shell scripts, and it takes less than an hour to get a new server up and running the software I need. I need to work out a decent backup mechanism. Plus, I could use more reliable storage like like Amazon’s EBS to preserve the data. But on the whole, the pricing is far too attractive and makes the risks worthwhile.

Visualising networks

May 11, 2013 May 11, 2013 / How I do things / Leave a Comment

Some slides from my talks on visualising networks. (These are part of a series of talks I’m giving at a number of forums; the one at The Fifth Elephant is open to public.)

Geocoding in Excel

May 8, 2013 May 8, 2013 / Coding / 2 Comments

It’s easy to convert addresses into latitudes and longitudes into addresses in Excel. Here’s the Github project with a downloadable Excel file.

This is via Visual Basic code for a GoogleGeocode function that geocodes addresses.

Function GoogleGeocode(address As String) As String
    Dim xDoc As New MSXML2.DOMDocument
    xDoc.async = False
    xDoc.Load ("http://maps.googleapis.com/maps/api/geocode/" + _
        "xml?address=" + address + "&sensor=false")
    If xDoc.parseError.ErrorCode <> 0 Then
        GoogleGeocode = xDoc.parseError.reason
    Else
        xDoc.setProperty "SelectionLanguage", "XPath"
        lat = xDoc.SelectSingleNode("//lat").Text
        lng = xDoc.SelectSingleNode("//lng").Text
        GoogleGeocode = lat & "," & lng
    End If
End Function

S Anand