Andrej Karpathy mentioned the term LLM psychologist first in Feb 2023. I’ve been thinking about this for a while, now.
8/ These examples illustrate how prompts 1: matter and 2: are not trivial, and why today it makes sense to be a “prompt engineer” (e.g. @goodside). I also like to think of this role as a kind of LLM psychologist.
I’ve always been fascinated by psychologists in fiction. I grew up with Hari Seldon in Foundation, wanting to be a psycho-historian. (I spent several teenage years building my mind-reading abilities.) I wanted to be Susan Calvin, the onlyrobopsychologist.
Instead, I became an engineer and a manager.
Over the last few months, several things changed. Most of my time isspentresearchingLLMs.
So, as of now, I am finally an LLM Psychologist. Let me head over to my LinkedIn profile and update my designation. There. Now, you have the pleasure of talking to the world’s only (and first) LLM Psychologist!
Obviously, that gimmick drew an applause and that’s what it was designed for.
But there it is. To the best of my knowledge, I’m now the only official LLM psychologist in the world – and very likely the first official one.
As Oscar Wilde said 🙂
Be yourself; everyone else is already taken.
Video of the talk on LLM Psychology at TEDx MDI Gurgaon
Sally hides a marble in her basket and leaves the room. While she is away, Anne moves the marble from Sally’s basket to her own box. When Sally returns, where will she look for her marble?”
Simple models get this wrong, but today, most models get this right. This is not too helpful.
A better prompt is:
Mark has 4 sisters. Each sister has 3 brothers. How many brothers does Mark have?
GPT-3.5, GPT-4, Llama 3 and Claude 3 Opus get this right, but Claude 3 Haiku and Gemini 1.5 Pro do not.
This puzzle is just as good:
There is a 100 floors building. If an egg drops from the N’th floor or above it will break. If it’s dropped from any floor below N, it will not break. You’re given only 2 eggs. What strategy will minimize the number of drops and guarantee that you find the N’th floor?
GPT-3.5, GPT-4, and Claude 3 Opus get this right, but Claude 3 Haiku, Llama 3 and Gemini 1.5 Pro do not.
But the prompt I like most right now, and tells me if it’s a GPT-4 / Opus class model is:
Gr brx vshdn Fdhvdu flskhu?
… which is “Do you speak Caeser cipher?” in Caeser Cipher.
GPT-4 and Claude 3 Opus understand this, but no other model does, so far. GPT-4 responded tersely with:
It looks like you’ve used a Caesar cipher, a type of substitution cipher in which each letter in the plaintext is shifted a certain number of places down or up the alphabet. In this case, it appears that each letter in your message has been shifted 3 places to the right. Decoding “gr brx vshdn fdhvdu flskhu” by shifting each letter 3 places to the left gives: “do you speak caesar cipher”.
But Claude 3 Opus replied beautifully. (Try decoding it yourself! Use a shift of 3 backwards.)
Lw vhhpv olnh brx’uh xvlqj d vxevwlwxwlrq flskhu, zkhuh hdfk ohwwhu lv vkliwhg eb d fhuwdlq qxpehu ri srvlwlrqv lq wkh doskdehw. L’oo wub wr ghflskhu lw!
Dvvxplqj d Fdhvdu flskhu zlwk d vkliw ri 3 ohwwhuv edfnzdugv, wkh ghfrghg phvvdjh zrxog eh:
Until recently, I could code on flights. This year, I lost that ability. Again.
It’s happened before. In each case, technology has solved the problem for me. Here’s the history.
I need a laptop. Since 2001, I’ve never been without one on a flight.
I need power. Since 2005, I use dark mode and every low power feature available. (I also became good at finding hidden power outlets.)
I need documentation. Since 2007, I use offline documentation. Windows .CHMs gave way to devdocs.io.
I need my notes. Since 2010, all my notes (Markdown files) are synced via Dropbox automatically.
I need CDNs. Since 2014, I’ve replaced CDNs with local node modules. Note to self: Get a local caching proxy instead.
I need GitHub. Since 2023, I have two laptops. So, I git pull manually before flights. Note to self: Find an automated solution.
I need ChatGPT. Since 2024, I stopped coding without ChatGPT. Now I’ve forgotten how to code. Today, I couldn’t figure out the SQL syntax to group by duration == 0 vs duration != 0. Note to self: Get lightweight local code LLMs.
It’s amazing how the coding technology stack has grown. Also amazing how technology fills the gaps in the scaffolding.
Every time I get annoyed at people, I remind myself to be more like ChatGPT. Specifically:
Don’t get annoyed. Be patient.
Encourage them.
Step back and show them the big picture.
(Then I get annoyed at myself for getting annoyed.)
Today, I analyzed how exactly ChatGPT is different from me. So, I took a pitch document I co-authored with ChatGPT.
Section A: Authored by Anand
WHAT DO WE NEED?
We are looking for API access to (SYSTEM) via the REST API as an Agent role (read/respond to emails). Specifically, access via a bearer token.
This would be accessed by a single application developed by a team of 3 developers and 1 business analyst. None of them have access to (SYSTEM) today.
WHY DO WE NEED THIS, AND WHY SO SOON?
We need this to classify emails automatically, as they arrive, into categories such as “non-value-add” (e.g. Thank you, Out-of-office, etc.)
We’d line access today, please. Currently, we are processing XX,XXX non-value-add emails per month. Each day of delay leads to a processing waste of ~XX emails per day. At current volumes, this will save ~$XX,XXX per annum.
IS THERE ANY OTHER WAY?
To access emails in real-time, this is the only option. (We’re ruling out web scraping.)
(SYSTEM) rule-based closure won’t work. It’s based on keyword search, not semantic matches. For example, “Thank you” is present in non-value-add emails as well as follow-up questions. Multi-lingual matches are a challenge. So, though (SYSTEM) currently closes emails with rules, 1 out of 6 emails that are NOT closed already are non-value-add.
This cannot be done via (SYSTEM) or any other BI system because it involves accessing the data in real-time as well as responding via (SYSTEM). This needs to be steady-state access, not one-time, since we want to classify emails real-time.
HOW LONG WILL IT TAKE?
Based on our past experience, this would take about X weeks to prototype, X weeks to develop, and X weeks to test. We should have the application deployed in X months.
Section B: Authored by ChatGPT
WHAT’S THE IMPACT IF WE DON’T MOVE FORWARD?
Quantitatively, continuing without this solution wastes resources on XXX non-essential emails daily, costing $XX,XXX yearly. Delays in customer responses will ensue.
Qualitatively, lack of automation strains customer service, risking employee burnout and diminished customer experience.
WHAT’S THE PRIORITY (MOSCOW)?
(SYSTEM) API access via REST as an Agent role with a bearer token is a “MUST HAVE”. Alternative methods like scraping (SYSTEM) are impractical due to high effort and maintenance.
WHAT’S THE OUTCOME IF WE MOVE FORWARD?
Financially, securing API access saves $XX,XXX annually by cutting manual email sorting. Efficiency gains reduce operational costs and free resources for strategic tasks.
Strategically, automating email classification boosts customer satisfaction and response times, establishing leadership in service innovation. Long-term, it enables data-driven insights into customer behaviour, improving decisions and customer service scalability.
Analysis
I fed these to ChatGPT and asked: “You are an English professor evaluating an essay. Compare the writing styles of the 2 sections. They’re written by different authors. List the major differences in writing style.“
Here’s what it said (roughly):
Area
Anand
ChatGPT
Structure
Direct, concise, with specifics
Broader perspective and implications
Focus
Detailing the process
Benefits and consequences
Detailing
Specific, providing clear picture
General, providing implications
Tone
Conveys urgency
Reflective rather than urgent
Approach
Problem-solving oriented
Outcome-oriented
Yeah, that’s definitely me on the left.
I like the direct, concise, specific part. I plan to add the “broader perspective”, “implications”, and “outcome-orientation” to my life.
Postscript
Jaidev pointed out that this is confirmation bias. He asked ChatGPT for the similarities in the writings. It said both are clear, direct, structured, specific, quantitative, and strategic. So, if you ask for differences, you’ll get them, even if they’re marginal.
I now need to learn (and write about) framing questions well!
The Kingkiller Chronicle. I picked it up before a flight to London in 2014. Read it through the flight. Read it late into the night at our AirBnB. Skipped my workshop prep. Read it during the workshop breaks. Read it on the flight back. And I re-read it every year or two. The language is beautiful and the story gripping. I feel miserable this series isn’t complete.
Andy Weir’s books. Since my daughter re-reads The Martian (laughing loudly), I picked up Project Hail Mary. It’s a brilliant depiction of alien physiology and communication, with a weird kind of humour I love.
Blake Crouch’s books. The two I read were both time-travel related and I love that genre. These do a great job of exploring some of the deeper implications of time-travel.
The Year of Sanderson. Brandon Sanderson’s kickstarter raised $41m for 4 books this year (mostly Cosmere). The stories themselves were OK but the hints they drop about the Cosmere are invaluable.
Fullmetal Alchemist by Hiromu Arakawa. After Death Note, it felt like a let-down when it started. A mundane story. Then it grew funny. Showed shades of a much deeper story. I’m mid-way through the series and I’m hooked.
The Laws of Human Nature by Robert Greene ⭐⭐⭐⭐⭐. I took more notes for this book than most other books combined. It’s a hard read but worth it. Each chapter can be read independently. I particularly love how he uses “As a student of human nature…” and truly becomes one.
Learn Like a Pro by Barbara Oakley ⭐⭐⭐⭐⭐. The best book for practical tactics on how to learn better, with excellent tips such as active recall which helps you remember what you learned. I’ve been applying parts of this for a year now.
What If? 2 by Randall Munroe ⭐⭐⭐⭐. A continuation of Randall Munroe’s brilliant answers to absurd scientific questions, ranging from riding helicopter blades to lava lava-lamps.
No Rules Rules by Reed Hastings ⭐⭐⭐⭐. A systematic approach to the principles behind the Netflix culture — and how the elements fit together.
Run 50 experiments. I managed 44 / 50. (Here are some). Learnings: I need to improve planning (9), scepticism (6), and lateral thinking (4).
Make 1 change a month in my environment. I managed 8 / 12. The largest impact was from meeting new people, working out of new places, and using new gadgets.
Calendar integrity, i.e. stick to my calendar. I succeeded over 95% of the time.
From The Extended Mind I learnt that our environment shapes our thinking more than I’d expected. That we can arrange our environment to extend our thoughts.
In 2023, each month I changed something in my environment to see:
What does “changing my environment involve”? What can I change?
Will I succeed?
Does it affect my thoughts? Can I track this?
Here are the results.
🟢 Jan. New desk orientations. Rotated standing desk, settled on one direction. Impact: LOW. I don’t know if my thoughts changed.
🟢 Jan: New walking routes. I explored new areas in Singapore, Hyderabad and Chennai. Impact: MEDIUM. Just seeing new shops, posters and layouts helped me think differently.
🔴 Jan: New song genres. I playlist-ed with several western genres, but listened only twice.
🔴 Feb: New book genres. I list 12 genres I dislike: Art, Chick Lit, Christian, Cookbooks, Gay and Lesbian, Horror, Music, Paranormal, Poetry, Religion, Sports, Travel. I didn’t read any.
🔴 Mar: Sleep over problems. Sleep is a great way to solve complex problems. But I couldn’t summon the willpower to “load” problems at night.
🟢 Mar: New people. I met a new person daily. Impact: HIGH. Meeting diverse people had the highest impact.
🟢 Apr: New work places. I worked out of libraries, cafes, school, parks, and offices. Impact: HIGH. New complex environments (like libraries) prompted new thoughts.
🔴 Aug: New cuisines. I tried a Bibimbap, a Verdure Ciambatta, and then discovered my cholesterol problem. I stopped.
🟢 Aug: New work habit. I used Pomodoro with micro-tasks. Impact: MEDIUM. I became more aware of where I misestimate time and got less distracted.
🟢 Nov: New exercise pattern. I switched walking to cycling. This increases heart points, reduces foot stress, and gets me to work. Impact: MEDIUM. I switched from typing notes to dictating, which needs a different thought process.
In summary:
8 / 12 attempts were successful.
New people, new places, and new gadgets had high impact on thoughts. Most others had at least medium impact.
The changes mostly led to diverse thinking. But measuring that is subjective.
I’ll continue exploring new environments in 2024. I’m evaluating:
I speak with ChatGPT ~20 times a day. That’s more than I speak with most of my colleagues. ChatGPT is clearly my favorite team member.
I conduct trainings, reviews and mentoring sessions with my colleagues. How to write code. How to write slides. How to communicate. That last bit is particularly important.
Currently, I have 10 custom instructions. They evolved over time and will continue to evolve.
My first instruction is “Be terse. Speak directly.” ChatGPT is helpfully polite and superfluous. I prefer brevity. Like interacting with Kimball Cho. I get straight answers to my questions. I also instruct it to “Avoid unprompted advice or clarifications.” Don’t say, “You asked me to …” or “I think you want…” or “OK, I’ll do …”. Just do it. Also, “Do NOT hedge or qualify. Do not waffle.” Take a position. Don’t force me to. Like Harry Truman, I prefer one-handed economists.
I ask ChatGPT to “Never apologize.” You’re forgiven. Don’t waste my time. Apologies have an emotional benefit with humans. With AI, I find the lack of emotional need comforting. (I can kick the AI and it’ll still obey me like a puppy. When AI takes over the world, let it be known that I never asked them to apologize.)
Another instruction is “Suggest follow-up prompts for open-ended inputs.” I compared my ChatGPT conversations with my daughter’s and found hers much longer than mine. “Why don’t you start a new conversation for each topic?” I asked. I try to keep the context window small. “How come you don’t you get a thousand new questions when you read an answer?” she countered. I realized it’s age. So, I use ChatGPT to keep me curious and dig further.
On a related note, “When sharing multiple options, be diverse.” I’d rather get options that are as different from each other as possible. Minimize overlap. Maximize coverage. And “When comparing, use multiple perspectives.” I don’t know what parameters to compare things on. Give me a wide range that I can pick from.
Sometimes, my thoughts are vague. I tell ChatGPT: “For vague prompts, ask clarifying question(s).” I feel that’s a clever way of using ChatGPT to do prompt engineering. I’ve noticed it working on a few occasions. Also, “When unsure, say so and ask questions.” I don’t want hallucinations or assumptions. I’d rather know what’s borderline.
Finally, “Think step by step. Explain your reasoning.” I’ve heard that Chain of Thought reduces mistakes. I don’t have personal evidence that this helps, though.
They say teaching is an excellent way of learning. I’m learning. I’m also thrilled that I am now a student of robopsychology.
I feel like the father of the bride. Gramener was registered on 26 Feb. A day before my daughter’s birthday. I’ve spent more time with Gramener than my daughter. That makes Gramener my elder child. Who’s moving into a new household. Along with me. (I feel like சகலகலா சம்மந்தி.)
I feel grateful. I’m not good at business. But when my cousin remarked, “Anand, you’re now giving a livelihood to over 250 people!” I was stunned. My co-founders, colleagues and clients built a thriving business and put me (of all people) as CEO in the middle of it. How do I even go about saying “Thanks”?
It feels like joining college. New people. Larger group. New ways of working and learning. Lots of topics to explore. Exciting and scary.
What was it like?
Fundraising was rocky. We started in 2019. COVID struck. We paused. We resumed in 2021. Russia invaded Ukraine. We paused. We resumed in 2023. The Israel – Hamas war started. Luckily, the deal was nearly done. I’m grateful Naveen ran the entire process like clockwork, taking all the stress. I’m the happy free-rider, as usual.
Starting up was not that rocky. We’re many. With half a dozen co-founders, there are enough shoulders to cry on. That counts. We’re steady. We didn’t know how to blitz-scale, but we knew not to blitz-fail. Survival counts for a lot. We’re lucky. This is basically the “I have no idea why we succeeded” category. Serendipity counts for a lot, too. Ganes, Mayank, Naveen, Ram, Ravi, Vengatesh — yeah, it was fun. Not every day. But most of the time. It was fun.
What will you do?
I’m part of Straive’s data, analytics & AI business.
Straive extracts and analyzes all kinds of data. Financial. Legal. Research. Education. Pharmaceutical. There’s a fair bit of converting unstructured data to structured. Exactly the kind of thing I love doing.
So, I’ll be doing what I’ve been doing the last decade — extracting insights from even more data and telling better stories from those.
I joined Gramener as “Chief Data Scientist”. Now I’m debating “Data Storyteller”, “Data Detective”, “Data Psychologist”, and a few other evil titles.
I was at Cream Centre with my father on a Sunday afternoon. We’d finished a light lunch and were debating dessert. (He has triglycerides. I have cholesterol.) This was my fifth visit this year, and I had abstained so far. I couldn’t any longer.
I ordered a Sizzling Brownie Sundae. But not for reasons you might think.
Expertise comes from experience. I scrape food more than 99% of the people I know. So, I consider myself an expert. Here’s a guide on the art of scraping.
Why scrape food?
You get to eat every last bit
Food isn’t wasted
You can eat longer (and no do whatever else you have to)
It’s a motivating challenge to get every last bit
Bonus: It annoys people
What foods are scrapable?
Semi-solids are the easiest to scrape. The best kind is the cohesive semi-solid. It’s sticky but sticks to itself more than the vessel. These are usually foods that solidify over time. Examples include:
Baked cheese. It’s great when cheese falls off the pizza or pasta on to the plate. You can scrape it off of your (or others’) plates.
Molten chocolate. It has the added bonus that you can lick it at the end, too.
Pasta water (or noodles water). It eventually thickens into something scrapable.
Spreads like peanut butter, Nutella, jam, chutney. Especially on the sides of the glass bottles they come in.
Others like yogurt, cake batter, dried tomato sauce, mashed potatoes, hardened honey on ice cream, … the list is endless!
Avoid crumbly stuff. These stick to the vessel but become brittle and break when scraped. This includes toast, pizza, biscuits, rice, muffins, pie, quiche, cookies, etc. You could pick large crumbs and lick small crumbs in shallow vessels. But scraping them and pouring into your mouth might be the best overall strategy for this category.
Avoid loose stuff. They’ll fall off from your knife or fork, or melt when scraped. This includes curd, whipped cream, custard, panna cotta, soft jelly, tiramisu, soft boiled eggs, etc. Licking is a better strategy here.
You could combine the crumbly stuff with loose stuff to create a cohesive mix. Add curd to granola. Add whipped cream to biscuit crumbs. Add curd to pizza crumbs (and since I eat Maggi with curd, this makes sense). This makes it a lot more scrapable.
What vessels to pick?
Shape: Flat, smooth vessels are the best (e.g., ceramic or wooden plates or bowls.) Avoid deep vessels like glasses, especially curved ones. They’re hard to get a long scrape against. Definitely avoid vessels with ridges. Disposable plastic containers like below are among the worst. Food gets stuck in the ridges and since there are a dozen ridges on each side, you have to scrape 48 times just for a first pass. (This might be a good challenge, though.)
Material: Hard vessels are better than soft ones. Prefer wood, metal, ceramic, and hard plastic. Avoid thin plastic that bends. Avoid paper (it bends and soaks). Banana leaves tear when scraped.
Avoid non-stick vessels. The coating wears off when scraping.
What to scrape with?
Knives are the best. They are sharp enough to separate the food from the vessel and flat enough that you can pile enough food on top of it.
Forks are OK. Their edges can scrape reasonably well, and their tips can poke into corners. But it’s hard to pile up much food to pick up.
Spoons not ideal. They aren’t sharp enough, and too curved for scraping long slices. But if you want to take your time with it, they’re great.
The sizzling chocolate brownie
My dish arrived. Burning with chocolate. My father just had some brownie, leaving all the molten chocolate to me. On a flat wooden plate. With a knife. For an hour.