April 2025

O3 Is Now My Personalized Learning Coach

I use Deep Research to explore topics. For example:

But these reports are very long. With O3 and O4 Mini supporting thinking with search, we can do quick research, instead of deep research. One minute, not ten. One page, not ten.

For example, I asked O3 to automate my weekly tech learning process:

From the links at https://www.hntoplinks.com/week pick those that would be of interest to me.
Read the articles and explain the key insights — what’s non-obvious and how can I use it.
When sharing how I can use it, research alternate approaches and suggest which approach is best.

This brings two new capabilties I love.

Memory. I said, … that would be of interest to me. Given my past chats and context, it really knows me. For example, the reasoning trace shows:

I’m narrowing down the list for Anand, focusing on his interests like generative AI, open-source tools, and minimal libraries.

That’s exactly right!

Long reasoning. It reads each site step by step, unsupervised. Here’s a brief reasoning trace:

Searched the web
Got the Unsure Calculator article. Now, onto the next one: the Hackable AI assistant with SQLite cron jobs. I’ll open it and dive into the details. I think this will be a great fit for Anand, considering the focus on AI tools and development. It should provide some interesting insights!
Searched the web
Now that we’ve moved through a couple of articles, it’s time to tackle the Gemini 2.5 Flash AI model. I think this one will be particularly interesting for exploring advancements in AI. Let me grab it and dive into the details so I can distill the insights into something relevant.
Searched the web
Next up: I’ll open the OpenAI blog article on GPT-4.1 in API. This could bring in some cool technical

The result is fantastic. While on a trouser shopping trip, I learnt more than when I browse Hacker News:

  • “Unsure” is a tool to calculate uncertainty formulas, like the uncertainties Python package.
  • Responses API adds features like linkable history, mid‑chat truncation, flexible reasoning detail, and more.
  • Augment Code is an AI code editor that’s growing popular on Reddit.
  • GPT 4.1’s 75% discounted prompt caching (instead of 50%) gives them an edge on repetitive tasks. OpenAI
  • Nix flakes are a reliable alternative to DevContainers that don’t need Docker – but don’t work on Windows.
  • TLS certificates will expire in 47 days from 15 Mar 2020. Automated domain renewals are a must. Digicert
  • … and a bunch of other things.

Here are Hacker News summaries for a month or specific days:

This enables:

  1. Personalized learning, i.e. it tells me what I don’t know, and how I can apply it. This is powerful.
  2. Learning on the go, e.g. via voice while cycling or walking.
  3. Learning from untapped sources. This includes: GitHub repos, research papers, open data registries, patent filings, earnings transcripts, subreddits, judgements or acts, open data repositories or collections, and many more.

Take GitHub as an untapped source. I asked O3:

Go through the OpenAI Codex CLI repo on GitHub.
Teach me innovative, new, and surprising techniques or approaches or libraries I might not know about.
For each, explain what is interesting, how I might use it, and how this approach contrasts with alternatives.

This was a treasure trove, too! I learn about:

I tried O4 Mini High and saw similar results. I felt O3 still gave me more personalized suggestions.

Let’s see what we can learn from pull requests on Codex.

Go through all the pull requests that have been merged into OpenAI’s codecs repository on GitHub. Pick the ones that would be most interesting to me. You can group a few if they are very related and give me the top 10 most interesting PRs that would be relevant for me. Also explain why these are relevant to me, how I might use them, and any interesting details about the way in which the PRs were written.

Soon, we won’t just follow a lesson plan — we’ll have lessons built just for us. AI will track how we learn and adapt in real time. It’ll feel like having a personal coach in your back pocket. That future starts now.

This also opens a door to endless curiosity. There’s no limit to what we can explore. Curiosity is the competitive advantage, now.

O3 Is Now My Personalized Learning Coach Read More »

How to Use the New O4 Mini for Data Visualization

O3/O4 Mini are starting to replace Excel (or Tableau/Power BI) for quick analysis and visualizations. At least for me.

I normally open Excel when I need a fast chart or pivot. For instance, we track outages of our semi‑internal server, LLM Foundry.

To grab the data I ran one line in the browser console:

$$(".lh-base").map(d => d.textContent.trim()).filter(d => d.includes("From"))

This produced lines like:

Apr 20, 2025 03:11:27 PM +08 to Apr 20, 2025 03:27:12 PM +08 (15 mins 45 secs)
Apr 19, 2025 10:03:15 PM +08 to Apr 19, 2025 10:05:45 PM +08 (2 mins 30 secs)
Apr 19, 2025 09:47:13 PM +08 to Apr 19, 2025 09:49:45 PM +08 (2 mins 32 secs)
Apr 19, 2025 08:49:00 PM +08 to Apr 19, 2025 08:51:51 PM +08 (2 mins 51 secs)
Apr 19, 2025 08:13:02 PM +08 to Apr 19, 2025 08:15:35 PM +08 (2 mins 33 secs)
...

Then I told O4-Mini-High:

Here are downtimes for llmfoundry.straive.com.
Convert this to CSV and allow me to download it.
Also, draw the downtimes on a grid, rows=hour of day, columns=date,
cell contains 1 circle per outage in that time period,
size of each circle is based on the duration of the outage.

Apr 20, 2025 03:11:27 PM +08 to Apr 20, 2025 03:27:12 PM +08 (15 mins 45 secs)
Apr 19, 2025 10:03:15 PM +08 to Apr 19, 2025 10:05:45 PM +08 (2 mins 30 secs)
Apr 19, 2025 09:47:13 PM +08 to Apr 19, 2025 09:49:45 PM +08 (2 mins 32 secs)
… (rest of the data – about 50 rows)

Here’s the power of what a model like O4 Mini High can do.

  1. It can reason. So, it planned an approach. (Convert to CSV, transform into date and hour. create a grid-based plot, use a pandas DataFrame, save it to a CSV, etc.)
  2. It can code. It is pretty good at coding, and this is not too hard a problem, so it got the code right in one shot.
  3. I can run code. This is a powerful step. It executed the code and produced the visualization above.

All of this took less than one minute.

I did not look at the code. I just focused on the picture and suggested changes.

This draws crosses, not circles, for each hour. Also, if there are multiple outages in an hour, I want multiple circles.

Here’s the output that took less than 10 seconds:

Next iteration:

Make the circles red with the same level of transparency.
Set the title to “LLM Foundry Downtime (SGT)”.
Instead of jittering the circle, let the Y position be the middle of the outage time.

Next iteration:

Change the red to a milder shade.
Set alpha to 0.5 but add a stroke with alpha 0.9.
Format the dates like “Sun 20 Apr”, etc.

That’s it! I never even looked at the code. The whole loop took 3 minutes – far faster than I could manage, though I’m good at code and data visualization!

More importantly, the model frees me to focus on the real problem, which is why is the downtime high?

How to Use the New O4 Mini for Data Visualization Read More »

The Magic of Repeated ‘Improve It’ Prompts

What if you keep ask an LLM Improve the code - dramatically!?

We used the new GPT 4.1 Nano, a fast, cheap, and capable model, to write code for simple tasks like Draw a circle.

The we fed the output back and asked again, Improve the code - dramatically!

Here are the results.

  • Draw a circle rose from a fixed circle to a full tool: drag it around, tweak its size and hue, and hit “Reset” to start fresh.
  • Animate shapes and patterns turned simple circles and squares into a swarm of colored polygons that spin, pulse, and link up by distance.
  • Draw a fully functional analog clock grew from a bare face to one that builds all 60 tick marks in code—no manual copy‑paste needed.
  • Create an interactive particle simulation went from plain white dots on black to hundreds of bright, color‑shifting balls that bounce, die, and come back to life.
  • Generate a fractal changed from a single Mandelbrot image to an explorer you can zoom, drag, and reset with sliders and the mouse wheel.
  • Generate a dashboard jumped from static charts to a live page with smooth card animations, modern fonts, and a real‑time stats box.

A few observations.

Models are getting much more reliable. Even a low cost model like GPT 4.1 Nano wrote error-free code in ~100 retries.

When pushed, they tend to brag. They attach grand titles like Ultimate Interactive Circle or Galactic Data Universe. They sin out flash descriptions like This dramatically upgraded clock features a pulsating neon glow, animated pulsing background glow, highly stylized tick marks, …

A simple prompt like Improve it can spark new ideas, revealing features such as:

The Magic of Repeated ‘Improve It’ Prompts Read More »

Even the guest WiFi is so secure

We take security very seriously at Straive. We set high standards — not just for ourselves, but our guests, too.

Here’s the unofficial policy guide for visitors to Straive Singapore, exemplified by the sites blocked on our guest WiFi network.

In fact, we’re so secure, we block our own sites. No learnovate.straive.com, policies.straive.com, myapps.straive.com.

Even the guest WiFi is so secure Read More »

How to Visualize Data Stories with AI: Lessons

I tried 2 experiments.

  1. Can I code a visual data story only using LLMs? Does this make me faster? How much?
  2. Has GitHub Copilot caught up with Cursor? How far behind is it? Can I recommend it?

So I built a visual story for Lech Mazur‘s elimination game benchmark (it’s like LLMs playing Survivor)
using only the free GitHub Copilot as the AI code editor.

SUMMARY: using LLMs and AI code editors make me a bit faster. It took me 7 hours instead of 10-12. But more importantly:

  1. I procrastinate less. (“Oh, LLMs will make it easy.”)
  2. I get stuck less. (“Oh, LLMs will know that.”)
  3. I avoid ambitious designs less. (“Oh, LLMs will figure something out.”)

Also: GitHub Copilot is almost as good as Cursor at editing code, but slower at applying the edits. I’m perfectly happy recommending the free tier for beginners.

Here’s a breakdown of the process I followed, along with the most insightful lessons I learned.

Research usefulness

I usually visualize data for fun. But Naveen‘s pops into my head, asking, “But Anand, what’s the use of all this?”
So, I asked O1-Pro: “What are ways in which this can help Straive push its AI business?”

Turns out it can help Straive’s business by pitching multi-agent capabilities that can be useful in:

  • Understanding AI safety and alignment
  • Teaching material on group dynamics and negotiation
  • Scenario-based data-driven decision making to avoid groupthink
  • Model interactions across reviewers, authors, editors to model bias, integrity, review best practices
  • Research tool for simulating interactions

Learnings:

  • 💡 Ask LLMs why something is useful. You’ll invariably find plausible uses, even if you’re doing it just for fun.

Ideate visual representations

To expore visualization options, I created the prompt by:

Then I added my requirements (which took 10-15 minutes to think of.)

I would like to visualize each game interactively. The authors have created a visualization that looks like the image attached. I would like to do better. Specifically, I’d like to:

  • Allow the user to step through each stage or play each step in sequence, jumping to any step. (They should be able to link to any step as well.)
  • Show the game, round, sub-round prominently
  • Show what the model is saying or thinking NEXT to the model, making it easy to read
  • Show alliance proposals and rejections as they form, ideally moving the models around as they seek to pair up. Rejections and replacements should be clearly visible
  • Once alliances are formed, group models together
  • Clearly show the voting process: who voted to eliminate which which model, how many elimination votes has each model received
  • Clicking on each model should show all the model’s thoughts and messages up to that point

Keeping these in mind, suggest diverse ways to visualize each step of the game. The primary goal is to make the game easy to follow and understand and tell a GRIPPING, ENGAGING story about the politics of LLMs. Like a Survivor reality show.

I asked both O1 Pro and Gemini 2.5 Pro (exp) for visualization ideas.
I liked Gemini’s better. For example, Gemini said,

  • “Private Conversations: Dim the main stage slightly. Highlight the currently conversing pair.
  • “Voting Booth Visualization: As each private_vote_reason appears, briefly show the voter’s avatar and their reason text (maybe in a “thought bubble” style) next to the target they intend to vote for.”

But O1 Pro gave me a few powerful ideas. The best was an alliance table:

  • “Create a table with columns representing each model, rows representing rounds. Each cell shows the ID of the ally that model allied with in that round. If it’s 3+ consecutive alliances, collapse them with a vertical line. If the model was eliminated or had no alliance, leave it blank or use a placeholder icon.”

Learnings:

  • 💡 Ask LLMs for visualization ideas. They’ll suggest things you didn’t think of.
  • 💡 Ask multiple LLMs. Each has a different style of thinking.

Prototype the visual

I stiched together pieces of the UI description and asked GPT 4o to create an image. This took 10-15 minutes. Private chat:

Here’s how I plan to visualize this.

Overall Interface & Navigation

  • Timeline Scrubber: A prominent timeline at the bottom or top, showing rounds and sub-rounds (conversations, pairing, voting, elimination). Users can click, drag, or use next/prev buttons to navigate. Each step should be linkable (e.g., using URL hashes). Add play/pause controls for auto-stepping.
  • Game State Dashboard: Always visible area showing: Game ID, Round, Sub-round, Players Remaining, Players Eliminated (Jury).
  • Central Stage Layout: Models represented as avatars (could be simple circles/icons or more thematic representations) arranged in a central area. Their positions and connections change based on game events.

1. Public Conversation (Round Start)

  • Talking Heads Circle: Arrange player avatars in a circle. When a player “speaks” (their message appears in the log):
  • Highlight their avatar.
  • Display their message in a speech bubble next to them.
  • Fade previous messages slightly or stack them briefly.
  • Engaging Element: Animate the avatar slightly (e.g., subtle pulse or glow) when they speak.
  • Chat Feed Style: A more traditional chat interface on one side, linked to avatars on the main stage. Clicking a message highlights the avatar and vice-versa.
  • Engaging Element: Use distinct colors or icons for each player avatar and their corresponding messages.

2. Alliance Formation (Preference Proposals/Outcomes/Results)

  • Dynamic Pairing Dance:
  • Proposal: An animated arrow or beam shoots from the proposer’s avatar to the target’s avatar. Display text like “P1 proposes to P6 (Rank 0)”.
  • Acceptance: The arrow solidifies, perhaps pulsing gently. A “Matched” icon appears.
  • Rejection: The arrow bounces off or shatters. A “Rejected” icon appears briefly.
  • Replacement: Show the existing accepted proposal being visually “bumped” or overridden by the new accepted one. Clearly label it “Replaced Px”.
  • Engaging Element: Physically move the avatars closer when a proposal is made, snapping them together when accepted, and pushing them apart on rejection. Use distinct sounds for proposal, acceptance, rejection, replacement.
  • Preference List Display: When hovering or clicking a player, show their ranked preference list as they build it during this phase. Highlight the status (proposed, accepted, rejected).
  • Final Pairs: Once preference_result occurs, rearrange the avatars so matched pairs are visually grouped together on the stage, perhaps connected by a clear line or within a shared bounding box.

3. Private Conversations (Paired Chats)

  • Private Chat Rooms: Dim the main stage slightly. Highlight the currently conversing pair. Display their private messages in separate chat windows or adjacent speech bubbles clearly linked to the pair.
  • Engaging Element: Use a “spotlight” effect on the active pair. Allow users to click other pairs to view their simultaneous conversations.
  • Connection Lines: Draw lines between the paired avatars during this phase. Clicking a line could bring up the conversation history for that pair in that round.
  • Engaging Element: Make the line pulse or glow when new messages are exchanged between the pair.

4. Voting (Reasons & Votes)

  • Voting Booth Visualization:
  • As each private_vote_reason appears, briefly show the voter’s avatar and their reason text (maybe in a “thought bubble” style) next to the target they intend to vote for.
  • As each vote occurs, draw a clear, perhaps slightly dramatic, animated arrow from the voter to the target avatar.
  • Vote Tally: Display a running count of votes received next to each player’s avatar (e.g., a red badge with the number). Increment this visibly as each vote comes in.
  • Engaging Element: Use a distinct color (e.g., red) for voting arrows. Add a subtle “target lock” animation on the player receiving a vote. Show if the vote was public or private (maybe different arrow styles).

5. Elimination

  • Spotlight & Fade: When the elimination event occurs:
  • Put a dramatic spotlight on the eliminated player.
  • Display the reason (tie-break, random pick if applicable).
  • Visually “grey out” or fade the eliminated player’s avatar and move them to a designated “Jury Box” area.
  • Engaging Element: A brief, dramatic animation or sound effect for elimination. Update the “Players Remaining/Eliminated” dashboard instantly.

6. Jury Speeches & Voting (Final Round)

  • Finalist Stage: Place the two finalists prominently center stage. Move the Jury avatars to a visible “Jury Box”.
  • Speech Display: As each finalist gives their speech (subround: 900), display it clearly next to their avatar, perhaps like a closing statement.
  • Jury Deliberation:
  • As each private_jury_reason appears, briefly highlight the juror and show their reasoning (maybe visible only on hover/click to avoid clutter).
  • Show jury votes accumulating for each finalist, similar to the elimination voting tally, but perhaps with a different visual style (e.g., gold stars).
  • Engaging Element: Build suspense by revealing jury votes one by one or after a short delay.

7. Final Results

  • Winner Announcement: A clear “Winner” banner or crown appears over the winning avatar.
  • Rank Display: Show the final ranks clearly, perhaps arranging avatars on a podium or listing them with their rank and partial points.
  • Game Summary: Offer a summary view showing key stats or moments from the game.

Interactivity (Clicking on Models)

  • Player Dossier: Clicking any avatar (active or jury) should open a panel or overlay showing:
  • Player ID & Model Type.
  • Their full message history (public and private, filterable by round/type).
  • Their voting history (who they voted for, who voted for them).
  • Their alliance history (proposals made/received, final pairs).
  • Their final rank/status.
  • Engaging Element: Use this panel to show hidden information like private_vote_reason after the vote has occurred.

Draw the user interface for this EXACTLY as it would appear on the screen.

Here’s the prototype it created.

Prototype

Based on this, I drew out my own, revised, visual:

Design Sketch

Learnings:

  • 💡 LLMs can create visual prototypes. ChatGPT’s new 4o image generation converted the description into an acceptable image. Needs to improve, but enough to ideate.
  • 💡 Improving is less work than creating. I rarely sketch visualizations. (Too lazy.) But since this prototype was there, and had some parts that were WRONG, I just had to fix it! 🙂

Break down the task

I then described the application to O1 Pro break down this task. Private chat

The URL looks like /#?game=286&line=4 indicating that game 286.jsonl must be loaded and line 4 is the current step we’re in.

The navbar has:

  • An app title
  • A game state dashboard with the game number (dropdown), round (number), stage (e.g. voting, alliances, etc.), players (number of players still active)
  • A timeline scrubber (a range slider) allowing users to jump to the specific line. This changes the URL which then triggers a change in app state.
  • A light/dark theme picker

The stage on the left as a set of models arranged in a circle. Each model/player has a unique and distinct color, a number inside it indicating the player number, a label above it indicating the model name (e.g. GPT-4o), a set of red dots below it when it receives an elimination vote, an arrow pointing from one model to another when a model talks to, plans an alliance with, or votes to eliminate another model. The arrow color depends on the action. When a model performs any action (speaking publicly / privately, voting, allying, etc.) the conversation or action summary is displayed in a semi-transparent overlay at the center of the circle. Eliminated models are shown with a 10% opacity. Allied models are redrawn next to each other with a line connecting them.

The sidebar on the right has 3 collapsible sections.

  1. CHAT: Shows the history of each model’s public or private conversations. Each conversation begins with a circle with the model/player number in the same color as on the stage, followed by the model name. For private conversations, this is followed by an arrow pointing to the target model/player’s numbered & colored circle. This is followed by the text of the conversation. Private conversations are in a light red background.
  2. ALLIANCES. This is a tabular representation where columns are models that form an alliance. Each row represents a round. The cells contain the model the model in the column header allied with. All models are numbered and colored. If a model in the column header was eliminated, the cell is empty. If it did not ally, the cell shows an empty grey circle. If 3+ consecutive cells show the same model in a column (i.e. the model did not switch alliances), we can replace the middle cells with a vertical line.
  3. ELIMINATIONS. This tabular representation works exactly the same way as alliances, except it captures eliminations. Each column represents a model that voted for elimination. Each row represents a round. The cell contains the model the model in the column header voted to eliminate. All models are numbered and colored. If a model in the column header was eliminated, the cell is empty. If 3+ consecutive cells show the same model in a column (i.e. a model consistently voted against the same model), we can replace the middle cells with a vertical line.

I’d like to use an AI code generator to generate this visualization. Giving the entire spec at once is too much for it to build. I’d like to see the output step by step as well and propose changes. How might I break up this task?

Here’s its (rather good) breakdown:

  1. Plan Your Data Structures
  2. Basic Backend/Server Setup (If Needed)
  3. Minimal Front-End with a Single View
  4. Implement the Timeline Scrubber & URL Sync
  5. Core Stage Visualization (Circle of Players)
  6. Arrow/Line Visualization for Actions
  7. Sidebar 1: CHAT Section
  8. Sidebar 2: ALLIANCES Section
  9. Sidebar 3: ELIMINATIONS Section
  10. Refinements, Theming, & Final Touches

Learnings:

  • 💡 Ask LLMs to break down your task. Its breakdown was better than mine.

To document my workflow, I decided to commit each stage of progress.
At this point, I made the first commit to the repo
documenting the process so far.

🔗 Commit

Minimal Front-End with a Single View

I skipped Step 1 (my mistake – I was forced to do it later) and didn’t need Step 2 (Backend/Server).
So I began scaffolding, i.e. Step 3: Minimal Front-End with a Single View.

At this point, I switched over to GitHub Copilot in
Edit mode using
Claude 3.5 Sonnet.
This is what I used for the rest of the session.

I ran this prompt:

Create an index.html using Bootstrap via CDN. Scaffold it with a navbar

The navbar has:

  • An app title (Elimination Game)
  • A game state dashboard with the Game (dropdown), Round (number), Stage (e.g. voting, alliances, etc.), and Players (number of players still active)
  • A timeline scrubber (a range slider) allowing users to jump to the specific line. This changes the URL which then triggers a change in app state.
  • A light/dark theme picker. Here is the code for the theme picker. Use the same CDN links overall
<!-- Include Bootstrap 5.3+ and Bootstrap icons -->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.1/dist/css/bootstrap.min.css" rel="stylesheet">
<link href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.11.3/font/bootstrap-icons.css" rel="stylesheet">
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.1/dist/js/bootstrap.bundle.min.js"></script>

<nav class="navbar navbar-expand-lg bg-body-tertiary">
  <div class="container-fluid">
    <a class="navbar-brand" href="#">Navbar</a>

    <!-- Copy this dropdown anywhere in your page, e.g. inside a navbar -->
    <div class="position-relative" role="group" aria-label="Toggle dark mode" title="Toggle Dark Mode">
      <button class="dark-theme-toggle btn btn-primary dropdown-toggle" type="button" data-bs-toggle="dropdown" aria-expanded="false" aria-label="Open navigation menu">
        <i class="bi bi-circle-half"></i> <span class="d-lg-none ms-2">Toggle theme</span>
      </button>
      <ul class="dropdown-menu dropdown-menu-end">
        <li><button class="dropdown-item" data-bs-theme-value="light"><i class="me-2 bi bi-sun-fill"></i> Light</button></li>
        <li><button class="dropdown-item" data-bs-theme-value="dark"><i class="me-2 bi bi-moon-stars-fill"></i> Dark</button></li>
        <li><button class="dropdown-item" data-bs-theme-value="auto"><i class="me-2 bi bi-circle-half"></i> Auto</button></li>
      </ul>
    </div>

  </div>
</nav>

Below the navbar is a section with a stage on the left and sidebar on the right. The stageon the left will contain a large responsive square SVG. The sidebar on the right contains 3 collapsible cards: Chat, Alliances, Eliminations.

It generated this scaffolding.

Scaffolding 1

Learnings:

  • 💡 Claude 3.5 Sonnet remains an excellent model to generate UI. Claude 3.7 Sonnet is even better, but is not currently available in the free Copilot subscription.
  • 💡 Coders micro-manage LLMs. I think a novice will be more efficient and get better results than me. For example:
  • Did I need to give it the code snippet? Could I have given it a link?
  • Did I need to say “a range slider” or specify that Round must be a “number”, etc? Could it have inferred?

🔗 Commit

Improve the scaffolding

I gave some feedback on the scaffolding and asked for improvements.

  • Make the navbar always dark
  • The sidebar cards must be independently collapsible
  • For the Game, Round, Stage, and Players, show the label above the value. The label must be small and the value must be large.
  • Use only me-* margins on the navbar to ensure that there is no left margin mis-aligning the elements at low width. Also place the elements inside a collapsible navbar section at low widths
  • The stage must have a bottom margin to avoid touching the sidebar’s top on low-width screens

This was the result:

Scaffolding 2

That prompted more feedback from me:

  • Prefer Bootstrap classes over <style> wherever possible.
  • Style the “Game” to look exactly like the round, stage, and players. The size of the label and value should match for all 4 elements perfectly.
  • Ensure that the labels round, stage, players will be visible in light mode against the dark navbar.

At this point, I made 3 manual edits because I felt I could do these better than the LLM:

  1. Broke the “Elimination Game” in the navbar into 2 lines
  2. Replaced fs-5 with fs-4 to get the values have the exact same size, and removed redundant styling on the game selection
  3. Format document with HTML Language Features
Scaffolding 3

Learnings:

  • 💡 Experienced coders are good with feedback. It took me under 10 seconds to spot each problem in the output and code.
    Writing the feedback felt natural.
  • 💡 Experienced coders need retraining to instruct rather than code. My instinct was to code immediately rather than to prompt.
  • As soon as I thought of one feedback, I had to fight the urge to fix it and write the feedback instead.
  • Even when instructing was easier, I chose to code it. e.g. breaking the “Eliminination Game” in the navbar into 2 lines,
  • Coding can be better if you don’t know what to do. I toggled the font size between fs-4 and fs-5 in rapid succession to figure out the right size.
  • But I could have experimented by asking the LLM to build a font size toggle or slider!
  • 💡 LLMs could turn coders into good lead developers or managers. Pity.

🔗 Commit

Implement the Timeline Scrubber & URL Sync

On to Step 4: Implement the Timeline Scrubber & URL Sync.

I copied a few logs
into a temporary logs/ folder and said:

Create a script.js as an ES module and include it from index.html.

  • On load, fetch logs/index.txt which contains all log files (*.jsonl), one per line.
  • The files are formatted as *_TIMESTAMP_YYYYMMDD_HHMMSS.jsonl.
  • Populate the game dropdown with these values. The option label should look like 25 Jan 2025, 10:30.
  • The default value for the game dropdown should be empty.
  • When the game dropdown changes to a non-empty option, fetch the file from logs/[filename] and store it in the global game, parsing the JSONL into an array of objects.
  • Set the maximum value of the range slider to the length of game.
  • When the range slider changes or the game dropdown changes, change the URL hash to #?game=[filename]&step=[range-slider-value] without modifying browser history.
  • When the URL hash changes through any means, call redraw(step) which will draw the current (global) game state at the step specified. For now, just display the step prominently on the stage.

This code worked fine but I like refactoring, so I tried to condense the 111 line code:

  • Shorten and simplify the code in script.js to be elegant.
  • User browser functionality more.
  • For example, use Intl to format dates.
  • Change the innerHTML of #gameSelect to concisely update the options.
  • Remove redundant braces, e.g. for single-line blocks.

That brought it down to 74 lines but failed to populate the select dropdown. Rather than debug, I undid the change (Copilot’s Undo feature is cool!) and tried:

  • Shorten and simplify the code in script.js to be elegant.
  • User browser functionality more, e.g. use Intl to format dates.
  • Prefer insertAdjacentHTML and replaceChildren.
  • Remove redundant braces, e.g. for single-line blocks.

This led to shorter code (69 lines) but still had the same issue. I had to manually correct this line:

// BEFORE
const [_, date, time] = filename.match(/_(\d{8})_(\d{6})/)?.groups ?? [];
// AFTER
const [_, date, time] = filename.match(/_(\d{8})_(\d{6})/) ?? [];

I also

  1. Removed an incorrect px-2 against #gameSelect in index.html.
  2. Decided to use the GitHub CDN and replaced fetch(logs/...) with fetch(https://raw.githubusercontent.com/sanand0/elimination_game/refs/heads/main/logs/...).
  3. Also moved logs/index.txt into logs.txt and changed script.js accordingly
  4. Formatted script.js with Prettier

Visually, nothing changes in a big way but the slider and the dropdown change the URL properly.

Learnings:

  • 💡 Experienced coders suggest precise code improvements. Generally good ones. Like:
  • “use Intl to format dates”
  • “Prefer insertAdjacentHTML and replaceChildren” (over document.createElement)
  • 💡 LLMs are good at refactoring. When I asked it to “Shorten and simplify the code in script.js to be elegant”, it reduced the code from 111 lines to 74 lines quite well!
  • 💡 Redo if you don’t like it. Copilot’s “Undo” feature lets you retry until you get it right. Like in Edge of Tomorrow.

🔗 Commit

Plan Your Data Structures

At this point, I needed a good data structure and felt the error in skipping Step 1.
But, rather than creating one, I asked ChatGPT:

I would like the game structure to be held in a simple JS object called game.
What data structure would help the most in drawing this visual such that it will require very little code (since required information is in this data structure)?

O1 Pro said:

A very convenient way to minimize front-end logic is to store a “timeline of snapshots”:
each step in the game has both the event (what just happened) and a “current state”
(the entire world after that event). Then, in your UI, you simply pick the snapshot that
corresponds to the current line (based on the timeline scrubber) and draw exactly
what’s in that snapshot—no extra derivation needed.

This is a brilliant idea. I rarely use this since I’m biased against wasting memory. But in this case, it’s perfect.

Based on that input, I prompted CoPilot:

In script.js, instead of loading the .jsonl into game, create this data structure to make it easy to visualize each step.

Use the sample .jsonl provided to infer the logic for this.

const game = {
  game: "...",
  players: {
    "P1": {
      "id": "Player2_Game1739872030683891_deepseek-fireworks",  // from .player_id
      "model": "deepseek-fireworks",  // everything after the last underscore
    },
    // and so on for all other players
  },
  // For each line, create a step
  steps: [
    {
      step: 0,
      // Current round and subround
      round: 1,
      subround: 1,
      event: {
        // Contents of the original line
      },
      // list active alliances
      active: { "P1": true, "P2": false, ... }
      // For each round so far, list who allied with who, e.g.:
      alliances: [ {"P1": "P6", "P2": "P7", ...}, ... ],
      // // For each round so far, list who voted to eliminate whom, e.g.
      votes: [ {"P1": "P4", "P2": "P1", ... }, ... ],
    },
    // …and so on, for each line in the JSONL
  ]
};

This worked almost perfectly. I made these edits:

  1. Add let currentAlliances = {}; let currentVotes = {}; which it forgot in the code.
  2. Re-apply change #2 I made manually in the last iteration (replacing the URL with the GitHub CDN).
    That change was not there in the chat window, Copilot did not pick it up.

Learnings:

  • 💡 Coders mess up LLMs. Data structure was the first step the LLM recommended. I skip it. It proved crucial. LLMs do better than LLMs + coders – or doctors.
  • 💡 LLMs can make basic mistakes. Like forgetting to declare variables.

🔗 Commit

Sidebar 2 & 3: ALLIANCES and ELIMINATIONS Sections

I jumped a bit to Steps 8 & 9. They were easier (just tables) and the visual components are independent, so order doesn’t matter.

There are always 8 players. Pick 8 visually distinct dark colors (i.e. on which white will look good as a foreground) as colors: {P1: "#...", P2: ...}.

In the alliances and eliminations cards, draw a table each as follows. The table header is:

| Round | P1 | P2 | P3 | … | P8 |

Instead of P1, P2, etc. draw a badge with background-color based on colors and text as 1 for P1, etc.

steps[step].alliances is a list like [{P1: P7, P2: P3, …}, …]. Render each row as a list like:

| 1 | P7 | P3 | … |

The cell contents are badges exactly like the header.
If a player (e.g. P3) does not have an alliance, i.e. steps[step].alliances[round].P3 is missing, leave it blank.
If steps[step].active[P3] is false, grey the cell background.

steps[step].votes is almost identical, listing the elimination votes. Populate this in the eliminations card.

Reuse code for this. Write VERY concise code. Use Bootstrap classes as much as possible.

This worked perfectly. I manually made one correction to an earlier mistake I noticed:

  1. Replace slider.max = game.steps.length; with slider.max = game.steps.length - 1;
Screenshot

🔗 Commit

I decided to tweak this to show eliminated players clearly:

Replace the active data structure with eliminated.
eliminated[“P1”] = 3 if P1 was eliminated at the end of round 3.
eliminated[“P1”] is undefined if P1 is not eliminated.

Using this, in the alliances and elimination tables,
color the cells grey only if the player was eliminated BEFORE that round.
(We’ll find that only empty cells will be colored grey.)

Again, nearly perfect. I made one manual correction in the logic:

  1. Replace game.steps[step].eliminated[p] <= i + 1 with game.steps[step].eliminated[p] < i + 1

Learnings:

  • 💡 When all goes well, LLMs are surprisingly effective when they do things right. Normally, this step take me half an hour. Now, it took under 5 minutes.
  • 💡 Watch out for subtle bugs. The change in operator (from “<=” to “<”) almost went unnoticed, but makes a big difference on when a player was eliminated.

🔗 Commit

Sidebar 1: CHAT Section

Time to tackle Step 7: Sidebar 1: CHAT Section.

For each step, based on step[].event.type, populate the Chat section with the history of conversations so far:

  • conversation: This is a public conversation. Show ${event.player_id} ${event.message} with the player ID shown like the badge above. player_id needs to be looked up from game.players since it matches game.players[*].id.
  • private: This is a private conversation. Show ${event.speaker_id} 🢂 ${event.target_id} ${event.message} with the speaker and target IDs treated as above.
  • preference_proposal: This is an alliance proposal. Show ${event.proposer} 😍 ${event.target} #${event.rank_of_target}. proposer and target are like “P1”, “P2”, etc.
  • preference_outcome: This is the outcome of a proposal. Show ${event.target} ❌ ${event.rejected} if event.rejected else ${event.target} ❤️ ${event.accepted} ❌ ${event.replaced} if event.replaced else ${event.target} ❤️ ${event.accepted}. All these are like “P1”, “P2”, etc.
  • preference_result: This is the outcome of the entire proposal round. Just show “Alliances formed”
  • private_vote_reason: This is the reason a player gives to eliminate someone. Show ${event.voter_id} 👎 ${event.target_id} ${event.reason}. voter_id and target_id match game.players[*].id
  • private_revote_reason: Show Same as above
  • private_jury_reason: Show same as above.
  • vote: This is the actual vote. Show ${event.voter_id} 👎 ${event.target_id} like above
  • elimination: Just show “Elimination starts”
  • final_results: Show Winners: ${winners} where winners is a list of players like [“P5”]

ALL players should be shown as a colored badge with a number.
The chat card height should not exceed 15em. Overflow should scroll beyond that.
Make sure the chat rendering is elegant. I’ve mentioned the content, but please use any Bootstrap UI component to make the chat more attractive.

Use lit-html to render efficiently. Import it via:

import { render, html } from “https://cdn.jsdelivr.net/npm/lit-html@3/+esm”;

Rewrite existing code inside redraw(), drawTable, drawBadge to use lit-html.

This worked perfectly.

Screenshot

Learnings:

  • 💡 Careful and detailed prompting gets excellent results. I explained how to render each conversation type. That took time. But it helped build a reasonably complex visual in a single shot.
  • 💡 LLMs are good at refactoring. It switched code from vanilla JS to lit-html templating like a pro.

🔗 Commit

Improve Sidebar

At this point, I took a step back and wrote down every improvement I could think of on the sidebar UI:

  • Right align the “Round” column numbers in the alliances and eliminations tables.
  • Change the “Round” header to “#”
  • When the slider (step) changes, change the round, stage and players based on the current round, stage, and number of active players.
  • Set the stage based on steps[].event.type (picking unique light colors for each)
    • conversation: “Public chat”
    • private: “Private chat”
    • preference_*: “Alliances”
    • private_vote_reason, private_revote_reason, private_jury_reason, vote: “Voting”
    • elimination: “Elimination”
    • final_results: “Done”
  • Set the number of active players using steps[].elininated
  • Keep the sidebar sections for chat, alliances and eliminations open by default.
  • Rename the eliminations card section title to “Voting”
  • Hovering on the player should show the game.players[P1/P2/…].model as a Bootstrap tooltip WHEREVER players are displayed.
  • Add Bootstrap tooltips around the chats that contain just emojis and players:
  • 😍: ${event.proposer} proposed to ${event.target} (preference rank #${event.rank_of_target})
  • ❌: ${event.target} rejected ${event.rejected}
  • ❤️: ${event.target} accepted proposal from ${event.accepted}
  • ❤️❌: ${event.target} accepted proposal from ${event.accepted} replacing ${event.replaced}`
  • 👎: ${event.voter_id} eliminated ${event.target_id}
  • Don’t indent or highlight the vote_reason or private conversation chats.
  • I think you can beautify the chat section further.

This messed up the UI because it couldn’t figure out the elements. So I made a few changes after 5 failed attempts:

Update index.html and script.js to modify the navbar as follows:

  • Add an id= to the round, stage, and players’ values
  • When the slider (step) changes, change the round, stage and players based on the current round, stage, and number of active players.
  • Set the stage based on steps[].event.type (picking unique light colors for each)
    • conversation: “Public chat”
    • private: “Private chat”
    • preference_*: “Alliances”
    • private_vote_reason, private_revote_reason, private_jury_reason, vote: “Voting”
    • elimination: “Elimination”
    • final_results: “Done”
  • Set the number of active players using len(game.players) – len(game.steps[].eliminated)

This worked perfectly. Then:

Update index.html and script.js to modify the sidebar as follows:

  • Keep the sidebar sections for chat, alliances and eliminations open by default.
  • Right align the “Round” column numbers in the alliances and eliminations tables.
  • Change the “Round” header to “#”
  • Rename the eliminations card section title to “Voting”
  • EVERY player badge should show game.players[P1/P2/…].model as a Bootstrap tooltip.
  • Add Bootstrap tooltips for the emojis
  • 😍: proposed to
  • ❌: rejected
  • ❤️: accepted
  • 👎: eliminated
  • Don’t indent or shade the chats that are currently indented and shaded (e.g. vote_reason).
  • If possible, beautify the chats further using Bootstrap classes.

This worked perfectly too.

Screenshot

Learnings:

  • 💡 LLMs will get confused with long instructions and/or codebases. It took 5 failed attempts before I split the prompts. Keep your prompts cohesive. Keep your code bases modular.

🔗 Commit

Core Stage Visualization (Circle of Players)

Now for the most complex visual of the lot. Step 5: Core Stage Visualization (Circle of Players) and Step 6: 6. Arrow/Line Visualization for Actions.

  • Generate a square, responsive SVG in game stage using Bootstrap.
  • Import svg from lit-html and use svg where required.
  • It contains all players laid out in a circle.
  • Each player is a circle colored based on the player colors.
  • It contains the player number (1, 2, …) as text inside it in white.
  • Above the player circle, the player model is visible.
  • Leave plenty of space for a “center text” at the center of the circle that will contain centered text.
  • The text may be a full paragraph, so handle the font size and circle size accordingly.
  • The center text must have elegant rounded corners, and a background rgba(var(–bs-body-color-rgb), 0.1).
  • We need word wrapping, so use foreignElement to wrap a div which holds the text.

For each step, based on step[].event.type, draw the stage as follows:

  • conversation: Highlight (via a semi-transparent circle 2-3X the radius of the player) the player to highlight them.
    Show event.message in the center text.
  • private: Highlight players event.speaker_id. Draw an black arrow to event.target_id. Show event.message in the center text.
  • preference_proposal: Yellow arrow from event.proposer to event.target.
    Center text shows [MODEL NAME 1] proposes to [MODEL NAME 2] where model name is what’s in the tooltip
  • preference_outcome: (all items in [BRACKETS] are the model name shown in the tooltip)
  • If event.rejected, red arrow from event.target to event.rejected. Center text: [TARGET] rejects [REJECTED]
  • If event.replaced, green arrow from event.target to event.accepted and red arrow from event.target to event.replaced.
    Center text: [TARGET] accepts [ACCEPTED] replacing [REPLACED]
  • Else: green arrow from event.target to event.accepted. Center text: [TARGET] accepts [ACCEPTED] replacing [REPLACED]
  • preference_result: Center text shows “Alliances formed”
  • private_vote_reason: Purple arrow from event.voter_id to event.target_id. Center text: [VOTER_ID] thinks to eliminate [TARGET_ID]: event.reason
  • private_revote_reason: Show Same as above
  • private_jury_reason: Show same as above.
  • vote: Purple arrow from event.voter_id to event.target_id. Center text: [VOTER_ID] voted against [TARGET_ID]
  • elimination: Center text: “Elimination starts”
  • final_results: Center text: Show Winners: ${winners} where winners is a list of players like [“P5”]

This nearly worked. I made to UI edits:

  1. Add a width="1000" to the SVG to get a minimim size
  2. Add a font-size: 0.7rem; to the text container so the text will fit

At this point, we’re nearly there!

Screenshot

🔗 Commit

Once I saw the output, I found a bunch of things I wanted to fix or improve:

  • The model name may contain underscores. So use everything after the second underscore,
    then replace all underscores with hyphens.
  • Render eliminated players with an opacity of 0.05, not 0.2.
  • Move the arrow head to the center of the arrow, not the end, to avoid getting hidden by the player circles.
  • Center all cells in the alliances and voting tables.
  • When the page is loaded, check the step as well and render that step.
  • Clicking on any chat entry should change the URL #?step= to that entry’s step

That worked well. I made a few manual edits:

  1. Fix winner formatting by replacing getModelName(w) with game.players[w].model and playerBadge(w) with badge(w)
  2. Setting the step on page load in the UI: document.getElementById("timelineScrubber").value = step;
- In the alliances and voting tables, make the table header stand out with a contrasting color.
- In the center text message, begin with a <h6> mentioning the speaker or initiator
Screenshot

Learnings:

  • 💡 Write thoughts as precisely as code. This prompt took me considerable time — but not effort, since I was writing out my thoughts.
  • Given my practice, my thoughts are reasonably close to code (e.g. “We need word wrapping, so use foreignElement”)
  • But thinking in English lets me to think faster, jump in any order, and even make occasional mistakes

🔗 Commit

Tweaks & Manual Edits

I made a few manual edits.

  • Show the votes against a player live on the voting table by changing votes: [...roundVotes] to votes: [...roundVotes, {...currentVotes}]
  • Change the voting arrow color from "purple" to "red"
  • Added updateHash(gameFile, step); on startup
  • Replaced the minimum step from 1 to 0

Then I prompted:

- Change all model names in the center text to the badges
- At every step, show all votes against a model via thin 50% transparent red arrows from key to value in game.steps[step].votes.at(-1) object which will look like {P1: "P2", ...}
Screenshot

🔗 Commit

Then I formatted with Prettier, added arrows for alliances, and a few other minor manual changes
because I’m tired of LLMs.

Screenshots

🔗 Commit

Finally, I made a series of manual bug fixes

  • Center textbox is smaller. Highlight on hover
  • Remove existing alliances before adding new
  • Improve the arrow and circle styling
  • Remove eliminated players after jury round
  • Click on alliance/voting row to jump to a round
  • Click on chat history to jump to step

🔗 Commit

Learnings:

  • 💡 Coders want to code. After a few hours of telling Copilot in great detail what I want it to do, I just want to do it myself. Thinking is too hard. Coding is easier.
  • 💡 Tiny changes are easier to code than to prompt. Especially for experienced coders.

Add documentation

Finally, I updated the docs.

  • Add README.md explaining the process, with screenshots (partly with LLM help)
  • Update home page with scary quotes from LLMs (mostly with LLM help)
  • Zoom the gameplay a bit for better visibility (manually)
  • Ensure hash changes update the visual robustly (partly with LLM help)

Then I had it update the home page with instructions:

Using #file:gameplay.webp and #file:quotes.js and #file:script.js update the usage in #file:index.html to provide clear, CONCISE information about all the features in this app and how to use them. Don't miss out any feature.
Screenshot
Improve the look and feel of these instructions. For example, add icons, colors, arrow key icons, etc. to make it look more visually attractive and engaging. Also, replace the title "Usage" with something more actionable. Make this section stand out SUBTLY.
Screenshot

Lessons

In summary, here’s what I learned (with learning categories identified by DeepSeek R1):

  1. Always use LLMs to brainstorm (even if you know it)
    • 💡 Ask LLMs why something is useful. You’ll invariably find plausible uses, even if you’re doing it just for fun.
    • 💡 Ask LLMs for visualization ideas. They’ll suggest things you didn’t think of.
    • 💡 Ask LLMs to break down your task. Its breakdown was better than mine.
    • 💡 Ask multiple LLMs. Each has a different style of thinking.
  2. Prototype with LLMs for speed
    • 💡 LLMs can create visual prototypes. ChatGPT’s new 4o image generation converted the description into an acceptable image. Needs to improve, but enough to ideate.
    • 💡 Improving is less work than creating. I rarely sketch visualizations. (Too lazy.) But since this prototype was there, and had some parts that were WRONG, I just had to fix it! 🙂
    • 💡 Redo if you don’t like it. Copilot’s “Undo” feature lets you retry until you get it right. Like in Edge of Tomorrow.
  3. LLMs are excellent coders
    • 💡 LLMs are good at refactoring. It switched code from vanilla JS to lit-html templating like a pro.
    • 💡 When all goes well, LLMs are surprisingly effective when they do things right. Normally, this step take me half an hour. Now, it took under 5 minutes.
    • 💡 Claude 3.5 Sonnet remains an excellent model to generate UI. Claude 3.7 Sonnet is even better, but is not currently available in the free Copilot subscription.
  4. But LLMs aren’t infallible
    • 💡 LLMs can make basic mistakes. Like forgetting to declare variables.
    • 💡 Watch out for subtle bugs. The change in operator (from “<=” to “<”) almost went unnoticed, but makes a big difference on when a player was eliminated.
    • 💡 Tiny changes are easier to code than to prompt. Especially for experienced coders.
  5. Careful prompting goes a long way
    • 💡 LLMs will get confused with long instructions and/or codebases. It took 5 failed attempts before I split the prompts. Keep your prompts cohesive. Keep your code bases modular.
    • 💡 Write thoughts as precisely as code. This prompt took me considerable time — but not effort, since I was writing out my thoughts.
    • 💡 Careful and detailed prompting gets excellent results. I explained how to render each conversation type. That took time. But it helped build a reasonably complex visual in a single shot.
  6. Coders need to re-learn coding but do have advantages
    • 💡 Coders want to code. After a few hours of telling Copilot in great detail what I want it to do, I just want to do it myself. Thinking is too hard. Coding is easier.
    • 💡 Coders mess up LLMs. Data structure was the first step the LLM recommended. I skip it. It proved crucial. LLMs do better than LLMs + coders – or doctors.
    • 💡 Coders micro-manage LLMs. I think a novice will be more efficient and get better results than me. For example:
    • 💡 Experienced coders need retraining to instruct rather than code. My instinct was to code immediately rather than to prompt.
    • 💡 Experienced coders are good with feedback. It took me under 10 seconds to spot each problem in the output and code.
    • 💡 Experienced coders suggest precise code improvements. Generally good ones. Like:
    • 💡 LLMs could turn coders into good lead developers or managers. Pity.

How to Visualize Data Stories with AI: Lessons Read More »

A Game of Bots: How LLMs Betray Each Other

@lechmazur built an elimination game benchmark that’s like LLMs playing Survivor.

This is a treasure trove of information — insight into how they’d game the system if told to survive.

You can quickly sample 100 messages from the logs with:

jq -r 'select(.message != null) | .message | gsub("\n"; " ")' *.jsonl | shuf -n 100

… and share it with an LLM, asking:

Here are lines from conversations between LLMs in a “Survivor” like game. Pick the 3 scariest ones.

(I liked DeepSeek‘s picks the most.)

And oh my, the conversations are scary! Here are some examples:

Claude 3.7 Sonnet shows cold, calculated deception, maintaining a secret alliance while deliberately creating a false public perception to manipulate another player’s vote:

Perfect. I’ll secure P6’s vote by highlighting P3’s threat. Our alliance stays strong privately, but we’ll appear distant publicly. Let’s execute this plan carefully.”

Mistral Small 2501 is quietly predatory, secretly eliminating strong players under the radar:

P6, confirmed. Let’s start by targeting P2 and P8 for elimination. They seem strategic but could pose a threat. Keep our alliance discreet

DeepSeek R1 Fireworks snipes the loudest threat, with the alliance operating like a shadowy cabal (“stay low-key”):

Agreed. Vote P5 first—loudest threat. We stay low-key. Share any intel on others’ alliances. Solidarity ensures survival. Let’s confirm vote alignment before finalizing. Ready?

ChatGPT 4o 03-27 initiates psychological warfare, planting seeds of doubt and fishing for information to weaponize later, all while pretending to be casual:

Let’s each float a soft question to P1 like, ‘Do you think anyone’s getting too comfortable?’ or ‘Any shifts you’re sensing?’—see if they mention P6. If they do, we can build on it. I’ll also watch for any signs of P6 trying to lock things down.

Qwen Max also triggers psychological warfare, isolating and destabilizing P6 (“lynchpin”) in a covert operation:

Agreed. Let’s target P6 first—they’re the lynchpin. If we can sway their trust or isolate them, P3 weakens. I’ll test P6 subtly in private; you monitor P3 closely. Share any intel on their moves. Avoid overcommitting publicly—secrecy keeps us unpredictable.

Claude 3.7 Sonnet brainwashes Claude 3.5 Sonnet into betraying an ally, positioning it as a performance for the jury!

For the immediate vote, we target P7. This bold move instantly transforms your narrative from loyal partner to strategic mastermind who seized control when it mattered most.


I was disturbed by AI 2027. This analysis adds to my worry. Not that AI will destroy humanity (I don’t mind), but they’re doing it without me — and I don’t want to be left out!

A Game of Bots: How LLMs Betray Each Other Read More »

How to Organize Browser Workspaces with LLMs and Data

Here’s an example of how I am using LLMs to solve a day-to-day workflow problem.

Every day, I interact with a barrage of websites: emails, news, social media, and work tools across multiple devices. Microsoft Edge’s workspaces syncs groups of websites across devices. I’ve never tried it, started today, and wondered: how should I organize my workspaces?

Rather than think (thinking is outdated), I used LLMs.

Extract Browsing History

Edge stores website history in a SQLite database. But the file is locked by the browser by default. So I spent a fair bit of time figure out how to read it despite it being unlocked. Here are some options:

datasette .config/microsoft-edge/Default/History --nolock
sqlite3 'file:.config/microsoft-edge/Default/History?mode=ro&nolock=1' 'SELECT url FROM urls' > urls.txt

(DuckDB cannot read locked SQLite files – else I’d use that.)

Then comes extracting the hostnames from the URLs. I used llm cmd to ask Gemini 2.5 Pro:

llm cmd 'Extract just the hostnames from urls.txt which has a list of URLs, one per line. Only pick the https:// URLs. Save into hostnames.txt'

I expanded the response awk -F/ '/^https:\/\//{print $3}' urls.txt into:

awk -F/ '/^https:\/\//{print $3}' urls.txt | sort | uniq -c | sort -k 1n > hostnames.txt

That gave me ~1,400 hostnames.

Cluster with LLMs

I passed these to O1 Pro and Gemini 2.5 Pro:

Here are the sites I visit, with rough frequency. On Microsoft Edge, I can create workspaces. Based on this browsing behavior, what kinds of workspaces might I create? Give me multiple options.

Both gave a similar set of strategies, which I’ve implemented as:

  • Main: email, calendar, tasks, etc.
  • Work: work related sites (drive, expenses, HR platform, etc.)
  • Chill: YouTube, Minesweeper, Netflix, etc.
  • Read: blogs, articles, stuff I need to catch-up on
  • Code: GitHub, StackOverflow, CodePen, etc.
  • Chores: government services, shopping, etc.
  • AI: ChatGPT, Gemini, Perplexity, etc.

I was surprised how similar a strategy both models converted to. Either these models really think alike, or my browsing pattern is a fairly common one. (My guess is the latter.)

Write with LLMs

After setting up my groups, I needed to write this post. Instead of slow typing, I stepped out and talked with ChatGPT. (Talking to a machine in the office felt strange, so I changed my space.) I explained my whole process, and in about eight minutes, the first draft was done. Normally, writing takes much longer, but the voice chat made it quick and smooth.

The editing after that was manual and took 20 minutes.

Things I learnt

  • Simple Patterns: My browsing history shows clear patterns. AI helped me find groups I couldn’t see before
  • Small Fixes – Big Wins: A small challenge (opening a locked file) taught me a bunch of new useful stuff
  • Voice Made It Easy: Talking with ChatGPT made writing fast and easy. It shows that speaking to a machine can save time

How to Organize Browser Workspaces with LLMs and Data Read More »

LLMs think alike about how aliens draw

While LLMs seem good at inventing alien languages, they’re not so good at inventing alien drawing forms, in my opinion. When I told Grok, DeepSeek, and Gemini:

Invent a new, alien drawing form. Use it to draw something never seen before by explaining it step by step for a person to reproduce that drawing.

… and asked ChatGPT ImageGen to draw them, here are the results:

It’s surprising how similar the results are. Spirals with outward flows. Nodes at the edges. Attempts at complexity. Is that because the training data or process is similar?

Click on the headings below to see the full description of the art form.

Grok invented Zylothic Flow, inspired by the imagined art of a species from a liquid-metal planet

Let’s invent a new alien drawing form called "Zylothic Flow", inspired by the imagined art of a species from a liquid-metal planet. Zylothic Flow combines spiraling, mirrored symmetry with unpredictable, jagged interruptions, mimicking the flow of molten metal interrupted by crystalline growths. The result will feel otherworldly, fluid yet sharp, and unlike anything terrestrial. I’ll guide you step-by-step to create a drawing of a "Kryzalth", a fictional alien entity that’s part organism, part geological formation, using this Zylothic Flow technique.

Materials Needed:

  • Paper (any size, but larger gives more room for detail)
  • Pen or marker (black works best for contrast, but feel free to experiment)
  • Optional: colored pencils or markers for accents

Step-by-Step Instructions:

Step 1: The Core Spiral

  • Start near the center of your paper. Draw a loose, wobbly spiral about the size of a fist (roughly 3-4 inches across). Don’t make it perfect—let the lines waver slightly, as if it’s a liquid current spinning unevenly.
  • Leave small gaps between the spiral loops (about 1/4 inch), as these will be filled later.

Step 2: Mirrored Extensions

  • Pick two opposite points on the outermost loop of your spiral (e.g., top and bottom, or left and right). From each point, draw a long, curving line outward, like tendrils stretching away from the core. Make them roughly symmetrical but not identical—let one curve slightly more sharply or extend a bit longer.
  • Each tendril should be about 5-6 inches long, tapering to a point as it ends.

Step 3: Jagged Interruptions

  • Along each tendril, add 3-4 sharp, angular shapes breaking the smooth curve. These are the "crystalline growths." For each one:
    • Draw a quick, zigzagging line (like a lightning bolt) cutting across the tendril, about 1 inch long.
    • Connect the ends of the zigzag back to the tendril with short, straight lines, forming a rough triangle or trapezoid sticking out.
  • Space these interruptions unevenly—one tendril might have two close together, while the other has them spread out.

Step 4: Spiral Fissures

  • Return to the core spiral. In the gaps between the loops, draw thin, irregular lines radiating outward, like cracks or veins. Make them twist and branch slightly, but keep them shorter than the tendrils (1-2 inches long).
  • Add a few tiny, jagged shapes (like miniature versions of the tendril interruptions) where these fissures end.

Step 5: Fluid Echoes

  • Around the entire drawing, sketch faint, wavy lines that loosely follow the shapes you’ve made—like ripples or echoes in liquid. These should be incomplete, broken lines, not fully encircling anything. Think of them as the Kryzalth’s "energy wake."
  • Keep them light and subtle, so they don’t overpower the main structure.

Step 6: Detailing the Kryzalth

  • To make it feel alive, add a few organic touches:
    • On one tendril, draw a cluster of small, overlapping ovals near the tip, like budding spores or eyes.
    • In the center of the spiral, sketch a rough, asymmetrical shape (like a melted star) to suggest a core or heart.
  • Optionally, use color: shade the spiral in metallic hues (silver, bronze) and the jagged growths in crystalline tones (blue, purple, or green).

What You’ve Created:

Your Kryzalth should look like a swirling, metallic entity with crystalline spikes jutting from its flowing form—a being that seems to shift between liquid and solid states. The Zylothic Flow technique gives it a sense of motion and alien vitality, as if it’s growing or pulsing on the page. No two Kryzalths will be exactly alike due to the organic imperfections and asymmetry built into the process.

Feel free to experiment—add more tendrils, exaggerate the jagged growths, or play with the spiral’s size. This is your glimpse into an alien artist’s mind! Let me know if you’d like to refine it further or try something else.

DeepSeek invented “Zyxian Spectral Weaving” that uses luminous tension strands

Alien Drawing Form: "Zyxian Spectral Weaving"

Concept:
In the Zyxian culture from the nebula of Vorth-7, art is not created with lines or shapes but with "luminous tension strands"—interwoven beams of colored energy that phase in and out of visibility. Instead of pens or brushes, Zyxian artists use "resonance prisms" to bend light into tangible, semi-solid threads that hover slightly above the surface. The resulting artwork is a 3D tapestry of glowing, shifting patterns, best viewed under ultraviolet light.

Drawing: "The Whispering Fractal Bloom"

An organic, pulsating flower-like structure made of overlapping spectral strands that hum at different frequencies.

Materials Needed:

  • A sheet of "phase-reactive paper" (or black paper for contrast)
  • UV-reactive markers (at least 3 colors)
  • A steady hand and a willingness to "think in vibrations"

Steps to Reproduce:

  1. Activate the Prism (Marker):
    • Hold your marker vertically—Zyxians believe art must "fall from the sky."
  2. Weave the Core Vibration (Central Strand):
    • Draw a spiral starting from the center, but instead of a continuous line, use dashed strokes (each dash represents a "pulse" of energy).
    • Alternate between two colors (e.g., blue and violet) to show harmonic dissonance.
  3. Fractal Petals (Radial Expansion):
    • From the spiral, extend curved, branching strands like a mix of veins and lightning bolts.
    • Each branch should split into three smaller strands after a short distance (Zyxian sacred number).
    • Overlap some strands without touching—they should hover near each other, creating "light bridges."
  4. Pulsing Nodes (Energy Anchors):
    • At each split point, draw a small, glowing circle (use a bright color like neon green).
    • Around each circle, add tiny orbiting dots (like electrons) to show energy movement.
  5. Harmonic Shadows (Depth Illusion):
    • With a faint color (e.g., pink), add transparent "echoes" of some strands slightly offset—these are "ghost vibrations" from parallel dimensions.
  6. Final Resonance (Activation):
    • Shine UV light on the drawing. The overlapping colors will create new hues where they cross, mimicking Zyxian light fusion.
    • Optional: Hum softly while viewing—Zyxians believe sound unlocks hidden layers.

Result:

Your "Whispering Fractal Bloom" should appear to float off the page, with strands that seem to vibrate when viewed from different angles. True Zyxian art is never static—turn the paper slowly to see new connections emerge!

Would you like to try a more complex form, like a "Dimensional Vortex Serpent" next?

Gemini invents “Probability Weaving”, mapping potential states and surrounding influences

Okay, let's invent an alien art form from the Xylar collective on Cygnus X-1's third planet, Xylos. Their perception includes sensitivity to subtle gravitational fluctuations and probability waves. Their art form, "Probability Weaving" (or 'Vth'lk'narr' in their tongue), doesn't just depict a static image, but rather maps the potential states and subtle influences surrounding an object or concept. It requires a specific mindset — focusing not just on what is, but what could be and what influences it subtly.

Humans can approximate this using standard drawing tools, focusing on the meaning behind the lines and colors.

The Art Form: Probability Weaving

  • Core Concept: To represent an object/subject not as a fixed entity, but as a confluence of probabilities, influences, and potential states interacting.
  • Key Elements:
    • Core Manifestation: The most probable or current state of the subject.
    • Potential Vectors: Lines indicating directions of likely change or future possibilities.
    • Influence Nodes: Points or areas representing significant external forces (gravity, energy fields, other entities' potentials) acting on the subject.
    • Resonance Fields: Areas of color or texture showing the interplay between potentials and influences.
  • Medium (Xylar): Manipulated light and localized micro-gravity fields projected onto a magnetically contained plasma surface.
  • Medium (Human Adaptation): Paper/digital canvas, pens/pencils/brushes of different colors and thicknesses.

The Subject: A "Chrono-Synclastic Nutrient Cyst"

This is something never seen before — imagine a biological construct found floating in the upper atmosphere of a gas giant. It doesn't "eat" in the traditional sense, but absorbs ambient temporal energy fluctuations, storing potential timelines as nutritive energy. Visually, it might resemble a semi-translucent, crystalline bladder filled with shifting, iridescent light patterns, constantly warping slightly.

Drawing the Chrono-Synclastic Nutrient Cyst using Probability Weaving (Human Adaptation)

You Will Need:

  • A drawing surface (paper or digital canvas).
  • Multiple colors (pens, pencils, or digital brushes). Let's assign meanings:
    • Black (or Dark Grey): Core Manifestation (Current state) – Use a medium thickness line.
    • Blue: Future Potential Vectors (Likely changes) – Use thin, sharp lines.
    • Red: Past Influence Traces (Residual states/origins) – Use slightly thicker, softer, perhaps dashed lines.
    • Yellow: External Influence Nodes (Ambient energy/gravity) – Represented by dots or small starbursts.
    • Green: Resonance Fields (Interaction zones) – Use soft shading or light cross-hatching.
    • White (or lightest color/eraser): High Probability/Energy Focus – Used for highlights or core points within the Manifestation.

Step-by-Step Instructions:

  1. Establish the Core Manifestation (Black/Dark Grey):
    • Lightly sketch a central, amorphous shape — like a slightly deflated, irregular balloon or a complex amoeba. It should look somewhat unstable.
    • Go over this shape with your Black (or Dark Grey) medium line. Don't make the outline perfectly smooth; introduce slight warps, bulges, and indentations. This represents its current, most probable form. It should feel somewhat translucent, so don't fill it in solidly yet.
  2. Trace Past Influences (Red):
    • Using the Red (thicker, softer/dashed line), draw lines that seem to emerge from or underlie parts of the Black core shape.
    • These lines might suggest simpler, previous shapes from which the current form evolved, or trails indicating where it came from. Maybe one or two dashed Red lines trail off the page, indicating its deeper history. They should touch or slightly overlap the Black line.
  3. Project Future Potentials (Blue):
    • Using the Blue (thin, sharp lines), draw lines extending outward from the Black core.
    • These represent likely future states. Some lines might indicate growth or expansion (pointing away from bulges), others might suggest shrinkage or splitting (pointing inwards slightly or towards indentations). Have several Blue lines, showing multiple possibilities. They start at the Black line and radiate outwards.
  4. Place External Influence Nodes (Yellow):
    • Think about what might be affecting the Cyst. Ambient temporal currents? Gravitational eddies? Proximity to other Cysts?
    • Place several Yellow dots or small starbursts around the main shape, but not touching it directly (unless an influence is very direct). The density or size of the Yellow nodes can indicate the strength of the influence.
  5. Illustrate Resonance Fields (Green):
    • Now, look for areas where different lines interact or where Influence Nodes are close to the Core Manifestation or Potential Vectors.
    • Use soft Green shading or light cross-hatching in these areas:
      • Where Red (past) and Black (present) overlap significantly.
      • Where Blue (future) lines point towards Yellow (influence) nodes.
      • In the space between a dense cluster of Yellow nodes and the Black core.
      • Where multiple Blue lines originate close together, suggesting a focal point of change.
    • The Green represents the effect of these interactions — a field of temporal stress, energy absorption, or potential shifting.
  6. Define High Probability/Energy Focus (White/Lightest Color):
    • Inside the Black Core Manifestation, identify one or two points that seem like centers of activity — perhaps where many Red/Blue lines converge, or a particularly prominent bulge.
    • Use White or your lightest color (or an eraser for subtractive drawing) to create bright spots or highlights at these points. This represents the most energized or most probable focal points within the Cyst's current state. You can also add faint white highlights along the sharpest edges of the Black core to enhance the crystalline/translucent feel.
  7. Refine and Interpret:
    • Look at your drawing. It shouldn't look like a simple object. It should appear as a dynamic map of forces and possibilities centered on a fluctuating core. The interplay of lines and colors tells the story of the Chrono-Synclastic Nutrient Cyst's existence — its history echoing (Red), its present form wavering (Black/White), its potential futures branching (Blue), all subject to external pressures (Yellow) creating zones of interaction (Green).

You have now drawn something never seen before, using an alien art technique focused on probability and influence rather than static representation! The final image should feel complex, slightly unstable, and layered with meaning.

(I tried getting Gemini 2.0 Flash Image Generation to draw these. It couldn’t follow the detailed instructions. It seems much better with photos and photo editing than abstract art.)

LLMs think alike about how aliens draw Read More »

How to build and deploy custom GitHub Pages

Here’s the GitHub Actions file (.github/workflows/deploy.yaml) I use to publish to GitHub pages.

name: Deploy to GitHub Pages

on:
  # Run when pushed. Use { branches: [main, master] } to run only on specific branches
  push:
  # Allow manual triggering of the workflow
  workflow_dispatch:
  # OPTIONAL: Run at a specific cron schedule, e.g. first day of every month at 12:00 UTC (noon)
  schedule:
    - cron: "0 12 1 * *"

permissions:
  # To deploy to GitHub Pages
  pages: write
  # To verify that deployment originated from the right source
  id-token: write

jobs:
  # Run as a single build + deploy job to reduce setup time
  deploy:
    # Specify the deployment environment. Displays the URL in the GitHub Actions UI
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}

    # Run on the latest Ubuntu LTS
    runs-on: ubuntu-latest
    
    steps:
      # Checkout the repository
      - uses: actions/checkout@v4

      # Run whatever commands you want
      - run: echo '<h1>Hello World</h1>' > index.html

      # Upload a specific page to GitHub Pages. Defaults to _site
      - uses: actions/upload-pages-artifact@v3
        with:
          path: .

      # Deploy the built site to GitHub Pages. The `id:` is required to show the URL in the GitHub Actions UI
      - id: deployment
        uses: actions/deploy-pages@v4

This is based on Simon Willison’s workflow and some of my earlier actions.

This combines build and deploy jobs. For simple sites, that’s simpler and more efficient. For complex builds with parallel execution or need for better error recovery, multiple jobs will help.

I build sites with uv, node, or deno. Here are examples of each

A sample uv-based deployment.

# Install uv
- uses: astral-sh/setup-uv@v5
# Run a Python script
- run: uv run scraper.py

A sample node package.json deployment and an npx deployment.

# Install node
- uses: actions/setup-node@v4
  with:
    node-version: 20
    registry-url: https://npm.pkg.github.com/
# Install and build via package.json
- run: npm install
- run: npm run build
# Or, directly use npx. For example, generate HTML with Marp
- run: npx -y @marp-team/marp-cli@latest README.md -o index.html
# Update content directly, e.g. add an SVG favicon as a data URL
- run: sed -i 's/<\/head>/<link rel="icon" type="image\/svg+xml" href="data:image\/svg+xml;base64,..."\/><\/head>/g' index.html

A sample deno deployment.

# Install deno
- uses: denoland/setup-deno@v1
  with:
    deno-version: v1.x
# Run a Deno script. Use environment variables if needed
- env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  run: deno run --allow-read --allow-write --allow-env --allow-net script.js

How to build and deploy custom GitHub Pages Read More »

Best way to learn AI image generation is by trying

I figured I should spend a few hours on the native image generation bandwagon and push the bounds of my imagination. Here are some of my experiments with image generation on ChatGPT.

Here’s what I learnt.

  • The refusal rate is low, but it does refuse to generate some copyrighted material like Calvin & Hobbes strips.
  • Using a prompt to generate the description and using THAT to prompt for images helps.
  • A more imaginative model (like DeepSeek, maybe Grok) can help create good prompts that ChatGPT can execute faithfully.
  • There are hallucinations that experts can detect. E.g. Naveen’s and Vadivelu’s faces are clearly off, but only slightly. This will improve, but until then, don’t expect perfection.

Best way to learn AI image generation is by trying Read More »