More

verma7 · 2026-03-01T06:00:57 1772344857

I wrote a C++ translation of it: https://github.com/verma7/microgpt/blob/main/microgpt.cc

2x the number of lines of code (~400L), 10x the speed

The hard part was figuring out how to represent the Value class in C++ (ended up using shared_ptrs).

WithinReason · 2026-03-01T09:21:35 1772356895

I made an explicit reverse pass (no autodiff), it was 8x faster in Python

hu3 · 2026-03-02T04:56:42 1772427402

I made an explicit double-reverse pass (no code!), it was 80x faster in my head!

spopejoy · 2026-03-02T14:20:21 1772461221

"I've got an ipod -- In My Mind"

https://theonion.com/i-have-an-ipod-in-my-mind-1819584018/

WithinReason · 2026-03-02T21:26:22 1772486782

code here, it's just not interesting to look at:

https://news.ycombinator.com/item?id=47220542

bear3r · 2026-03-02T05:13:02 1772428382

tradeoff worth naming: you avoid the autodiff graph overhead (hence the speedup), but any architecture change means rewriting every gradient by hand. fine for a pedagogical project, but that's exactly why autodiff exists.

love2read · 2026-03-01T23:05:08 1772406308

Can you share a link?

WithinReason · 2026-03-02T16:54:14 1772470454

https://www.ideone.com/VAz4Nn

Doesn't run inside IDEone due to the external download link, but you can copy&paste the code over

freakynit · 2026-03-05T09:27:45 1772702865

24x speedup (over 10x already) and similar loss profile (for c++ version, optimized by claude): https://gist.github.com/freakynit/3982eab8413a89941bd0018e63......

verma7 · 2026-03-08T05:25:53 1772947553

This is amazing! Thanks for optimizing the code using Claude!

verma7 · on Jan 13, 2021

My guess is Craig Silverstein, as he mentioned that Larry and Sergey might have worked with him at Stanford.

dilyevsky · on Jan 13, 2021

Yeah from description def sounds like him. He was on the way out when i joined but he was probably more active when this was written

verma7 · on Jan 1, 2021

I am planning to learn how to write better. I am starting that by maintaining a stream of consciousness personal daily log. Also get more practice by commenting on Hacker News :)

verma7 · on Dec 31, 2020

The analogy of the question "What is life?" with "What is a computer?" bring up interesting parallels. The author defines a computer as any device that has transistors, RAM, etc that is, the computing substrate. But there were devices that didn't use transistors, but used vacuum tubes or mechanical gears (like Babbage's Analytical engine), which I think are still computers. We have some good theoretical model of computers: like Turing machines. One possible definition of a computer is any device that is Turing complete.

I wonder what such a theoretical model of life would look like.

frongpik · on Dec 31, 2020

The analogy goes deeper. Most of us are familiar with the spirit-soul-body concept. Descriptions of the three are very similar to concepts in CS: spirit is the algorithm or an idea - in this sense the spirit is immutable and the same for all devices, soul is the software - it's mutable, implemented in a particular language or framework and generally strives to be a perfect reflection of the spirit, and finally the cpu itself with all the machinery running electrons is the body - it's run by that almost immaterial soul, although no part of that machinery, not even a single electron, is a part of the soul. The connection between the three can be easily understood by CS folks, but is a great mystery for uninitiated (i.e. those who can't code).

There's even a striking analogy of good an evil. "The evil is affirmation of disorder" (Eliphas Levi) and that is bugs and poor chaotic design: software that embraces this disorder, gives up liberty and reason (aka the good) and becomes evil, for evil has neither liberty nor reason.

I'm sure The Matrix used this very analogy.

verma7 · on Aug 4, 2020

How would you get the distribution from the sum of numbers individually?

chrisdirkis · on Aug 4, 2020

I believe the parent meant breaking it down as "if you earn $100k, add 10 slips that each have $10k on them". The number of slips will be more than the number of people in the room, but that's fine. It's less practical than the "seed + pass round" method that someone else recommended imo (due to handwriting notability), but still works in a more technical sense.

jhanschoo · on Aug 4, 2020

You're right, you can't, I was looking at the problem posed by grandparent comment of getting only the average.

jessaustin · on Aug 4, 2020

You don't count the slips of paper in the hat. Instead you count the number of people who put in slips of paper.

nitrogen · on Aug 4, 2020

That will give an average, but it won't give the distribution (min/max/etc)

verma7 · on June 9, 2020

|Q-Z| is countably infinite, but |Q| = |Z|. There are countably infinite rational numbers that are not counting numbers, yet there are exactly as many rational numbers as counting numbers.

verma7 · on Feb 15, 2020

Change List: internal Google lingo for what is more commonly known as patch, diff, or pull request in the Open source wold.

verma7 · on Oct 11, 2019

You might find the last section "Sleep spindles and intelligence" in https://www.tuck.com/sleep-spindles/ interesting.

verma7 · on Sept 8, 2019

It's easy to transfer to transition to the manager ladder by becoming a TLM (Tech Lead Manager) once you are L5 or higher.

verma7 · on July 1, 2019

Headspace app has worked well for me too. I have been using it for more than 3 months, meditating for 20 mins every day and I feel calmer throughout the day.