kh9000's comments

kh9000 · 2026-03-16T01:06:59 1773623219

Nothing to add other than that I agree, and this is a easy to understand, correct way of framing the problem. But the framing probably won't convince exuberant AI folks. These tools are incredibly powerful, but they can only take you faster in the direction you were already going. Using them responsibly requires discipline, and not everyone has it, and we're about to undertake a grand social experiment in seeing how it plays out.

Unnecessary complexity was already a big problem in software prior to LLMs, and I think we're about to see it become, unbelievably, a much much bigger problem than before. My personal belief is that there are a lot of exponential harms that stack up as complexity increases:

1) Difficulty of understandinga codebase rises expontentially with code size. A 100kLOC codebase is more than 10x harder to understand than a 10kLOC codebase. 2) Large codebases get larger faster, classic snowball effect. Implementing a given feature against a larger codebase requires a larger change, even if it's the ideal, perfect change. Plus, ast shortcuts and bad abstractions make future shortcuts and bad abstractions much more likely. 3) Team size increases with codebase size, larger team size triggers Brooks law: number of communication paths grows quadratically with team size.

Prior to LLMs, the world already had large codebases that had essentially become emergent organisms that became so complex, the organizations that created them completely lost control of the ability to work in them effectively. Visual Studio and MSBuild, or Microsoft Teams come to mind. But historically, it was very expensive to get to this point. Now it's going to be much easier

kh9000 · 2026-02-15T00:38:14 1771115894

The section coding.pdf has their code style guidelines, colloquially known as Cutler Normal Form, CNF for short. I'm conflicted on it. Definitely overly verbose, but you can't argue with the results of the NT team. Such a rigid style guide almost feels like the technical version of a dress code. And there's an idea called "enclothed cognition" which is like, if you wear a business suit to work, it exerts a subconscious influence that results in you taking the work more seriously, focusing your attention, etc: https://en.wikipedia.org/wiki/Enclothed_cognition

bombcar · 2026-02-15T04:31:24 1771129884

It's also important to remember that a ton of things we take for granted now simply didn't exist (source code control was in its infancy, merging was shit, syntax highlighting was minimal at best, compiling took time, etc).

jen20 · 2026-02-15T12:16:28 1771157788

Source control was in infancy compared to today, but still 15 years old (SCCS) when Windows NT development started!

anthk · 2026-02-15T12:12:33 1771157553

Not with Borland products. Even XEmacs and Emacs had these features (code control was with CVS or close).

bombcar · 2026-02-15T17:29:42 1771176582

CVS and RCS and friends were infants; barely more than copying directories or zip files around.

Complex merging as we're used to with git was unheard of.

nxobject · 2026-02-15T08:29:18 1771144158

At least there was this...

> Note that the NT OS/2 system does not use the Hungarian naming convention used in some of the other Microsoft products.

markus_zhang · 2026-02-15T01:59:39 1771120779

I agree that in general people are influenced by their perception of themselves. For example I always pretend I'm a kernel programmer when hacking on small kernels. This did lend me a lot of patience to debug and figure things out, which I do not have for my work.

kh9000 · 2025-11-15T23:25:02 1763249102

I imagine it would be frustrating to be the windows shell dev who has to investigate the torrent of bizarre memory corruption bugs that inevitably occur on Windhawk users’ machines after major OS updates. There’s really no avoiding it when you detour unstable “implementation detail” sort of functions across the taskbar/systray/start/etc. especially now that c++ coroutines are in widespread usage therein.

But to be fair, I understand the demand for products like this, because of several painful feature takebacks between 10 -> 11. It would be nice if cleaner approaches like wholesale shell replacement were still as straightforward as they were prior to windows 8. The “immersive shell” infra running in explorer + the opaque nature of enumerating installed UWPs + a bunch of other things make that almost impossible today.

hulitu · 2025-11-16T13:13:27 1763298807

> I imagine it would be frustrating to be the windows shell dev who has to investigate the torrent of bizarre memory corruption bugs that inevitably occur on Windhawk users’ machines after major OS updates.

What makes you think they will be bothered to investigate other's people bugs, when they don't investigate even their own ?

indrora · 2025-11-16T00:48:45 1763254125

From what I've learned: stuff like this makes up a not insignificant portion of the crash reports that come through. This results in crash dumps that are useless at best because they just look like memory corruption or badly written malware. In my discussions with folks about this, an annoying number of people who run this sort of software either a) do not care that it makes developing Windows harder for the devs or b) actively want the usable signal for the Windows development teams to be low.

Anixx1 · 2025-11-16T06:21:44 1763274104

There are mods that hook only WINAPI functions.

kh9000 · 2025-09-12T15:55:09 1757692509

Using the UIA tree as the currency for LLMs to reason over always made more sense to me than computer vision, screenshot based approaches. It’s true that not all software exposes itself correctly via UIA, but almost all the important stuff does. VS code is one notable exception (but you can turn on accessibility support in the settings)

philipbjorge · 2025-09-12T18:55:47 1757703347

Important is subjective — In the healthcare space, I’d make the claim that most applications don’t expose themselves correctly (native or web).

CV and direct mouse/kb interactions are the “base” interface, so if you solve this problem, you unlock just about every automation usecase.

(I agree that if you can get good, unambiguous, actionable context from accessibility/automation trees, that’s going to be superior)

phatskat · 2025-09-13T17:11:43 1757783503

I’ve been working hard on our new component implementation (Vue/TS) to include accessibility for components that aren’t just native reskins, like combo and list boxes, and keyboard interactivity is a real pain. One of my engineers had it half-working on her dropdown and threw in the towel for MVP because there’s a lot of little state edge cases to watch out for.

Thankfully the spec as provided by MDN for minimal functionality is well spelled out and our company values meeting accessibility requirements, so we will revisit and flesh out what we’re missing.

Also I wanna give props (ha) to the Storybook team for bringing accessibility testing into their ecosystem as it really does help to have something checking against our implementations.

freedomben · 2025-09-12T17:07:13 1757696833

Agreed. I've noticed ChatGPT when parsing screenshots writes out some Python code to parse it, and at least in the tests I've done (with things like, "what is the RGB value of the bullet points in the list" or similar) it ends up writing and rewriting the script five or so times and then gives up. I haven't tried others so I don't know if their approach is unique or not, but it definitely feels really fragile and slow to me

Juminuvi · 2025-09-12T22:58:58 1757717938

I noticed something similar. I asked it extract a guid from an image and it wrote a python script to run ocr against it...and got it wrong. Prompting a bit more seemed to finally trigger it to use it's native image analysis but I'm not sure what the trick was.

spacebacon · 2025-09-14T06:51:27 1757832687

Probably just ask it to use native image analysis versus writing code. I have done this before extracting usernames from screenshots.

morkalork · 2025-09-13T01:00:01 1757725201

I've run into this with uploading audio and text files, have to yell at it to not write any code and use it's native abilities to do the job.

nikanj · 2025-09-12T19:15:02 1757704502

Most Electron software doesn't follow accessibility guidelines and exposes nothing over UIA

akurilin · 2025-09-12T19:14:57 1757704497

I recently tried using Qwen VL or Moondream to see if off-the-shelf they would be able to accurately detect most of the interesting UI elements on the screen, either in the browser or your average desktop app.

It was a somewhat naive attempt, but it didn't look like they performed well without perhaps much additional work. I wonder if there are models that do much better, maybe whatever OpenAI uses internally for operator, but I'm not clear how bulletproof that one is either.

These models weren't trained specifically for UI object detection and grounding, so, it's plausible that if they were trained on just UI long enough, they would actually be quite good. Curious if others have insight into this.