Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

'k. take your word for it. Esp in relation to Maildir. I know very little about Windows development.

But... just, FWIW, this particular subject has come up a lot on HN over the years with various explanations. https://news.ycombinator.com/item?id=18783525 and many many others easily searchable on hn.algolia.com

My fav comment was by an MS dev: "NTFS code is a purple opium-fueled Victorian horror novel that uses global recursive locks and SEH for flow control."



I definitely understand it gets talked about a lot, endlessly. It's not an unearned reputation. I just think that, especially in light of things like that last comment you like, so much of that reputation at this point is folklore more than benchmarks. People take "Windows is bad at lots of files in a single folder" as faith from some bible of Operating Systems Allegories rather than something they've worked with directly or seen tested themselves first-hand.

Part of what certainly doesn't help is that most of the "lots of files in a single folder" applications make other POSIX-based assumptions (such as locks and consistency with respect to concurrency are generally much more opt-in and eventually consistent by default in POSIX rather than opt-out and aggressively consistent by default in Windows). If you are trying to use POSIX-based assumptions on Windows it doesn't matter what you are doing, including "lots of files in a single folder", you are going to have a bad time. I can easily presume that is what happened in most of your anecdotal counter-examples (Java Minecraft, Firefox, Hedgewars, will all have different, plausible POSIX biases), though I can't know for certain without benchmarks and performance data in front of me, and none of those are currently my job. "Lots of files in a single folder" at that point, under that presumption, is a symptom, rather than the root cause. It's very easy to blame the symptom sometimes, especially when that sort of performance debugging/fixing is getting in the way of your real goals and that symptom is sometimes such an easy fix (use more folders, bundle more zips, what have you).

Again, I can't say that with too much certainty without specific performance data, it's just I do think people need to question the "Orthodoxy" of "well, Windows is just bad at that" more than they do sometimes.


Hm. Did you read the comment by the Microsoft dev in the linked thread? He gives the following reasons:

"We've long since gotten all the low-hanging fruit and are left with what is essentially "death by a thousand cuts," with no single component responsible for our (lack of) performance, but with lots of different things contributing"

* Linux has a top-level directory entry cache that means that certain queries (most notably stat calls) can be serviced without calling into the file system at all once an item is in the cache. Windows has no such cache, and leaves much more up to the file systems... [snip]

* Windows's IO stack is extensible, allowing filter drivers to attach to volumes and intercept IO requests before the file system sees them. ... [snip] .. Even a clean install of Windows will have a number of filters present, particularly on the system volume (so if you have a D: drive or partition, I recommend using that instead, since it likely has fewer filters attached). Filters are involved in many IO operations, most notably creating/opening files.

* The NT file system API is designed around handles, not paths. Almost any operation requires opening the file first, which can be expensive. ... [snip]

"Whether we like it or not (and we don't), file operations in Windows are more expensive than in Linux, even more so for those operations that only touch file metadata (such as stat)."

I can say my personal experience under Windows has been that compiling the same project was twice as fast in a linux virtualbox inside windows, than in the host. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: