Hacker Newsnew | past | comments | ask | show | jobs | submit | nickelpro's commentslogin

If we're going to be pedantic, mmap is a syscall. It happens that the C version is standardized by POSIX.

The underlying syscall doesn't use the C ABI, you need to wrap it to use it from C in the same way you need to wrap it to use it from any language, which is exactly what glibc and friends do.

Moral of the story is mmap belongs to the platform, not the language.


it also appears in operating systems that aren't written in c. i see it as an operating system feature, categorically.

No, that's too far down the pedantry rabbit hole. "mmap()" is quite literally a C function in the 4.2BSD libc. It happens to wrap a system call of the same name, but to claim that they are different when they arrived in the same software and were written by the same author at the same time is straining the argument past the breaking point. You now have a "C Erasure Polemic" and not a clarifying comment.

If you take a kernel written in C and implement a VM system for it in C and expose a new API for it to be used by userspace processes written in C, it doesn't magically become "not C" just because there's a hardware trap in the middle somewhere.

mmap() is a C API. I mean, duh.


and if i directly do an mmap syscall on linux from a freestanding forth that doesn't go through libc for anything? sure, c unfortunately defines how i have say, pass a string, but that's effectively an arbitrary calling convention at that point; there's no c runtime on the calling side so it's not particularly useful to contend that what i'm using is a c api.

or perhaps mmap is incontrovertibly a c function on platforms where libc wrappers are the sole stable interface to the kernel but something else entirely on linux?


> and if i directly do an mmap syscall on linux from a freestanding forth

... mmap() remains a system call to a C kernel designed for use from the C library in C programs, and you're running what amounts to an emulator.

The fact that you can imagine[1] an environment where that might not be the case doesn't mean that it isn't the case in the real world.

Your argument appears to be one of Personal Liberty: de facto truths don't matter because you can just make your own. This is sort of a software variant of a Sovereign Citizen, I think.

[1] Can you even link a "freestanding forth" with an mmap() binding on any Unix that doesn't live above the libc implementation? I mean, absent everything else it would have to open code all the flag constants, whose values change between systems. This appears to be a completely fictitious runtime you've invented, which if anything sits as evidence in my favor and not yours.


?

i'm not so much imagining an environment per se¹ as describing one i've already written, so i'm not entirely sure where any of this is coming from. if you care to have some additional assurance this isn't somehow an elaborate rhetorical trap, a previous comment about forth tail call elimination with a bit of demonstrative assembly is presumably only a short scroll down my profile. ctrl-f for cmov if you want to find it quickly. as i recall, it came up for similar reasons then because people often make similar incorrect generalizations about lots of things that implicitly sit atop a c runtime in their minds. that said, you're the first one to call me a sovcit before asking any clarifying questions so at least there's some new pizzazz there.

i was clear that i was talking specifically about linux precisely because this isn't something one can do portably for exactly the reasons you're describing (which, yes, makes porting things built like this off of linux before the point you've built up enough to be able to go through libc annoying and ad hoc at the very least).

the fact remains that i can, right now, non-theoretically, on a well supported common unixlike os, and entirely unrelated to whatever weird crusade you seem to have invented to stand in for my side of this discussion, link a pile of assembly with -static -nolibc, fire up the repl, and mmap files into memory as i please with nary a bit of c on the userspace side.

as i originally said, i'm happy to consider linux a weird exception to the point you're making in a wider context since this isn't something you can do portably, but there still are entirely useful things one can do today with mmap that involve zero userspace c code on a widely supported platform.

edit: lol forgot to even get to this part. i'm also somewhat curious what you mean with this bit: "you're running what amounts to an emulator." perhaps i'm not firing on all cylinders today but i fail to see how it's useful to characterize performing bare syscalls from assembly (or something more high-level built out of assembly legos) as an emulator in any way, but i'm open to having missed some interesting nuance there.

¹ unless you mean trivially (seeing as this is code i imagined and then proceeded to write) in which case i suppose i agree


mmap is also relatively slow (compared to modern solutions, io_uring and friends), and immensely painful for error handling.

It's simple, I'll give it that.


Page faults are slower than being deliberate about your I/O but mapped memory is no faster or slower than "normal" memory, its the same mechanism.

Nah, usually can't have huge pages. Almost certainly can't have giant pages. Can't even fit all L3$ capacity into the L2 TLB if done via 4k pages...

I hadn't thought of that but apparently Linux at least has had support for a while, according to manpage? https://man7.org/linux/man-pages/man2/mmap.2.html

On BSD, read() was already implemented in the kernel by page-faulting in the desired pages of the file, to then be copied into the user-supplied buffer. So from the first time mmap was ever implemented, it was always the fastest input mechanism. (First deployed implementation was in SunOS btw, 4.2BSD specified and documented it but didn't implement it.) Anyway there's no magic to get data off a device into memory faster, io_uring just lets you hide the delay in some other thread's time.

mmap is slow because stalling on page faults is slow. Your process stalls and sits around doing nothing instead of processing data you've read already. You can google the benchmarks if you like. io_uring wasn't built just for kicks.

https://www.bitflux.ai/blog/memory-is-slow-part2/


> Why does C have the best file API

> Look inside

> Platform APIs

Ok.

I agree platform APIs are better than most generic language APIs at least. I disagree on mmap being the "best".


Dumb.

Compilers aren't deterministic in small ways, timestamps, encoding paths into debug information, etc. These are trivial, annoyances to reproducible build people and little else.

You cannot take these trivial reproducibility issues and extrapolate out to "determinism doesn't matter therefore LLMs are fine". You cannot throw a ball in the air, determine it is trivial to launch an object a few feet, and thus conclude a trip the moon is similarly easy.

The magnitude matters, not merely the category. Handwaving magnitude is a massive red flag a speaker has no idea what they're talking about.


And that result of that magnitude is the paradigm of operation is just completely different. Good programmers create inputs, check outputs, and build up a mental model of the system. When the input -> output is not well defined you can't use those same skills.


> a paid, invite-only social network where every person is verified human and there's no algorithm

This seems like an incredibly niche product that only a handful of people are interested in to begin with. It isn't an notable or surprising result that building it resulted in little interest from general audiences.


Kind of ironic - the moat is money...

At the same time, I see the appeal. I feel like 10% of the comments I read lately are "is this an AI response?" - would be nice to be free of that. Probably not possible tho.


It also strikes me as being in competition with, you know, a group chat.


i was just about to say, what they're describing is an imessage group chat with your friends.


It's original meaning was days since software release, without any security connotation attached. It came from the warez scene, where groups competed to crack software and make it available to the scene earlier and earlier. A week after general release, three days, same-day. The ultimate was 0-day software, software which was not yet available to the general public.

In a security context, it has come to mean days since a mitigation was released. Prior to disclosure or mitigation, all vulnerabilities are "0-day", which may be for weeks, months, or years.

It's not really an inflation of the term, just a shifting of context. "Days since software was released" -> "Days since a mitigation for a given vulnerability was released".


Wikipedia: A zero-day (also known as a 0-day) is a vulnerability or security hole in a computer system unknown to its developers or anyone capable of mitigating it

This seems logical since by etymology of zeroday it should apply to the release (=disclosure) of a vuln.


Properly manage PATH for the context you're in and this is a non-issue. This is the solution used by most programming environments these days, you don't carry around the entire npm or PyPI ecosystem all the time, only when you activate it.

Then again, I don't really believe in performing complex operations manually and directly from a shell, so I don't really understand the use-case for having many small utilities in PATH to begin with.


Macro hygiene, static initialization ordering, control over symbol export (no more detail namespaces), slightly higher ceiling for compile-time and optimization performance.

If these aren't compelling, there's no real reason.


We live with that for *decades*. For me this is not a daily problem. So yes, this is not compelling, unfortunately.


import std; is an order of magnitude faster than using the STL individually, if that's evidence enough for you. It's faster than #include <iostream> alone.

Chuanqi says "The data I have obtained from practice ranges from 25% to 45%, excluding the build time of third-party libraries, including the standard library."[1]

[1]: https://chuanqixu9.github.io/c++/2025/08/14/C++20-Modules.en...


Yeah, but now compare this to pre-compiled headers. Maybe we should be happy with getting a standard way to have pre-compiled std headers, but now my build has a "scanning" phase which takes up some time.


Modules are a lot like precompiled headers, but done properly and not as a hack.


The OP does this and measures ~1.2x improvement over PCH.


> even senior C++ developers are always going to be able to deduce the correct value category

Depends what "senior" means in this context. Someone with 20-years of domain experience in utility billing, who happened to be writing C++ for those 20 years? Probably not.

Someone who has been studying and teaching C++ for 20 years? Yes they are able to tell you the value category at a glance.

Language experience is not something you develop accidentally, you don't slip into just because you're using the language. Such tacit experience quickly plateaus. If you make the language itself the object of study, you will quickly surpass "mere" practitioners.

This is true of most popular programming languages in my experience. I find very, very few Python programmers understand the language at an implementation level, can explain the iterator protocol or what `@coroutine` actually used to do, how `__slots__` works, etc.

C++ is not unique in this, although it is old and has had a lot more time to develop strange corners.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: