More

jcgl · 2026-03-12T23:56:39 1773359799

In any scenario where you want to do traffic steering at a network level. Managing multiple network upstreams (e.g. for network failover or load balancing) is a common example that is served well by numerous off-the-shelf routers with IPv4. That's an important feature that IPv6 cannot offer without using NPTv6 or NAT66.

It's conceivable that OSes could support some sort of traffic steering mechanism where the network distributes policy in some sort of dynamic way? But that also sounds fragile and you (i.e. the network operator) still have to cope with the long tail of devices that will never support such a mechanism.

simoncion · 2026-03-13T04:32:02 1773376322

> Managing multiple network upstreams (e.g. for network failover or load balancing) is a common example ... that IPv6 cannot offer without using NPTv6 or NAT66.

I don't think that's true. I haven't had reason to do edge router failover, but I am familiar with the concepts and also with anycast/multihoming... so do make sure to cross-check what I'm saying here with known-good information.

My claim is that the scenario you describe is superior in the non-NATted IPv6 world to that of the NATted IPv4 world. Let's consider the scenario you describe in the IPv4-only world. Assume you're providing a typical "one global IP shared with a number of LAN hosts via IPv4 NAT". When one uplink dies, the following happens:

* You fail over to your backup link

* This changes your external IP address

* Because you're doing NAT, and DHCP generally has no way to talk back to hosts after the initial negotiation you have no way to alert hosts of the change in external IP address

* Depending on your NAT box configuration, existing client connections either die a slow and painful death, or -ideally- they get abruptly RESET and the hosts reestablish them

Now consider the situation with IPv6. When one uplink dies:

* You fail over to your backup link

* This changes your external prefix

* Your router announces the prefix change by announcing the new prefix and also that the now-dead one's valid lifetime is 0 seconds [0]

* Hosts react to the change by reconfiguring via SLAAC and/or DHCPv6, depending on the settings in the RA

* Existing client connections are still dead, [1] but the host gets to know that their global IP address has changed and has a chance to take action, rather than being entirely unaware

Assuming that I haven't screwed up any of the details, I think that's what happens. Of course, if you have provider-independent addresses [2] assigned to your site, then maybe none of that matters and you "just" fail over without much trouble?

[0] I think this is known as "deprecating" the prefix

[1] I think whether they die slow or fast depends on how the router is configured

[2] ...whether IPv4 or IPv6...

jcgl · 2026-03-13T07:49:57 1773388197

> * Hosts react to the change by reconfiguring via SLAAC and/or DHCPv6, depending on the settings in the RA

This is the linchpin of the workflow you've outlined. Anecdotal experience in this area suggests it's not broadly effective enough in practice, not least because of this:

> * Existing client connections are still dead, [1] but the host gets to know that their global IP address has changed and has a chance to take action, rather than being entirely unaware

The old IP addresses (afaiu/ime) will not be removed before any dependent connections are removed. In other words, the application (not the host/OS) is driving just as much as the OS is. Imo, this is one of the core problems with the scenario, that the OS APIs for this stuff just aren't descriptive enough to describe the network reconfiguration event. Because of that, things will ~always be leaky.

> [1] I think whether they die slow or fast depends on how the router is configured

Yeah, and that configuration will presumably be sensitive to what caused the failover. This could manifest differently based on whether upstream A simply has some bad packet loss or whether it went down altogether (e.g. a physical fault).

In any case, this vision of the world misses on at least two things, in my view:

1. Administrative load balancing (e.g. lightly utilizing upstream B even when upstream A is still up

2. The long tail of devices that don't respond well to the flow you outlined. It's not enough to think of well-behaved servers that one has total control over; need to think also of random devices with network stacks of...varying quality (e.g. IOT devices)

simoncion · 2026-03-13T09:12:19 1773393139

> The old IP addresses (afaiu/ime) will not be removed before any dependent connections are removed.

I have two reactions to this.

1) Duh? I'm discussing a failover situation where your router has unexpectedly lost its connection to the outside world. You'd hope that your existing connections would fail quickly. The existence of the deprecated IP shoudn't be relevant because the OS isn't supposed to use it for any new connections.

2) If you're suggesting that network-management infrastructure running on the host will be unable to delete a deprecated address from an interface because existing connections haven't closed, that doesn't match my experience at all. I don't think you're suggesting this, but I'm bringing it up to be thorough.

> ...the OS APIs for this stuff just aren't descriptive enough to describe the network reconfiguration event.

I know that Linux has a system (netlink?) that's descriptive enough for daemons [0] to actively nearly-instantaneously start and stop listening on newly added/removed addresses. I'd be a little surprised if you couldn't use that mechanism to subscribe to "an address has become deprecated" events. I'd also be somewhat surprised if noone had built a nice little library over top of whatever mechanism that is. IDK about other OS's, but I'd be surprised if there weren't equivalents in the BSDs, Mac OS, and Windows.

> In any case, this vision of the world misses on at least two things, in my view:

> 1. Administrative load balancing...

I deliberately didn't talk about load balancing. I expect that if you don't do that at a layer below IP, then you're either stuck with something obscenely complicated or you're doing something like using special IP stacks on both ends... regardless of what version of IP your clients are using.

> 2. The long tail of devices that don't respond well to the flow you outlined.

Do they respond worse than in the IPv4 NAT world? This and other commentary throughout indicates that you missed the point I was making. That point was that -unlike in the NATted world- the OS and the applications running in it have a way to plausibly be informed of the network addressing change. In the NAT case, they can only infer that shit went bad.

[0] ...like BIND and NTPd...

jcgl · 2026-03-13T13:08:20 1773407300

> 1) Duh? I'm discussing a failover situation where your router has unexpectedly lost its connection to the outside world. You'd hope that your existing connections would fail quickly. The existence of the deprecated IP shoudn't be relevant because the OS isn't supposed to use it for any new connections.

Well failover is an administrative decision that can result from unexpectedly losing connection. But it can also be more ambiguous packet loss too, something that wouldn't necessarily manifest in broken connections--just degraded ones.

If upstream A is still passing traffic that simply gets lost further down the line, then there's no particular guarantee that the connection will fail quickly. If upstream A deliberately starts rejecting TCP traffic with RST, then sure, that'll be fine. But UDP and other traffic, no such luck. Whereas QUIC would fare just fine with NAT thanks to its roaming capabilities.

> I know that Linux has a system (netlink?) that's descriptive enough for daemons [0] to actively nearly-instantaneously start and stop listening on newly added/removed addresses. I'd be a little surprised if you couldn't use that mechanism to subscribe to "an address has become deprecated" events. I'd also be somewhat surprised if noone had built a nice little library over top of whatever mechanism that is. IDK about other OS's, but I'd be surprised if there weren't equivalents in the BSDs, Mac OS, and Windows.

Idk, I'll have to take your word for it. Instinctively though, this feels like a situation where the lowest common denominator wins. In other words, average applications aren't going to do any legwork here. The best thing to hope for is for language standard libraries to make this as built-in as possible. But if that exists, I'm extremely unaware of it.

> I deliberately didn't talk about load balancing. I expect that if you don't do that at a layer below IP, then you're either stuck with something obscenely complicated or you're doing something like using special IP stacks on both ends... regardless of what version of IP your clients are using.

I presume you meant a layer above IP? But no, I don't see why this is challenging in a NAT world. At least, I've worked with routers that support this, and it always seemed to Just Work™. I'd naively assume that the router is just modding the hash of the layer 3 addresses or something though.

> Do they respond worse than in the IPv4 NAT world?

I've basically only ever had good experiences in the IPv4 NAT world.

> That point was that -unlike in the NATted world- the OS and the applications running in it have a way to plausibly be informed of the network addressing change. In the NAT case, they can only infer that shit went bad.

I'm certainly sympathetic to this point. And, all things being equal, of course that seems better! If NAT66 were to not offer sufficient practical benefits, then I'd be convinced.

But please bear in mind that this was the original comment I responded to (not yours). Responding to this is where I'm coming from:

> Why would IPv6 ever need NAT?

jcgl · 2026-03-12T23:50:52 1773359452

NetworkManager just recently got CLAT! https://gitlab.freedesktop.org/NetworkManager/NetworkManager...

Issue for CLAT in systemd-networkd: https://github.com/systemd/systemd/issues/23674

iso1631 · 2026-03-13T12:45:23 1773405923

And if that's applies across the board, in 20 years time that might have filtered through to mean ipv4 can be dropped in my company.

I'd rather see this at a lower level than network manager and bodging in with bpf, so it's just a default part of any device running a linux network stack, but I don't know enough about kernel development and choices to know how possible that is in practice.

This should have been supported in the kernel 25 years ago though if the goal was to help ipv6 migration

jcgl · 2026-03-13T14:35:00 1773412500

I agree. Someone was working on that, though the work seems quite stale now: https://codeberg.org/IPv6-Monostack/ipxlat-net-next

jcgl · 2026-03-09T21:42:03 1773092523

I had no idea that you could run Lineage on the Jelly Star! That sounds phenomenal. My dream phone is a Star running Graphene. But short of that, Lineage would be great.

Any notes on your experience?

igorramazanov · 2026-03-09T22:54:55 1773096895

Not an author, but I've been using Jelly Star with a stock Android for 2 years.

Actually, typing this comment right now with this phone.

1. Keyboard: MessagEase or ThumbKey + Jelly Star is a perfect match.

2. Bitwarden passkeys + Firefox doesn't work. As I've researched, same with LineageOS. Didn't check Chrome, though.

3. All apps work without issues. Banking, Google Wallet, taxi, etc. It's a regular Android.

4. Battery isn't great, but charges fast and enough to carry on through the most of the day.

5. It's perfect for running or other outdoor activities.

6. 4G only, I sometimes also use it as an external modem for the laptop, and definitely would appreciate 5G.

7. Android 13 and no updates:/

All in all, I'm happy, but if I could foresee it advance, then I'd go with Jelly Max instead, because freshier Android version, Bitwarden + Firefox passkeys and 5G support.

Unfortunately, Jelly Max a bit bigger than Jelly Star, but still much smaller than other regular smartphones.

Hoodedcrow · 2026-03-10T05:57:26 1773122246

To my understanding, you need Google services for passkeys to work, at least for now. I wouldn't want them even sandboxed, and on Lineage they're not restricted in any way so this would be a big concession.

bluebarbet · 2026-03-09T22:48:29 1773096509

Did not know about Lineage either, now I'm interested too.

For the last 2 years I've been using a similar device from Unihertz's competitor, Cubot. Namely the King Kong Mini 3. No issues, very solid. Given how tiny it is, it gets lots of attention and marks me out as an eccentric (no objections). But stock Android, of course.

jcgl · 2026-03-07T22:08:11 1772921291

> I don't care about glibc or compatibility with /etc/nsswitch.conf.

So what do you do when you need to resolve system users? I sure hope you don't parse /etc/passwd, since plenty of users (me included) use other user databases (e.g. sssd or systemd-userdbd).

cyberax · 2026-03-07T22:39:15 1772923155

Most software doesn't need to resolve users. You also can always shell out to `id` if you need an occasional bit of metadata.

jcgl · 2026-03-08T10:20:15 1772965215

That's a fair point, and shelling out to id is probably a good solution.

I guess what bothers me is the software authors who don't think this through, leaving applications non-functional in these situations.

At least with Go, if you do CGO_ENABLED=0, and you use the stdlib functions to resolve user information, you end up with parsed /etc/passwd instead of shelling out to id. The Go stdlib should maybe shell out to id instead, but it doesn't. And it's understandable that software developers use the stdlib functions without thinking all too much about it. But in the end, simply advocating for CGO_ENABLED=0 results in software that is broken around the edges.

cyberax · 2026-03-10T02:44:45 1773110685

On the other hand, the NSS modules are broken beyond fixing. So promoting ecosystems that don't use them might finally spur the development of alternatives.

jcgl · 2026-03-11T10:38:37 1773225517

Could be interesting. What do you see as the main problems with NSS? I've never needed to use it directly myself. It seems quite crusty of course, but presumably there's more that your referencing.

Moving from linking stuff in-process to IPC (such as systemd-userdbd is promoting) _seems_ to me like a natural thing to do, given the nastiness that can happen when you bring something complex into your own address space (via C semantics nonetheless). But I'm not very knowledgeable here and would be interested to hear your overall take.

cyberax · 2026-03-11T17:47:48 1773251268

NSS/PAM modules have to work inside arbitrary environments. And vice versa, environments have to be ready for arbitrary NSS modules.

For example, you technically can't sandbox any app with NSS/PAM modules, because a module might want to send an email (yes, I saw that in real life) or use a USB device.

NSS/PAM need to be replaced with IPC-based solutions. systemd is evolving a replacement for PAM.

And for NSS modules in particular, we even have a standard solution: NSCD. It's even supported by musl libc, but for some reason nobody even _knows_ that it exists. Porting the NSCD protocol to Go is like 20 minutes of work. I looked at doing that more than once, but got discouraged by the other 99% of complexity in getting something like this into the core Go code.

jcgl · 2026-03-07T20:58:32 1772917112

> 3. use clevis to enable automatic unlocking of the root fs only when secure boot check passes;

Can also use systemd-cryptsetup/systemd-cryptenroll for this. I've not used clevis myself, but I'd imagine you have to do somewhat more rolling-your-own compared to the systemd tools.

> The unified kernel image doesn't accept additional kernel parameters, so only parameters that are set during generation of the initram are used. The secure boot makes sure no one else has tampered with the boot chain. And TPM stores the disk key securely.

FYI, multi-profile UKIs are a thing. You can have one UKI with multiple different command lines, e.g. one for regular boot, one for emergency mode, etc.

https://uapi-group.org/specifications/specs/unified_kernel_i...

jcgl · 2026-03-06T12:33:13 1772800393

> I think agents will have much better luck with TUIs than browsers.

I’m very skeptical. Why would you think that? TUIs inherently don’t provide programmatically accessible affordances; if they have any affordances at all, they’re purely visual cues that have unstandardized and of varying quality.

Compare that to the DOM in a browser where you’ve got numerous well-understood mechanisms to convey meaning and usability. Semantic HTML and ARIA roles. These things systematically simplify programmatic consumption.

jcgl · 2026-03-04T22:44:21 1772664261

Capabilities are craaaazy coarse on Linux. Really only a small piece of the sandboxing puzzle. Flatpak, Bubblewrap, and Firejail each provide an overall fuller view of what sandboxing can be.

jcgl · 2026-03-02T21:29:47 1772486987

Well 5% of a massive addressable market it itself quite a lot.

jcgl · 2026-03-02T16:02:48 1772467368

Sure, that’s an interface that’s better for many users and use-cases.

However, it seems better if you could, as much as is possible, move the AI stuff from runtime to “compile time.”

Instead of having the AI do everything all the time, have AI configure your Zapier (or whatever) on your behalf. That way you can (ideally) get the best of both worlds: the reliability and predictability of classical software, combined with the fuzzy interface of LLMs.

BeetleB · 2026-03-02T17:47:46 1772473666

> Instead of having the AI do everything all the time, have AI configure your Zapier (or whatever) on your behalf.

That is what many use OpenClaw for! The AI assistant will happily recommend existing services and help you (or itself, if you let it), set it up.

(In theory. In practice, it often does a poor job).

The appeal of OpenClaw is I don't need to go research all these possible solutions for different problems. I just tell it my problem and it figures it out.

Yesterday I told it to monitor a page which lists classes offered, and have it ping me if any class with a begin date in March/April is listed. This is easily scriptable by me, but I don't want to spend time writing that script. And modifying it for each site I want to be notified for. I merely spoke (voice, not text) to the agent and it will check each day.

(Again, it's not that reliable. I'm under no illusion it will inform me - but this is the appeal).

samusiam · 2026-03-03T11:09:34 1772536174

But literally any decent agent can recommend existing services and help you set them up. And even help you help them set the services up for you. I do this with Claude all the time.

skeledrew · 2026-03-02T17:27:44 1772472464

That's still too much work. Someone would have to make like an OpenClaw wizard that protectively offers to set all that stuff up. So the potential OpenClaw user can then, on running for the first time, be guided through the setup of whatever they'd like to get connected. And "setup" here means a short description of X and a "Connect? (y/n)" prompt. Anything more and you start losing people.

jbellis · 2026-03-02T17:06:35 1772471195

yes. in a similar vein, we're seeing that get standardized in coding agents as "don't have the agent use tools directly, have the agent write code to call the tools"

jcgl · 2026-03-01T22:26:14 1772403974

> So, can we add two bools together? Adding booleans together is nonsense, but we've said these are a kind of integer so sure, I guess True + True = 2 ? And this cascades into nonsense like ~True being a valid operation in Python and its result is true...

The bitwise negation is indeed janky and inaccurate, but True + True = 2 is absolutely a valid thing to say in boolean algebra. Addition mean "or", and multiplication means "and."

tialaramex · 2026-03-02T18:59:13 1772477953

> True + True = 2 is absolutely a valid thing to say in boolean algebra

Nope. The Boolean algebra only has two values, and it lacks the addition operation entirely.

jcgl · 2026-03-02T22:55:50 1772492150

I always remember learning that 2 was a legit enough way to represent the result of 1 + 1, but the internet seems to agree with you mostly. Though I contend that 1 + 1 = 2 is unambiguous, so is fine.

But multiplication and addition do work just fine for boolean arithmetic: https://en.wikipedia.org/wiki/Two-element_Boolean_algebra

tialaramex · 2026-03-03T01:05:59 1772499959

Huh, I learn something new, I was not aware of "Two element Boolean algebra" nor just how deep this particular rabbit hole goes.

It's fine that 1 + 1 = 2. That's just integer arithmetic. The problem is that the booleans are not "just integers" and so Python's choice to implement them as "just integers" while convenient for them has consequences that are... undesirable.