Hacker Newsnew | past | comments | ask | show | jobs | submit | dist1ll's commentslogin

Your MS-01 routes line-rate 25Gbps in software with VyOS w/o kernel bypass? That's very surprising to me. At what packet sizes?

Sorry for the OT response, I was curious about this comment[0] you made a while back. How did you measure memory transfer speed?

[0] https://news.ycombinator.com/item?id=38820893


I used «powermetrics» bundled with macOS with «bandwidth» as one of the samplers (--samplers / -s set to «cpu_power,gpu_power,thermal,bandwidth»).

Unfortunately, Apple has taken out the «bandwidth» sampler from «powermetrics», and it is no longer possible to measure the memory bandwidth as easily.


> "surely if I send request to 5 nodes some of that will land on disk in reasonably near future?"

That would be asynchronous replication. But IIUC the author is instead advocating for a distributed log with synchronous quorum writes.


But we know this is not actually robust because storage and power failures tend to be correlated. The most recent Jepsen analysis again highlights that it's flawed thinking: https://jepsen.io/analyses/nats-2.12.1


The Aurora paper [0] goes into detail of correlated failures.

> In Aurora, we have chosen a design point of tolerating (a) losing an entire AZ and one additional node (AZ+1) without losing data, and (b) losing an entire AZ without impacting the ability to write data. [..] With such a model, we can (a) lose a single AZ and one additional node (a failure of 3 nodes) without losing read availability, and (b) lose any two nodes, including a single AZ failure and maintain write availability.

As for why this can be considered durable enough, section 2.2 gives an argument based on their MTTR (mean time to repair) of storage segments

> We would need to see two such failures in the same 10 second window plus a failure of an AZ not containing either of these two independent failures to lose quorum. At our observed failure rates, that’s sufficiently unlikely, even for the number of databases we manage for our customers.

[0] https://pages.cs.wisc.edu/~yxy/cs764-f20/papers/aurora-sigmo...


I believe testing over paper claims


Is there more detail on the design of the distributed multi-AZ journal? That feels like the meat of the architecture.


As long as your target language has a strict define-before-use rule and no advanced inference is required you will know the types of expressions, and can perform type-based optimizations. You can also do constant folding and (very rudimentary) inlining. But the best optimizations are done on IRs, which you don't have access to in an old-school single pass design. LICM, CSE, GVN, DCE, and all the countless loop opts are not available to you. You'll also spill to memory a lot, because you can't run a decent regalloc in a single pass.

I'm actually a big fan a function-by-function dual-pass compilation. You generate IR from the parser in one pass, and do codegen right after. Most intermediate state is thrown out (including the AST, for non-polymorphic functions) and you move on to the next function. This give you an extremely fast data-oriented baseline compiler with reasonable codegen (much better than something like tcc).


I would argue that stateful services (databases, message queues, CDNs) all perfectly fit the unikernel model. The question is whether the additional engineering effort and system design is worth the performance gain.


Interesting. Are there any research and papers on potential performance gains?


Another one is "jalr x0, imm(x0)", which turns an indirect branch into a direct jump to address "imm" in a single instruction w/o clobbering a register. Pretty neat.


> every unwrap in production code needs an INFALLIBILITY comment. clippy::unwrap_used can enforce this.

How about indexing into a slice/map/vec? Should every `foo[i]` have an infallibility comment? Because they're essentially `get(i).unwrap()`.


Yes? Funnily enough, I don't often use indexed access in Rust. Either I'm looping over elements of a data structure (in which case I use iterators), or I'm using an untrusted index value (in which case I explicitly handle the error case). In the rare case where I'm using an index value that I can guarantee is never invalid (e.g. graph traversal where the indices are never exposed outside the scope of the traversal), then I create a safe wrapper around the unsafe access and document the invariant.


If that's the case then hats off. What you're describing is definitely not what I've seen in practice. In fact, I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access. Even security-critical cryptography crates that passed audits don't do that. Personally, I found it quite hard to avoid indexing for graph-heavy code, so I'm always on the lookout for interesting ways to enforce access safety. If you have some code to share that would be very interesting.


My rule of thumb is that unchecked access is okay in scenarios where both the array/map and the indices/keys are private implementation details of a function or struct, since an invariant is easy to manually verify when it is tightly scoped as such. I've seen it used it in:

* Graph/tree traversal functions that take a visitor function as a parameter

* Binary search on sorted arrays

* Binary heap operations

* Probing buckets in open-addressed hash tables


> I don't think I've ever seen a crate or production codebase that documents infallibility of every single slice access.

The smoltcp crate typically uses runtime checks to ensure slice accesses made by the library do not cause a panic. It's not exactly equivalent to GP's assertion, since it doesn't cover "every single slice access", but it at least covers slice accesses triggered by the library's public API. (i.e. none of the public API functions should cause a panic, assuming that the runtime validation after the most recent mutation succeeds).

Example: https://docs.rs/smoltcp/latest/src/smoltcp/wire/ipv4.rs.html...


I think this goes against the Rust goals in terms of performance. Good for safe code, of course, but usually Rust users like to have compile time safety to making runtime safety checks unnecessary.


> graph-heavy code

Could you share some more details, maybe one fully concrete scenario? There are lots of techniques, but there's no one-size-fits-all solution.


Sure, these days I'm mostly working on a few compilers. Let's say I want to make a fixed-size SSA IR. Each instruction has an opcode and two operands (which are essentially pointers to other instructions). The IR is populated in one phase, and then lowered in the next. During lowering I run a few peephole and code motion optimizations on the IR, and then do regalloc + asm codegen. During that pass the IR is mutated and indices are invalidated/updated. The important thing is that this phase is extremely performance-critical.


And it's fine for a compiler to panic when it violates an assumption. Not so with the Cloudflare code under discussion.


Idiomatic Rust would have been to return a Result<> to the caller, not to surprise them with a panic.

The developer was lazy.

A lot of Rust developers are: https://github.com/search?q=unwrap%28%29+language%3ARust&typ...


One normal "trick" is phantom typing. You create a type representing indices and have a small, well-audited portion of unsafe code handling creation/unpacking, where the rest of the code is completely safe.

The details depend a lot on what you're doing and how you're doing it. Does the graph grow? Shrink? Do you have more than one? Do you care about programmer error types other than panic/UB?

Suppose, e.g., that your graph doesn't change sizes, you only have one, and you only care about panics/UB. Then you can get away with:

1. A dedicated index type, unique to that graph (shadow / strong-typedef / wrap / whatever), corresponding to whichever index type you're natively using to index nodes.

2. Some mechanism for generating such indices. E.g., during graph population phase you have a method which returns the next custom index or None if none exist. You generated the IR with those custom indexes, so you know (assuming that one critical function is correct) that they're able to appropriately index anywhere in your graph.

3. You have some unsafe code somewhere which blindly trusts those indices when you start actually indexing into your array(s) of node information. However, since the very existence of such an index is proof that you're allowed to access the data, that access is safe.

Techniques vary from language to language and depending on your exact goals. GhostCell [0] in Rust is one way of relegating literally all of the unsafe code to a well-vetted library, and it uses tagged types (via lifetimes), so you can also do away with the "only one graph" limitation. It's been awhile since I've looked at it, but resizes might also be safe pretty trivially (or might not be).

The general principle though is to structure your problem in such a way that a very small amount of code (so that you can more easily prove it correct) can provide promises that are enforceable purely via the type system (so that if the critical code is correct then so is everything else).

That's trivial by itself (e.g., just rely on option-returning .get operators), so the rest of the trick is to find a cheap place in your code which can provide stronger guarantees. For many problems, initialization is the perfect place (e.g., you can bounds-check on init and then not worry about it again) (e.g., if even bounds-checking on initialization is too slow then you can still use the opportunity at initialization to write out a proof of why some invariant holds and then blindly/unsafely assert it to be true, but you then immediately pack that hard-won information into a dedicated type so that the only place you ever have to think about it is on initialization).

[0] https://plv.mpi-sws.org/rustbelt/ghostcell/


I do use a combination of newtyped indices + singleton arenas for data structures that only grow (like the AST). But for the IR, being able to remove nodes from the graph is very important. So phantom typing wouldn't work in that case.


Usually you'd want to write almost all your slice or other container iterations with iterators, in a functional style.

For the 5% of cases that are too complex for standard iterators? I never bother justifying why my indexes are correct, but I don't see why not.

You very rarely need SAFETY comments in Rust because almost all the code you write is safe in the first place. The language also gives you the tool to avoid manual iteration (not just for safety, but because it lets the compiler eliminate bounds checks), so it would actually be quite viable to write these comments, since you only need them when you're doing something unusual.


I didn't restate the context from the code we're discussing: it must not panic. If you don't care if the code panics, then go ahead and unwrap/expect/index, because that conforms to your chosen error handling scheme. This is fine for lots of things like CLI tools or isolated subprocesses, and makes review a lot easier.

So: first, identify code that cannot be allowed to panic. Within that code, yes, in the rare case that you use [i], you need to at least try to justify why you think it'll be in bounds. But it would be better not to.

There are a couple of attempts at getting the compiler to prove that code can't panic (e.g., the no-panic crate).


What about memory allocation - how will you stop that from panicking ? `Vec::resize` will always panic in Rust. And this is just one example out of thousands in the Rust stdlib.

Unless the language addresses no-panic in its governing design or allows try-catch, not sure how you go about this.


That is slowly being addressed, but meanwhile it’s likely you have a reliable upper bound on how much heap your service needs, so it’s a much smaller worry. There are also techniques like up-front or static allocation if you want to make more certain.


Yep and this postmortem details how their proxy modules use static allocation.


I'm far more worried about some dependency calling unwrap() or expect() now.

https://github.com/search?q=unwrap%28%29+language%3ARust&typ...

This is ridiculous. We're probably going to start seeing more of these. This was just the first, big highly visible instance.

We should have a name for this similar to "my code just NPE'd". I suggest "unwrapped", as in, "My Rust app just unwrapped a present."

I think we should start advocating for the deprecation and eventual removal of the unwrap/expect family of methods. There's no reason engineers shouldn't be handling Options and Results gracefully, either passing the state to the caller or turning to a success or fail path. Not doing this is just laziness.


In TFA they mentioned they preallocate all the memory up front


Indexing is comparatively rare given the existence of iterators, IMO. If your goal is to avoid any potential for panicking, I think you'd have a harder time with arithmetic overflow.


Cargo needs to grow a label for crates that provably do not panic. (Neverminding allocations and things outside our control flow.)

I want to ban crates that panic from my dependency chain.

The language could really use an extra set of static guarantees around this. I would opt in.


> I want to ban crates that panic from my dependency chain.

Which means banning anything that allocates memory and thousands of stdlib functions/methods.


See the immediately preceding sentence.

I'm fine with allocation failures. I don't want stupid unwrap()s, improper slice access, or other stupid and totally preventable behavior.

There are things inside the engineer's control. I want that to not panic.


Your pair of posts is very interesting to me. Can you share with me: What is your programming environment such that you are "fine with allocation failures"? I'm not doubting you, but for me, if I am doing systems programming with C or C++, my program is doomed if a malloc fails! When I saw your post, I immediately thought: Am I doing it wrong? If I get a NULL back from malloc(), I just terminate with an error message.


Not GP but I read "I'm fine with allocation failures" as "I'm OK with my program terminating if it can't allocate (but not for other errors)".


I mean, yeah, if I am using a library, as an user of this library, I would like to be able to handle the error myself. Having the library decide to panic, for example, is the opposite of it.


If I can't allocate memory, I'm typically okay with the program terminating.

I don't want dependencies deciding to unwrap() or expect() some bullshit and that causing my entire program to crash because I didn't anticipate or handle the panic.

Code should be written, to the largest extent possible, to mitigate errors using Result<>. This is just laziness.

I want checks in the language to safeguard against lazy Rust developers. I don't want their code in my dependency tree, and I want static guarantees against this.

edit: I just searched unwrap() usage on Github, and I'm now kind of worried/angry:

https://github.com/search?q=unwrap%28%29+language%3ARust&typ...

A lot of this is just pure laziness.


Not to mention overcommit has become standard behavior on many systems, so you wouldn't even get a NULL unless you really tried.


I think I'd prefer a compile-time guarantee.

Something that allows me to tag annotate a function (or my whole crate) as "no panic", and get a compile error if the function or anything it calls has a reachable panic.

This will allow it to work with many unmodified crates, as long as constant propagation can prove that any panics are unreachable. This approach will also allow crates to provide panicking and non panicking versions of their API (which many already do).


Yes, I want that. I also want to be able to (1) statically apply a badge on every crate that makes and meets these guarantees (including transitively with that crate's own dependencies) so I can search crates.io for stronger guarantees and (2) annotate my Cargo.toml to not import crates that violate this, so time isn't wasted compiling - we know it'll fail in advance.

On the subject of this, I want more ability to filter out crates in our Cargo.toml. Such as a max dependency depth. Or a frozen set of dependencies that is guaranteed not to change so audits are easier. (Obviously we could vendor the code in and be in charge of our own destiny, but this feels like something we can let crate authors police.)


I think the most common solution at the moment is dtolnay's no_panic [0]. That has a bunch of caveats, though, and the ergonomics leave something to be desired, so a first-party solution would probably be preferable.

[0]: https://github.com/dtolnay/no-panic


This sounds a little bit like Safe Haskell, which never really took off.


I would be fine just getting rid of unwrap(), expect(), etc. That's still a net win.

Look at how many lazy cases of this there are in Rust code [1].

Some of these are no doubt tested (albeit impossible to statically guarantee), but a lot of it looks like sloppiness or not leaning on the language's strong error handling features.

It's disappointing to see. We've had so much of this creep into the language that eventually it caused a major stop-the-world outage. This is unlikely to be the last time we see it.

[1] https://github.com/search?q=unwrap%28%29+language%3ARust&typ...


I don't write Rust so I don't really know, but from someone else's description here it sounds similar to `fromJust` in Haskell which is a common newbie footgun. I think you're right that this is a case of not using the language properly, though I know I was seduced into the idea that Haskell is safe by default when I was first learning, which isn't quite true — the safety features are opt-in.

A language DX feature I quite like is when dangerous things are labelled as such. IIRC, some examples of this are `accursedUnutterablePerformIO` in Haskell, and `DO_NOT_USE_OR_YOU_WILL_BE_FIRED_EXPERIMENTAL_CREATE_ROOT_CONTAINERS` in React.js.


I would be in favor of renaming unwrap() and its family to `unwrap_do_not_use_or_you_will_break_the_internet()`

I still think we should remove them outright or make production code fail to compile without a flag allowing them. And we also need tools to start cleaning up our dependency tree of this mess.


For iteration, yes. But there's other cases, like any time you have to deal with lots of linked data structures. If you need high performance, chances are that you'll have to use an index+arena strategy. They're also common in mathematical codebases.


I mean... yeah, in general. That's what iterators are for.


WASM traps on out-of-bounds accesses (including overflow). Masking addresses would hide that.


Ditto. Perfect hashing strings smaller than 8 bytes has been the fastest lookup method in my experience.


Problem is, there are a lot of RISC-V instruction way longer than that (like th.vslide1down.vx) so hashing is going to be slow.


You could copy the instruction to a 16 byte sized buffer and hash the one/two int64s. Looking at the code sample in the article, there wasn't a single instruction longer than 5 characters, and I suspect that in general instructions with short names are more common than those with long names.

This last fact might actually support the current model, as it grows linearly-ish in the size of the instruction, instead of being constant like hash.


Note th.vslide1down.vx is a T-Head instruction, a vendor custom extension.

It is not part of RISC-V, nor supported by any CPUs outside of that vendors' own.


Is there a handy list of all RISC-V instructions?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: