More

vfaronov · on June 23, 2021

> While that sounds like a strange implementation detail, the philosophy of the .Net team has always been "how do you reasonably recover from an stack overflow?"

Can you expand on this or link to any further reading? I just realized that this affects my platform (Go) as well, but I don't understand the reasoning. Why can't stack overflow be treated just like any other exception, unwinding the stack up to the nearest frame that has catch/recover in place (if any)?

zamalek · on June 23, 2021

> Why can't stack overflow be treated just like any other exception[...]?

Consider the following code:

    func overflows() {
        defer a()
        
        fmt.Println("hello") // <-- stack overflow occurs within
    }

    func a() {
        fmt.Println("hello")
    }

The answer lies in trying to figure out how Go would successfully unwind that stack, it can't: when it calls `a` it will simply overflow again. Something that has been discussed is "StackAboutToOverflowException", but that only kicks the bucket down the road (unwinding could still cause an overflow).

In truth, the problem exists because of implicit calls at the end of methods interacting with stack overflows, whether that's because of defer-like functionality, structured exception handling, or deconstructors.

vfaronov · on June 24, 2021

But doesn’t this apply to “normal” panics as well? When unwinding the stack of a panicking goroutine, any deferred call might panic again, in which case Go keeps walking up the stack with the new panic. In a typical server situation, it will eventually reach some generic “log and don’t crash” function, which is unlikely to panic or overflow.

Perhaps one difference is that, while panics are always avoidable in a recovery function, stack overflows are not (if it happens to be deep enough already). Does the argument go “even a seemingly safe recovery function can’t be guaranteed to succeed, so prevent the illusion of safety”?

(To be clear: I’m not arguing, just trying to understand.)

zamalek · on June 24, 2021

I'm not actually sure what Go would do in a double-fault scenario (that's when a panic causes a panic), but assuming it can recover from that:

In the absolute worst case scenario: stack unwinding is itself a piece of code[1]. In order to initiate the stack unwind, and deal with SEH/defer/dealloc, the Go runtime would need stack space to call that method. Someone might say, "freeze the stack and do the unwind on a different thread." The problem is the bit in the quotes is, again, at least one stack frame and needs stack space to execute.

I just checked the Go source, and it basically uses a linked list of stack frames in the heap[2]. If a stack is about to overflow, it allocates a new stack and continues in that stack. This does have a very minor performance penalty. So you're safe from this edge case :).

[1]: https://www.nongnu.org/libunwind/ [2]: https://golang.org/src/runtime/stack.go

vfaronov · on Oct 29, 2019

The title of this HN post is completely out of context, and the discussion here is unrelated to the article. The author is not at all arguing against types or Haskell, nor is he bashing QuickCheck (he’s developed a QuickCheck-like framework for Python). He’s merely complaining about some details of QuickCheck’s design.

oakpond · on Oct 29, 2019

I agree with your observation, except for the part where he not bashing on QuickCheck (he is, but you have to click the first link in the article in order to find it).

> One of the big differences between Hypothesis and Haskell QuickCheck is how shrinking is handled.

> Specifically, the way shrinking is handled in Haskell QuickCheck is bad and the way it works in Hypothesis (and also in test.check and EQC) is good. If you’re implementing a property based testing system, you should use the good way. If you’re using a property based testing system and it doesn’t use the good way, you need to know about this failure mode.

> The big difference is whether shrinking is integrated into generation.

> Integrating shrinking into generation has two large benefits...

> The first is mostly important from a convenience point of view...

> But the second is <i>really</i> important, because the lack of it makes your test failures potentially extremely confusing.

vfaronov · on March 24, 2019

How much of this applies to 1~100KB responses?

drewg123 · on March 24, 2019

1k, not so much since there is no aggregation that can happen there anyway.

100k is not that much different than 100mb, except the TCP window will not be as far open, so TSO will not be as effective.

Note that I work on a CDN that serves large media files, so I'm biased towards that workload.

vfaronov · on Dec 1, 2018

HTTP is an inherently complex protocol, which has over time accrued many idiosyncratic, non-orthogonal features to support various use cases of the growing Web. Just consider that there exists an entire class, entire design space of libraries known as “HTTP routers” which boil down to extracting arguments from the first line of an HTTP request.

If you want simplicity, and you fully control both sides, and you don’t care about the systemic advantages that REST purports to provide, then your best bet is to avoid HTTP altogether (which in practical terms may of course mean tunneling through it) and stick to a simple, modern RPC protocol.

geezerjay · on Dec 1, 2018

> Just consider that there exists an entire class, entire design space of libraries known as “HTTP routers” which boil down to extracting arguments from the first line of an HTTP request.

This comment sounds a bit disingenuous. Routing involves way more than extracting arguments from the first line of an HTTP request, but its complexity is not due to HTTP. Routing is based on content negotiation, and essentially everyone is free to design their own personal content negotiation process, and very often they do.

Take, for example, content type. Do file extensions matter? Does the Content-Type header mean anything? If both are used what should prevail? What should the router do if none was passed?

On top of that, then let's add HATEOAS, HAL, content type API versioning, custom headers, etc...

In the end developers need to map HTTP requests to an action, and HTTP is not the problem.

Libraries are helpful not because the problem is complex, but because ready-made solutions are better than rolling our own. There are plenty of libraries not because HTTP is complex, but because plenty of people have their personal preference.

drb91 · on Dec 1, 2018

> extracting arguments from the first line of an HTTP request.

The reason this sounds simple is that the use of “arguments” conceals a good deal of semantics. Yes, you can cut a huge amount of complexity out by using a subset of HTTP, and I do recommend that, but by switching to a “modern” rpc protocol you also lose a good deal of introspection, low-effort ramp up for new contributors, and highly replicated semantics on server and client side. You can debug or interact with curl (or a number of similar tools), or a browser, or any virtually programming language with many choices for implementations. There are tons of commodity tooling for log parsing, playback inspection, testing, fuzzing, load balancing etc etc. You get a lot for “free”, and the potential complexity by itself is a poor excuse for dumping the baby out with the bath water.

I also realize you may have acknowledged this by referencing “systemic” benefits, but I’d like to spell out just how large they are.

vfaronov · on Dec 1, 2018

Case in point: CacheControl [1]. Plug it in and you get flexible caching — interoperable with NGINX, the browser, and everybody else — for very little. But it relies on the protocol. If you respond to GET /stars/98765 with 200 (OK) and an “under construction” placeholder, it’s going to get confused.

[1] https://github.com/ionrock/cachecontrol

vfaronov · on Dec 1, 2018

RFC 7540 also defines an optional way for the client to signal whether it wishes the request to be processed asynchronously: https://tools.ietf.org/html/rfc7240#section-4.1

geezerjay · on Dec 2, 2018

My first impression is that RFC7540 isn't appropriate for this use case, because it is proposed as a way for the client to signal optional/prefered behavior when said behavior is already made mandatory by the server.

To put it differently, why would it matter if a client POSTs with Prefer: respond-async if the server response will always be async? Whether that header is present or not, the HTTP response is already designed to always return a Location header.

vfaronov · on Dec 1, 2018

https://tools.ietf.org/html/rfc7231#section-6.5.9

> It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.

vfaronov · on May 8, 2018

Outstanding explanation. I would like to emphasize one point:

> If you have multiple systems with data, and you want to merge the data together into a unified whole for some reason

This isn’t just about “big data” or “data warehousing” or such OLAPy concerns. It’s really about the bread and butter of modern information systems.

“Get some ID from service A, use it to query service B.” Does that sound familiar? That’s data integration for you. That’s what RDF / Linked Data is about.

vfaronov · on May 8, 2018

This article does not have a valid argument against using JSON-LD.

It may be read as an argument against making namespaces a ubiquitous part of JSON, in the same way as they are ubiquitous in XML (even though not part of the base XML spec). It’s a valid argument against getting to the point where namespaces pop up in every JSON tutorial, and every JSON serializer takes a namespace map, and every user of JSON has to deal with them.

But JSON-LD (as RDF) uses namespaces to solve a particular problem (merging data from disparate sources). Solutions involve costs, that’s normal. You might weigh this cost against the expected payoff and decide it’s not worth it for you — but the article doesn’t do that.

vfaronov · on May 8, 2018

RDF was not designed for arbitrary graphs. It was specifically designed for resource descriptions on the Semantic Web.

> reliance on the web infrastructure (IRI, DNS, web servers, ...)

This is the whole point of RDF. It is what enables Linked Data.

> I read the history of JSON-LD by one of its main author [1], and it does not originated in the RDF community.

Not sure what you mean exactly, but here’s from http://manu.sporny.org/2014/json-ld-origins/:

> JSON-LD started around late 2008 as the work on RDFa 1.0 was wrapping up. We were under pressure [...] to come up with a good way of programming against RDFa data.

JSON-LD definitely was always an RDF serialization, created by people intimately familiar with RDF and the Semantic Web.

titanix2 · on May 9, 2018

> RDF was not designed for arbitrary graphs.

Exactly. That's my exact point. So, now, my problem is other scholars asking "why don't you use our model X or Y based of RDF" when the only thing I care about is the graph part. And I don't use RDF because of the points I exposed earlier. I should I've put emphasis that I'm working in a given field (lexicography), and that the RDF issues I face may be irrelevant for some other use cases.

As for the JSON-LD origin I stand correct.