Yep, I wished I had had access to this back when I was working on something in the Rails side. Ingesting large amounts of data from Shopify across multiple oauth access points while being limited to the PostgreSQL backend and Shopify rate limit restrictions was a pain to setup with Redis and Sidekiq. I ended up forking Sidekiq to accomplish that -- essentially using Sidekiq as a concurrency framework to build in a different set of behavior. Reading through the GenStage document, I could have implemented something better.
The way I implemented it (with Ruby) left gaps of idle time, and huge amounts of data gets staged through the queues. I couldn't figure out how to make it better. Since I only had a week to come up with something while the infrastructure was melting down, at the time, it was acceptable. Looking at this though, the demand-based backpressure would work very well.
There are a couple problems I think I can use this in my current work -- this is great work!
Already did. That wasn't the issue. I get that you're trying to help, but you don't really understand the problem and why GenStage would help it. The issue isn't N+1 queries but in concurrency and reasoning with concurrency, something that Rails, and many Rails developers seem to avoid.
This looks incredible, congratulations to the Elixir team!
Perhaps more exciting than the first part of the post is the second bit about the future. It's fantastic to see such a clear path forward for concurrency in Elixir. Definitely looking forward to GenStage.Flow
It's demand based. The consumer will request a fixed size of input and then the producer will send up to but no more than that until more demand is made.
In addition to that, demand is asynchronous: the consumer can request for more events, at any amount, any time it wants to.
The default implementation requests more when half of the current demand is processed. So if you set :max_demand to 100, when the consumer has processed 50 events, we request 50 more.
I am on my phone (sorry) but the documentation for the GenStage module has a section on the message protocol between consumers and producers. GenStage is simply one implementation of this message protocol.
i believe it's similar to window updates in http2. consumers send an async message to producers with a count of messages they are willing to accept and it's up to producers to not exceed the count
I'm not quite at the point where I can appreciate how useful this is, because I don't really understand concurrency itself. What is a good resource from which to learn about concurrency, and especially Elixir/Erlang's approach to it?
If you want to watch a video, check out Joe Armstrong (one of the creators of Erlang) talk about concurrency and Erlang[0]. There are a bunch of Erlang/Elixir videos [1] posted by Erlang Solutions on YouTube.
Before diving into Erlang/Elixir, I'd make sure to learn the basics of concurrency/parallelism with more basic languages such as Java, Python or Ruby. i.e. start with the easy stuff, then with that strong foundation choose as you prefer!
Honestly, learning actor-model concurrency with Erlang (and probably Elixir) is a lot easier than dealing with concurrency in the "basic" languages you list. Concurrency in Java, Python, or Ruby isn't "the easy stuff" compared to concurrency in Erlang/Elixir.
Conceptually, threads are probably easier than actors for a beginner. I'm not saying that it's easier to implement correct code with threads (in fact it's more complex), but most devs go from OO to FP, not the other way around.
These days there is such an explostion of languages, frameworks, techniques etc it must be scary for a beginner. Given that, i'd advise to go by the safe (average) path.
But if you learn about concurrency via threads then you have to learn about mutexes... if you learn about mutexes then you have to learn about condition variables... and if you learn about condition variables then you have t...
Well you see where this is going. I think it's important to learn about Threads, Mutexes, Condition Variables, and why sharing isn't caring, but I don't think those lessons necessarily need to come before learning about the Actor model.
On the other hand, if you want to sell the Actor model it might be good to first go through the complex techniques needed for trying to get concurrency right with threads.
Most devs go from OO to FP because of dominance of OO in industry (with knock-on effects on pedagogy) from sometime around the late-1980s/early-1990s until recently. Shortly before that, the same would be true of procedural to either OO or FP. And probably at one point the same would have been true of unstructured imperative to procedural.
But that doesn't mean that concurrency in popular OO languages of today is easier than concurrency in Erlang/Elixir (which may be examples of functional-ish languages, which I assume is the relevance of your OO to FP statement), nor does it mean that concurrency in formerly popular procedural languages is easier than in any particular OO language, or that concurrency in unstructured imperative languages is easier than in any particular procedural language.
I think the only argument left for "why I should learn OO before FP" is the lingering suspicion no one would tolerate the pain involved with OO once they reached that point.
All OTP abstractions are based on process creation/supervision/messaging primitives. So while erlang doesn't have this abstraction, they are all based on the same primitives that both erlang and elixir use.
Could this be used to allow Elixir to load balance a third party server?
For example: You have an Elixir load balancer that manages requests for three other servers and distributes load based on the 'demand' that the consumer communicates back to the balancer?
yep. The neat thing about this is that it's composable primitives, so you could imagine having an arbitrarily shaped tree of demand-creating producer-consumers and consumers. And with distribution, it'd be possible to scale this out pretty far.
Are there any plans to use these primitives to communicate with servers not written in Elixir? For example can you foresee Elixir being used to manage and coordinate heterogeneous architectures?
I'm on the erlang rather than the elixir side of the BEAM, but it's pretty easy and normal in the BEAM to communicate with heterogenous processes, either by implementing a dirty or regular NIF, opening a port, building a C Node or using JInterface with java, using CORBA haha, or just opening a socket.
You'd still have the regular BEAM reduction-counting backpressure in place, and it would be trivial to have your flow components report their work rates and/or demand requests to whatever monitoring service (statsd, etc.) that you might care to set up. Or to respond to state query messages like sys:get_state/1.
It's definitely worth trying!
I have used akka-stream once, and always expect something like that in erlang world. Finally, here it is. Thanks for the Elixir team.
The way I implemented it (with Ruby) left gaps of idle time, and huge amounts of data gets staged through the queues. I couldn't figure out how to make it better. Since I only had a week to come up with something while the infrastructure was melting down, at the time, it was acceptable. Looking at this though, the demand-based backpressure would work very well.
There are a couple problems I think I can use this in my current work -- this is great work!