This is the biggest use-case for ARM on the short-term, and it isn't close.
Lambdas are small, isolated, stateless, and highly abstract. All of these things mean, most users will probably just change one line, and immediately see a 20-40% cost reduction.
Practically speaking, there is no reason for x86 lambdas to exist anymore (outside of, "we ship a go binary compiled to x86, need to change one line in the CI to compile to ARM, ok done").
More broadly speaking; as I type this on an M1 Macbook Air, a device with no fan, capable of matching the single-core performance of the brand new i7 11700KF and Ryzen 7 5900x, talking about how new AWS ARM processors will reduce lambda costs by up-to 40% while increasing performance by 20% over x86... we need consumer ARM chips, and we need broader non-Mac desktop application support. Seems like the writing is on the wall, and developers can either keep up with the curve or be left behind.
Right now, very specifically: Looking at you Autodesk. We're coming up on a year of M1. Where's the non-Rosetta release of, say, Fusion 360?
ARM is cheaper on AWS per the unit of work done, in the general case. But there should be no assumption that ARM is that snappy little brother that will launch your lambdas in no time while x86 is still starting up, this is the kind of idea that I get when reading the comments mentioning ARM. It won’t launch your virtualized environment that noticeably faster. It’ll just do more work for less money, and 40% is probably not the figure that you’ll see in real life.
>and 40% is probably not the figure that you’ll see in real life.
There are lots of companies running on Graviton 2 seeing more or less that kind of cost reduction with the switch. Including Twitter. To the point AWS are currently not meeting demands for those instances.
Not every workload will benefits, but it certainly isn't a niche market.
Let's be fair here, AWS in general is fairly expensive. So is it really a surprise that customers are clamoring to use a reduced priced (while equally performant) offering?
I don't think it will have any affect on Lambda's, but with TeamCity spinning up cloud agents. A t3.medium instance with Ubuntu takes about 1m 20s to begin a build from starting. The t4g.medium with Ubuntu takes around 37 seconds.
This seems pretty huge ('19% better performance at 20% lower cost') considering many (most?) Lambda workloads will be arch-agnostic, not even needing updated scripts/installed libs etc. Then many more will only need a trivial re-compile.
I'm sure they don't need me to tell them they'll be inundated with demand...
> This seems pretty huge ('19% better performance at 20% lower cost')
I believe the performance boost will not be meaningful and the only important factor is cost per GB*s, given that most Lambda workloads are IO bound or simple event handlers that barely take 1s to run anyway.
Nevertheless this is exciting stuff, specially given that NodeJS might be lambda's most popular runtime and thus we might only be a toggle flip away from a 20% discount.
Golang is the top dog for Lambda IMO. The AWS Lambda SDK for Golang includes the entire lambda runtime so you can deploy a single 20mb binary w/ an Alpine linux base (speaking of the ECR backed lambdas) and that’s it. With NodeJS you either have to use the AWS provided centos based images which weigh in at several hundred MB or do some multistage magic to build a minimal Alpine image including the lambda runtime and AWS SDK which altogether still weighs in almost a hundred MB.
TBH, as the Dockerfile based Lambda layers have a 20GB file size limit, I'm not sure the size matters as much.
It's certainly easier to figure out what's going on in a smaller container though! I've had to debug some nasty situation with layers in Python Lambdas before and it's not fun...
Well, call me surprised. Personally I never bothered using Python runtimes in a production environment, mainly because NodeJS is widely described as pretty much the ideal lambda runtime, and as far as I know there is absolutely no compelling reason to pick Python over it.
Also, for performance undoubtedly Golang is by far the optimal AWS Lambda runtime.
Consequently, I see NodeJS in every single application ever as pretty much the default runtime, with a bit of Java (renowned for being by far the worst AWS Lambda runtime) primarily to leverage code reuse and of course Golang.
If anyone reading this picked Python for your AWS Lambda runtime, do you mind telling why? I would love to hear your rationale as I'm sure I'm missing an important take.
Ours are all python, none node. In many cases they replaced existing (non-Lambda) python code, though others were green. Why rewrite the code, or switch to node for greenfield stuff? We'd need a compelling reason to adopt node, not to avoid it.
I assume the answer's roughly the same for anyone - it's odd to me that you're talking about 'ideal Lambda runtime', I'd be surprised if many people care. You're going to choose your language by however you were going to choose your language otherwise, and then run it on Lambda. And python is more popular than node in general.
If you really wanted every last gram of performance, you'd be running an optimised compiled binary anyway, not using any of the provided runtimes.
> Ours are all python, none node. In many cases they replaced existing (non-Lambda) python code, though others were green. Why rewrite the code, or switch to node for greenfield stuff? We'd need a compelling reason to adopt node, not to avoid it.
That's a great point. I admit I followed the same exact rationale motivated by code reusie to go with Java runtimes in projects that consisted mainly of peeling responsibilities out of legacy Java services, and Java is notoriously by far the worst performing Lambda runtime.
Granted, this was before lambda's pricing granularity was updated from 100ms down to 1ms. Nowadays, a JDK invocation easily costs 50x the cost of a NodeJS lambda invocation, not to mention the concurrency consequences.
> I assume the answer's roughly the same for anyone - it's odd to me that you're talking about 'ideal Lambda runtime', I'd be surprised if many people care.
Sure, there are other constrains and requirements, but in the context of AWS Lambdas, and taking into account the way they are priced and the fact that they run on a single vCPU and that they are mainly IO-bound, NodeJS is very hard to beat. If you're starting out a project, you're not totally ignorant regarding AWS Lambda's pricing, and you are free to pick whatever runtime you want, it is very hard to justify any rational choice other than NodeJS or, alternatively, Golang.
> If you really wanted every last gram of performance, you'd be running an optimised compiled binary anyway
Hence Golang.
Meanwhile, NodeJS gets you about 90% where Golang takes you without taking a single step outside of the happy path.
If anyone reading this picked Python for your AWS Lambda runtime, do you mind telling why? I would love to hear your rationale as I'm sure I'm missing an important take.
Libraries and code reuse. Python has libraries for everything, and we have written a whole host of Python libraries of our own. We know how to do what we want to do in Python and many of our lambda functions started their life in either python CLI scripts or as part of a Flask app. "Copy pasting" working code into AWS Lambda is much quicker and easier than rewriting in Node or Go.
If we're 'moving' to anything it would be moving some functions to C(++)
We work with machine learning and computer vision stuff, so python is our language of choice for the main codebase (which is non-lambda). Given that, it makes sense to stick with python for lambda related things for code reuse and lower cognitive overhead.
> I believe the performance boost will not be meaningful and the only important factor is cost per GB*s, given that most Lambda workloads are IO bound or simple event handlers that barely take 1s to run anyway.
At @dayjob we’ve got plenty of Lambda workloads where we’re butting up against the maximum execution time limit; while that's mostly a “not optimally architected for Lambda” problem, an extra ~20% capacity before that becomes a constraint rather than a concern would be welcome.
It will probably take a while to ramp up - it's explicitly opt-in and I expect there's a huge amount of existing code that creates lambdas without being able to specify the architecture. For starters example Terraform and Cloudformation don't support it yet. And if you're using a 3rd party layer you need to wait for them to publish a layer that explicitly supports that architecture.
Sure but then you have the CDK, etc. I’d be interested to know how many people create lambdas in the console and not via some
Other tooling, which needs to be updated to support this.
Precisely. Once you start to consider how other ISAs factor into that (like perhaps RISC-V on the horizon) I really wonder if ARM has a chance to take off in the first place. A ~50% price/performance boost isn't negligible, but it's also not very compelling. Especially for larger companies who'd need to invest more time/money into transitioning their backend.
50% price/performance is very compelling. If you talk to a large company and say "you can cut your compute bill in half with a single re-deploy" they will be very happy.
> A ~50% price/performance boost isn't negligible, but it's also not very compelling.
My take is that a 50% discount in price/performance and a 20% discount in price per execution unit that's only a toggle flip away is something that everyone loves and is eager to jump onboard.
The only exception is when your workload stubbornly stays within free tier limits, which in that case you have no compelling reason to do the work.
I would love to know what the margin boost is to AWS as a result of Graviton. It's cheaper to customers but I have to imagine better margins to them - so kind of a win/win. It's just amazing how they're pushing it out to every AWS service at this point. We transitioned over our RDS instances to Graviton this last weekend and have a nice performance boost with ~10% savings.
Disclosure as I'm Co-Founder and CEO of Vantage - but we'll likely be adding Graviton recommendations to our suite of cost saving recommendations on https://www.vantage.sh/ to give per resource views of potential savings (i.e. you can save X% on this Lambda function based upon the costs we saw by switching to Graviton, etc.)
From folks I know at AWS, services are all being heavily pushed to use Graviton to run their services, to the point of having to justify why they couldn't. The cost savings internally are just as significant.
I am now going to go down the rabbit hole of arm specific bugs in the libraries I use because so far all of the applications I dev (arm ubuntu) have never encountered any bugs.
I think they will transition all services that do not expose infrastructure (think i.e. Cognito, Pinpoint, VPC, Api Gateway, CloudFormation, ...) ASAP.
This is awesome. In my benchmarking on some EC2 instances the Graviton 2 is a great processor for web workloads (frontend web apps). It really is 20% more performance for 20% less cost.
Where in the world is GCE and Azure with ARM processors to compete? It's been how many years now and nothing to show yet? Customers aren't going to wait much longer...
> Where in the world is GCE and Azure with ARM processors to compete?
I can't speak for Azure, but Google Cloud looks like they've gone all-in with AMD's EPYC platform for their premium option, and offers the E2 instance family for those that want cheap compute and don't care what the underlying platform is.
> Customers aren't going to wait much longer...
I can't imagine anyone outside of hobbyists and tiny startups changing their entire hosting platform due to it not having an ARM offering.
> I can't imagine anyone outside of hobbyists and tiny startups changing their entire hosting platform due to it not having an ARM offering.
Don’t the economics work the other way though? The bigger the workload the bigger the saving and the easier it will be to justify moving. Probably a degree of flexibility in pricing to keep the biggest customers though!
Microsoft is partnering ( as well as being an investor ) with Ampere. And Google is going with AMD EPYC offer vCPU as single CPU Core instead of a thread.
I mean Amazon has been preparing for this since 2015 when they acquired Annapurna Labs.
Considering TSMC somehow increased their planned capacity for 5nm by 50%. There may be a chance you will will see that going to Microsoft of Google coming in 2023.
This is awesome, saving that much money can be great for startups.
Oddly I can’t see lambdas being enough of a cost to justify it to hobbyist such as myself.
But this is a great sign of things to come, so much energy is consumed by data centers. Then again, I wonder how much code will randomly break, plus AWS’s dependency management is literally bundle it up locally and upload a zip.
What happens if a ARM package can’t be built locally.
> Then again, I wonder how much code will randomly break, plus AWS’s dependency management is literally bundle it up locally and upload a zip.
I much prefer using Docker containers since they added support for this. There's an established toolchain which supports things like this and it fits well into popular deployment pipelines from things like GitLab/Github.
> AWS’s dependency management is literally bundle it up locally and upload a zip.
That's true and not great IMO either, I also wish the zip wasn't necessary - in the 'bootstrap' binary case just let me upload the binary! - but in case you're not aware of layers (just guessing from 'dependencies') it can be improved a bit:
Ah I'd forgotten about that, thanks! I haven't created any since then other than via Serverless, which.. a large part of my distaste for it is the opaqueness and not knowing what it's doing, but I assume it uses zip method.
I wish cloud formation allowed for more then 4096 bytes of inlined code. The underlying API allows for 50MB, sometimes it’s really nice to be able to deploy from cloudformation without all of the overhead for S3 or ECR.
As already mentioned, Docker images are also available and that's my preferred solution.
If your concern is more about having access to ARM hardware, CodeBuild is a reasonable option. CodePipeline is more expensive than it should be given the feature set, but CodeBuild is cheap and easy to use directly.
Alternatively, I've been intending to move some of my build to Lambda itself. Easy to bootstrap if you use an architecture independent language like Python to script your build. And if your code takes more than 15 minutes to build, should it really be a Lambda function?
To be completely clear, if you just need a python script to basically Sync a few AWS services together, this is great. That’s how I use Lamdas now.
My main concern if I’m using something like Numpy, and I pull my dependencies locally on my 64-bit Machine I have no idea what’s going to work once I push it up to AWS ARM.
> This is awesome, saving that much money can be great for startups.
This doesn't sound very realistic. Cost-conscious organizations would hardly consider lambdas as a reasonable option unless you're well within the free tier limits. Otherwise lambdas are far more expensive than simply handling requests directly with a service running on EC2/ECS/Fargate.
If you have really well tuned autoscaling or really consistent request workloads. My company is still provisioned for peak load at all times which is not cheap..
We have another product built on lambdas where the compute is cheaper than the monitoring, security tooling, etc attached to the account.
Depends on your workload - it’s about $3.50 per month for the smallest T series VM on the SPOT market in the cheap regions (us-east-1, us-east-2, us-west-2), that is your next step up from lambda for minimal workloads. You can run a whole lot (many many millions) of lambda invocations before your bill gets anywhere close to 3.50. But after you exceed that point a dedicated instance running your workload full time starts to look nice.
You could also run a Digital Ocean droplet for less which might make sense if your on tinkerer budget and don’t need IAM access control or VPC access.
I wonder if the ARM-powered Lambdas have a different vCPU scaling. Supposedly, you get between 2-6 vCPUs for x86-64 Lambdas, depending on how much memory you configure for the Lambda.
But, the x86-64 vCPU is not a "real core". It's a hyperthread (SMT), or 1/2 a core. So I'm curious if the scale is still 2-6 vCPUs for Graviton, where a vCPU == A real core...since there's no SMT on Graviton.
Ignoring SMT (as others already mentioned it's not used for AWS Lambda), I also expect different vCPU scaling and they also hint at that in the detailed blog post for this feature [1]:
> Workloads using multithreading and multiprocessing, or performing many I/O operations, can experience lower execution time and, as a consequence, even lower costs.
Interesting. Just tried running "cat /proc/cpuinfo" on an X86 Lambda, and I get:
siblings : 2
cpu cores : 2
Which would imply no hyperthreads, to the degree that their hypervisor is telling the truth. So Lambdas are then one of the few places where an x86 core is a core. Thanks for the info.
I highly recommend Andy Jassy's infrastructure keynote from re:Invent 2020. Among many other things, he discusses why Amazon never gives you a single hyperthread (it's mostly about cache coherency, IIRC).
Thank you for the reference. Clearly you are right that Lambda runs on EC2. Specifically they mention that Lambda runs on EC2 bare metal instances (i.e. without any of the containerization), which makes a lot of sense to me.
I feel in this case the distinction between EC2 and EC2 bare metal is very important. Based on the GP's use of "EC2"[0] its clear they meant "the EC2 containerization model/code", rather "EC2" to mean "the EC2 hardware and/or software together".
I'll correct my original response: "While Lambda does run on EC2 hardware, Lambda uses Firecracker rather than the EC2 containerization model/code. I would be very surprised if we can correctly infer anything about Lambda's containers based on EC2's containers.
Maybe one day in the future (or already?) EC2 will also use Firecraker. At which point this comment should be deprecated.
[0] "You can look at the EC2 t instance series to see how AWS might think about constraining resources."
Yes, I had assumed SMT because they say they give you between 2-6 vCPUs. I assumed that meant 2,4, or 6 hyperthreads...where they could keep you from sharing a core with neighbors. But, seems you get real cores.
Ideally on Serverless infrastructure the system would
"move" jobs around so they run on the optimal platform
for what they are doing continuously?
(Based on price or based on raw compute etc)
I know there are many reasons this will be difficult in
real life, but it sounds like where it should be heading.
> Ideally on Serverless infrastructure the system would "move" jobs around so they run on the optimal platform for what they are doing continuously? (Based on price or based on raw compute etc)
AWS Lambda supports about half a dozen runtimes, some based on interpreted languages, and some based on compiled languages. Thus you can't simply move golang or rust code between completely different architectures and expect things to work.
Also, one of AWS Lambda's usecases is having your runtime call precompiled binaries. I'd be pissed if my Lambda's ceased to work because AWS decided to run my trusty python-calling-C++ lib Lambdas in ARM just because they want to push people towards graviton.
Hopefully ECS/fargate will also be supported soon. I tried shifting our CI workers to ARM but it resulted in not being able to use them to build ECS images, which was not great.
Cross-compilation is iffy at best -- too many developers of build scripts don't think about it, and end up not supporting it (e.g. Makefile doing codegen using a binary built during the run for the target arch) or having subtle and mysterious bugs (./configure saves things for the build arch, not the target arch)
There seems to have been a rain of
ARM permutations lately most of them
proprietary?
Apple as their M1 and M2 (coming).
MS has their Sq1, Sq2
Tesla has the FSD Chip
AWS Graviton, Nvidia
Snapdragon, Broadcom
I am sure there are many more.
I know Apple has not publicly documented their chips.
I dont know for sure about the others.
Will all of this make building compilers magnitudes more complex?
If you have a codebase and you want it to optimize for M2,
with all of its goodies like the GPUs and neural whatnot,
then I need the same codebase to run as fast as possible on
the Graviton cpus that sounds difficult?
Maybe there is a subset of Opcodes shared across all or most
of them but is that reasonable to limit compilers to them?
I am sure I am just confused but it seems to me to be a bit
of a mess.
That subset of opcodes is the ARM instruction set. So yes GCC will have no problem. Even the SIMD extensions are under ARM's purview.
The neural processors, GPU's, TPU's etc. aren't part of the CPU anyways (they're part of the SoC).
This fragmentation has some downsides but having a good compiler isn't one of them. It's more likely all the different ways these SoC's handle booting, connecting to peripherals, etc. This will create a huge burden on something like the Linux kernel if it attempts to support them all (for evidence, check out the recent work on Linux M1 support.. fascinating stuff).
ARM itself is proprietary, including the ISA and several micro-architectures that implement the ISA.
I don't think it's shocking that folks who licensed a proprietary ISA, and then paid a stiff license fee to implement their own microarch off of a proprietary reference microarch, would keep their chips proprietary.
risc-v is the only hope in this arena, and luckily David Patterson is generally signal enough that something might be a good idea.
Google v. Oracle seems to suggest that you could do that, potentially? (IANAL.)
On the other hand, the fact that Arm doesn't produce chips might be an important factor, if there's an argument that the Java API is incidental to Java IP but the ARM ISA is the substance of the IP.
I think this would also depend on the extent to which the ARM ISA specifically is protected by Arm's patents, since Google v. Oracle was strictly related to copyright. It might be the case that what you're suggesting is plainly illegal, or I could see it being a kind of grey area where it's fine per se, so long as no one actually puts it into practice without purchasing a patent license from Arm.
Suspect that ISA can’t be patented but efficient implementation of aspects of it are. (IANAL again)
However the cost of implementing the Arm ISA in a clean room is probably more than the licensing cost from Arm - they are spreading their costs over lots of customers and you are not - so why would you do it?
I changed a few very simple Lambda routines running Node.js v14 from "x86_64" to "arm64". Not complicated modules that run relatively quickly doing some very basic calculations. I was seeing double the running time. One routine that was reporting a billed duration of 2049 ms with "x86_64" reported 4243 ms with "arm64". Switched it back to "x86_64" and the runtime halved as expected.
So at least with Node.js, I see no benefit with the arm64 architecture. YMMV.
I have been using a Graviton 2 based EC2 to host some production web server and database workloads. I love the savings!
Publishing my docker images to my registry as ARM images from my x86 based CI pipeline is a little weird though. Would be nice if docker baked in all the QEMU magic and let me use `docker build --target-arch arm64`.
Use docker buildx, it's the Buildkit engine which fully supports cross platform builds. It's essentially just what you mention, add a CLI flag and it spit out an ARM image even on your Intel machine. See: https://www.docker.com/blog/multi-arch-build-and-images-the-...
Depending on what your idea of a computer is, the reference carrier board for the Raspberry Pi 4 compute module has a pcie x1 slot. The jetson family of developer kits have pcie connectivity. The 4GB Jetson Nano has an m.2 E key slot which has a pcie x1 lane, with the right adapter available from Amazon or alibaba or where ever, you can plug in a pcie card of your choice. The Xavier NX has two m.2 slots, which again, with adapters can connect to pcie cards. Though I’m not sure of the use case as the jetson line has fairly capable gpus built in.
so...my use case for GPU lambdas is for inference. Say 500 inferences/day ~1 sec/inference, has to be real time as the output is user facing.
Can they really spin up quick enough to actually respond in real time? I don't really want to have to keep them warm as I'll end up paying through the nose for it.
For some of my use cases the real-time response isn't essential, just needs one of containers for training. I figured I can just go about ECS, but thought if these were wrapped as lambdas it would just be a matter of toggling the base-image/lambda type.
Thanks for this! I was hoping it was going to be this easy for a SAM based Go app but didn't have it in me to wade through the AWS docs. This post is $.
You can run an arbitrary binary. Keywords for searching docs are 'bootstrap' (the entrypoint name) and 'custom runtime'.
In this way you can also run whatever version of the provided language runtimes you want, not limited to the official ones if you want newer python syntax or node stdlib features than are available, for example.
Not at all. If you want to run assembly code that targets a specific architecture you need to know what that architecture is going to be. At it’s heart Lambda is just a Linux container running a CGI app.
Everything is assembly, that's how your CPU works. If you run C code, rust, go whatever you have to Target an architecture so the compiler can generate the correct assembly instructions. Even python or js interpreters must be recompiled per each architecture you wish to run those languages in. Lambda doesn't change that, as the comment above says its just a container
The whole point of container is to abstract away the underlying OS & hardware. If you need CPU architecture specific code, running it on lambda seem a bit pointless to me.
The default docker runtime does not run the containers in a VM but instead on top of the host kernel using namespaces. In this case the kernel which the container utilizes is the same kernel running the rest of the system. The container is limited by the CPU architecture and configuration of the underlying kernel.
If your container runtime uses an emulator (say QEMU) you are able to run containers which do not match the host system, albeit at a slower speed than running the container using a matching architecture and namespaces.
Containers are more about abstracting away kernel state tables, but not deeper env like processor arch. You can see this in how there was a lot of work into deploying aarch64 containers over the past year to support M1 docker.
And lambda is more about changing the process lifecycle from traditional always using resources. Sort of the inetd for hyperscalers.
The whole arch64 and docker is my point. I use docker to abstract that pain so I don’t have to care about (to a certain degree). If all of sudden, I need to worry about my underlying arch, then I might as well run on VMs. For lambda, it’s more of a micro-OS anyway than a generic docker container.
I'm not sure what you mean by not caring about underlying arch. With docker you use an arch-specific image. The author may transparently provide multiple arch versions of the image, but I wouldn't call it abstracting. (not in the sense fat-binaries do for example) Specifically, if you then make your own app image on top of the base one, it will again need to be built for each architecture separately.
Docker explicitly doesn't abstract aarch64. That's why it took tons of work to build up the ecosystem this far and even then and still you'll come across public images that aren't multiarch.
If you can compile the single task your lambda will perform down to as close to bare metal as possible, without affecting your workflow, why not?
I use JavaScript in the few lambdas I have because of dev. ex. What little additional cost it would add offset by speed of development for me, and how important speed is to my tasks.
Lambda is never meant for low latency CPU app, where speed is king, due to startup cost. I don’t see why you would run architecture specific code on lambda.
There are languages like Go, Rust or C that are compiled. If your lambda uses compiled languages you need to recompile to the right platform.
Additionally, a lot of interpreted languages depends on native libraries. You need to build those libraries using the right arch. It happens a lot if you want, for example, deploy a Rails app in Lambda.
By the way, Lambda works great for low latency apps. The "startup cost" is negligible if you know how to implement it. You can serve ML models from a lambda, and in that case you want a math library optimized for the platform. Unless you have a lot of traffic a Lambda usually is cheaper.
I wouldn't have a task that would quite so extensive on lambda (though tbh I don't know how it costs compared to aws' video encoder service)
But in a similar vein, extracting interesting thumbnails would at scale, be one use case where lambda with assembly/compiled static bins may be a good choice.
Unless you want to restrict lambda to only run interpreted code with interpreter & binary dependencies provided by AWS, or run things in emulation, you can't really hide it.
It's a question of how much you can do with the tool. If Lambda only supported, say, pure Python or Node libraries you'd run into cases where you couldn't use it or would see big performance penalties. Being able to push a container with, say, a high-performance image processing library which your Lambda needs makes it a lot more versatile but also forces you to think about things like x86 or ARM.
Overall, I see that as a win: you can get started by putting code in something like Python or JavaScript and not caring but when you do hit more complex dependencies you have a path which is an extension of what you've already been using rather than having to switch to a completely different service like ECS.
10,000,000 300ms invocations with 1gb of memory is under 30 dollars. That's dirt cheap, over a month of CPU time that you can burst to tens of thousands of concurrent executions without caring.
Not saying AWS doesn't come with a premium, but lambda isn't exactly the poster-child for expensive compute.
Lambdas are small, isolated, stateless, and highly abstract. All of these things mean, most users will probably just change one line, and immediately see a 20-40% cost reduction.
Practically speaking, there is no reason for x86 lambdas to exist anymore (outside of, "we ship a go binary compiled to x86, need to change one line in the CI to compile to ARM, ok done").
More broadly speaking; as I type this on an M1 Macbook Air, a device with no fan, capable of matching the single-core performance of the brand new i7 11700KF and Ryzen 7 5900x, talking about how new AWS ARM processors will reduce lambda costs by up-to 40% while increasing performance by 20% over x86... we need consumer ARM chips, and we need broader non-Mac desktop application support. Seems like the writing is on the wall, and developers can either keep up with the curve or be left behind.
Right now, very specifically: Looking at you Autodesk. We're coming up on a year of M1. Where's the non-Rosetta release of, say, Fusion 360?