Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why cloud bandwidth is so obscenely expensive and what you can do about that (kerkour.com)
154 points by randomint64 on Oct 26, 2023 | hide | past | favorite | 158 comments


I wonder is

  There also is Hetzner and their dedicated servers. You pay for a server with a 
  1Gb/s connection and all egress is free. Unfortunately, I can't recommend them 
  as each time I've tried to create an account, their abuse detection systems 
  banned my account even before I had the occasion to enter a payment method.
related to

  I've myself won the jackpot in the past on Netlify where I forgot to put a 
  sleep in a while (true) loop, making my script flood my website and generate a 
  lot of traffic. For the story, I quickly migrated the website to another 
  provider, deleted my account, and never had to pay the bill :)
Is there some kind of blacklist if you do something like that? But I do know Hetzner has some extremely strict policies regarding account creation.


Hetzner default process requires contacting a human before your account is ready for any use.

Looks like they have some exceptions, so some people manage to just make an account and use it. And looks like there's some requirement on German law that forces their hands. But I don't think the OP got flagged for anything.


> Is there some kind of blacklist if you do something like that?

No; but oddly enough, the people who "do things like that" generally tend to come from certain countries — I guess countries with cultures that don't place much weight on the concept of "incurring a debt of honor" by consuming someone else's resources without them ever knowing about it.

So most systems don't generally need a big, manually curated and ever-growing blacklist; they just need to block registrations from IP addresses / ASNs of ISPs headquartered in these countries; and/or block payment attempts from credit cards issued by banks headquartered in these countries. That immediately stops 90% of such abuse.

And of the remaining 10%, half of it is still people from those same countries — just using foreign IPs through residential-botnet VPNs, and stolen credit cards they purchased on scammer forums. Blocking these is a bit of an art, but it's possible: there's always patterns to the requests themselves, often because those same scammers try to solve all their problems with money, and so have also purchased scam-site kits to run on the hosting they acquire — things like cryptocurrency "drainers." If you're a VPS hosting provider, you can just detect these by the SHAs of the files; if you're a dedicated hosting provider with no access to customers' disks, it's still pretty easy to pick these out by the outbound signature of the network traffic they generate — as they almost always rely on making requests to particular third-party SaaS information systems, that you can turn into an IDS detection fingerprint.

(I'm personally in a different position in this ecosystem — my company operates one of the public informational SaaS services that these scam-site kits like to use. From my company's perspective, these scammers are perfectly normal paying customers, not intent to scam us... but we don't want these people as customers, so we still detect this fraudware by the fingerprint it makes in our API request logs, and permaban the users who deploy such kits by every fingerprinting metric we can.)

---

Though, on another note, I suppose you could call the observational "IP reputation" metrics gathered by providers like https://www.ipqualityscore.com/ something like a blacklist — and I'm sure hosting providers like Hetzner check your "IP score" before letting you register. But these aren't blacklists in the sense of being manually curated.

Instead, what these providers curate is something like a distributed version of an SSHGuard blocklist: a bunch of the provider's own "observer nodes", all over the world, observe what IPs are hitting them with DDoSes and other botnet-like activities, and consider these IPs temporarily compromised for as long as that activity persists (because any device infected by a botnet can potentially be repurposed as a part of a residential-proxy VPN network — and that means that any traffic observed to come from such a device, can't be trusted to be originating from that device.)

IIRC, these providers will also "go undercover" to buy access to both commercial and residential-botnet VPNs; cycle through them to find out what all the available exit-node IP addresses are from a client's perspective — and then mark all these as compromised as well.


As far as I can tell, the author is French and lives in France. I've heard plenty of stereotypes about the French, but them bit caring about debts if any kind doesn't come to mind.

Then again, I seriously doubt Netlify's cost actually reflected the damage incurred. Cloud providers inflate their bills massively, and if they did incur a loss serious enough, they'd pursue the matter in court; a couple of thousands of euros lost is worth getting your legal team involved for. Deleting your account doesn't clear your debt, not does it make you untraceable.

Most likely, Netlify noticed the large bill associated with a deleted account, concluded that the resources spent didn't incur them enough loss to care, and waived the fee. Companies like Amazon will sometimes waive huge bills due to bugs if you ask nicely anyway.

I don't think there's an international fraud registry that works for this kind of abuse. The best you can do is verify the identity of your customers and let the banks and/or legal system handle frauds.

Outside of France there are places that register debt to your name, making it hard to get loans or mortgages or even things like phone contracts exceeding a certain monthly fee. There's also the American credit score system, of course, which will bite offenders in other ways down the line.

With the popularity of services like privacy.com where you can create virtual credit cards that will just disappear when you don't want to pay your bills anymore, I think this type of abuse had been calculated into the pricing structure.


Cards from privacy.com are easy to ban because they always come from the same banks, and their card numbers always start with the same eight digit sequence. Just ban anything from that starting eight digit sequence, and you're done.

Sadly, that also hurts all of the legitimate customers from privacy.com.


> As far as I can tell, the author is French and lives in France. I've heard plenty of stereotypes about the French, but them bit caring about debts if any kind doesn't come to mind.

No, I wasn't implying anything about France; but then, the author wasn't doing the thing that most "people causing problems for hosting providers" do, which goes more like so:

1. register with a stolen credit card that validates at the time, but won't accept payment when the provider goes to collect for the month;

2. rack up billable usage doing some kind of scam; and then

3. when the account gets closed for non-payment, immediately register again, from a new (VPNed) IP, using a new (stolen) identity, with a new (stolen) card.

4. Optionally: do this "in bulk" with multiple accounts at once, perhaps even with scripted automatic bulk account registrations, account "aging" to avoid registration-recency being used as a fraud-score calculation, etc. (You're more likely to see this type of attacker on API services where the service has some kind of per-customer rate-limiting and the attacker doesn't appreciate being rate-limited — they just configure their client software to round-robin their workload across many accounts.)

If you were raised to see this as "using up someone else's resources and depriving others of those resources", then this probably sounds unethical to you, and you will avoid doing it even if it's "easy" to do. But if you weren't, then this probably just looks like an "infinite money glitch" in real life.

If you want me to be concrete about the part of the world where these fraudulent users come from: it's CIS countries. It's hard to tell which people are responsible any more specifically than that — the various CIS countries crop up pretty evenly in attack logs. This is likely because there are many VPN services run in each of these countries, that specifically serve the "other CIS countries" market, and even more specifically serve the "your country is blacklisted from service X? we got you, bro" market.

(I have been witness to posts on scammer forums over the last year or two, that specifically said something to the effect of "full identity kits [IP VPN, identity and matching credit card] for sake! Russian kits on discount because they're unlikely to be accepted pretty much anywhere useful. Ukranian kits marked up with a premium right now, because the west is a big fan of them at the moment, and so is more hesitant to ban them / write rules against them.")

> With the popularity of services like privacy.com where you can create virtual credit cards that will just disappear when you don't want to pay your bills anymore, I think this type of abuse had been calculated into the pricing structure.

There's a simple switch on pretty much every payment processor, that when enabled, rejects cards known to be prepaid/gift cards, only accepting cards that can actually carry a negative balance. Any post-paid usage-based-billing subscription service would have this switch enabled.

A paranoid provider like Hetzner, in addition, probably blocks the Privacy.com partnering card issuer's BIN numbers from being accepted at subscription time. I know our service sure does. (We block the BINs for Venmo and CashApp "cards" too.)


Been their customer for many years, I remember signing up online, no red flags in the process. Some years later signed up an account for a client as well again no problems.


Based on the pricing you see at VPS providers, bare metal providers, and colocation, I have long called BS on large cloud provider bandwidth pricing. It's completely insane, like more than 10X higher than VPSes and more than 100-1000X higher than wholesale (bare metal and colo).

Take this for example: https://www.fdcservers.com/

That company is profitable, which means they have to be paying less than this for bandwidth or at least selling bandwidth at cost and paying less for other things. I'm sure they are overselling but even then this model couldn't work if bandwidth were anywhere even close to what cloud charges for it. (We run a few things there for high bandwidth simple services and it works great and seems to be as advertised. There are many other companies with insanely low pricing on bandwidth, but they're one of the cheapest.)

The key to understanding why I really think AWS/GCP/etc. price bandwidth the way they do is that ingress is free and egress is absurdly expensive. I call this "roach motel pricing" after the old roach motel ad: "roaches go in, but they don't come out." The obvious objective is to get data to flow effortlessly in but make it costly to either relay data for edge or endpoint first computation or move data out. The whole pricing structure is to encourage cloud providers to suck in the entire computing ecosystem and host it internally within large data centers, which is their business model. It works very well.

The instant I saw this pricing structure when cloud first started to be introduced I immediately understood what was happening here.


Searching for terms like "wholesale bandwidth transit" might be a better reference for comparison of actual cost. The providers you mention are depending somewhat on customers that statistically underutilize what they have and using obscure TOS clauses to chase away heavy users.


That's the list price. You don't pay the list price if you purchase at high volume. Bare metal and VPS providers are surely very high volume customers.

This is perhaps an argument against pure DIY colo, but most small scale colo isn't connected "naked" to the Internet but gets racked in a data center that offers a bandwidth package and does peering for you. The only people who peer directly tend to be clouds, bare metal hosts, CDNs, and ISPs. Even if you have your own IP space and an ASN you are usually behind someone who is bulk purchasing transit and handling peering.

To be fair very large customers can negotiate costs down at cloud providers too, but you have to be spending tens of thousands a month to even have a chance at getting their attention for any kind of special deal.


Company I used to work for had an AWS bill at 7 figures and sometimes 8 (a very time based offering)

AFAICT we had a blanket 20% off the bill, nothing special about egress.


It's probably the closest proxy for which you can get meaningful numbers.


One complicating factor of egress is that it's not all created equal. Some providers have really good peering agreements. Others (even some big players...) have bad ones. Some cheaper-than-AWS options aren't really selling the same thing.

It's really hard to evaluate that before you hit issues in production, too.


> Take this for example: https://www.fdcservers.com/

You took "as example" literally the cheapest provider that exists in this space. (and lowest network quality/connections)


Okay, then here's more:

Hetzner, Datapacket, Hivelocity, Reliablesite, and OVH.

I did pick them as an extreme example, but even the more expensive bare metal and colo providers are like 100X cheaper than cloud. VPS providers like Digital Ocean and Vultr are ~10X cheaper.

Cloud sells egress bandwidth at an enormous markup. Meanwhile ingress is free. If this were a technical limitation ingress would be billed too because there's nothing asymmetric about the Internet. The asymmetry you usually see on home connections is mostly a DOCSIS (cable modem) limitation or a way of segmenting personal vs. business accounts.

BTW we run a few things at FDCServers (among others) and monitor them and they seem fine. We do use multiple providers in case one of them has a huge issue.

Edit: these providers can't be overselling too much because the fact that they have such cheap bandwidth makes them a magnet for bandwidth-heavy work loads: game servers, seed boxes, DIY CDNs, TURN servers, etc.


> Cloud sells egress bandwidth at an enormous markup. Meanwhile ingress is free. If this were a technical limitation ingress would be billed too because there's nothing asymmetric about the Internet.

Symmetry is the entire problem.

1) to maintain settlement-free peering, you need traffic-symmetry. Amazon — on their own - transmits far more data than they receive. They benefit far more on this account from ingress traffic than more egress. Hence, the economic signal that ingress is free.

2) If you have unbalanced network traffic, your scaling factor becomes the transmit direction that is more heavily utilized. A 100gbps circuit can generally send and receive 100gbps simultaneously. But it can’t be used to send 150gbps and receive 50gbps. So if you have a bunch of people all wanting to transmit from your circuit far more than they receive, you have to upgrade your circuits and routers earlier than you otherwise would for a given aggregate amount of traffic. This is expensive, but adding more ingress traffic is free.


Assuming that's all true, it still doesn't explain the magnitude of the markup. VPS providers can do $0.01/gig outbound almost across the board. This includes some very large ones, and because bandwidth is cheap there people tend to host a lot of high bandwidth things there.

GCP, AWS, etc. are roughly 10X more expensive than this. Are they 10X better?

The cost is very high. I can't stress this enough. A terabyte outbound at AWS is roughly $80. For around $300/month I can get 1gbps unmetered at a highly regarded high quality bare metal provider and transmit well over 200TiB in a month. I know because I've done it continuously for years. It works fine. That means AWS is marking up bandwidth by at least 50X at a minimum.

(The above are rough numbers, but they capture the magnitude of the difference.)

If what you say is true and the markup is reasonable then all these companies should be out of business. Either that or they're overselling bandwidth by a giant margin, but that doesn't check out because they specifically advertise as places to put your extremely high bandwidth work loads. If your business model depends on people not doing X, don't advertise as a place for people to do X.


Note: at no point in my comment did I say I thought the specific dollar amount of the bandwidth markup was reasonable or unreasonable. I pointed out reasons why differential bandwidth pricing for ingress/egress exists at the major cloud providers, when you claimed there were no technical reasons.

Also note: VPS providers generally do not exist at the scale where they could or would pursue settlement-free peering. They buy transit from one (or, hopefully, more) upstream providers who have to worry about the first point I made. The VPS providers still have to worry about point #2, but they can do that by courting different kinds of customers, without the baggage (in the context of running a network) of a big legacy content distribution business that both Amazon and Google have.


A word of warning when it comes to Vultr. While they are listed as a member of the "Bandwidth Alliance", they don't discount egress automatically. You need to request the discount from support.

See: https://www.answeroverflow.com/m/1118148928915918990


Beware of cloud providers that don't "charge for egress" but do charge on a metric that is effectively a proxy for egress.

For example, a cloud streaming service that charges for video delivery by per minute streamed is just charging for egress with extra steps.


Eh? If you're delivering video at scale there's no way anybody is going to do that for free?

Whereas with AWS they charge you for traffic between AWS regions last time I checked?


They're not asking for free service, they're just noting that a claim of "free egress" might be a lie.


Any CDN that doesnt charge for egress (like cloudflare) will deliver your videos without charges.


Delivering video will instantly promote you to their business plan where you are charged


The CTO of Cloudflare denied that.


Afaik its allowed with R2.


> For example, a cloud streaming service that charges for video delivery by per minute streamed is just charging for egress with extra steps.

It's even worse than paying for egress because the cloud provider gets to capture any gains in compression technology, right?


You see egress with extra steps, I see billing for work relevant to my business and my customers


Cloudflare R2 (their S3 clone) has free egress.

We are in the middle of migrating about 5TB of fairly active data from Google Cloud Storage to R2. For years now, most of our cloud bill has been egress from GCS. Activity spiked recently and doing something about it became urgent.

R2 has an S3-compatible API (we're using the Amazon-provided S3 Java SDK). Storage is 25% cheaper than GCS standard. Making our code multi-provider was pretty easy. I'm actually looking forward to next month's hosting bill.

I feel like I'm slowly getting pushed off of GCP due to pricing. Last year we moved some stateless image processing infrastructure to AWS Elastic Beanstalk because RAM is much, much cheaper there.


You kinda pay in reduced performance. For example, civitai uses r2 for delivering the images, and it frequently doesn't load at all for me or very slowly.

When your users have to use a VPN to move their traffic to another Cloudflare data center so the site actually loads, something is wrong...


Heh, this entire threat is "Fast, cheap, reliable... Pick two" where everyone wonders why all three are so much more expensive.


Not really. There is a cost to being fast and reliable, but then the biggest cloud providers slap an additional >10x increase on top because they can get people to pay it.

You can also buy high quality connections at datacenters and plug them into your own servers.


General question, can we just call object storage object storage? If there's going to be another provider after already remembering S3, B2, now R2, on top of the various aliases for products like virtual servers, shared hosting, VLANs, and other things people are presenting as new products with the most random of abbreviations, I'm really going to lose track.


we need something like [this](https://www.azureperiodictable.com/), but across all cloud providers


It's a bit worrying though that one day they might decide to not leave the money on the table.


Oracle Cloud offers 10TB free bandwidth per month and is part of the bandwidth alliance I think with Cloudflare.

Edgecast (Edgio) is free as far as I can tell, except you pay for support if you want it and some extra features.

Datapacket has unmetered bandwidth for example.

There are plenty of solutions, but they aren’t very well known because of the dominance AWS and others have.

Not to mention, it might be cheaper to just use direct connect with a cloud provider or just grab an ASN and host stuff in a local datacenter.


One thing to keep in mind - if you accidentally run up a bill because of a mistake, there’s a good chance you can reach out to support and they will credit your account. $5k may be a lot to an individual, but the cloud provider’s costs for that service are significantly less and they shouldn’t mind forgiving the charge. Case in point, when the author deleted his account and switched providers, the old host didn’t chase him for the charges.


That’s still a lot of hopes and prayers you do not end in financial ruin.

Instead, the cloud providers could offer prepaid credits/billing maximums to let individuals sleep at night. I would love to run my side project off of AWS to gain the experience, but no way I feel comfortable with a potentially unlimited liability because I configured something poorly. My joke project can take the uptime hit if it means I know that I will never spend more than $X to host it.


GCP used to offer a simple type-a-number-here daily billing limit. For a small business or personal project you could even safely put it at something super low like $2/day (or much less), since their "Always Free" tier covered a decent amount of usage.

You can still do it, but they got rid of the special-cased Console setting and instead provided a migration snippet to programatically control it using one of their cloud run/automation features or another.

Ah, here's the page, I guess:

https://cloud.google.com/billing/docs/how-to/budgets

Configuring a cap:

https://cloud.google.com/billing/docs/how-to/notify#cap_disa...


At least AWS sort of provides tooling to approximate this. I have some alerts setup in CloudWatch for anomalous Network Out, estimated charges exceeding a threshold, and a raw Network Out limit that if exceeded shuts instances down.

Overtime, bandwidth costs have gone down for AWS. What's finally causing me to think about jumping ship (surprised OP didn't mention OVH's unmetered stuff) is their upcoming pricing per month for IPv4 addresses. The alternative is proxying through CloudFlare, which I don't particularly want to do.


It's still absurd having to hand-craft that type of protection against racking up enormous bills, especially with cloud providers now actively promoting their various "free" tiers to hobbyists.

It reminds me of EU roaming before the regulatory cap, when a single video call close to the border could cost you thousands of Euros.


The thing is that GCP used to have this, but then got rid of it a couple years ago. So is there really demand for it in most of their customers?


Google killing a product cannot be used to assess popularity. Especially if the product’s existence would potentially hurt revenue.


It can certainly raise questions about popularity, however, as long as you also adequately consider their tendency for rent-seeking, monopolistic behavior, etc.

In this specific case it's more that I'm having a hard time picturing serious "cloud"-based web companies that decide "Oh yeah sure we'll just shut our site down if we get too many visitors". That would mean this feature is useful primarily for hobbyists, which are more likely to be put off by (and react like they do to roaming phone charges to) uncapped costs (and also not a reliable source of revenue), weakening the revenue argument for removing the feature.


Yeah, again "sorta" "approximates".

AWS is famous for making those alerts hard to set and error-prone. Honestly, I don't understand why people (instead of large companies) use them at all.


I agree with you. I understand why a "shut down all services when this amount of money is spent" button shouldn't be enabled by default, but it should at least be an option.

This is a good reason to avoid Amazon and friends. They're set up to let you fail, and that makes them terrible companies to host your private products with.


That's been true during recent flush years, where money was cheap and nobody really cared, but there are signs all over the place that the attitude is changing. Everything that used to be free is being monetized and everything that was monetized is seeing increased prices.

We should expect that these sort of implicit affordances are being trimmed as well, or at least that they're likely to be soon.


Yeah, I rather suspect they'll be much less forgiving of this sort of thing in the future.

All that takes is a lean quarter and some MBA to coin a term for rookie mistake-windfalls, and they will come to feel entitled to revenue from it.


Bob, I see the collateral spend of your clients is down this quarter. Can we see that bump up a bit? Mmmmkay that would be great. Thanks. Good talk.


Cloud bandwidth is cheap if you commit to spending a certain amount per month.

Our bandwidth is less than .001/GB with like a $1500 monthly commitment in AWS. Fastly was even cheaper (which is why we're using fastly).

Call you AM and talk. It's not hard, and it'll save you a ton of money.


We had a commit that was like three orders of magnitude more and our price was 1-2c for cdn and 2c for tier1 cloud (but only “partner” networks, the rest was list). That was after us threatening to leave (and having capability to do so). So either you are mistaken or your am really really likes you.


No, it's your people suck at negotiating.

Again, we got quotes from Fastly, Akamai, and AWS. Our commit is/was $1500/month (we're on fastly now). Fastly gave us like 6 month free to switch, both bandwidth and requests.


So you mean to tell me you negotiated roughly -99% their list with a 1.5k commit?


Our CEO did, yes. Again, just ask for a quote, and get quotes from other providers. It's literally just an email (or phone call).


You mean, cloud bandwidth is cheap if you don't use much bandwidth. That's still an order of magnitude or two worse than the backbone cost.


Is that something generally avaliable to customers? I've done some searches and can't find anything other than savings plans for EC2 etc.


On AWS you have to call and ask them. We got quotes from AWS, Fastly, and Akamai. All of them are substantially cheaper than the public rates.


I'm surprised this about egress without mentioning the new Sippy: https://developers.cloudflare.com/r2/data-migration/sippy/

It allows you to incrementally migrate off of providers like S3 and onto the egress-free Cloudflare R2. Very clever idea.


I think egress would benefit from some regulation. This would enable other companies to compete with specific AWS services, without competing with AWS as a whole. Though I probably wouldn't price cap egress itself, but rather force amazon to offer cheap peering.


Agree 100%. I've always found it surprising there isn't a requirement for symmetric bandwidth charges. I think it's fine if AWS wants to have expensive bandwidth, but having free ingress + exorbitant egress seems like a transparent attempt at preventing customers from partial migrations to competitors.


Amazon & Co. will always rule for the exact same reason Microsoft Office has/will. It's the combination of services effect creating a natural moat. There's nothing you can do about that other than to break up the large entities, it forms by default unless you do that. It requires a comically enormous burden placed on the entity to break it because of how fantastically strong the combination effect is for customers, cheap peering won't touch it.

So long as they don't put too many barriers up (raising the cost through unnecessary regulation to cut out the competition, which would benefit AWS), we'll always have competitive alternatives, including running your own infrastructure inexpensively with companies like Hetzner or Cloudflare (depending on what you need).


> So long as they don't put too many barriers up (raising the cost through unnecessary regulation to cut out the competition, which would benefit AWS), we'll always have competitive alternatives, including running your own infrastructure inexpensively with companies like Hetzner or Cloudflare (depending on what you need).

I mean, if the article mentions small businesses in particular, their needs are probably met by software running on a few VPSes. As long as they don't want to leverage any advanced features that the PaaS/SaaS providers would give them, most of the affordable providers should be enough: Hetzner, Contabo, Time4VPS, or even Scaleway, DigitalOcean, Vultr and so on.

Probably pick a company that has been around for >5 or >10 years instead of something on LowEndBox that might disappear a year or two down the line, but other than that it's likely to be fine (as long as you have backups figured out and have either some redundancy/failover, or just are okay with downtime).


I'm kind of amazed that AWS is the cheapest on this list, as they're typically the first example people give when they talk about overcharging for bandwidth.

When I worked at Malwarebytes between 2008 and 2014 we measured our bandwidth in petabytes, but it was still only about $25k/month. This was obviously a custom contract, but it really is interesting how bandwidth pricing has remained relatively static over the last decade.


Yeah I think the list is only highlighting the worst offenders. It would be more useful to also include the cheap VPS providers mentioned later, that would be:

- Scaleway: $0

- Hetzner: ~$490 _(article says bandwidth is free but I think above 20TB it's charged at 1 Euro/TB)_

- DigitalOcean/Vultr: ~$5k depending on compute usage _(you get a free bandwidth allowance based on compute)_


How far does the FUP go on that 0$ offer, though?

It doesn't say on the pricing page, VPS info page, or the various pdf documents with terms and conditions. Maybe you need to sign up and get to the ordering page before you can see the conditions applying to your VPS or maybe I overlooked it, but this is where I stopped looking


Great question. "Free Bandwidth" is almost worse than cheap bandwidth because in reality it means: "Secret bandwidth pricing/limiting"


Just to be clear, was Malwarebytes using AWS?


We used a combination of CDNs and balanced according to region (some were better in Australia than others, for instance).


> There also is Hetzner and their dedicated servers. You pay for a server with a 1Gb/s connection and all egress is free.

There's also OVH and they have things like 1 Gbit/s guaranteed, up to 5 Gbit/s (not guaranteed) EPYC 7313 servers with 25,50 or 100 Gbit/s guaranteed private (from OVH servers to OVH servers) for... 200 EUR / month.

Re-using the example from TFA, you could max the bandwith of such a server 24/7, be "downgraded" from 5 Gbit/s to 1 Gbit/s guaranteed and yet it'd cost you... 200 EUR/month.

Versus hundreds of thousands of egress bill in the cloud.

200 EUR / month vs hundred of thousands.

But yay. Go cloud!


Combine the two. Set up a large nginx reverse proxy with cache on the 200 EUR/month machine in Hetzner or OVH and serve 99.9% of the traffic from this location (js, css, images, videos, etc.). Use the cloud for convenience in managing the complex parts in the backend.


Why not just go Cloudflare then? In our sites, most of what Cloudflare caches is js, cs, images, etc...


Summary: Use a provider that doesnt charge egress Optimize your egress usage


Hetzner comes to mind…


Hetzner's peerings aren't the best, especially outside EU/US. Plus they recently intercepted traffic to one of their servers, an attack likely perpetrated by the German state.


They have regions outside of Germany too. And IIRC, the people being wire tapped were doing some illegal stuff.


Also it was a court order. All legit providers have to comply with pen register / trap and trace warrants.


Exactly, they were communicating in the russian language.


I mean, I have no idea what they were doing but they literally said (in a deleted HN comment) that what they were doing was illegal.


There are a couple things this article glosses over. With Vercel vs. Netlify, be mindful of the many other fees they charge besides bandwidth. I wouldn’t be surprised if the costs are comparable when that’s accounted for. Sometimes the answer with these things is just to be diligent. Don’t be surprised at what you get if you stick your hand in a meat grinder. Put up guards in your CI process if you have to.

The other thing is that I’m not sure how willing I am to jump to endorse an efficient cache policy as a blanket prescription like they seem to. I suppose I’m not opposed to the advice, but it has very strong “draw the owl” vibes. Project requirements vary, and I’d even argue that sometimes minimal or no caching is wise when requirements are subject to change or you’re doing a lot of static deployments because the savings can sometimes not be worth the headache. End-users are very demanding nowadays. Sometimes “it’ll roll over soon” isn’t good enough. Caching is tricky. Make sure you really get it right in addition to doing the math on the cost-savings if you go down that road.


Serious Q for big Cloud spenders - how much have you negiotiated egress down from the list price?


Not a extreme cloud spender but we have basically free egress on Azure. Well, "waived" for up to 15% of the total monthly bill so we can’t go too crazy.

It’s not secret information: https://azure.microsoft.com/en-us/blog/azure-egress-fee-waiv...


If you can sign a yearly bandwidth commit (not sure what the minimum bandwidth requirement is, but 1PB / year may be in the ballpark) - you will get prices that are extremely competitive (maybe like 90%+ off base list pricing?)


You can get it down to ~1c/g at best then if you want cheaper you gotta build your own


In my experience pricing can go far lower than that. The sibling commenter saying $0.001/GB on an 18PB /year commit seems reasonable to me.


You said 90% off which i’ve seen personally and amount to roughly 1c (most of the list starts at like 8-10c in the US). The sibling says they somehow got 90% off that original 90% which seems silly and def not something i would count on


Did you see the sibling comment posted 45 minutes before you saying they pay 0.1c/g at aws when committing to spending 1500$/month?


Did you see my comment in a sibling thread? If this was common then you could earn $$$ via egress arbitrage…


What would your hypothetical egress arbitrage look like? Keep in mind we are specifically talking about cloudfront bandwidth - so being able to route to an upstream without that upstream also paying for bandwidth likely is not possible.


1. Run a proxy in your VPC and redirect all CF requests to someone elses VPC 2. ???? 3. Profit! (assuming you charged them >2.1c/g)


$1500/mo commit got our bandwidth down to < $.001/GB in AWS via CloudFront.


So the $23K DDOS bill at AWS from the article would be only $476. Not bad.


Im not sure you can. At least at the scale we use AWS its a flat % off the top based on our spend


Ime you can when you apply some creative negotiation coupled with a multi-million $ spends. Usually to their peered (“partner”) networks though not generic transit


Do you have time, or do you have money? If you have time, then build some boxes and put them in a colo. If you have money then negotiate a cloud contract. Even if you have time, it might be better to start prototyping on a free tier on the cloud, and then maybe go colo, and then back to cloud.


Is there a way to protect against the "Denial of wallets" attack?


Yes, there is a way to protect against that attack!

All you need to do is have triggers in place that will shut down or slow down your service when the costs are exceeding some amount.

(If you get a moment of viral growth, you can always disable the trigger.)

Unfortunately, AWS makes it extremely difficult to build such a trigger, and I'm not sure about other cloud providers.


There’s always rate limiting. Cloudfront supports it, API gateway support it, and it’s super easy to set up.


It's the idiot tax.


I love the term "denial of wallet attack".


I’ve just never seen bandwidth be a significant part of the bill for any sass I’ve built/worked for. I’m super curious about this. What kind of sass businesses use huge amounts of bandwdith but don’t provide enough commercial value to the customers to make it viable? Can folks with first-hand experience please share here?


Bandwidth is rarely an issue if you're transferring mostly text, often an issue if you're transferring images (except for highly cacheable static images), and always an issue if you're transferring video or game assets. That's just how it works because the orders of magnitude differ so much.

An image resource can easily be 100x some text (or HTML or API response), and a video can easily be 1000x an image.

Another factor is cacheability. If you're a B2B SaaS business you'll have few clients and can take advantage of caching much more typically. If you're B2C that flips and caching might not provide much benefit.

Anecdotally, on a medium sized ecommerce store, AWS egress for product images was a huge cost.


Thank you, the e-commerce example is definitely something I hadn’t thought about. Having high quality imagery helps make sales but comes at a cost. Looking around at some sites I buy from, I think I can squint and see this “out in plain sight”.


In fact an interesting detail of the ecommerce image situation is that you can somewhat trade-off bandwidth and compute. Creating resized images is expensive (we had ~2m unique images viewed per month), but if you can resize an image down and compress it well you might save 80% of the bandwidth on it, and deliver a better UX due to faster images. We did a bit of modelling to figure out how many sizes we wanted to deliver in order to get the best compute/bandwidth trade-off. This can be tricky for many companies but we were in a weird position because we had a large catalog compared to order volume, because of how the business worked.

Video is an even better one. While we didn't have to deal with it in ecommerce, so many social applications are built around video. That doesn't necessarily mean YouTube/TikTok/etc, but also smaller things like BeReal who I'd guess are built entirely on cloud infra, even embedded videos in Mastodon posts. A similar thing could even be said about Podcasting/audio – not as bad as video, but there are specialist podcast hosts for a reason, and it's not that they do a great job, it's because they have cheap bandwidth.


Why does it need to be sass?

My problem was hosting too many photos on the landing pages. Like literally like 60 photos on 4 different landing pages.

The page looked fantastic and people would email me that I have the best site on the internet. But if I ever went viral, I'd always go down. I have since changed so the pages only have like 20 pictures now.

I only now became profitable, and it has little to do with anything other than waiting for word of mouth to take over.

This might seem like a crappy business, but now its a few thousand per year passive profit. Who knows what the future holds. Its year 7 and the growth is exponential despite me ending all marketing.


100% - referencing SaaS is just me being narrow minded. Your example hits close to home for me. I feel that the more creative people (like but not limited to technical folks) can try new things without economic penalties, the more we all benefit from the inevitable innovations. Compared to how things worked when I started I love how inexpensive experiments are today, even the ones that don’t “blow up”. I totally see your point, too — there is still some weighty risks and egress costs can be one of them.


TL;DR - author claims the hardest part of running a cloud PaaS is networking, thus the cloud providers charge you a ton for egress to outsource the optimization problem to you.

It's not clear to me this claim is true - they're in the business of solving exactly this optimization problem - and they have, they're just capturing maximum profit.

If it was really the hardest / most costly - then other providers wouldn't be able to offer it so much cheaper. It's clearly a solved problem.

This is just the knob the pricing departments found to be the one to dial to juice profits to the max.

The article seems like a puff piece to promote other providers.

But it could be useful to people shopping for cheap egress.


> author claims the hardest part of running a cloud PaaS is networking

This is obviously false to me as someone who has done data center stuff and devops before.

The hardest part is running complex services like databases, storage, and runtime environments like Kubernetes or serverless and doing a good enough job to deliver high uptime for a gigantic variety of customer workloads over which you have basically no control.

The other hardest problem is security, because you are by definition running a ton of untrusted code on shared infrastructure. (scream emoji!)

You can sort of tell software management is where the pain is because that's what keeps people on the cloud. That's the selling point. Without those managed services compute, storage, and bandwidth are ridiculously cheaper on VPS and bare metal providers.

Networking is comparatively easy. It's mostly a game of over-engineering. You design your internal data center fabric to handle way more data than you think you will need.

There was a time maybe a decade or two ago when networking at scale was a lot harder but today there are numerous off the shelf products designed for hyper-scale that you can just buy and deploy. This is part of why VPS and bare metal providers can sell bandwidth so cheap. You don't have to roll your own high performance fiber backplane fabric anymore.


??? This is obviously true. The bandwidth available in a geographic location is, practically speaking, limited. Roads are cheap to pave, but if there was congestion 24/7 and you paved all you could, then you would need to start rationing road usage. For a business the best way to ration is to charge more. Solving this problem requires a massive amount of engineering work, which would also justify the increased cost.


Bandwidth in a geographic location is certainly limited... there's only so many fiber strands available, and fiber transceivers only go up to a certain speed, +/- wave division; there's costs of transceivers and what not. But in most places where you'll find datacenters, the limit is rather high. It's also usually possible to run more fiber.

If you take a look at the Seattle Internet Exchange [1] which is pretty public about their operations, they've got a router with 2x 36-port 400gbe line cards; three member facing 400gbe connections, 163 member facing 100gbe connections. That's a huge amount of bandwidth available at the exchange, some (perhaps much) of which is then delivered by the member to fiber that goes elsewhere. And SIX isn't the only peering location in the Seattle metro.

SIX's traffic graph shows a peak of 2.8 Tbps; AMS-IX's graph shows a peak of 11.7 Tbps, AMS-IX doesn't offer 400 gbe though, but needing more than 100 gbps in one location is pretty rare.

[1] https://www.seattleix.net/


There is a difference between being able to run more fiber, and being able to do it cheaply. The more you scale, the more you need everything you depend on to scale. For many places that calculation makes it not worth it.


There is so much dark fiber out there. Bothering to activate it is a different story.


Because of advancements in xWDM it is not limited. By replacing networking equipment (0.000001% of total cost) you can x10, x100, and more the bandwidth on an existing cable.

The same (single mode) optical cable that was once capable of running 2mbps is now able to run 400 * 100Gbit.

Now wireless is starting to repeat this evolution.

This is one of the reasons for the old joke: "Do you know how to amass a small fortune in Telecom?"

"Start with a big fortune"


Bandwidth pricing works the same as product pricing, you are falling for the classic engineer trap in thinking the sum of the parts somehow determines the price. Cloud bandwidth prices are many many magnitudes more than what it "costs" to transfer.


And you're not physically limited by how many machines you can put in a data center?


How does Cloudflare offer it for free?


Um. It’s not. They have a free plan, but it’s subsidized by their paying customers.


And a lot of their "good" features are only available to paying customers. Image resizing, Argo (fancy... routing aware something or another to your server), more than a few cache rules, and a range of other custom stuff.

I think they do a really good job of it, personally. It's trivial to get your free account set up to play with and even host a pretty serious amount of "simple" traffic, and you can easily explore the features, see what's available for paid (they're quite transparent about the assorted costs which is nice), etc.

It feels a bit like an honest version of Microsoft and Adobe's "look the other way for casual piracy, because you're familiar with the product..." of the late 90s and early 2000s. "Hey, try it out for free, and if the free stuff works for you, great. But if you need more, this is what else we offer!"


I'm not saying Cloudflare is free

I'm saying they don't charge bandwidth


They are eating the costs, as a marketing ploy. It won’t be free forever. They’ve said as much in their press releases. Basically, they are trying to pressure AWS and friends to lower their egress fees. Once they’re lowered, they’ll start charging for it.


Source? I'm incredibly skeptical that one day they will suddenly make a 180 and start charging for bandwidth


I'd literally have to find the press releases from the day they announced R2, which kinda goes into depth on it a bit (and I can't find it). But logically, why wouldn't they do a 180? It has to be eating into their margins. Wholesale costs alone would estimate that it's eating 15-20% of their revenue.

Remember when Uber used to be cheaper than a taxi? It's the same issue, different market; there is so much being subsidized by other money that if they ever run into any financial issues, hire the wrong executive, or simply get market dominance ... they'll shift this all to the customer.


It's not going to happen and you're talking out your ass. Speculation is fine but let's not pretend like you have any source for your pet theories

By the way I say all of this as a Cloudflare Enterprise customer. I imagine you have a tiny fraction of the experience I do with Cloudflare and completely fail to understand the ramifications of what such a decision would entail on their business


Start searching dude: https://www.cloudflare.com/en-gb/press-releases/. You lost my respect in this convo so I give zero fucks, believe in whatever you want to believe. Fairies, Monopoly money, whatever.


You won this one. Your sources of "just trust me bro" and "start searching dude" are truly notable.


If you're going to claim a company said something so impactful, you should be prepared to back it up.


The author is just spitballing in their role as a Cloudflare shill. They don’t actually know jack shit about public cloud netops or pricing strategy.

The so-called “bandwidth alliance” they spruik so hard is a negotiating vehicle for Cloudflare, intended to improve their position when haggling over peering, to support profitability of their DDoS protection service.


Where’s the profitability in the DDoS service? Even free accounts have had massive DDoS attacks mitigated - for free. There is no pricing anywhere in their portfolio for DDoS mitigation or bandwidth. Cloudflare sells a lot of products but DDoS protection isn’t one of them.

It’s a hook to get users on the platform to potentially upsell later but if all you need is DDoS protection they provide it for free.


Enterprise DDoS protection is their goddamn flagship product. The Cloudflare website just says “call us”. That’s all. And it’s because E&G deals aren’t made through a website.

Read the 10-K before spouting off about what value creation a middleman provides, and how it captures that value as revenue. Consumer services are last on their list and given little space. Large customers are explicitly stated as Cloudflare’s sales focus. And the entire top of the list of their disclosed competitors is everyone who sells a WAF or network protection apparatus to that segment.

> they provide it for free

This is naive. As ever, if it’s free, that means you’re part of the product. For social media, as is well understood, that’s your eyeballs sold to advertisers. In the case of Cloudflare, it’s consumer traffic that they pass as a middleman, used to build peering leverage through sheer volume. The clearest confirmation of this, however, is still Cloudflare going to all the time and trouble to wrangle together an industry collective - the so-called “bandwidth alliance” - for the express purpose of driving down settlement-based peering fees.

They are not altruists. This is the architecture of their competitive advantage.


> Enterprise DDoS protection is their goddamn flagship product. The Cloudflare website just says “call us”. That’s all. And it’s because E&G deals aren’t made through a website.

What does this have to do with anything? Cloudflare understands their customers very well and if you need "Enterprise" functionality of course it's behind contact us - just like nearly any other "enterprise" tier with any solution.

> Read the 10-K before spouting off about what value creation a middleman provides, and how it captures that value as revenue.

The unwarranted hostility across your comment tells me you have personal issues with Cloudflare that are entering the discussion. That's fine, and you can use whatever you prefer/works for you.

> This is naive. As ever, if it’s free, that means you’re part of the product.

Again, unwarranted hostility. Cloudflare goes into depth explaining why they offer their free tier and relatively inexpensive plans until "Enterprise". In short, free is where they roll out changes in the architecture first to test them. This is fair. Additionally, because they get a lot of traffic with the free and inexpensive plans they are able to provide a better experience to all users across plans due to the significant amounts of traffic they are able to capture and analyze.

Comparing their approach with social media platforms grabbing eyeballs to attract advertising revenue is disingenuous at best.

> The clearest confirmation of this, however, is still Cloudflare going to all the time and trouble to wrangle together an industry collective - the so-called “bandwidth alliance” - for the express purpose of driving down settlement-based peering fees.

What is wrong with this? I've built networks, bought transit, done peering, etc. Cloudflare has excellent points with their "no egress charges" and bandwidth alliance. The obscene egress bandwidth markup from other providers is absurd. They clearly take advantage of people without the experience of peering, buying transit, etc who end up believing "that's just what bandwidth costs".

> They are not altruists.

I never said they were. Please show me a large and publicly traded company that is.


> The unwarranted hostility across your comment tells me you have personal issues with Cloudflare that are entering the discussion.

Mistaking bluntness for hostility is an error. Turning it into a vague ad hominem is a choice.

> Cloudflare goes into depth explaining why they offer their free tier

That seems credulous. Read financial statements, not puff pieces, to analyse and understand a company.


>The second, less known, reason is that between compute, storage and network, network is actually the hardest part to scale for cloud providers. Thus they need to find a way to incentive their customers to optimize their network usage by themselves. And what better motivation to optimize something than taking a huge chunk of your profits?

This doesn't really make sense as an explanation.

Cloud networking has two components:

1. The network connecting the datacentres to the internet (the datacentre edge infrastructure).

2. The network connecting the servers and virtual machines to that edge.

1 is a solved problem and the variable costs are going to be based on any transit connections used. What cloud providers charge for bandwidth here is out of proportion to that, and they don't bill for ingress anyway, so this makes no sense.

2 is a technically more complex area, except it's very clear that getting people to use this infrastructure less is not the motivation because cloud providers don't charge for intra-DC traffic. You can ferry data around between servers in the same AZ as much as you like and not pay anything. The same is true of transferring data to and from e.g. S3 and EC2.

In other words, the more technically complex and sophisticated network component is the one which isn't billed at all. Meanwhile, boring internet connectivity is billed at an exorbitant rate, but only in one direction.

It's really quite obvious at this point that the motivation is vendor lock-in. You are penalised if you take data out of a big cloud provider but not to put data in. This means as soon as you put anything in AWS you are suddenly motivated to do as much in AWS as possible.

This is fundamentally anti-competitive. If AWS starts offering some infrastructure service, it can offer free bandwidth to that service for its own customers, but some third party providing a comparable service can't, at least unless it also starts using AWS.

But that's not even the best of it. If you want even better proof that these charges are BS, just look at Snowball. This is a service where you can transfer data to S3 by having AWS mail you a hard drive. Amazingly, they have the gall to charge per GB for this in addition to the fixed fee you pay... but only for exporting data.

In other words, I can pay $150 and get loaned a SSD, fill it up with 14TB of data and import that data to S3. If I pay $300 to get data out of S3, I also pay $0.03/GB. Internet transit is not even involved here!

It is pure anti-competitive lock-in. It's also really against the principles of the internet, in that the internet couldn't have developed as it did if we accepted different billing for communicating with different IPs (can you imagine "I don't want to send traffic to a foreign IP, I'll pay international rates!"). Yet cloud providers work on this principle. If you think about it it is effectively a net neutrality violation, just not on the part of residential ISPs - they charge you less to communicate with their own services than with others (zero-rating).


What you can do about it is go back to the early 2000s and rack up your own boxes in a colo, with some reasonable unmetered bandwidth allocations, and skip the whole "cloud" nonsense.

My personal box is a 6C/12T Xeon from a few years back, 128GB ECC RAM, and about 24TB usable disk plus 2TB NVMe, for running a lot of other VMs that I run actual services in (remote backup, my web presence, communications, game servers, the works). Then a 25Mbit/100Mbit burst connection at a local datacenter for a bit over $100/mo.

A 12vCPU/96GB RAM GCE N2 box with similar specs as my box is about $1800/mo. So... given that "a few months of cloud spend" buys my system, I genuinely don't see the case for "cloud" if you're using a lot of data, or a lot of RAM, or a lot of CPU. Yes, it's managed and convenient, but it's horridly expensive for what you get compared to "owning your own hardware."

I'd been spending quite a bit more a month on various cloud instances to do the same thing I do with my server.


If you want bandwidth, vps are quite cheap. 100Mbps averages to a a little over 32TB/mo. There there's tons of low CPU low ram systems with 32TB/mo transfer for cheap. Now under $10/mo: https://lowendbox.com/blog/hosteons-they-survived-dedipath-a...

People should rack their own rpi5 or nucs, at least. There's various providers for both. And consider moving up from there. The remaining folks who will accept & rack a mid tower box for under $75 are saints.


Sure, I'll use a $5/mo DO Droplet (maybe $6/mo now for the new ones?) and their 1TB transfer pool if I just need raw bandwidth, but, tbh, not a lot of stuff is just pure bandwidth needs. And "capable cloud boxes for various purposes" start adding up. If I just wanted to shovel out a static website, sure. But I can do that a variety of other ways too.

Once you start running "real" things, CPU and RAM use start climbing - mostly RAM in my experience. I run a Matrix homeserver - Synapse really struggle with <4GB RAM on the system, so it's one of a few things on a VM with 16GB. I've got a Minecraft and Valheim server running for friends and family - 8-12GB is reasonable for both. You certainly can do it in less, and I've done it, but a lot of problems go away when you can just throw RAM at the problem.

Also, most of the cheap VPS boxes are badly IOPS limited on disk - unless you start paying a good bit more for faster disk. I split my box between a 2TB NVMe mirror for VM root drives and "high write traffic" stuff (the gameserver drives and the like), and my large, slow, spinning rust array for backup storage and "big stuff." Again, you can do it with cloud, but storage adds up in a hurry - 24TB of "slow" disk is a grand a month.

It just depends on what you need. I really like cloud boxes for light use "toy" boxes, and if that's what you want, they're great. I just cannot wrap my head around the merits of spending a lot of money for "serious cloud compute" when I can build and rack a similarly capable box for 2-3 months of the cloud spend on a similar system. I understand opex vs capex issues for business, but, man, it sure seems stupid to me.

Then again, I'm the kind of person who's had a box racked up a colo (or somewhere similar - random basements on a good connection are nice) for the majority of my adult life. So I'm pretty clearly biased. But I did do a few years of "Hey, cloud, this is cool" - and eventually sat down, evaluated my needs and the costs, and went back to a colo'd box.


The point of this article was trying to liberate bandwidth, and I was highlighting how even common answers like your Droplet are far from the value-end of what's available. I showed an 18x better price point than what you would have gone to, on transfer alone (the 8GB memory is also a colossal advantage vs 0.5/1gb on a droplet).

I think for a lot of people, a nuc with dual m.2 drives is probably a much more affordable & realistic entry point that would probably survive even your fairly demanding gameserving use cases. And cost less than half as much.

Overall I'm very interested in a future where we do have scale out edge caches, where we do just need modest resources to maintain wide-scale connectivity. Many personal services are not at this scale or not this architecturally complex, but being able to have efficient scalable personal systems would be an interesting ends to move towards. Matrix would be a good example of a service that would be nice to have good fan-out for.


> What you can do about it is go back to the early 2000s and rack up your own boxes in a colo, with some reasonable unmetered bandwidth allocations, and skip the whole "cloud" nonsense.

Yep, we used to pay about $35/mo for an unmetered 1Gbit 1U slot at our local datacenter. What surprised me (but maybe is a standard these days?) is that you don't even need to install the server physically yourself. You just buy it, ship it to the datacenter, and wait for the remote access.


The nice key about this is "local datacenter" - it's nice to be able to drive over to your hardware when needed.

I've found some nice colo deals but I'm a bit hesitant at being too far away and requiring remote hands, though I should work out exactly how much I'm paying for power ...


I think that's great for hosting your own stuff. Just personally not something I'd want to do once I really need high availability (power outages/network outages). But I suspect it might actually be a great way for compute bound startups to start for cheap.


So...

Why do you need such high availability that a rare power outage, network outage, or equipment failure is such a big deal?

I ask having spent a lot of my college years (and some beyond) trying to run "enterprise grade uptime on a college student budget" - with the appropriate frustration. I've come around to a more casual, "As long as I can reasonably get it back up in a day or two, meh..." way of thinking about various services.

It's important to figure out what you need and what you want - a few friends and I once had a stack of rather compute-heavy boxes in a rack, and we ran most of them without a UPS. The provided UPS wasn't enough amps for what we were running, so we just backed up the core stuff and left the crunchers subject to outages. Stuff could restart easily enough without a big deal. It was annoying, but not a big deal.

I'd argue that most startups don't need a multi-region, globally backed up infrastructure. And that they'd be a lot better off with less, cashflow-wise. But I guess that's not "Built to scale to 10B users" that VC fund expect to see these days, so... jump through the hoops provided?


Customer trust is one of the most important assets, it's hard to earn, and once lost it's incredibly hard to get back.

Hosting your own infra makes that risk much higher. Especially if you're providing a critical path service.

I'm not saying that you need some overkill multi-region globally backed up setup, it definitely varies and depends on the nature of your business. If what you're doing works great for you, keep doing it!


I am thinking about doing something similar. Out of curiosity, do you use a reverse proxy web server to protect from malicious attacks ? That`s probably the only part I would think of leaving in the cloud if I were to host my own servers at home.


What's different about a box hanging out in the cloud with open ports, vs at home or in a colo with open ports?

I use CloudFlare to front most of my web stuff for bandwidth diversion - my blog is a purely static Jekyll site, and I've told CF to "cache everything," so if I get hit with a spike of traffic, it isn't hitting my server (much - comments are still locally hosted with a Discourse forum integration, but if that's overloaded it doesn't block page render).

My home bandwidth simply isn't up to hosting. But if I can ever get someone to tap the fiber a quarter mile from my place and give me a drop, I'd move to hosting at home.


In the early 2000s before Google even IPO'ed, they experimented with such colo with unlimited bandwidth allocations to run their search crawler. It turns out something that's labeled unlimited isn't truly unlimited.


Man, some of the early "Google before they had their own DC" stories are epic, if you can get an SRE talking at some point.


> My personal box ... for a bit over $100/mo.

Is that just $50/U?


It's $75 for a 1U slot (which somehow gives me 2U of space) with dual power, and the rest is bandwidth and an IP block (I've got a /29 so I can put different services on different IPs, because that's... just a thing I do - I'm fundamentally a "first decade of the 2000s Linux sysadmin" in how I think).


Heathen! Not everyone can afford that PhD in enterprise class hardware management you must have earned to pull this off. And besides, what are you going to do when you wake up tomorrow and you're the next FAANG? You'll never scale to 2B DAU on that box and we all need to be ready for that, just in case. I knew a guy who knew a guy who founded a FAANG, so it could happen to you too. You have to be ready!

/s


Oof. Thanks for the /s, I've quite literally heard that argument from people before for "cloud." That you have to design to scale, physical hardware can't scale rapidly, etc.

I assume, if I suddenly had reason to scale, I could call my colo and get another slot from them pretty darn quick. They're local and quite responsive, though I do get a feeling their "public shared rack" offering was created to deal with me. I was the first server in it...


But seriously, for scaling up, just have a plan for how to move things to the cloud quickly. Containerization and building everything on Kubernetes help.


Or don't!

Not everything "has to scale." My stuff isn't built to, and I see no reason to do so.

It's my personal server. I run some stuff for my convenience, a quiet backwaters community forum, a Matrix chat server, etc. If any of that really starts to ramp up, I can always split it out into its own VM. If it exceeds what my current server can handle (exceedingly unlikely), I'll go rack up another box for it. And if I get a super sudden spike in traffic, something is probably quite weird. The only thing likely to see that is my personal blog site, which I already have cached in Cloudflare with a typical ~60% cache hit rate, going up north of 90% if there's a spike in traffic, because, being a static site, I've told CF to cache "literally everything there."

I probably could move most of my stuff to "the cloud" in an evening's sysadmin, but... why? I already went through the process of moving it off the cloud to save myself an awful lot of money.


The easiest plan for scaling up is "don't buy the literal biggest server out there" because if you at all MAKE money when you're scaling up, you'll be able to do a hardware swap to a massive device, which will buy you tons of time to work out what to do next.

Here's Let's Encrypt doing just that: https://letsencrypt.org/2021/01/21/next-gen-database-servers...


Can anyone take a crack at why the hell you need a company and its computer farm 3000 miles away to merey sync your calendar, CalDav or whatever it is between two devices that are oerfectly capable of taking to each other or with a local-first mediating process or tool? I don't get what is about syncing things (particularly the basics like Contacts, Calendar/Reminders that requires you to send the data thousands of miles away and back when its all in the same room


Here's the answer: because it's not written for just 2 devices. It's written to sync however many devices you want -- 3 or 4 or 5 or whatever. Both your phones, your tablet, your laptop, and your watch.

And it has to sync when your devices might never actually be turned on at the same time. And you want it to sync when you're traveling, and you don't want to open your home firewall for security reasons.

So it requires a single, centralized, always-on server as the source of truth. And given the speed of light, whether that server is 3 miles or 3,000 miles away is largely irrelevant.

And writing additional code to sync directly when devices are on the same LAN and turned on at the same time, as a special case, is not only entirely redundant -- but you still have to sync with the server, since you generally want the information backed up as well in case you lose your devices, or they're stolen, or damaged. Using your examples of Contacts and Calendar, those are often some of the most important things in someone's life to keep synced remotely.

Does that answer your question?


I just feel like there's a middle ground like Resilio Sync where you have to share in ensuring you're syncing things but it can technically do the deed for you. You can even set an automation after you edit your calendar to trigger automatic/manual syncing. I feel like middle grounds like this or EteSync should be more predominant.

I would mind that since its local and I prefer to not be dependant on external hosting services wherever I can have a more accountable and static solution a little closer to home


But almost nobody wants that middle ground. That's a tiny, tiny, tiny minority.

I, and everyone I know, wants something that always works, all the time, no matter where they are, that they never have to think about.

Not something that they "have to share in ensuring", not something they need to "set an automation". For almost everyone, external hosting services are a feature rather than a dependency, because homes are subject to fires and natural disasters. And nobody wants to deal with finding a friend across town who will host your backup NAS that you'll set up and periodically maintain.

There's just no market for it from any major consumer company.


Because it's simpler for them to develop, because they make the server the source of truth, and then they can sell it to you.

And for most people, it works fine, but when you hit edge cases you're dead in the water.


Put simply: There's no incentive for most organizations to implement this like this.


Hey everyone!

I'm David Aronchick - co-founder of Bacalhau[1] - and we are specifically building an open source platform to help with this. Basically, we allow you to do compute over data, making it much easier to schedule jobs (container, WASM, arbitrary binary) to where the data is being stored, even over irregular/unreliable networks. We were able to show 90%+ savings in things like log aggregations. Happy to answer any questions if you'd like!

David Aronchick aronchick@expanso.io

[1] https://github.com/bacalhau-project/bacalhau




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: