This is generated on device with llama.cpp compiled to webassembly (aka wllama) and running SmolLM2-360M. [1] How is this different from the user clicking on the link? In the end, your local firefox will fetch the link in order to summarize it, the same way you would have followed the link and read through the document in reader mode.
Having spent some time in outbound sales (after tech burn-out), the most important aspect (as many comments say) is "relationships". The best training for that is to go out and make them. We had sales training every single day, so it's really not something you can pick up a book or go to a weekend class and walk away being effective. That said, books and classes are a good way to find your footing.
Never Eat Alone - Keith Ferrazzi (networking & relationship building)
Never Sit in the Lobby - Glenn Poulos (sales & relationships)
Getting to Yes - Roger Fisher (negotiation, particularly "principled negotiation")
The Joy of Selling - Steve Chandler
The Psychology of Selling - Brian Tracy
In one of our quarterly division training, our office manager gave us Dale Carnegie How to Win Friends and Influence People and were told if we learned nothing else, to study that book.
It's been over a decade since my sales time, but the 2 sales techniques I haven't forgotten are: "selling ins't telling" and "feel, felt, found". As you can imagine, they are about relating to people, not giving technical/spec speeches.
It's something you have to practice everyday, make sales a part of your job title -- not simply something you do on top of running the company. An integrated layer no different than other software maintenance task, except the maintenance is the relationships with people you want to sell to.
For any other tech types that may someday find they need sales skills, I highly recommend actual job experience in outbound sales (with a company that provides frequent sales training). It was a massive culture shock that gave me the professional people and relationship skills I struggled with.
We should just build more CLI tools, that way the agentic AI can just run `yourtool --help` to learn how to use it. Instead of needing an MCP-server to access ex. Jira it should just call a cli tool `jira`. Better CLI tools for everything would help both AI and humans alike.
Kind of sad seeing businesses getting screwed by closed source proprietary software, then making the same choices all over again.
Nutanix also seeing huge demand.
Not everyone is repeating their mistakes, with Proxmox and Xcp-ng seeing huge new level of business, as well, which is nice.
I'm part of the Apache CloudStack project and that too is seeing unparalleled levels of demand.
The KVM hypervisor has sort of become the de facto choice, thanks to virt-v2v tool which can help migrate VMware guests.
Always nice to see folks talking about VM snapshots - they're an extremely powerful tool for building systems of all kinds. At AWS, we use snapshots in Lambda Snapstart (along with cloning, and snapshots are distributed across multiple workers), and in Aurora DSQL (where we clone and restore a snapshot of Postgres on every database connection), in AgentCore Runtime, and a number of other places.
> But Firecracker comes with a few limitations, specifically around PCI passthrough and GPU virtualization, which prevented Firecracker from working with GPU Instances
Worth mentioning that Firecracker supports PCI passthrough as of 1.13.0. But that doesn't diminish the value of Cloud Hypervisor - it's really good to have multiple options in this space with different design goals (including QEMU, which has the most features).
> We use the sk_buff.mark field — a kernel-level metadata flag on packets - to tag health check traffic.
Clever!
> Light Sleep, which reduces cold starts to around 200ms for CPU workloads.
If you're restoring on the same box, I suspect 200ms is significantly above the best you can do (unless your images are huge). Do you know what you're spending those 200ms doing? Is it just creating the VMM process and setting up kvm? Device and networking setup? I assume you're mmapping the snapshot of memory and loading it on demand, but wouldn't expect anywhere near 200ms of page faults to handle a simple request.
Not sure if my "products" compare to yours, but I’ve seen some success with a few of them over the years, maybe there are some takeaways (or pitfalls to avoid) for you:
CloudCamping (PMS): 250+ Businesses, 2023
- Positioned as more modern, more accessible, and more affordable than the competition
- Limited competition due to the complexity of the product
- Personally visited campgrounds to demo the product
- Sent physical postcards (old school!) to campgrounds with product updates and announcements
- Due to limited competition, it is now ranking very high in the German marked on SEO
The Road to React & The Road to Next: 1000+ Users, 2024
- Gave away The Road to React for free in exchange for an email, grew the mailing list this way
- Benefited from early timing (luck!), it was the first book on the topic
- Initial version wasn’t polished, but I kept iterating and improving it each year
- In 2025, released the paid course The Road to Next to my audience, now over 1,000 students enrolled
SoundCloud (DJ/Producing as “Schlenker mit Turnbeutel”)
- Active from 2010–2015 as a hobby, grew to 10,000+ followers (a lot for the time)
- SoundCloud allowed 1,000 direct messages per track
- Carefully selected 1,000 high-engagement listeners in my music niche and personally messaged them to check out new tracks
So yeah, a mix of timing/luck, outreach that does not scale, being better than the competition I'd say.
That's funny, there isn't any mandate for using LLMs at my company. Everybody has just quietly added them to their workflows without being told.
LLMs are like chainsaws. In the right hands, they can help you do the same job faster. In the wrong hands, it could cut off a limb. If someone drops a thousand line PR of total slop, it's not an AI failure, it's a human failure.
Personally, I use LLMs for fill-in-the-middle jobs and boosting test coverage. It's a modest benefit, but I am definitely shipping higher quality software in a shorter period of time compared to before.
They're not bad as starting points for learning new things, either. I've learned a number of new frameworks and libraries by using LLMs as a "better Google". Now I'm using it to learn Rust.
As long as you use your knowledge and experience to evaluate the output, LLMs can only help. I would posit that people committing garbage from LLMs will have no problem writing insecure and buggy code without assistance.
I caveat all this with the totally valid security and privacy concerns around sending code and data to 3rd parties that haven't been vetted rigorously. That said, local LLMs solve that problem handily and are good enough most of the time.
The last time I've used a leet code style interview was in 2012, and it resulted in a bad hire (who just happened to have trained on the questions we used). I've hired something like 150 developers so far, and what I ended up with after a few years of trial and error:
1. Use recruiters and network: Wading through the sheer volume of applications was even nasty before COVID, I don't even want to imagine what it's like now. A good recruiter or a recommendation can save a lot of time.
2. Do either no take home test, or one that takes at most two hours. I do discuss the solution candidates came up with, so as long as they can demonstrate they know what they did there, I don't care too much how they did it. If I do this part, it's just to establish some base line competency.
3. Put the candidate at ease - nervous people don't interview well, another problem with non-trivial tasks in technical interviews. I rarely do any live coding, if I do, it's pairing and for management roles, to e.g. probe how they manage disagreement and such. But for developers, they mostly shine when not under pressure, I try to see that side of them.
4. Talk through past and current challenges, technical and otherwise. This is by far the most powerful part of the interview IMHO. Had a bad manager? Cool, what did you do about it? I'm not looking for them having resolved whatever issue we talk about, I'm trying to understand who they are and how they'd fit into the team.
I've been using this process for almost a decade now, and currently don't think I need to change anything about it with respect to LLMs.
I kinda wish it was more merit based, but I haven't found a way to do that well yet. Maybe it's me, or maybe it's just not feasible. The work I tend to be involved in seems way too multi faceted to have a single standard test that will seriously predict how well a candidate will do on the job. My workaround is to rely on intuition for the most part.
- figure out if you know what those keywords are (what it means, why you would do that, why it's better than prior solutions, some real world examples)
- after a few weeks of this, you'll have a list of companies and words that you're interested in
- go on linkedin (or HN) and look for people working either at those companies and/or using those words
- ask them for a 15 minute chat to hear how they are approaching the problems you're interested in (not for a job, but to hear how they talk about it)
- use what you learned from the previous step to write some blog posts / articles / tutorials / tiny projects that let you see how much you know and then can later show people when you apply to those jobs
- reach out to the groups / companies you want to work and say that you're interested in that area and if you could have a chat about the work they are doing.
- remember that a) everyone is always hiring even if they don't have a job post, b) most job posts never make it public, c) shoot your shot
Once the connection is upgraded, you loose all metadata included in the HTTP headers (because it’s not HTTP) and all protections relying on it.
Also CORS and SOP can be bypassed: https://dev.to/pssingh21/websockets-bypassing-sop-cors-5ajm
Of course you can reimplement everything by hand (and you must if you use WebSockets), but with SSE/Mercure you don't have to because it's plain old HTTP.
In Ashburn, VA, I can buy Dark Fiber for $750 MRC to any datacenter in the same city. I can buy Dark Fiber for $3-5K MRC to any random building in the same city.
That Duplex Dark Fiber with DWDM can run 4TBPS of capacity at 100GE (40x 100GE). Each 100GE transceiver costs $2-4K NRC dependent on manufacturer - $160K NRC for 40x. (There are higher densities as well, like 200/400/800GE, 100GE is just getting cheap.)
In AWS, utilizing 1x100GE will cost you >$1MM MRC. For significantly less than that, let's say absolutely worst-case 5K MRC + 200K NRC, you can get 40x100GE.
Now you have extra money for 4x redundancy, fancy routers, over-spec'd servers, world-class talent, and maybe a yacht if your heart desires.
This seems like quite a lot of setup and hassle for what could be handled some other way with less fuss, like chamber[0] or Doppler[1]. Heck, even the classic .env seems like a better choice in every way.
What are the advantages to a configuration like this? Seems the HTTP interface with non-encrypted cache and separate agent situation isn’t something secure enough to satisfy most companies these days.
Aside from all the valid points listed in the blog I found out that the frontend engineers in my company save some queries in central library and reuse them even if they don't need all the field returned by this array just to save themselves the time they spend writing queries so they are basically using GraphQL as REST at the end and now we have the worst of both worlds.
This basic monitoring primitive is the first thing we started with at Cronitor[1]. I was a software engineer at my day job and needed a way to be alerted when something doesn't happen.
We have a decent free plan that would probably work for you.
You cannot only do classic heartbeat checks but also high level API (single request and multi request) and Browser checks to make sure your service is behaving as expected.
I've been using it for my latest API - I was looking for a tool that allowed me to describe APIs similarly to GraphQL and in a design-first sorta way. All these openapi editors just felt crazy clunky and made data-relationship within the API non-obvious. TypeSpec is a great tool, really helped me out here - was exactly what I was looking for!
If anyone wants to eval this locally versus codellama, it's pretty easy with Ollama[0] and Promptfoo[1]:
prompts:
- "Solve in Python: {{ask}}"
providers:
- ollama:chat:codellama:7b
- ollama:chat:codegemma:instruct
tests:
- vars:
ask: function to return the nth number in fibonacci sequence
- vars:
ask: convert roman numeral to number
# ...
YMMV based on your coding tasks, but I notice gemma is much less verbose by default.
The downside of v8 isolates is: you have to reinvent a whole bunch of stuff to get good isolation (both security and of resources).
Here's an example. Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process. They need to be sandboxed in isolated processes. Chrome sandboxes them in isolated processes.
Process isolation is slightly heavier weight (though forking is wicked fast) but more secure. Processes give you the advantage of using cgroups to restrict resources, namespaces to limit network access, etc.
Once you've forked a process, though, you're not far off from just running something like Firecracker. This is both true and intense bias on my part. I work on https://fly.io, we use Firecracker. We started with v8 and decided it was wrong. So obviously I would be saying this.
Firecracker has the benefit of hardware virtualization. It's pretty dang fast. The downside is, you need to run on bare metal to take advantage of it.
My guess is that this is all going to converge. v8 isolates will someday run in isolated processes that can take advantage of hardware virtualization. They already _should_ run in isolated processes that take advantage of OS level sandboxing.
At the same time, people using Firecracker (like us!) will be able to optimize away cold starts, keep memory usage small, etc.
The natural end state is to run your v8 isolates or wasm runtimes in a lightweight VM.
Fun weekend project but definitely not production-ready (no tests, no error handling, concurrent requests will cause a race condition, etc.). If readers are looking for something production-ready to use, consider https://github.com/go-redis/redis_rate (which implements GCRA/leaky bucket), or https://github.com/ulule/limiter (which uses a much simpler algorithm, but has good middleware).
Lot of interesting OSS observability products coming out in recent years. One of the more impressive(and curious for many reasons) IMHO is OpenObserve: https://github.com/openobserve/openobserve .
As opposed to just a stack, they are implementing just about the whole backend shebang from scratch.
> [1] https://imgur.com/a/3E17Dts
This is generated on device with llama.cpp compiled to webassembly (aka wllama) and running SmolLM2-360M. [1] How is this different from the user clicking on the link? In the end, your local firefox will fetch the link in order to summarize it, the same way you would have followed the link and read through the document in reader mode.
[1] https://blog.mozilla.org/en/mozilla/ai/ai-tech/ai-link-previ...