You can do this all in fly.io, no cloudflare container needed.
The whole selling point of fly is lightweight and fast VMs that can be "off" when not needed and start on-request. For this, I would:
Set up a "peformance" instance, with auto start on, and auto-restart-on-exit _off_, which runs a simple web service which accepts an incoming request, does the processing and upload, and then exits. All you need is the fly config, dockerfile, and service code (e.g. python). A simple api app like that which only serves to ffmpeg-process something, can start very fast (ms). Something which needs to load e.g. a bigger model such as whisper can also still work, but will be a bit slower. fly takes care of automatically starting stopped instances on an incoming request, for you.
(In my use case: app where people upload audio, to have it transcribed with whisper. I would send a ping from the frontend to the "whisper" service even before the file finished uploading, saying "hey wake up, there's audio coming soon", and it was started by the time the audio was actually available. Worked great.)
That's a good trick (the "get ready" ping). It reminds me of how early Instagram was considered fast because they did the photo upload in the background while you were typing your caption so that by the time you hit "upload" it was already there and appeared instantly.
It may be even easier to not even leave a vm in off. Using either the fly command or their api, you can kick off a one-off machine that runs an arbitrary script on boot and dies when that script ends.
For at least some codebases, I'm not sure this is a useful metric. Because you don't usually put the whole codebase in your context at the same time.
For example in my current case, there are lots of files with CSS, SVG icons in separate files, old database migration scripts, etc. Those don't go in the LLM context 99% of the time.
Maybe a more useful metric would be "what percentage of files that have been edited in the last {n} days fit in the context"?
eSIM, global, variable pricing per country with per-GB billing, anonymous crypto payments and no KYC. Although it seems to not have some of the additional security features of the OP.
This looks interesting but I’m struggling to find any examples of what this actually entails/produces/looks like. Most of the guides are about setting up your environment, checking out code, etc.
So I've actually built patterns using their system. Basically you define a layout using JS and defining a series of points and offsets and lines. You can refer to variables such as body measurements or other dimensions for bags. https://freesewing.eu/ is their more "consumer" facing site where you can enter your measurements and then download sewing patterns sized for you specifically.
One of the other nice things they do as part of the pattern design process is testing the pattern makes sense at many "scales" and so you can actually define a "body" the size of a doll and use this for defining dolls clothing, or make really size inclusive clothing, there are members of the community with varying disabilities such as forms of dwarfism who otherwise struggle to find appropriately sized clothes.
This does happen: for example in Macbook repair, it is common to buy defective motherboards, in order to salvage the chips off them (which are apple-specific, hence not purchasable elsewhere). Those boards often come from China, and often have holes drilled in them, I guess exactly to prevent them from being repaired.
It's a shame, because some of those boards could (and would, they are valuable enough) be fully repaired by a skilled repair person. Instead, the chips are picked off and the rest goes to waste.
I did buy a batch once that didn't have holes drilled, and they all turned out to have all sorts of strange, often random issues, so I suspect those were RMAs that somehow "fell off the back of a truck" and escaped the drilling.
Openai has a "flex" processing tier, which works like the normal API, but where you accept higher latency and higher error rates, in exchange for 50% off (same as batch pricing). It also supports prompt caching for further savings.
For me, it works quite well for low-priority things, without the hassle of using the batch API. Usually the added latency is just a few seconds extra, so it would still work in an agent loop (and you can retry requests that fail at the "normal" priority tier.)
That's interesting but it's a beta feature so it could go away at any time. Also not available for Codex agentic models (or Pro models for that matter).
Exactly what I am suspecting! I called so many times: nothing found all works as expected.
As for the starlink: I noticed that clouds or weather ( rain snow ) does not have a true effect. Must be the frequency is not absorbed by the water in the air or similar effects. Only hard blocking with construction or big canopies of trees is struggling.
Sorry I was unclear, I mean 50s or 70s air travel compared to present day air travel. (Which on reconsideration might not be particularly relevant haha)
The whole selling point of fly is lightweight and fast VMs that can be "off" when not needed and start on-request. For this, I would:
Set up a "peformance" instance, with auto start on, and auto-restart-on-exit _off_, which runs a simple web service which accepts an incoming request, does the processing and upload, and then exits. All you need is the fly config, dockerfile, and service code (e.g. python). A simple api app like that which only serves to ffmpeg-process something, can start very fast (ms). Something which needs to load e.g. a bigger model such as whisper can also still work, but will be a bit slower. fly takes care of automatically starting stopped instances on an incoming request, for you.
(In my use case: app where people upload audio, to have it transcribed with whisper. I would send a ping from the frontend to the "whisper" service even before the file finished uploading, saying "hey wake up, there's audio coming soon", and it was started by the time the audio was actually available. Worked great.)
reply