This has been one of my favorite frameworks to use for AI development. I feel like it is pretty dead-simple to use. But also has a great quality: it gets the f** out of your way when you need it to. I feel like with a fast moving layer beneath it (ie ai providers), I need to be able to adopt and experiment with new features from the providers, without having to wait for your library to update.
I think the api generally is great taste. Having typed functions as the core abstraction (which shares some similarity with Dspy Signatures) was a great move.
Congrats on the launch, can't wait to see the platform come together more and more.
As new model releases support longer and longer context windows, there is a lot of discussion around whether RAG is still relevant.
RAG is here to stay for a while:
(1) Enterprises have much more data than reasonably will fit in a context window any time soon
(2) Even if you can technically put 1M tokens in, that does not mean the model can effectively use it all
(3) Longer input = higher latency and cost for inference
I have really enjoyed the conversations I have had with Jacopo and Ciro over the years. They have really revisited a lot of assumptions behind commonly used tools/infrastructure in the data space and build something that really has a much better developer experience.
This is great! I personally found the original Anthropic MCP documentation pretty lacking in terms of how Claude Desktop used the MCP server(s), constraints, etc. For example, there is a pretty hard timeout which will cause the MCP server to crash.
Glad there's a simple to use solution for creating my own server where I can make some different design choices!
What does the app provide? I have long considered creating an app that combines Notes with Voice Memos with a way of tracking alternative ideas for each line or section.
Create projects
Projects can contain notes and audio (uploaded or recorded in browser)
Then theres an AI chat in the project where the docs/audio are available as context (multimodal models used)
Its definitely very early; AI and UX need a lot of work. But definitely has helped me get over some “humps” with writing songs.
For extra context: I write songs with acoustic guitar and vocals, but I would say they are pretty simple overall.
I think there's some updates I can make to this to give some guidance around the (1) you have to know a lot of things point. You can use AI assistance to help learn more. But ultimately, I don't believe we are at the point where the AI makes you "smarter"; but it _can_ make you more productive.
I disagree with (3) based on my experience. That feeling happens when I am not providing enough context. I rarely have experiences where I step back to provide more context and still end up in a dumb loop. Highly recommend providing lots of context + breaking down the problem more.
I would love to dig deeper into an example you have where you feel you "never get your time back". Because in general I am saving a lot of time from how much less typing I have to do.
I'd rather spend my time writing the code myself because it is enjoyable, then spend the same amount of time gathering context and verbosely explaining it in text to an AI
> Highly recommend providing lots of context + breaking down the problem more.
I don't have to do this with a human, I can delegate and know they will on their own. Also, the AI will make the same mistake if I come back a week later, a human learns online. This is at the core of lost time and I don't think AIs are close to this level without an advancement that takes us beyond transformers
Interesting! I find I very often have to do this with humans. But yes, the repeat mistakes is a thing. This is where ensuring I output context along the way is helpful. Although it's maybe annoying to "re-refer" to the context, I find that it helps the AI make the same mistakes less often
The example is kind of opposite of the point GP's trying to make, isn't it? I mean, "cheeseburger" is a very well-known and a very specific thing. You ask a cook to make you one, they'll know what you want. Try asking for "a hamburger with patty topped with sliced cheese and standard condiments", and you'll probably get back a confused look, "er, you mean you want a cheeseburger, right?".
Same pattern happens with LLMs, in my experience. GP says an LLM infrerence is "sort of a decompression process for a lossy copy of the Internet" - but in these terms, if asking it for a cheeseburger means decompressing parts of the latent space around the term "cheeseburger", then asking for "a hamburger with patty topped with sliced cheese and standard condiments" is making it decompress much larger space around multiple terms, and then filter the result out into a semantically relevant subspace, and then run extra inference on that.
If you think about it, the very reason we (humans) give names to things is to avoid having to repeatedly do that decompression and filtering every time we want to refer to a specific thing. We call the modified "hamburger" a "cheeseburger" precisely to avoid having to talk like GP suggests we should talk to LLMs, so I very much think this advice is backwards.
Cursor kept giving me burgers without meat, without top, double wrapped, etc. I started giving it nested bullet points with amount of details roughly represent gut-felt relative volume of functions, and then the agent started macro-expanding the instructions much better. It did relocate, rename, and reorganize functions as needed. Then I could ask for fixes, manually trim extra bits, filled unmarked TODOs, to reach the desired end result.
It also seemed to do better with occasional stern instructions, like "ok but you're wrong, fix $item", than consistently teacher-student-like interactions like "Great! However, there is..." IME. I suppose students solutions are more likely to be wrong and/or less sophisticated.
The end product is on the Internet, nothing secret or inappropriate... I'm just reluctant to post "my first HTML" on HN.
Kind of like that, I had to word the input so that instruction looked like collapsed code and apparent correlation can be seen. It didn't do right from mere description of an end result.
The code was just an html file with some sticky buttons that would reset. AI left some stuffs set, left reset function empty, had handling codes scattered everywhere etc etc and just didn't get it. Being able to just keep rubberstamping AI until it breaks was a huge time saving, but it wasn't quite as much IQ saving.
With the example, "Add RBAC to my application", I’ve had success telling the LLM, “I want your help creating a plan to add RBAC to my application. I’m sending the codebase as context (or just the relevant parts if the entire codebase is too large). Please respond with a high-level, nontechnical outline of how we might do this.” Then take each step and recursively flesh it out until each step is thoroughly planned.
It’s wise to get the LLM to think more broadly and more collaboratively with questions like, “What question should I ask you that I haven’t yet” and similar
I noticed a lot of people complaining about Cursor agent being bad. And my initial experience was bad. After working with it for a while, I found a workflow that works for me; I hope it's helpful for you too!
The main reason people think that is because most of the MS Office suite's document formats are things that no programmer wants to touch if they can help it and usually when you're parsing them to something else, you can drop all the weirdness. Nobody loses sleep because the xlsx file you're using as an input for a script doesn't parse the excel graph that someone else put into it properly.
They're all incredibly capable formats (from a user perspective anyway), with the caveat that they're utter hell to work with in terms of a programming perspective. It's easier to just toss it into a black box parser/serializer and hope that all the text you need in/out comes out properly on the other side.
Actually generating docx or xlsx files (that aren't trivial) that look exactly like another input file (so you have to account for every notable difference in formatting) is a ton of work; most people who have touched webdev will probably at some point have had to format their emails for Outlooks half-assed ancient HTML parser and even there, you at least control what it's going to look like.
I think a lot of people unfamiliar with docx think of it as effectively the same thing as any other rich text format. Because of that, they assume converting it to other formats is trivial, not realizing that other formats support a small subset of the functionality docx does.
Definitely. Finance professionals, academic researchers, legislators, among many other professionals, encounter similar version control issues. We are currently focused on law because of our domain expertise and because the problem is particularly pronounced in the law.
I can see something like this being useful in other law related areas. My wife's court office has hacked together some horrendous workflow for tracking offenders as they progress through the system using OneNote and Word, but it suffers from all the synchronization issues and conflicts you'd expect.
This would be great in engineering for specifications and such. We have product lifecycle management software to track document revisions and handle approvals but it doesn't integrate with .docx files, it just stores them. You have to download files manually and diff them in Word or Beyond Compare for redlines.
I think the api generally is great taste. Having typed functions as the core abstraction (which shares some similarity with Dspy Signatures) was a great move.
Congrats on the launch, can't wait to see the platform come together more and more.