> ... unless you train ChatGPT on a bunch of code examples and contexts then it ...

Paradigma11 · 2025-07-20T10:42:23 1753008143

Gemini Pro 2.5 has a context window of 1 million tokens and wants to rise that to 2 million tokens soon. 1 token is approx 0.75 words, so 1 million tokens would be in the ballpark of 3k pages of code.

You can add some tutorials/language docs as context without any problem. The bigger your project gets the more context it gets from there. You can also convert apis/documentation to a RAG and expose it as a MCP tool to the LLM.

ofrzeta · 2025-07-20T11:38:07 1753011487

> Gemini Pro 2.5 has a context window of 1 million tokens and wants to rise that to 2 million tokens soon. 1 token is approx 0.75 words, so 1 million tokens would be in the ballpark of 3k pages of code.

You mean around 3000 files with 3000 characters? That is a lot. I've played with some other LLMs in Agentic AIs but at work we are using Copilot, and when I add context through drag and drop it seems to be limited to some dozen files.

Paradigma11 · 2025-07-20T12:01:46 1753012906

I think Copilot has some hardcoded limitations of around a dozen files, like you said. But this stuff changes constantly.

ofrzeta · 2025-07-20T12:47:04 1753015624

Still I don't totally understand how that huge of a context works for Gemini. I guess you don't provide the whole context for every request? So it keeps (but also updates) context for a specific session?

Paradigma11 · 2025-07-20T13:33:14 1753018394

I dont know how the massive context works but Caching is certainly a thing and cheaper: https://ai.google.dev/gemini-api/docs/caching?lang=python

Gemini is better than Sonnet if you have broad questions that concern a large codebase, the context size seems to help there. People also use subagents for specific purposes to keep each context size manageable, if possible.

On a related note I think the agent metaphor is a bit harmful because it suggests state while the LLM is stateless.

firesteelrain · 2025-07-20T07:59:05 1752998345

Gist is

1. Gather training data

2. Format it into JSONL or Hugging Face Dataset format

3. Use Axolotl or Hugging Face peft to fine-tune

4. Export model to GGUF or HF format

5. Serve via Ollama

https://adithyask.medium.com/axolotl-is-all-you-need-331d5de...

https://www.philschmid.de/fine-tune-llms-in-2025

https://blog.devgenius.io/complete-guide-to-model-fine-tunin...

ofrzeta · 2025-07-20T08:05:31 1752998731

So, finetuning? Not so easy with ChatGPT I guess, but thanks for the info anyway.

firesteelrain · 2025-07-20T08:11:47 1752999107

Yes, it takes an existing model and fine tunes it. ChatGPT would be basically extensively prompt engineering in a session. Maybe using their API? I have never tried it personally.

When I fine tuned a Mistral 7B model it took hundreds of examples in Alpaca style

It’s a lot of work. Maybe OpenAI has a more efficient way of doing it because in my case I had to manually adjust each prompt

oharapj · 2025-07-20T05:24:31 1752989071

If you're OpenAI you scrape StackOverflow and GitHub and spend billions of dollars on training. If you're a user, you don't

sixothree · 2025-07-20T04:03:47 1752984227

RAG maybe?

firesteelrain · 2025-07-20T07:58:18 1752998298

RAG is good suggestion to pull in runtime without weights