Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm sorry but from a _practical_ standpoint, it feels like mostly fluff. Someone was advertising today on a HN hiring post that they would create a basic chatbot for a specific set of documents for $15,000. This feels like the type of web page that person would use to confuse a client into thinking that was a fair price.

Practically speaking the starting point should be things like the APIs such as OpenAI or open source frameworks and software. For example, llama_index https://github.com/jerryjliu/llama_index. You can use something like that or another GitHub repo built with it to create a customized chatbot application in a few minutes or a few days. (It should not take two weeks and $15,000).

It would be good to see something detailed that demonstrates an actual use case for fine tuning. Also, I don't believe that the academic tests are appropriate in that case. If you really were dead set on avoiding a leading edge closed LLM, and doing actual fine-tuning, you would want a person to look at the outputs and judge them in their specific context such as handling customer support requests for that system.



What are you even talking about, why would anyone be building chatbots at this point? Chatbots are like the hello world of using an LLM API, it has nothing to do with what this article is about.


Almost every LLM application has an element of conversing or querying an AI based on some knowledgeset or tasks etc.

100% this web page (or similar) will be used to basically scam clients into overpaying for simple wrappers around llama_index or LangChain etc. Some people will spend a week wasting their time trying to fine-tune an open source LLM on some wholly inadequate dataset before realizing they can use OpenAI and something from github. But most will not admit that.

Sure, a few people doing basically research projects for a large company or university will find some of the information useful. But realistically, probably not so much if they have to actually deliver a working system in a reasonable amount of time that would justify the business expense.


No.. this information is for software engineers applying LLM's in their applications. I'm working on an LLM based system, and I'm just soloing a startup with no academic background and it's certainly no research project. I've been doing it for 3 months 1-2 hours per night and I've already applied half the patterns in this article.

I'll grant you that the way the author presents his ideas seems a bit academic, but I assure you all of this information is just the immediate things you run into as a software engineer trying to integrate LLM's into your systems. No one is jumping to finetuning before they've even tried GPT4 on their problem.


For some use cases, legal reasons such as proprietary/private data, copyright, terms of service, prevent the use of a 3rd-party API.

On the other hand, directly using an off-the-shelf model, even the best ones, may not meet your performance requirements.

That’s where fine-tuning an open LLM is necessary.


1) Dropbox can be replicated in a couple of hours with SFTP (or whatever was that iconic HN comment) 2) the devil is in the details. How do you get the data out of documents? Are they pdfs? Do they have tables? Do they have images? Sure, creating embeddings from text is simple and shoving that into a prompt to get an answer is easy, but getting that text out from different documents can be tricky.

Source: finished a chat to your documents project a month ago


I guess my point was really not to try to emphasize that chat with documents was particular easy for every application but rather just to suggest that the article wasn't particularly practical advice for common use cases.


Hard disagree. Evaluation is hard. Handling stupid mistakes from LLMs is something that has to be taken into account from day 1.

Finetuning is not that needed in my experience.


>SFTP

rsync




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: