Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok, impressive!

What are real world use cases for 7B family of models? Is anyone using them for anything productive?



They're quite good at generating scaffolds and ideas (mistral specifically).

You can use them for trivial nlp tasks ("between 0 and 1 how similar are these two sentences? Respond with an explanation.") and because it's a small model, you just run it 4 or 5 times and take an average pretty quickly.


7B coding models? Having massive amounts of questionable code :)


Welp, looks like I'm out of a job. Perhaps management will suit me well, where I can make massive amounts of questionable decisions.


Yeah, same experience here


They're perfectly fine for story telling and basic chatbot duty. Also generating basic code boilerplate works just fine.


They make good classifiers when fine tuned


Interesting!

I'd have one use case for classification: user text (from a jira issue) mapped to the team responsible for the fix.

Can you share some tutorials? I only just managed to get this working on windows/cuda:

https://colab.research.google.com/drive/1vk8i01apaSp59GVV2yI...

It's been a royal pain to setup


Do you have any pointers to learn how to start with fine tuning mistral locally!


Use axolotl




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: