I'll play around with it some more later. I was running llava-v1.5-7b-q4.llamafi... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		senkora on Sept 23, 2024 \| parent \| context \| favorite \| on: Forget ChatGPT: why researchers now run small AIs ... I'll play around with it some more later. I was running llava-v1.5-7b-q4.llamafile which is the example that they recommend trying first at https://github.com/Mozilla-Ocho/llamafile Groq looks interesting and might be a better option for me. Thank you.

senkora on Sept 23, 2024 [–]

I got better performance of 20.18 tokens per second using tinyllama-1.1b-chat-v1.0.Q8_0.llamafile from https://huggingface.co/Bojun-Feng/TinyLlama-1.1B-Chat-v1.0-l...

If anyone is reading this and had trouble with a larger model, that might be the one to try next.

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact