Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like self hosting random stuff on docker. Ollama has been a great addition. I know it's not, but it feels on par with ChatGPT.

It works perfectly on my 4090, but I've also seen it work perfectly on my friend's M3 laptop. It feels like an excellent alternative for when you don't need the heavy weights, but want something bespoke and private.

I've integrated it with my Obsidian notes for 1) note generation 2) fuzzy search.

I've used it as an assistant for mental health and medical questions.

I'd much rather use it to query things about my music or photos than whatever the big players have planned.



There's actually a very popular plugin for Obsidian that integrates RAG + LLM into Obsidian called Smart Connections.

https://github.com/brianpetro/obsidian-smart-connections


Ollama is not a model, it is the sofware to run models.


Not even that - wrapper for the software that runs the model


which model are you using? what size/quant/etc?

thanks!


Come join us on Reddit’s /r/localllama. Great community for local LLMs.


Not the parent, but I started using Llama 3.1 8b and it's very good.

I'd say it's as good as or better than GPT 3.5 based on my usage. Some benchmarks: https://ai.meta.com/blog/meta-llama-3-1/

Looking forward to try other models like Qwen and Phi in near future.


I found it to not be as good in my case for code generation and suggestions. I am using a quantized version maybe that's the difference.


I'd be interested in other people's recommendations as well. Personally I'm mostly using openchat with q5_k_m quantization.

OpenChat is imho one of the best 7B models, and while I could run bigger models at least for me they monopolize too many resources to keep them loaded all the time.


Agree. Please provide more details on this setup or a link.


Just try a few models on your machine? It takes seconds plus however long it takes to download the model.


I would prefer to have some personal recommendations - I've had some success with Llama3.1-8B/8bits and Llama3.1-70B/1bit, but this is a fast moving field, so I think it's worth the details.


New LLM Prompt:

Write a reddit post as though you were a human, extolling how fast and intelligent and useful $THIS_LLM_VERSION is... Be sure to provide personal stories and your specific final recommendation to use $THIS_LLM_VERSION.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact