Have you tried a Llamafile? Not sure what platform you are using. From their readme:
> … by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.
Low cost to experiment IMO. I am personally using MacOS with an M1 chip and 64gb memory and it works perfectly, but the idea behind this project is to democratize access to generative AI and so it is at least possible that you will be able to use it.
I should have qualified the meaning of “works perfectly” :) No 70b for me, but I am able to experiment with many quantized models (and I am using a Llama successfully, latency isn’t terrible)