Use llamacpp if you need to run it on a potato. Use koboldcpp if you have a gpu. Go to hugging face and download any gguf uncensored model of your choice. Pay openai zero dollars.
Beware that most forms of self-hosted ai (i.e. Everything that isn’t llamacpp, koboldcpp or easy diffusion) are generally extremely difficult and borderline impossible to install from their github repos. Anything having to do with voice tts for example, you can fucking forget about it. I’ve been hacking away on the 3 most popular open source ones for months and still haven’t gotten even one of them to actually work and stop having python dependency issues. It’ll be a great day when they stop using python for this stuff.
I hate Python and I’ve barely used it. It’s a dependency nightmare. I wrote something that calls the OAI API in Java and managed to install Automatic and SD, but I’ll go the easy route. I have an older 6GB like GTX1660 I think. Is that enough for kobold?
That should work in koboldcpp. I’m running mine on an older gpu than that but with more vram. Use the gpulayers parameter to control exactly how much of the work gets offloaded to the gpu to control how much vram gets used up.
I’ve tried llama3, the 8b version. It just made a joke about the famous song “stop” by the beatles. It’s on the 1969 album “let it be” and it contains the line “”\ Stop you don’t know what it’s like / Being on your own…"
I might invest in some better hardware and try the 27b version haha XD
It’s really just download and install. Just follow the instruction on the website. And check the readme on githun for how to use it.
If you want aa nice looking webinterface, instead of a commandline interface, you can also download one of the many ready to use frontends. You can find them in the github readme.
I installed it yesterday and the cli is pretty slow (on windows), but the rest api is pretty quick.
A local, uncensored AI.
Obviously I’m assuming this wouldn’t be a paperback.
Llama?
Use llamacpp if you need to run it on a potato. Use koboldcpp if you have a gpu. Go to hugging face and download any gguf uncensored model of your choice. Pay openai zero dollars.
Beware that most forms of self-hosted ai (i.e. Everything that isn’t llamacpp, koboldcpp or easy diffusion) are generally extremely difficult and borderline impossible to install from their github repos. Anything having to do with voice tts for example, you can fucking forget about it. I’ve been hacking away on the 3 most popular open source ones for months and still haven’t gotten even one of them to actually work and stop having python dependency issues. It’ll be a great day when they stop using python for this stuff.
I hate Python and I’ve barely used it. It’s a dependency nightmare. I wrote something that calls the OAI API in Java and managed to install Automatic and SD, but I’ll go the easy route. I have an older 6GB like GTX1660 I think. Is that enough for kobold?
That should work in koboldcpp. I’m running mine on an older gpu than that but with more vram. Use the gpulayers parameter to control exactly how much of the work gets offloaded to the gpu to control how much vram gets used up.
I’ve tried llama3, the 8b version. It just made a joke about the famous song “stop” by the beatles. It’s on the 1969 album “let it be” and it contains the line “”\ Stop you don’t know what it’s like / Being on your own…"
I might invest in some better hardware and try the 27b version haha XD
OpenwebUI with ollama is really good. Ollama is an easy install and OpenwebUI just needs docker, which seems complicated but it’s actually very easy.
Ollama alone works as well but it’s just a cli, not the best
Probably. I think that might be the top open source small LLM. I really need to set it up and try it.
It’s really just download and install. Just follow the instruction on the website. And check the readme on githun for how to use it.
If you want aa nice looking webinterface, instead of a commandline interface, you can also download one of the many ready to use frontends. You can find them in the github readme.
I installed it yesterday and the cli is pretty slow (on windows), but the rest api is pretty quick.