What locally hosted open-source LLMs are people using for #coding and having good success with?
I’ve got a MacBook Pro M4 Pro with 48 GB RAM and 2 TB drive space. Loading qwen3-coder-next:q4_K_M on it results in the MacBook going about 25 GB into swap. Surprisingly the machine is still usable, although it does show signs of stress.
Through my employer, I have access to (read as I run this) a Proxmox cluster with 5 members. 3 members each have 768 GB RAM on them. Each also has either 36 or 48 hyper threaded cores spread across 2 NUMA domains. I’ve loaded up qwen3-coder-next:q8_0 on it because the model takes about 85 GB RAM. I strapped Ollama to a single NUMA domain. The cluster members don’t have any fancy GPUs on them so I’m left with CPU operation only to run this model. As you might expect, it’s not the fastest thing around, but it does run.
Anyway, I’m just exploring and trying to see how useful these open source models are for mainly for coding, but other tasks as well, relative to Claude or Gemini Pro, both of which I have paid access to.
What have you had reasonable success with coding?
#ai #llm