
huggingface-local-models
Officialhuggingface/skills
10,656Added Jun 5, 2026
llama-cppgguflocal-inferencequantizationcpu-inferencehuggingfaceopenai-compatible
Summary
Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.