Back to registry

huggingface-local-models

Official
huggingface/skills
10,656Added Jun 5, 2026
llama-cppgguflocal-inferencequantizationcpu-inferencehuggingfaceopenai-compatible

Summary

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

SKILL.md