huggingface-local-models

Official

huggingface/skills

10,656Added Jun 5, 2026

llama-cppgguflocal-inferencequantizationcpu-inferencehuggingfaceopenai-compatible

Summary

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

huggingface-local-models

Summary

SKILL.md