Community curated code
llama.cpp enables high-performance LLM inference in C/C++, supporting various hardware and model types.