Popular Tags
No tags found in this context
Community curated code

Lemonade is a local AI server that allows users to run optimized LLMs on their own hardware, ensuring privacy and cost-effectiveness.

FastFlowLM enables efficient execution of large language models on AMD Ryzen AI NPUs, optimizing performance without GPU dependency.
vLLM is an efficient engine for LLM inference and serving, designed for high throughput and memory management.