Popular Tags
No tags found in this context
Community curated code

FastFlowLM enables efficient execution of large language models on AMD Ryzen AI NPUs, optimizing performance without GPU dependency.
vLLM is an efficient engine for LLM inference and serving, designed for high throughput and memory management.