Shimmy is a Rust-based inference server providing local, OpenAI-compatible endpoints for machine learning models.
Shimmy is a Python-free Rust inference server designed to provide OpenAI-compatible endpoints for GGUF models. It allows users to run AI models locally, ensuring privacy and eliminating the need for external API calls.
Key features:
Shimmy is ideal for developers looking for a reliable and efficient way to run machine learning models locally while maintaining control over their data and environment.