Back
Join now
About

Popular Tags

  • react
  • typescript
  • ui-components
  • shadcn-ui
  • tailwind
  • open-source-coding-agent
  • llm
  • ai-agents
  • open-source
  • react-components

Top Sources

  • github.com
  • clerk.com
  • 1771technologies.com
  • 21st.dev
  • abui.io
  • activepieces.com
  • ai-sdk.dev
  • alash3al.github.io
  • alchemy.run
  • altsendme.com

Browse by Type

  • Tools
  • Code
bookmrks.io - Discovery, refined.
Tags
  • api-server
    1
  • command-line-tool
    1
  • developer-tools
    1
  • gguf
    1
  • gpt
    1
  • huggingface
    1
  • huggingface-models
    1
  • huggingface-transformers
    1
  • inference-server
    1
  • llama
    1
  • llamacpp
    1
  • llm
    1
  • llm-inference
    1
  • local-ai
    1
  • lora
    1
  • machine-learning
    1
  • ollama-api
    1
  • open-source-coding-agent
    1
  • openai-compatible
    1
  • rust
    1
  • rust-crate
    1
  • transformers
    1
Website favicongithub.com

Python-free Rust Inference Server for OpenAI API

Shimmy is a Rust-based inference server providing local, OpenAI-compatible endpoints for machine learning models.

flux
Tech Stack
GitHubCloudflareCloudflare WorkersNginxFly.ioRailwayRenderNode.jsJavaScriptCargoRustExpressJestDockerBashPythonCodecovGitHub ActionsCObjective-CRuby
Summary

Shimmy is a Python-free Rust inference server designed to provide OpenAI-compatible endpoints for GGUF models. It allows users to run AI models locally, ensuring privacy and eliminating the need for external API calls.

Key features:

  • Single Binary - Download and run without any compilation or dependencies.
  • Automatic Model Discovery - Finds models from local directories and caches without manual configuration.
  • Advanced MOE Support - Efficiently runs large models on consumer hardware with intelligent CPU/GPU processing.
  • Zero Configuration - Automatically allocates ports and detects model adapters.
  • Compatibility - Works seamlessly with existing OpenAI SDKs and tools.

Shimmy is ideal for developers looking for a reliable and efficient way to run machine learning models locally while maintaining control over their data and environment.

Comments
No comments yet. Sign in to add the first comment!