Skip to main content

miLLM User Manual

Welcome to the miLLM (Mechanistic Interpretability LLM Server) user manual.

miLLM is an OpenAI-compatible LLM inference server with built-in support for Sparse Autoencoder (SAE) feature steering and real-time activation monitoring. It serves as the inference backbone for mechanistic interpretability research, providing a REST API that any tool — including miStudio, Open WebUI, or custom scripts — can use to run steered inference.

What miLLM Does

  • Serves LLMs via an OpenAI-compatible API (/v1/chat/completions)
  • Attaches SAEs to model layers for real-time feature analysis
  • Steers model behavior by amplifying or suppressing specific features
  • Monitors activations to observe which features fire during inference
  • Manages profiles to save and restore steering configurations

Quick Navigation

SectionWhat You'll Learn
Getting StartedWhat miLLM is and how to install it
Model ManagementDownloading, loading, and quantizing models
SAE ManagementDownloading and attaching Sparse Autoencoders
Feature SteeringManipulating model behavior via features
Probe MonitoringReal-time activation observation
ProfilesSaving and sharing steering configurations
API ReferenceOpenAI-compatible and management endpoints
TroubleshootingCommon issues and fixes