miLLM User Manual

Welcome to the miLLM (Mechanistic Interpretability LLM Server) user manual.

miLLM is an OpenAI-compatible LLM inference server with built-in support for Sparse Autoencoder (SAE) feature steering and real-time activation monitoring. It serves as the inference backbone for mechanistic interpretability research, providing a REST API that any tool — including miStudio, Open WebUI, or custom scripts — can use to run steered inference.

What miLLM Does

Serves LLMs via an OpenAI-compatible API (/v1/chat/completions)
Attaches SAEs to model layers for real-time feature analysis
Steers model behavior by amplifying or suppressing specific features
Monitors activations to observe which features fire during inference
Manages profiles to save and restore steering configurations

Section	What You'll Learn
Getting Started	What miLLM is and how to install it
Model Management	Downloading, loading, and quantizing models
SAE Management	Downloading and attaching Sparse Autoencoders
Feature Steering	Manipulating model behavior via features
Probe Monitoring	Real-time activation observation
Profiles	Saving and sharing steering configurations
API Reference	OpenAI-compatible and management endpoints
Troubleshooting	Common issues and fixes

What miLLM Does​

Quick Navigation​

What miLLM Does

Quick Navigation