skip to content
Documentation

Documentation

Conifer docs

Run open-weight models on your own hardware, from download to first token to a governed fleet. Every step stays on the machine in front of you.


Conifer is the runtime layer between a model and the metal. It fits the weights to your memory, picks a quantization that lands without swap, sizes the context window, and holds decode fast under load. These pages cover the work above that line: which model to run, what the catalog terms mean, where the engine spends its time, and how to give a local model tools without anything leaving your hardware. Concept before detail, so the map below doubles as the route.

Start here

Start with What is Conifer? The rest of the runtime is built on that one idea. Then install the app, load your first model, and run your first chat. Cold app to a streaming reply takes about five minutes. To skip the UI and point an existing client at the local API, one command stands it up:

terminal
conifer serve --model qwen3-8b

The sections

Each section owns its information and links out to the rest. Selection pages recommend and point. Concept pages define a term once. Engine pages explain the internals, and reference is for lookup. Nothing gets re-explained two pages over, and the trail from “which model?” to “why is it fast?” stays short.

SectionWhat’s in it
Getting startedInstall, your first model, your first chat. The five-minute path.
Choosing a modelWhich model for your task, with a catalog this large: coding, reasoning and math, writing, science, law, long context, or a good all-rounder.
How models workThe catalog terms, defined once: dense vs. sparse, instruct vs. reasoning, modalities, and quantization.
Inside the engineThe four surfaces Conifer tunes: the architecture, the model, the prompt, and the tool call.
CLI & local APIDrive the runtime from the terminal and serve an OpenAI-compatible endpoint on localhost.
Agents, tools, grantsGive a local model files, calendar, and notes under a deny-by-default grant model.
Security & governanceWhat stays on the machine, the threat model, and running Conifer across a team without a collection server.
ReferenceTroubleshooting, performance tuning, keyboard shortcuts, and the glossary.