THE FORMAT

The .mint file

Name: Documint
Author: Documint

An AST-grounded knowledge graph of your codebase. 313 symbols, 24 files, extracted in 0.20s. Every function signature, every type relationship, every parameter -- parsed from your source tree and SHA-256 verified. Not prose. Structure.

# project.mint

format: mint/v1

symbols: 313

files: 24

hash: a3f8c2e1...

[graph]

execute → fn(input: Input, retry: u32) → Result<Output>

parse_config → fn(path: &Path) → Config

validate → fn(schema: Schema, data: &[u8]) → bool

... 310 more symbols

PRIOR ART

Why existing formats fall short

LSIF and SCIP solve code intelligence for IDEs -- hover info, go-to-definition, find-references. They encode position-level data (line:col ranges) optimized for editor navigation, not semantic understanding. Griffe extracts Python API signatures but is single-language and designed for human-readable doc sites, not machine consumption.

None of these formats were designed for the problem AI agents actually face: understanding what an API does, how its types compose, and which documentation sections reference which symbols. .mint is purpose-built for that gap.

ANATOMY

Four layers of structured intelligence

Each layer is independently queryable. Agents request only what they need.

Symbol Graph

Every exported function, class, method, and type with full signatures, parameter types, and return types. Tree-sitter parsed, not regex matched.

Relationship Map

Call graphs, dependency chains, inheritance hierarchies, import/export edges. Not just what exists -- how symbols connect across module boundaries.

Doc-Symbol Bindings

Which markdown files reference which symbols, down to the section. When a signature changes, we know exactly which paragraphs are now stale.

Agent Context

CLAUDE.md, AGENTS.md, llms.txt -- derived from the symbol table, not hand-written. Section-addressable so agents can request specific API surfaces.

THE EVIDENCE

Why AST-grounded context beats prose

Research on repository-level code generation shows a 17% improvement in task completion when models receive structured type context versus natural language descriptions of the same API (RepoBench, 2024). The reason is straightforward: prose is ambiguous, types are not.

When an agent reads “this function takes a config object,” it has to guess the shape. When it reads fn(config: AppConfig) → Result<Server>, there is no ambiguity. .mint gives agents the latter for every symbol in your codebase.

README: written once, forgotten

READMEs describe intent at a point in time. They cannot keep pace with refactors, renamed parameters, or changed return types. Most are outdated within weeks of the last major change.

CLAUDE.md: manual maintenance

CLAUDE.md requires a human to update it after every API change. In practice, this means it drifts silently. The agent trusts it. The agent hallucinates.

.mint: verified on every push

Every symbol in a .mint file is SHA-256 hashed against the actual AST node. If the hash does not match, we flag it. Drift is detected in <2ms, not discovered by a user filing a bug.

PIPELINE

From source to knowledge graph in 0.20s

Tree-sitter parses your source files into concrete syntax trees. We walk the CST, extract exported symbols with full type annotations, trace cross-module relationships, and serialize everything into a single portable .mint artifact.

Source[src]

──→

Tree-sitter CST[cst]

──→

Symbol Extract[sym]

──→

.mint[.mint]

Parse

Tree-sitter produces a concrete syntax tree per file. Incremental reparsing means only changed files are re-processed on subsequent runs.

Extract

We walk exported nodes and capture function signatures, class hierarchies, type aliases, constants, and interfaces -- with full parameter and return type annotations.

Relate

Import/export edges, call sites, type dependencies, and inheritance chains are traced across module boundaries to build the relationship graph.

Serialize

The graph, signatures, and doc-symbol bindings are serialized into a single .mint file. SHA-256 hashes per symbol enable O(1) drift detection.

BENCHMARKS

Measured on real codebases

313

Symbols extracted (avg project)

0.20s

Full extraction time

<2ms

Per-symbol drift check

17%

Task completion gain (structured vs prose)

COMPARISON

.mint vs the alternatives

README, CLAUDE.md, and llms.txt were designed for humans or as ad-hoc agent context. .mint is the first format designed specifically for machine consumption with verified accuracy.

	.mint	README	CLAUDE.md	llms.txt
AST-grounded symbol graph	✓	—	—	—
Full type signatures + params	✓	—	—	—
Cross-symbol relationship map	✓	—	—	—
Auto-generated from source	✓	—	—	—
SHA-256 verified against AST	✓	—	—	—
Drift detection on push	✓	—	—	—
Section-addressable by agents	✓	—	—	—
Human-readable	✓	✓	✓	✓
AI agent context	✓	—	✓	✓
CI integration	✓	—	—	—

Give your agents structured context

Generate a .mint file from your codebase. Every symbol, typed and verified. Every relationship, traced and queryable. Your agents deserve better than stale prose.

INSTALL

>pip install documint-mcp

try it now read the docs