Data Extraction
001They train on your code.
Every prompt. Every file. Every fix. It flows through infrastructure you don't control, improving systems they want to use to replace you.
A complete coding agent that executes entirely on your machine. No API calls. No usage caps.
Monitoring active
Data Extraction
001Every prompt. Every file. Every fix. It flows through infrastructure you don't control, improving systems they want to use to replace you.
Artificial Scarcity
002Slowdowns, overages, caps. Right when you're deep in a sprint, the meter decides you've had enough.
Silent Downgrades
003They silently downgrade to cheaper models during peak load. Full price, degraded experience.
Cloud Dependency
004Every completion makes a round trip across the internet. Thousands of tiny interruptions, every single day.
A complete AI coding agent running entirely on your own hardware. No usage limits. No cloud dependency.
Your Machine
Your Code
Keystrokes · Files
RIG
Local Inference
Response
<300ms · On Device
Your Machine
RIG Model Active
Flights. Spotty Wi-Fi. Network outages. Nothing stops your flow.
Refactor the whole codebase. Riff on an idea all day. Run agent loops without thinking about cost.
Your code, keystrokes, and files never leave your machine. Not anonymized. Not aggregated. Not sent.
No round-trip to a data center. Inference happens on your machine, in single-digit milliseconds.
Rig is a closed system, model, context, tools, and inference, engineered together for one job: real coding work.
Step 01
Step 02
The model is compressed to run efficiently on consumer machines, carefully preserving the reasoning patterns that matter most.
The result is an 8 GB model that fits comfortably in memory on a MacBook. Full reasoning. Local execution. Zero cost per token.
Step 03
Model Size
Model size (memory required)
Fits in 16 GB unified memory.
Accuracy loss: <0.3%
[ 01 ]
Builds a connected model of modules, dependencies, and relationships so reasoning happens across files and aligns with your architecture.
[ 02 ]
Edits that respect function contracts, type boundaries, and dependency graphs, reducing bugs and regressions.
[ 03 ]
Explore -> Plan -> Execute workflows ensure multiple steps are reasoned out before changes occur.
[ 04 ]
From refactors to test generation to feature builds, coordinate tools, code edits, web search, and commands as needed.
[ 05 ]
Each agent runs in its own workspace so experiments are safe, parallel workflows don't clash, and code changes stay isolated until you merge them.
[ 06 ]
Custom Rust inference engine optimized for CUDA and Metal, delivering up to 144 tokens per second on consumer hardware.
Latency
0ms
No round-trip required
Privacy
100%
Air-gapped by design
Cost / Token
$0
Your GPU, your tokens
Uptime
Local
No dependency on cloud
Custom Model
Optimized for consumer hardware
Inference
Cross-OS using Rust
Context Graph
Repo-wide code understanding
Terminal UI
Built in Rust and blazing fast
Heavily Tuned
Consistent tool calls and plan use
Opinionated
Focused on code correctness
rig://localhost · offline
λ rig init
RIG
> Scanning hardware...
> Found M4 · 16GB RAM
> Loading RIG Model OK
> Indexing 2,418 files · 87,102 symbols
✓ Ready. Network OFF Telemetry OFF
λ find the memory
Custom Model
Optimized for consumer hardware
Inference
Cross-OS using Rust
Context Graph
Repo-wide code understanding
Terminal UI
Built in Rust and blazing fast
Heavily Tuned
Consistent tool calls and plan use
Opinionated
Focused on code correctness
We're inviting engineers to run it on real code and help shape what ships.