INSIGHT

Why we distill 30B into 2B instead of serving the big model

Cost, latency, and where the data lives — the case for small, on-device models from our WigtnOCR work.

2026.05.22 5 min read WIGTN Research

Right-sized beats biggest

For most enterprise problems, the right-sized model that runs where the data lives beats the biggest model behind an API. WigtnOCR is our proof: a 2B student matched its 30B teacher on the benchmark that mattered.

Cost — predictable, no per-token surprises at scale
Latency — single consumer GPU, no network round-trip
Privacy — sensitive government documents never leave the building

The question is rarely “is this model smart enough?” It is “is this the smallest model that is smart enough?”

Working on something like this? Let's talk.

Talk to us

Why we distill 30B into 2B instead of serving the big model

Right-sized beats biggest

How WIGVO translates a phone call in real time

WIGTN Flake: five AI experts debate where to open your cafe