INSIGHT
Why we distill 30B into 2B instead of serving the big model
Cost, latency, and where the data lives — the case for small, on-device models from our WigtnOCR work.
2026.05.22 5 min read WIGTN Research
w.
Right-sized beats biggest
For most enterprise problems, the right-sized model that runs where the data lives beats the biggest model behind an API. WigtnOCR is our proof: a 2B student matched its 30B teacher on the benchmark that mattered.
- Cost — predictable, no per-token surprises at scale
- Latency — single consumer GPU, no network round-trip
- Privacy — sensitive government documents never leave the building
The question is rarely “is this model smart enough?” It is “is this the smallest model that is smart enough?”
Working on something like this? Let's talk.
Talk to us