WigtnOCR: a 2B parser that reads Korean gov forms like a model 15× its size
A 2B-parameter document parser distilled from a 30B teacher — ranked #1 on the KoGovDoc Korean government-document benchmark while running on a single consumer GPU.

The problem
Existing OCR and rule-based parsers fail on Korean government documents — missing tables, forms, and complex layouts. State-of-the-art VLM parsers are tuned for English/Chinese, and 30B models are too expensive to deploy in production.
The approach
WigtnOCR distills a 30B teacher into a 2B student through pseudo-label distillation and LoRA fine-tuning on 2,667 Korean government document pages — reaching teacher-level accuracy on OmniDocBench while running on a single consumer GPU.
- Student: Qwen3-VL-2B-Instruct · Teacher: Qwen3-VL-30B-Instruct
- LoRA rank=8 fine-tuning, trained in 31 minutes with DeepSpeed ZeRO-2
- #1 overall on KoGovDoc-Bench across 6 parsers — beating models 10–30× larger
Teacher-level accuracy at 1/15th the size, on hardware you already own — that is the whole point.
Open
Model weights, training data, and evaluation code are all released on HuggingFace and GitHub.
Working on something like this? Let's talk.
Talk to us