MODEL · EMNLP 2026 (IN PREP)

WigtnOCR: a 2B parser that reads Korean gov forms like a model 15× its size

A 2B-parameter document parser distilled from a 30B teacher — ranked #1 on the KoGovDoc Korean government-document benchmark while running on a single consumer GPU.

2026.05.20 12 min read WIGTN Research

The problem

Existing OCR and rule-based parsers fail on Korean government documents — missing tables, forms, and complex layouts. State-of-the-art VLM parsers are tuned for English/Chinese, and 30B models are too expensive to deploy in production.

The approach

WigtnOCR distills a 30B teacher into a 2B student through pseudo-label distillation and LoRA fine-tuning on 2,667 Korean government document pages — reaching teacher-level accuracy on OmniDocBench while running on a single consumer GPU.

  • Student: Qwen3-VL-2B-Instruct · Teacher: Qwen3-VL-30B-Instruct
  • LoRA rank=8 fine-tuning, trained in 31 minutes with DeepSpeed ZeRO-2
  • #1 overall on KoGovDoc-Bench across 6 parsers — beating models 10–30× larger
Teacher-level accuracy at 1/15th the size, on hardware you already own — that is the whole point.

Open

Model weights, training data, and evaluation code are all released on HuggingFace and GitHub.

Working on something like this? Let's talk.

Talk to us
More from Research