OCR Models

OCR (Optical Character Recognition) models convert documents such as PDFs, scans, and images into clean, structured text. These vision-language models process visual content end-to-end, making them ideal for document digitization, data extraction, and content accessibility workflows.

  • LightOnOCR-2-1B: A compact 1B-parameter end-to-end multilingual vision-language model by LightOn. Converts documents (PDFs, scans, images) into clean, naturally ordered text with strong multilingual, LaTeX, and scan coverage.

Last updated

Was this helpful?