Sinhala OCR — Image to Sinhala Text
සිංහල OCR — රූපයේ අකුරු වචන බවට හරවන්න.
Drop a Sinhala photo, scan, or screenshot. The text appears below in editable Sinhala Unicode. The OCR runs entirely in your browser — your image never leaves your device.
Try AkuruLiyo →Drop a Sinhala image here, or click to pick a file.
JPEG, PNG, or WebP. Browser-local — your image never uploads.
What this can extract
- Printed Sinhala from books, magazines, and newspapers
- Sinhala signage and screenshots
- Scanned PDFs (one page at a time as an image)
- Mixed Sinhala + English documents
Handwritten Sinhala is hit-and-miss — Tesseract is trained on printed glyphs. For best results, use the highest-resolution version of the source you have.
How it works
AkuruLiyo runs Tesseract.js in your browser with the Sinhala-trained language model (sin.traineddata, maintained by the Tesseract open-source community). On the first OCR you'll see a one-time ~10 MB download for the language model — subsequent runs are instant because the model lives in your browser's IndexedDB cache.
Nothing is uploaded. We don't have a backend for this — by design, the OCR pipeline never sees a server. The image you drop into the page stays in your tab; close the tab and it's gone.
Frequently asked questions
Is this Sinhala OCR free?
How accurate is it?
Why does the first run take a while?
sin.traineddata, ≈10 MB) into your browser's IndexedDB. After that, the model is cached and every subsequent OCR starts in milliseconds. The download only happens once per browser.