Question 1

Is this Sinhala OCR free?

Accepted Answer

Yes — completely free, no daily limits, no signup. It runs in your browser; we don't even see what you OCR.

Question 2

How accurate is it?

Accepted Answer

Excellent on clean printed Sinhala (books, articles, signs); good on photographed pages with reasonable lighting; modest on handwriting or very low-resolution sources. The cleaner the source, the better the output. Mixed Sinhala + English pages work — Tesseract handles both scripts.

Question 3

Why does the first run take a while?

Accepted Answer

The first OCR downloads the Sinhala language model (sin.traineddata, ≈10 MB) into your browser's IndexedDB. After that, the model is cached and every subsequent OCR starts in milliseconds. The download only happens once per browser.

Question 4

Can I edit and format the extracted text?

Accepted Answer

Yes. Click 'Open in editor →' after OCR and the text drops into a fresh AkuruLiyo document where you can fix any glyph errors, add headings, lists, formatting, images, and export to PDF / Word / Markdown.

Question 5

Can I OCR a PDF?

Accepted Answer

v1 accepts image files (JPEG, PNG, WebP). For a PDF, export or screenshot each page as an image and OCR them one by one. Bulk PDF OCR is a candidate for v2 once we add a backend surface.

Sinhala OCR — Image to Sinhala Text

What this can extract

How it works

Frequently asked questions