Quantcast
Channel: Maia Atlantis: Ancient World Blogs
Viewing all articles
Browse latest Browse all 136795

From my diary

$
0
0

I pulled up the OCR project for the Book of Asaph the physician in Finereader 11 this lunchtime.  It’s a 6th century Jewish medical text, which apparently contains interesting quotes from classical writers.

Readers may remember — I can hardly remember myself — that I was experimenting with deskewing the pages, increasing the brightness, etc, in order to improve OCR.

Pretty much the last thing that I did was to open the PDF and import it into FR11, without doing any work.  I ran the OCR anyway, just to see what the raw result would look like.

The raw result is certainly better than some of the rubbish that I have had to clean up in the past.  But it is far from simple.  I think deskewing etc would be the answer.  However there are 250 pages to do, one at a time.   It might be a gentle task to do some time.


Viewing all articles
Browse latest Browse all 136795

Trending Articles