Ocular Historical Document Recognition System
Overview
Ocular is a state-of-the-art historical OCR system described in the following papers:
Improved Typesetting Models for Historical OCR [PDF]
Taylor Berg-Kirkpatrick and Dan Klein.
ACL 2014.
Unsupervised Transcription of Historical Documents [PDF]
Taylor Berg-Kirkpatrick, Greg Durrett, and Dan Klein.
ACL 2013.
Ocular can recognize collections of documents that use historical fonts. The system is unsupervised: you don't need document images that are labeled with human transcriptions in order to learn a particular historical font. Instead, Ocular learns the font directly, straight from the set of input document images you want transcribed.