Document Understanding Cooperating with OCR

A Document Understanding System Incorporating Character

Yasuaki Nakano, Hiromichi Fujisawa, Osamu Kunisaki, Kunihiro Okada and Toshihiro Hananoi
Proceedings of the 8th ICPR, pp.801-803 (1986)

Abstract

A document understanding system capable of document structure analysis together with character recognition system has bee developed and applied to an Optical Character Reader (OCR) system to generate format data automatically. In the registration step, a document with table type construction uis scanned by the system as an example. The OCR extracts the rectangular fields from the document, recognizes the characters in the fields and then identifies the label name of it. The OCR generates the format data of each field from the relations between fields and label names by consulting a knowledge-base. In the recognition step, the system recognizes the handwritten Chinese characters (Kanji) using the generated format data.

[Document Understanding] [Research Themes of Prof. Nakano.]

mail address: ←　お手数ですが打ち込んで下さい

First Written Before June 17, 1998
Transplanted to KSU Before June 19, 2003
Transplanted to So-net April 22, 2007
Last Update April 22, 2007