Segmentation Methods for Character Recognition

Hiromichi Fujisawa, Yasuaki Nakano and Kiyomichi Kurino

Proceedings of the IEEE, Vol.80 [7], pp.1079-1092 (1992)

This paper discusses character segmentation methods, a key technology for character recognition that determines the usability and applicability of optical character readers. A pattern-oriented segmentation method that leads to document structure analysis is presented. A first example of advanced character segmentation is touching handwritten numeral segmentation. Connected pattern components are extracted instead of a pixel image, and spatial interrelations between components are measured to group them into meaningful character patterns. Stroke shapes are analyzed in the case of touching characters. A method of finding the touching positions can separate about 95% of connected numerals correctly. Ambiguities are handled by multiple hypotheses and verification by recognition. An extended form of pattern-oriented segmentation is also discussed by presenting another example of tabular form recognition. Document images of tabular forms are analyzed, and frames in the tabular structure can be extracted. By identifying semantic relationships between label frames and data frames, information on the form can be properly recognized. Advance character segmentation with a document structure analysis capability is becoming increasingly significant in automating information extraction from various kinds of documents.

[手書き文字切り出し][中野の研究][中野の目次]

mail address: ← お手数ですが打ち込んで下さい

First Written Before June 16, 1998
Transplanted to KSU Before May 15, 2003
Transplanted to So-net May 3, 2005
Last Update April 8, 2007

© Yasuaki Nakano 1998-2007