2010年12月7日星期二

Reading #17: Distinguishing Text from Graphics in On-line Handwritten Ink (Bishop)

Comments
Francisco
Summary
The paper presents a system that separates text from graphics strokes. Different with the previous paper about entropy, this paper proposes a HMM-based method to distinguish text from graphics.

Independent Stroke Model
9 features are extracted to represent a stroke. A MLP model was trained for classification. The object function is the cross entropy error, which is defined as

Hidden Markov Model
The paper first proposes a uni-partite HMM, including a transition Matrix and a emission probability distribution over stroke features. The HMM is based on only temporal context.

uni-partite HMM

And the author propose a bi-partite HMM, including the gaps between strokes besides. Vertibi algorithm is used to find the optimal solution.

bi-partite HMM
Experiment and Result
The author tests his algorithm on Cambridge test set and Redmond test set. Both these two HMM models are better than the independent model. In Redmond test set, the bi-partite HMM model is worse than the uni-partite model.
Discussion
The paper gives us an idea about how to distinguish text from graphics. It should be not only dependent on their features, but also dependent on context. The context actually can increase the accuracy of recognition. Context seems very useful in recognition problems.

And in the experiment results, we can see that almost half of graphics still be recognized as text. So though the texts are rarely recognized as graphics, the cost is that many graphics are recognized falsely.  I dont think it is very easy to distinguish text from graphics. Sometimes it is still very hard for people to distinguish.

1 条评论:

  1. I agree, this is not an easy problem. In fact without the context I see it is impossible to recognize infallibly certain shapes. An "O" for instance is clearly a circle or an ellipse in other context. I also have trouble recognizing which button is "post" and which is "cancel" when they are in Chinese like here, fortunately I can use context to find out (post must be the one on the left).

    回复删除