Abstract
Text is very important to video retrieval, index, and understanding. However, its detection and extraction is challenging due to varying background, low contrast between text and non-text regions, and perspective distortion. In this paper, we propose a novel two phase approach to tackling this problem by discriminative features and edge density. The first phase firstly defines and extracts a novel feature called edge distribution entropy and then uses this feature to remove most non-text regions. The second phase employs a Support vector machine (SVM) to further distinguish real text regions from nontext ones. To generate inputs for SVM, additional three novel features are defined and extracted from each region: a foreground pixel distribution entropy, skeleton/size ratio, and edge density. After text regions have been detected, texts are extracted from such regions that are surrounded by sufficient edge pixels. A comparative study using two publicly accessible datasets shows that the proposed method significantly outperforms the selected four state of the art ones for accurate text detection and extraction.
Original language | English |
---|---|
Pages (from-to) | 322-328 |
Number of pages | 7 |
Journal | Chinese Journal of Electronics |
Volume | 23 |
Issue number | 2 |
Publication status | Published - 01 Jan 2014 |
Keywords
- Edge density
- Edge distribution entropy
- Foreground pixel distribution entropy
- Text detection
- Text extraction