Crynodeb
Semi-structured data records contained in the Web pages provide useful information for shopping agents and metasearch engines. In this paper, we present a visual segmentation-based data record extraction (VSDR) method to extract data records from those Web pages. VSDR method first segments a Web page into semantic blocks using the spatial closeness and visual resemblance of data records, then neighboring and non-neighboring data records are extracted based on a compress and collapse technique. Experimental results slum that unlike the existing methods which only generate good results on their test domains, VSDR is a general data record extraction method that is able to produce quite stable and good results on a wide range of Web pages.
| Iaith wreiddiol | Saesneg |
|---|---|
| Teitl | International Conference on Information Reuse and Itegration |
| Tudalennau | 502-507 |
| Nifer y tudalennau | 6 |
| ISBN (Electronig) | 1-4244-1500-4 |
| Dynodwyr Gwrthrych Digidol (DOIs) | |
| Statws | Cyhoeddwyd - 13 Awst 2007 |
| Digwyddiad | International Conference on Information Reuse and Itegration - Las Vegas, Teyrnas Unedig Prydain Fawr a Gogledd Iwerddon Hyd: 13 Awst 2007 → 15 Awst 2007 |
Cynhadledd
| Cynhadledd | International Conference on Information Reuse and Itegration |
|---|---|
| Gwlad/Tiriogaeth | Teyrnas Unedig Prydain Fawr a Gogledd Iwerddon |
| Dinas | Las Vegas |
| Cyfnod | 13 Awst 2007 → 15 Awst 2007 |
Ôl bys
Gweld gwybodaeth am bynciau ymchwil 'Visual Segmentation-Based Data Record Extraction From Web Documents'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.Dyfynnu hyn
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver