Abstract
A robot-based procedure is described for traversing a collection of hyperlinked documents written in HTML and converting these to the XML-compliant and well-formed XHTML representation. Transcluded chemical content invoked using 〈embed〉 or 〈applet〉 HTML calls are converted to the XHTML recommended 〈object〉 form. Additional attributes such as title or derived chemical attributes such as a SMILES descriptor are added to improve the indexing of the resulting document collection. Conformance tests for the popular Web browsers are reported.
Original language | English |
---|---|
Pages (from-to) | 253-258 |
Number of pages | 6 |
Journal | Journal of Chemical Information and Computer Sciences |
Volume | 41 |
Issue number | 2 |
Early online date | 26 Mar 2001 |
DOIs | |
Publication status | Published - Mar 2001 |