The Corpus was developed primarily to add value to scientific papers, through semantic markup that would make it easier for natural language processing and semantic web applications to automatically extract information pertaining to core scientific concepts. The ART corpus can also be used as a training set for machine learning algorithms, in order to automate the annotation of papers with CISP metadata.
Liakata, M. (Creator), Soldatova, L. N. (Creator) (29 Apr 2009). The ART Corpus. Prifysgol Aberystwyth | Aberystwyth University. Readme(.txt), Description_ART_Corpus(.pdf), ART_Corpus(ar.gz). 10.20391/f1c7c532-61fb-4914-8845-33463ff13105