Abstract
SAPIENTA stands for “Semantic Annotation of Papers: Interface & ENrichment Tool Automated” and incorporates a machine learning classifier for identifying CoreSCs trained using Conditional Random Fields (CRF). The machine learning classifier has been evaluated on 265 chemistry and bio-chemistry papers yielding more than 50% average accuracy for the 11 Core Scientific Concepts. The automatically generated concepts have been used to generate automatic summaries, evaluated in a question answering task by chemistry experts, yielding a precision of 75% and a recall of 66%. SAPIENTA also allows multi-label annotation at the sentence level and has been used by three biology experts to annotate 50 biology papers from Pubmed Central, which are relevant for Cancer Risk Assessment (CRA).
Original language | English |
---|---|
Media of output | Online |
Publication status | Published - 2011 |