In view of the increasing need to facilitate processing the content of scientific papers, we present an annotation scheme for annotating full papers with zones of conceptualisation, reflecting the information structure and knowledge types which constitute a scientific investigation. The latter are the Core Scientific Concepts (CoreSCs) and include Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. The CoreSC scheme has been used to annotate a corpus of 265 full papers in physical chemistry and biochemistry and we are currently automating the recognition of CoreSCs in papers. We discuss how the CoreSC scheme relates to other views of scientific papers and indeed how the former could be used to help identify negation and speculation in scientific texts.
|Title of host publication||NeSp-NLP '10: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing|
|Editors||Roser Morante, Caroline Sporleder|
|Publisher||Association for Computational Linguistics|
|Publication status||Published - 2010|
- text analysis
- text processing