Abstract
Text representations is a key task for many natural language processing applications such as document classification, ranking, sentimental analysis and so on. The goal of it is to numerically represent the unstructured text documents so that they can be computed mathematically. Most of the existing methods leverage the power of deep learning to produce a representation of text. However, these models do not consider about the problem that text itself is usually semantically ambiguous and reflects limited information. Due to this reason, it is necessary to seek help from external knowledge base to better understand text.
In this paper, we propose a novel framework named Text Concept Vector which leverages both the neural network and the knowledge base to produce a high quality representation of text. Formally, a raw text is primarily conceptualized and represented by a set of concepts through a large taxonomy knowledge base. After that, a neural network is used to transform the conceptualized text into a vector form which encodes both the semantic information and the concept information of the original text. We test our framework on both the sentence level task and the document level task. The experimental results illustrate the effectiveness of our work.
In this paper, we propose a novel framework named Text Concept Vector which leverages both the neural network and the knowledge base to produce a high quality representation of text. Formally, a raw text is primarily conceptualized and represented by a set of concepts through a large taxonomy knowledge base. After that, a neural network is used to transform the conceptualized text into a vector form which encodes both the semantic information and the concept information of the original text. We test our framework on both the sentence level task and the document level task. The experimental results illustrate the effectiveness of our work.
Original language | English |
---|---|
Pages (from-to) | 103-114 |
Number of pages | 12 |
Journal | Expert Systems with Applications |
Volume | 96 |
Early online date | 22 Nov 2017 |
DOIs | |
Publication status | Published - 15 Apr 2018 |
Keywords
- text representation
- knowledge base
- representation learning
- network embedding