TY - JOUR
T1 - Fuzzy orders-of-magnitude based link analysis for qualitative alias detection
AU - Shen, Qiang
AU - Boongoen, Tossapon
N1 - Funding Information:
This work is sponsored by UK EPSRC grant EP/D057086. The authors are grateful to the team members for their contribution, while taking full responsibility for the views expressed in this paper.
PY - 2012/4
Y1 - 2012/4
N2 - Alias detection has been the significant subject being extensively studied for several domain applications, especially intelligence data analysis. Many preliminary methods rely on text-based measures, which are ineffective with false descriptions of terrorists' name, date-of-birth, and address. This barrier may be overcome through link information presented in relationships among objects of interests. Several numerical link-based similarity techniques have proven effective for identifying similar objects in the Internet and publication domains. However, as a result of exceptional cases with unduly high measure, these methods usually generate inaccurate similarity descriptions. Yet, they are either computationally inefficient or ineffective for alias detection with a single-property based model. This paper presents a novel orders-of-magnitude based similarity measure that integrates multiple link properties to refine the estimation process and derive semantic-rich similarity descriptions. The approach is based on order-of-magnitude reasoning with which the theory of fuzzy set is blended to provide quantitative semantics of descriptors and their unambiguous mathematical manipulation. With such explanatory formalism, analysts can validate the generated results and partly resolve the problem of false positives. It also allows coherent interpretation and communication within a decision-making group, using this computing-with-word capability. Its performance is evaluated over a terrorism-related data set, with further generalization over publication and email data collections.
AB - Alias detection has been the significant subject being extensively studied for several domain applications, especially intelligence data analysis. Many preliminary methods rely on text-based measures, which are ineffective with false descriptions of terrorists' name, date-of-birth, and address. This barrier may be overcome through link information presented in relationships among objects of interests. Several numerical link-based similarity techniques have proven effective for identifying similar objects in the Internet and publication domains. However, as a result of exceptional cases with unduly high measure, these methods usually generate inaccurate similarity descriptions. Yet, they are either computationally inefficient or ineffective for alias detection with a single-property based model. This paper presents a novel orders-of-magnitude based similarity measure that integrates multiple link properties to refine the estimation process and derive semantic-rich similarity descriptions. The approach is based on order-of-magnitude reasoning with which the theory of fuzzy set is blended to provide quantitative semantics of descriptors and their unambiguous mathematical manipulation. With such explanatory formalism, analysts can validate the generated results and partly resolve the problem of false positives. It also allows coherent interpretation and communication within a decision-making group, using this computing-with-word capability. Its performance is evaluated over a terrorism-related data set, with further generalization over publication and email data collections.
KW - Orders-of-magnitude reasoning
KW - alias detection
KW - fuzzy set
KW - intelligence data
KW - link analysis
KW - similarity measure
UR - http://hdl.handle.net/2160/13537
UR - http://www.scopus.com/inward/record.url?scp=84857764366&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2010.255
DO - 10.1109/TKDE.2010.255
M3 - Article
SN - 1041-4347
VL - 24
SP - 649
EP - 664
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 4
M1 - 5677516
ER -