TY - GEN
T1 - Can Hallucination Reduction in LLMs Improve Online Sexism Detection?
AU - Ding, Leyuan
AU - Rajapaksha, Praboda
AU - Myat, Aung Kaung
AU - Farahbakhsh, Reza
AU - Crespi, Noel
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/7/31
Y1 - 2024/7/31
N2 - Online sexism is a pervasive problem with a significant impact on the targeted individuals and social inequalities. Automated tools are now widely used to identify sexist content at scale, but most of these tools do not provide any further explanations beyond generic categories such as ‘toxicity’, ‘abuse’ or ‘sexism’. This paper explores the impact of hallucination reduction in LLMs on enhancing sexism detection across three different levels: binary sexism, four-categories of sexism, and fine-grained vectors, with a focus on explainability in sexism detection. We have successfully applied Neural Path Hunter (NPH) to GPT-2, with the purpose of “teaching” the model to hallucinate less. We have used hallucination-reduced GPT-2, achieving accuracy rates of 83.2% for binary detection, 52.2% for four-categories classification and 38.0% for the 11-vectors fine-grained classification, respectively. The results indicate that: i) While the model performances may slightly lag behind the baseline models, hallucination-reducing methods have the potential to significantly influence LLM performance across various applications, beyond just dialogue-response systems. Additionally, this method could potentially mitigate model bias and improve generalization capabilities, based upon the dataset quality and the selected hallucination reduction technique.
AB - Online sexism is a pervasive problem with a significant impact on the targeted individuals and social inequalities. Automated tools are now widely used to identify sexist content at scale, but most of these tools do not provide any further explanations beyond generic categories such as ‘toxicity’, ‘abuse’ or ‘sexism’. This paper explores the impact of hallucination reduction in LLMs on enhancing sexism detection across three different levels: binary sexism, four-categories of sexism, and fine-grained vectors, with a focus on explainability in sexism detection. We have successfully applied Neural Path Hunter (NPH) to GPT-2, with the purpose of “teaching” the model to hallucinate less. We have used hallucination-reduced GPT-2, achieving accuracy rates of 83.2% for binary detection, 52.2% for four-categories classification and 38.0% for the 11-vectors fine-grained classification, respectively. The results indicate that: i) While the model performances may slightly lag behind the baseline models, hallucination-reducing methods have the potential to significantly influence LLM performance across various applications, beyond just dialogue-response systems. Additionally, this method could potentially mitigate model bias and improve generalization capabilities, based upon the dataset quality and the selected hallucination reduction technique.
KW - GPT-2
KW - Hallucination
KW - LLM
KW - RoBERTa
KW - Sexism detection
UR - http://www.scopus.com/inward/record.url?scp=85201020796&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-66329-1_40
DO - 10.1007/978-3-031-66329-1_40
M3 - Conference Proceeding (Non-Journal item)
AN - SCOPUS:85201020796
SN - 9783031663284
T3 - Lecture Notes in Networks and Systems
SP - 625
EP - 638
BT - Intelligent Systems and Applications - Proceedings of the 2024 Intelligent Systems Conference IntelliSys Volume 1
A2 - Arai, Kohei
PB - Springer Nature
T2 - Intelligent Systems Conference, IntelliSys 2024
Y2 - 5 September 2024 through 6 September 2024
ER -