Can Hallucination Reduction in LLMs Improve Online Sexism Detection?

Leyuan Ding*, Praboda Rajapaksha, Aung Kaung Myat, Reza Farahbakhsh, Noel Crespi

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Pennod mewn Llyfr/Adroddiad/Trafodion CynhadleddTrafodion Cynhadledd (Nid-Cyfnodolyn fathau)

Crynodeb

Online sexism is a pervasive problem with a significant impact on the targeted individuals and social inequalities. Automated tools are now widely used to identify sexist content at scale, but most of these tools do not provide any further explanations beyond generic categories such as ‘toxicity’, ‘abuse’ or ‘sexism’. This paper explores the impact of hallucination reduction in LLMs on enhancing sexism detection across three different levels: binary sexism, four-categories of sexism, and fine-grained vectors, with a focus on explainability in sexism detection. We have successfully applied Neural Path Hunter (NPH) to GPT-2, with the purpose of “teaching” the model to hallucinate less. We have used hallucination-reduced GPT-2, achieving accuracy rates of 83.2% for binary detection, 52.2% for four-categories classification and 38.0% for the 11-vectors fine-grained classification, respectively. The results indicate that: i) While the model performances may slightly lag behind the baseline models, hallucination-reducing methods have the potential to significantly influence LLM performance across various applications, beyond just dialogue-response systems. Additionally, this method could potentially mitigate model bias and improve generalization capabilities, based upon the dataset quality and the selected hallucination reduction technique.

Iaith wreiddiolSaesneg
TeitlIntelligent Systems and Applications - Proceedings of the 2024 Intelligent Systems Conference IntelliSys Volume 1
GolygyddionKohei Arai
CyhoeddwrSpringer Nature
Tudalennau625-638
Nifer y tudalennau14
ISBN (Argraffiad)9783031663284
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 31 Gorff 2024
DigwyddiadIntelligent Systems Conference, IntelliSys 2024 - Amsterdam, Yr Iseldiroedd
Hyd: 05 Medi 202406 Medi 2024

Cyfres gyhoeddiadau

EnwLecture Notes in Networks and Systems
Cyfrol1065 LNNS
ISSN (Argraffiad)2367-3370
ISSN (Electronig)2367-3389

Cynhadledd

CynhadleddIntelligent Systems Conference, IntelliSys 2024
Gwlad/TiriogaethYr Iseldiroedd
DinasAmsterdam
Cyfnod05 Medi 202406 Medi 2024

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Can Hallucination Reduction in LLMs Improve Online Sexism Detection?'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn