Developing robust systems to detect abuse is a crucial part of online content moderation and plays a fundamental role in creating an open, safe and accessible Internet. It is of growing interest to both host platforms and regulators, in light of recent public pressure (HM Government, 2019). Detection systems are also important for social scientific analyses, such as understanding the temporal and geographic dynamics of abuse.
Advances in machine learning and NLP have led to marked improvements in abusive content detection systems’ performance (Fortuna & Nunes, 2018; Schmidt & Wiegand, 2017). For instance, in 2018 Pitsilis et al. trained a classification system on Waseem and Hovy’s 16,000 tweet dataset and achieved an F-Score of 0.932, compared against Waseem and Hovy’s original 0.739; a 20-point increase (Pitsilis, Ramampiaro, & Langseth, 2018; Waseem & Hovy, 2016). Key innovations include the use of deep learning and ensemble architectures, using contextual word embeddings, applying dependency parsing, and the inclusion of user-level variables within models (Badjatiya, Gupta, Gupta, & Varma, 2017; Zhang et al., 2018). Researchers have also addressed numerous tasks beyond binary abusive content classification, including identifying the target of abuse and its strength as well as automatically moderating content (Burnap & Williams, 2016; Davidson, Warmsley, Macy, & Weber, 2017; Santos, Melnyk, & Padhi, 2018). However, considerable challenges and unaddressed frontiers remain, spanning technical, social and ethical dimensions. These issues constrain abusive content detection research, limiting its impact on the development of real-world detection systems.