Abstract
The online trend of the manosphere and feminist discourse on social
networks requires a holistic measure of the level of sexism in an online
community. This indicator is important for policymakers and moderators
of online communities (e.g., subreddits) and computational social
scientists, either to revise moderation strategies based on the degree
of sexism or to match and compare the temporal sexism across different
platforms and communities with real-time events and infer social
scientific insights. In this paper, we build a model that can provide a
comparable holistic indicator of toxicity targeted toward male and
female identity and male and female individuals. Despite previous
supervised NLP methods that require annotation of toxic comments at the
target level (e.g. annotating comments that are specifically toxic
toward women) to detect targeted toxic comments, our indicator uses
supervised NLP to detect the presence of toxicity and unsupervised word
embedding association test to detect the target automatically. We apply
our model to gender discourse communities (e.g., r/TheRedPill, r/MGTOW,
r/FemaleDatingStrategy) to detect the level of toxicity toward genders
(i.e., sexism). Our results show that our framework accurately and
consistently (93% correlation) measures the level of sexism in a
community. We finally discuss how our framework can be generalized in
the future to measure qualities other than toxicity (e.g. sentiment,
humor) toward general-purpose targets and turn into an indicator of
different sorts of polarizations.