When the Reddit community loves something, that person or thing can gain instant celebrity status, and a tidal wave of virtual (and sometimes monetary) love. But in the darkest corners of Reddit, there are some truly disturbing conversations happening.
Ben Bell, a data scientist at Idibon, set out to identify the worst of the worst and the best of the best. San Francisco-based Idibon is developing a natural language processing service that Bell applied to Reddit.
“I set out to scientifically measure toxicity and supportiveness in Reddit comments and communities,” Bell wrote in a blog post about his findings. “I then compared Reddit’s own evaluation of its subreddits to see where they were right, where they were wrong, and what they may have missed.”
Bell defined toxic comments as those engaging in an outright attack on another user, or those that contained overtly bigoted statements. The study also weighed toxic comments against those he defined as supportive, which includes language that expresses support or appreciation of another user.
He then tapped the Reddit API to pull data from the top 250 subreddits by subscribers, plus those mentioned in an AskReddit thread about toxicity on the site that had received more than 150 upvotes. There was also human annotation of all comments involved.
Here’s an interactive graph of the results:
According to Bell’s calculations: The most toxic community is /r/ShitRedditSays with 44 percent Toxicity and 1.7 percent Supportiveness scores. The subreddit finds bigoted posts around Reddit, but the conversations around these posts often then turns ugly, Bell says.
The most bigoted subreddit was /r/TheRedPill, a subreddit Bell describes as being “dedicated to proud male chauvinism, where bigoted comments received overwhelming approval from the community at large.”