Social networks including Facebook, Twitter, and Pinterest tap AI and machine learning systems to detect and remove abusive content, as does LinkedIn. The Microsoft-owned platform — which has over 660 million users, 303 million of whom are active monthly — today detailed its approach to handling profiles containing inappropriate content, which ranges from profanity to advertisements for illegal services.
As software engineer Daniel Gorham explained in a blog post, LinkedIn initially relied on a block list — a set of human-curated words and phrases that ran afoul of its Terms of Service and Community Guidelines — to identify and remove potentially fraudulent accounts. However, maintaining it required a significant amount of engineering effort, and the list tended to handle context rather poorly. (For instance, while the word “escort” was sometimes associated with prostitution, it was also used in contexts like a “security escort” or “medical escort.”)
This motivated LinkedIn to adopt a machine learning approach involving a convolutional neural network — a class of algorithm commonly applied to imagery analysis — trained on public member profile content. The content in question contained accounts labeled as either “inappropriate” or “appropriate,” where the former comprised accounts removed due to inappropriate content as spotted using the block list and a manual review. Gorham notes that only a “very small” portion of accounts have ever been restricted in this way, which necessitated downsampling from the entire LinkedIn member base to obtain the “appropriate” labeled accounts and prevent algorithmic bias.
To further tamp down on bias, LinkedIn identified problematic words responsible for high levels of false positives and sampled appropriate accounts from the member base containing these words. The accounts were then manually labeled and added to the training set, after which the model was trained and deployed in production.
Gorham says the abusive account detector scores new accounts daily, and that it was run on the existing member base to identify old accounts containing inappropriate content. Going forward, LinkedIn intends to use Microsoft translation services to ensure consistent performance across all languages, and to refine and expand the training set to increase the scope of content it is able to identify with the model.
“Detecting and preventing abuse on LinkedIn is an ongoing effort requiring extensive collaboration between multiple teams,” wrote Gorham. “Finding and removing profiles with inappropriate content in an effective, scalable manner is one way we’re constantly working to provide a safe and professional platform.”
LinkedIn’s uses of AI extend beyond abusive content detection. In October 2019, it pulled back the curtains on a model that automatically generates text descriptions for images uploaded to LinkedIn, achieved using Microsoft’s Cognitive Services platform and a unique LinkedIn-derived data set. Separately, its Recommended Candidates feature learns the hiring criteria for a given role and automatically surfaces relevant candidates in a dedicated tab. And its AI-driven search engine leverages data such as the kinds of things people post on their profiles and the searches that candidates perform to produce predictions for best-fit jobs and job seekers.