Google releases findings on how to spot fake reputation builders

Screen Shot 2014-07-08 at 10.31.42 AM

Online reputation in the form of Twitter followers, web traffic, and YouTube views can mean big bucks.

Modern snake-oil salesmen who want to boast exaggerated popularity on social media or e-commerce websites will happily pay freelancers to inflate their reputation. Google, Twitter, and Facebook, however, make their money based on the reliability of their websites, and so Google has now released findings on how to automatically spot these purveyors of fake reputations, or “crowdturfers.”

“Automatically detecting crowdturfing gigs is an important task because it allows us to remove the gigs before buyers can purchase them, and eventually, it will allow us to prohibit sellers from posting these gigs. To detect crowdturfing gigs, we built machine-learned models using the manually labeled 1,550 gig dataset,” wrote a team of researchers in a recently presented paper, which was partly supported by a Google Faculty Research Award. A Google Research blog post yesterday highlighted the work. It’s worth noting that, At the moment, their methods are quite sophisticated and only available to folks who can replicate their advanced algorithm detection system.

Freelancing micro-task websites, such as Odesk, Fiverr, and Freelancer have built a cottage industry out of globally distributed part-time workers. I’ve used these sites for research help and email list building. Nonprofit Samasource uses similar systems to help impoverished women and youth find work.

These sites offer great services. But fake reputations make them less trustworthy.

“Amazingly, one seller (crorkservice) has sold 601,210 gigs and earned at least $3 million over the past two years. In other words, one user from Moldova has earned at least $1.5 million/year, which is orders of magnitude larger than $2,070, the GNI (Gross National Income) per capita of Moldova,” explained the researchers.

Their machine-learning detection system has a pretty high rate of accuracy, based on the researchers’ own dataset. “Our experimental results show that these models can effectively detect crowdturfing gigs with an accuracy rate of 97.35 percent. Using these classification models, we identified 19,904 crowdturfing gigs in Fiverr, and we found that 70.7 percent were social media targeting gigs, 27.3 percent were search engine targeting gigs, and 2 percent were user traffic targeting gigs.”

Eventually, they hope their technology will help companies such as Twitter automatically ban bots and fake followers. You can read the full paper here.


VentureBeat is studying mobile marketing automation. Chime in, and we’ll share the data.