Every time a digital ad is shown under cost-per-impressions, an advertiser pays. But a third or more of those ad impressions are software bots pretending to be humans loading pages.
Media valuation firm Integral Ad Science has revealed to VentureBeat a new dimension of this ad threat — an out-in-the-open, “volunteer botnet” website that invites website owners, videomakers, and other content owners to automatically generate or buy fake page loads for others, in exchange for getting fake traffic for their own site or content.
A prime example of this kind of botnet, according to Integral, is a Hong Kong-based site called HitLeap.
HitLeap is straightforward about what it is doing. On its home page, a headline announces that the site offers “a Traffic Exchange service, which automatically delivers free traffic to your website.”
Here’s how it works, according to Integral:
- You sign up with the webpage(s) that you want traffic for, and you download a piece of software. It contains a special HitLeap browser that can automatically load pages while sitting minimized in a corner of any computer’s screen.
- You configure your setup for such things as time on page (which indicates how much time a “site visitor” has spent on a page) and traffic source (to indicate where the traffic is supposed to be coming from). You can indicate the traffic comes from another site, for instance, or from a cloud service.
- The software generates page loads on tens of thousands of other sites in its network, according to the site. You get credits for generating traffic, which you can use to create traffic for your pages or content. You can also purchase traffic, or get credits for traffic through referrals to the site.
‘Truly a bot’
But this isn’t “people sharing traffic,” Integral’s director of data science Jason Shaw pointed out. It’s a “fully automated service [and] is truly a bot.”
He added that it appears HitLeap has been around for at least a couple of years, and other sites may be engaged in similar practices — a kind of traffic-generating voluntary botnet that has been little discussed. Integral said it came across the site on “certain forums,” where comments talked about “making money within three months.”
Most efforts to detect traffic fraud, Shaw noted, are focused on malware, where software bots operate in the background without the knowledge of the computers’ owners.
“Previous voluntary botnets were very much in the malware” space, he told me, where they are subterranean and oriented toward Distributed Denial of Service (DDoS).
But this HitLeap kind of “voluntary botnet” is different, where it is out in the open and intended to generate revenue for its participants.
To date, HitLeap says it has delivered over 275 billion “hits,” which in this case means page loads. Each page contains one or more display ads, and each load contributes to the cost-per-thousand-displays that many advertisers are paying.
But the page doesn’t need to be owned by the person generating the ad traffic. An example would be a video clip on a YouTube page that generates ad impressions for YouTube and the video clip owner with each load.
According to Integral’s tests, the average page load from HitLeap lasts 12 seconds, which means the receiving site records that visitor as “viewing” the page for that time.
HitLeap asks if you are a bot
Running HitLeap day and night generates about 7,500 hits a day, Integral said. Depending how ads are arrayed on a page, each hit could result in one or more ad impressions. Some fraudulent publishers layer ads on ads so that, although they’re not viewable by humans, they might record dozens of ad impressions for one page load.
If you assume five ads per page, Integral said, a HitLeap client could generate a bit more than 37,500 ad impressions a day. With the 25,000 active users claimed by the site, that translates into more than 930 million impressions daily.
Integral says its own data suggests there are at least three to four thousand active users at any given time. At the top end of that estimate, the site would generate about 150 million impressions a day. If you assume a modest $1 per thousand impressions rate for display ads, that would mean the network is creating about $150,000 in revenue from ads every day.
However, it’s not entirely clear if any laws from any country have been broken. After all, it’s your browser going to these sites — except, of course, it’s a program entering the URLs and automatically clicking “go.” One angle, Integral pointed out, is that HitLeap or the publishers or both may be conducting fraud, since any site knowingly receiving this traffic is being paid for something — human visitors — it isn’t delivering.
HitLeap makes money from premium memberships that enable users to set up custom URLs, from fees to purchase traffic instead of trading it, and, one assumes, by generating traffic for some of its own pages.
Integral said they have not contacted the site.
Our inquiry to HitLeak was sent through their Contact form, which — interestingly — has both a picture-based Captcha and a checkbox to double-verify that the sender is not a bot.
We received the following response, which essentially says that few ads these days generate revenue by impressions from page loads, when, in fact, payment for cost-per-thousand ad impressions is still very common. Additionally, the prospect that a site owner would manually load a page a thousand times for, say, $1 to $4 for one to four ads, is not realistic:
The promoted websites can contain/utilize all sorts of 3rd party services, e.g. social network plugins, embedded video content and ads from advertisers. Your main concern in this case seems to be that the ad networks providing the ads might not like the web traffic that HitLeap can help its members deliver. If a simple load of the website would be good enough to defraud an ad network, what would stop anyone from simply refreshing their own page until becoming a millionaire? The answer is that it really isn’t that simple.
Nowadays, any ad network worth its salt has implemented very strict rules on the traffic that they count towards the advertisers’ metrics. In most cases, it means that a simple website load will not be enough — the visitor probably needs to click on the ads themselves and most likely a whole host of other metrics need to be good enough before the ad network even considers billing the advertiser for the visitor. They have to use these rules, because the Internet is increasingly traversed by all sorts of automatic systems, the best example being search engine crawlers, and a naive approach of counting the impressions of the ad and billing the advertiser proportional to that number simply doesn’t work in today’s web. If it did, well, we’d have people refreshing their websites all day long.
We do not encourage our members to violate any kind of service agreement that they might have with any 3rd party services who are providing content for their websites. We trust that any service providers (including the ad networks in question) are capable of enforcing their own terms of service on their clients, should it even become necessary.
In response, Integral has offered its feedback on HitLeap’s email:
Filtration of search crawler traffic is widely adopted throughout the online advertising ecosystem, and is very straightforward since crawlers such as those Google and Bing use to index web content declare themselves through their user agent string. However, filtration of ‘sophisticated’ invalid traffic — such as volunteer botnets like HitLeap and malware bots — is significantly more challenging. As an MRC [Media Rating Council]-certified vendor of brand safety and viewability measurement, Integral Ad Science is required to filter invalid traffic from reported metrics, including search engine bots and other forms of crawlers, and specializes in detecting more sophisticated fraudulent traffic.
Distributed fraud efforts, such as the volunteer botnets employed by HitLeap or botnets of compromised PCs, behave in a sophisticated and scalable manner that abuse the current industry standard of paying by the impression, or by the viewable impression.