Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Quantifying Online Advertising Fraud: Ad-Click Bots vs Humans [pdf] (oxford-biochron.com)
48 points by McKittrick on Feb 9, 2015 | hide | past | favorite | 37 comments


For some reason, I guess because domain had the word "Oxford" and the file was PDF, I thought it was a pure academia publication, but apparently it's from a company that's selling bot fraud software http://oxford-biochron.com/


How to get more credibility for blog posts:

- s/tldr/Abstract

- Use LaTeX, and convert to PDF


Yes, two software executives posing as academics confused us about bots posing as humans.


I know this is anecdotal but i don't find their findings to be true in practice -- at least as far as facebook.

If facebook tells me I got 180 clicks, my javascript detects 100 page views, and 45 people fill out my form, then I know at least 25% of those clicks were real people, or bot sophisticated and motivated enough to actually continue through filling out a form on the destination page.

There is certainly click fraud on every platform -- clicks that only facebook/linkedin/google report, but never appear on server logs.

I'd be interested in knowing what your personal experience has been, I know there are many people on hacker news who've advertised something in the past.


Your data may be anecdotal, but it's vastly more data than the paper provides (zero data to back up its claims).

My personal experience with advertising matches yours. Likes on Facebook can come from the strangest sources, but the clicks on ads are mostly real users that fill out forms and buy products.


The large majority of bots nowadays use Javascript, and are almost indistinguishable from humans if you don't look at their activity over time.

Filling a form with coherent content is a reasonable proof of humaneness, but only if you're looking at bots that weren't designed for handling that type of landing page.


That's what I wonder about.

Someone could write a facebook ad that links to a captcha test, and then post results.

That would provide some useful results...

I should also mention that my targeting is news feed exclusive. In the past I've used sidebar ads, but their results are garbage compared to news feed ads.


i hear fb ads are relatively non-botty. Likes are botted to death, but ads are pretty clean.


A little bit of LaTeX can bring a lot of authenticity to what is basically a corporate blog post. How about some of the actual data used to make this claim?


http://oxford-biochron.com/who-we-are/ Conflict of interest much ? writes a paper saying they did research on this.....


Actually, thats really awesome. I've been researching maxmind, threat-metrix, and neustars offerings to protect a large site that gets targeted with a lot of spam. Threat-metrix and neustar are both too expensive, and maxmind didnt really work all that well.

This paper basically touched on everything I wished for in a bot detecting product, at a price that seems totally reasonable.


The authors are selling a product to detect click bots. Take the results with a grain of salt..


With all those 5 bucks for 1000 page likes companies, there are a ton of people who are just paid to click like on things. Its not bots, its humans from places that are not the US. These humans, to avoid detection just click on random things to like in the US just to make it less obvious when they are paid to click like on things.


The problem with display advertising is that too much of the industry is focused on clicks. Anti-fraud companies can come up with ways to try to mitigate click fraud but it won't do much of anything as they can, and will be, gamed by black hats - there's too much money at stake.

I deal with it every day and the best method is to educate the client by explaining why a click is a poor indicator of performance. Work with the client to come up with measurable goals to track click-through/view-through conversions on these goals and ultimately try to measure impact on ROI. It's really not THAT difficult for most campaigns.

The most difficult part is that the client becomes aware of all those wasted dollars on previous campaigns that they thought were high performance because of a high CTR.


> "Work with the client to come up with measurable goals to track click-through/view-through conversions on these goals and ultimately try to measure impact on ROI. It's really not THAT difficult for most campaigns."

Strongly disagree here. It really is THAT difficult for most campaigns. What you are talking about is attribution, and display attribution in particular is still in the dark ages compared to anything click-based. It is IMHO by far and away the toughest problem to tackle in the industry right now. Even more so thank fraud, because if you have a clear sense of what is actually driving revenue, the fraud just becomes another factor for bid algorithms to consider.

Coming up with the value of a view-through conversion, etc. is non-trivial. Further, even getting revenue data from view-throughs is not easy for most advertisers that don't have an ad server in place (think everyone using the vanilla AdWords tracking on the GDN). Specifically, Google gives you view-through conversions, but not view-through revenue, even though they clearly have the data.

I agree that too much of the industry is focused on clicks, and publishers are still loving branding clients that go after impressions because they see it as an easy commission that is super simple to automate management for.

That said, I wish any company with a display offering would do more to prove the value of it from an attribution standpoint. Why do I need to have DFA for accessing full exposure-to-conversion path data? Wouldn't that make it much easier for me to sell in the value of display to my org/clients so I would spend even more?

Personally, I'm dying to see what Google does with Adometry, and what FB does with Atlas in terms of proving the value of display from a data-driven dynamic attribution standpoint. Static models are broken and display is a much more difficult beast to tackle.


Aren't most advertising campaigns focused on acquisitions these days?


No, and they are not focused on clicks either. The majority of the adspend* cares about impressions, viewability and lift.

*display adspend


> Aren't most advertising campaigns focused on acquisitions these days?

Nope, much of it is branding. Some are focused on clicks, some are focused on viewability... it's sort of a turning point in the industry...

> No, and they are not focused on clicks either. The majority of the adspend* cares about impressions, viewability and lift. *display adspend

Most or not, it's still a significant amount. The clients you mention may be more concerned with impressions/viewability/lift, but they're still vulnerable to be gamed the same way as someone who cares about clicks. Viewability is already being manipulated by the same bots that generate fraudulent clicks. It's a great metric in theory, but take it with a grain of salt.

If anything, these companies (Moat, IAS, Oxford...) are the ones who should be most concerned about combating bots.


> The clients you mention may be more concerned with impressions/viewability/lift, but they're still vulnerable to be gamed the same way as someone who cares about clicks.

No arguing there. I was just pointing out facts to the previous comenters.

I don't know about the turning point tho...


Looks like snake oil sellers push their marketing b###t with scientific sauce. Recent report from ComScore http://www.comscore.com/Insights/Blog/Ad-Fraud-and-NonHuman-... have absolutely different numbers.


Most click-fraud is carried out by setting up plausible-looking websites, signing them up to ad networks, and then clicking on the adverts that are served up on those sites.

There's nothing in this article about what kind of sites the adverts were being served on, which is a fairly glaring omission.


This document looks official, and replicates the format of a proper scientific paper. But it's extremely short, offers almost zero details about their "analysis" and produces dramatic conclusions that further the business goals of the authors.

I personally would not trust this at all.


Could someone fill me in on the business model of these click farms? How do they profit from clicking other people's ads?

I'm sure it will be perfectly obvious in hindsight.


They will typically set up several generic websites with ads, and set their click farm loose to represent fake traffic while clicking on ads for X% of impressions. You could also be a mercenary and drive traffic to other website operators who want traffic/rev, or even dilute quality/waste spend on competitor ad campaigns.

Since these click farms are typically just infected computers, they can likely setup other tasks to monetize: DDoS, email, BTC mining, etc...


they also own some of the sites on which ads are clicked


I don't believe that Facebook and Google are clicking their own ads


I wonder what Google's internal research shows. Are they themselves not committing fraud by failing to publicize this to their customers?


Google has massive teams that are devoted to detecting click fraud and getting rid of it. They really really want to detect and remove click fraud.

If I'm selling a product that makes $50 profit, all I care about is how much I have to pay to get a conversion. If it takes 100 clicks for a conversion at $0.50 per click, that is same ROI as 50 clicks for a conversion at $1 per click. So if half the clicks are from bots, I'm just going to bid half as much. Maybe in the short term I'm not going to, but certainly I will adjust in the long term.

Which of course is not what Google wants. Google doesn't want to make money at the expense of creating long term customers. Google wants the conversion rates on their platform to be better than the conversion rates on other platforms, which is directly comparable at the customer end.

So by detecting and controlling click fraud, Google gets higher CPCs (which they like), their platform has higher user end conversion rates (which they like), it builds trust and consistency in their platform (which they like), and in the long term they don't even lose money because CPCs will just adjust upwards as fraud goes down.


I don't dispute that it is in Google's interest to reduce click-fraud, but it is also in their interest to reduce the appearance of click-fraud too!


As long as the money keeps flowing the shareholders are happy.


This is disturbing to say the least.


Don't tell Facebook


uhh, the first point references wikipedia... I'm not sure if I trust the authors


They are using it to define the word, that is a perfectly reasonable thing to do from Wikipedia.


Unless this some academic joke flying over my head, the author's contact info is listed as "firstname.lastname@oxford-biochron.com". This report seems hastily thrown together, but that doesn't necessarily invalidate the results. It'd be interesting to see others run it or comment from Google.


That's commonly done in a bunch of academic papers where authors are from the same institution - saves space mostly


That is not a joke. That is either email address obfuscation or to save characters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: