Dealing with discrepancies or how I learned to stop worrying and love TCP
As online advertising has grown from an experiment on a marketer’s checklist to a critical tool in the proverbial toolbox, so has the demand for actionable metrics of performance.
At first, measuring engagement was straightforward. A site serves a user an ad (delivered by an unbiased third-party, the ad server), and a user clicks on that ad to go to whatever page the marketer desired. Ad servers then collect the number of clicks and impressions, which serves two primary purposes. The first is that marketers use these numbers to draw insights into how their campaigns are performing. The second is that marketers pay their advertising partners based on things like number of clicks.
Soon, marketers clamored to gain deeper insights. Technology vendors introduced cookies to attribute actions on the site, such as a product purchase or online signup, called a “conversion,” to an ad impression or click. It’s this process — attributing actions on a site to ad impressions and clicks — where things get tricky, and which this blog post will attempt to explain.
Even the most novice marketing campaign manager has, at some point, dealt with the headaches of managing an ad campaign’s results. Often, when working with multiple partners, the reported number of clicks and ad impressions varies, sometimes greatly. When you’re paying partners based on things like number of clicks, this can quickly become a headache for the campaign manager.
“Why did this happen?” the campaign manager wonders. Was there an issue with the ad trafficking or setup? Is one party collecting data incorrectly? Is there questionable activity the two parties are filtering differently? Are moths crawling into the servers and messing with the data (1)?
The answer could turn out to be a combination of all of the above. Discrepancies in ad impression, clicks, and conversion numbers can come about from a myriad of factors, including:
- garden-variety mistakes (such as trafficking errors or fat-fingering)
- infrastructure mismatches (such as when dealing with multiple ad verification companies)
- fraudulent/bot activity (common with click discrepancies)
In this series, I will take a detailed look at the most common cases of discrepancies, including a greater look at the impact TCP/IP itself has on our perceived numbers. We will see that often, tracking down the exact source of a discrepancy can be such a Sisyphean task that it is worth setting limits.
What is a “discrepancy”?
Let’s start with the basics. What constitutes a discrepancy? What does it mean to have a discrepancy? Why are there discrepancies in such basic metrics as ad impression counts?
Generally, discrepancies are characterized by one party having >20% difference in some metric from another party involved in the same campaign. These discrepancies can include:
- Ad impression discrepancies, which are nothing more than “render this ad on the page,” are mild, though we’ll see cases in which those, too, can blow up.
- Conversion discrepancies can often arise from differences in attribution — one partner thinks it deserves credit for a conversion, but a partner that has more visibility into the entire plan attributes it to someone else.
- Click discrepancies are typically harder to nail down, as they sometimes involve the murky world of fraudulent activity.
Infrastructure & its effect on discrepancies
The infrastructure of online advertising has made great strides in recent years, and a number of companies have popped up to deal with many issues advertisers face. For instance —how does an advertiser know that the ads are running as expected? Perhaps they are appearing on a different part of the website, either by mistake or intentionally. While we, and many other companies, have measures in place to manage many of these concerns, an opportunity in the ad tech ecosystem opened for companies to specialize solely in online ad verification – becoming critical partners of ours.
Ad Verification Partners
These companies, at a basic level, put a piece of code within the ads that identify the site on which the ad is run. Often, they employ sophisticated schemes to penetrate iframes, in order to see the underlying page (2). They wrap the original ad tag in a script that does a page detection. If the page matches a predetermined list of allowed pages (alternatively, if it does not match a predetermined blacklist of websites the ad should not appear on), the impression will be delivered. If not, it will be blocked.
As a first pass, it is essential that the ad verification company’s whitelist/blacklist agrees with the DSP’s corresponding list. As sites are flagged by the ad verification partner, so too should the DSP’s list be updated.
The Exchange Landscape
However, in the advertising exchange landscape, this verification process gets even more complicated. Real-time bidding exchanges work by sending a bid request to many DSPs. This is usually in the form of a JSON POST containing information about the request — e.g. IP address, URL, etc. If the supply partner is not employing the same reporting methods as the ad verification partner, there will be instances where the buying platform and verification companies disagree on which site the ad appeared.
In cases like these, it is essential to get as much detail as possible from all partners in order to resolve the discrepancy. If Advertiser 1 says, “I served an impression on X site at Y timestamp”, and Verification Company 2 says, “Actually, at that timestamp Y, I detected a request for nefarious site Z,” it can become apparent where the problem originated. A subsequent update of a blacklist can mitigate subsequent instances from occurring.
This seems to be a random, sporadic occurrence. But could something more insidious be at work?
Check back for part 2 in this series, when I give a deeper dive into bot activity, its impact on click discrepancies, and how we work to root it out.