A Guide to Online Advertising Data and Privacy
I’ve worked in Online Advertising for my whole career, and after leaving the gentle embrace of a tech giant, I’m reassessing the industry as an outsider. Especially when it comes to data privacy & humane tech. That’s why I started this blog.
As I do that soul searching, I thought it might be useful if I did a download of what I know about online advertising data and privacy in a (relatively) concise format. That’s what I’ve done below.
I try to keep it brief and include links if you want to learn more.
Feel free to skip sections that you feel you have a handle on. There is some meatier content towards the end on recent regulations and industry changes.
In search ads, advertisers create simple text ads and select keywords and some other targeting. When those keywords are typed into a search engine, the ad enters an auction where the highest bid wins (Google also accounts for some other relevancy factors, which you can read about here).
The biggest players in search ads are Google, Microsoft, Yahoo, Yelp and, increasingly, Amazon for shopping searches
This model has been very profitable. It has also provided actual useful ads to users of the search engines. And so, in its simplest form, it seems to be one of the less controversial forms of online advertising.
However, Google and other search engines do collect data on searches for all of its users. They then provide that data as targeting information for advertisers on their search and display ads products.
Traditional Sponsorships and reservations
Traditional sponsorships and direct reservations happen when a publisher works with an advertiser to sell ad impressions on their website. The contract is usually for a certain amount of their website traffic, say for 50M impressions within 30 days.
In late 2018 the share of display revenue transacted this way was around 58%, with the rest of the spend being transacted through primarily behavioral- and retargetting- based ‘programmatic’ auctions.
Sponsorships and reservations are usually struck because of overlap of audience and/or content between the advertiser and publisher.
Publishers and advertisers do have tools to target site users based on cookie data, but this is less of a focus in sponsorships than in retargeting and behavioral advertising (more on these later).
Sponsorships generally have richer and more engaging formats that customers are more likely to engage with and enjoy when compared to more scaled programmatic ads.
The ads can be intrusive or distracting, but usually the ads are not over-bearing and user data isn’t over-exploited.
If the use of data for targeting went away, there would still be a lot of value to advertisers and publishers for running these types of ads.
Social Media ads
Users of social media platforms knowingly enter information about themselves into the social network. That includes demographic information, location, interests and hobbies, political affiliation, religion and friends.
All of those pieces of data can be used by advertisers to target ads to those people.
The ads most often appear in the user’s feed, along-side posts by their friends and family.
Large social networks like Facebook gather data about users’ activities on the platform as well as across the web on sites that use Facebook sign in.
All of this information is tied to the social network’s user ID.
This is a huge amount of very personal information, and Facebook, in particular, has been known to share this data very widely, including to some very shady characters like russian trolls and Cambridge Analytica, who sought to use Facebook advertising to influence the results of the 2016 US presidential election.
Facebook and social media in general have come under heavy criticism after using private data in this way.
Retargeting (also known as remarketing) is when you visit a website and then that website starts showing you ads all throughout the web. A common way of doing this is if you start the checkout process to buy something and then abandon your cart to do something else. Then you see the product you didn’t quite buy following you all around the internet in ads.
This happens when the site drops a cookie on your browser and then shares those cookies with advertising platforms so they can serve you ads. Google, Amazon and Facebook all have successful retargeting businesses, as do a number of other smaller companies, such as Criteo.
Retargeting is becoming more difficult now because of stricter browser controls on all the major browsers (Safari, IE, FireFox, Chrome). More on third party cookie controls later on.
Advertising technology can derive likely behaviors for users by watching what they do on and off the web.
Common categories are
- Demographic data (age, gender, income level)
- Location data (including neighborhood as proxy for psychographics & income)
- Psychographics (what the person is interested, often derrived from what websites they have been to and what they have bought before)
- In-Market- this shows if a person has been comparison shopping for a large purchase, such as a car or consumer electronics.
In addition to gleaning this information from a customer’s behavior on the web, companies can upload a customer database and match it to cookies online to target users.
Large retail stores, for example, can match their in-store users to the same users on the internet by using their customer relationship management software and specialized technology for matching offline data with cookies and device ids.
You can read more about behavioral targeting here.
Certain ad tech platforms have begun using GPS coordinates and cell-tower location data to pinpoint users on a very granular level to show them ads for local businesses.
Large tech companies, like Google, have rules about how this can be done. They try to make sure advertisers cannot use the data to infer who the user is (ie their name, address, or other personally identifiable information). However, other smaller players such as Factual have fewer rules.
In-app ads function by a slighly different mechanism, relying on a mobile device’s ID for advertising (IDFA) rather than a cookie in order to personalize ads.
The same kind of things can be done with these device IDs that can be done with cookies. AdTech platforms and large exchanges such as Google’s AdMob and Twitter’s MoPub amass large amounts of data about each user, which can be used to target ads.
The device IDs are anonymized, but a motivated actor could easily back out location data and IDFA to get pretty close to personally identifiable information such as address, name, email address etc.
Data Management Platforms and Third party data
There are ad tech companies that specialize in helping large advertisers manage their cookie and device data, called Data Management platforms.
These guys help companies upload their customer relationship management (CRM) data bases to match their customers online and organize them into targetable lists.
For example, Target might upload all of their customer data and organize a category of ‘new moms’ who are likely to buy a certain bundle of products. Then they can match these names to cookies and device IDs using the Data Management platform and target ads to them online
Third party data providers collect data from website publishers and aggregate the data into useful behavioral categories for advertisers to target, even if they haven’t been able to make the behavioral inferences themselves.
Big Data and AI
Companies with large data sets on internet users are able to use ‘big data’ and artificial intelligence to make predictions about users online in order to target them with ads they are more likely to click.
If you have ever found yourself thinking about something and then you saw an ad for it, it’s likely not because your phone has been listening to you talk, but rather that your behavior online had been analyzed and indicated that you might be interested in that thing.
Companies with more data, more computing power and more data scientists tend to be best at this, so it should come as no surprise that this field is dominated by Google and Facebook.
Public opinion on AdTech
There has recently been a backlash about the use of behavioral data for ad targeting. A recent survey covered by Forbes said that 83% of consumers thought Behavioral Targeting was immoral!
Recent Facebook scandals such as the Cambridge Analytica scandal have highlighted just how willing tech companies are to share data with actors who may not have the users’ best interests in mind
Retargetting ads are generally considered creepy or annoying, but less invasive than behavioral targeted ads.
I consider search ads and traditional sponsorships the least morally questionable of the online advertising practices. However, it is worth considering that all of your search data as a consumer can now be used to fuel not only behavioral ads, but the search results themselves.
The ads and results may be more ‘relevant’ to the user, but they may also be manipulative, and exacerbate a ‘magic mirror’ phenomenon of showing the user only what they want to see.
The European Union passed the General Data Protection Regulation in 2017, which came into effect in 2018. This requires that websites gather consent to share your cookie data with a finite and limited number of third parties.
This was a big step towards transparency in the industry, but many publishers and tech providers have interpreted the law in a very lax way. It remains to be seen how much teeth the regulation has.
GDPR also has the effect of consolidating advertising market to a handful of the largest players such as Google and Facebook, of which most users have heard and to whom they are therefore most likely to give consent to use the data versus smaller ad tech companies.
This has the somewhat perverse effect of being anticompetitive and gives large companies advantages over smaller local European AdTech firms
My opinion is that anti-competitive forces of GDPR are still worth it if the regulation draws greater scrutiny of privacy practices by the industry as a whole.
Having worked with some of these smaller companies personally, they seem to be willing to take greater risks with user data than larger companies, so I’m not that sad if some of them go out of business.
California Consumer Privacy Act
So far the CCPA is the strongest data privacy regulation in the US. It will require websites to “Provide a clear and conspicuous link on the business’ Internet homepage, titled ‘Do Not Sell My Personal Information,’ to an Internet Web page…”
It also requires websites to share what they know about the users and allow them to delete their data from the site.
It goes into effect in 2020, and I’ll be anxiously watching to see what effect the regulation has on the industry.
I think this is a great start. I’d love to see more regulation of this kind at a Federal level.
US Data Privacy Regulation
Google and other players are engaging with the US government to lobby for national data privacy regulation, as Sundar Pichai, Google’s CEO, mentioned in this article.
It remains to be seen what this regulation will look like, and I hope to dive into some possibilities in a future post.
It is in the best interests of Google and the other large tech companies to engage at this stage to shape the regulation in a way that benefits them. I only hope that the legislation will be good for internet users as well.
Browsers privacy settings
Apple has recently adopted a posture of protecting its users’ privacy in a much stronger way than the other tech giants and led the way in protecting users from tracking with their ITP policy change in 2017. Ad Tech players quickly found ways around this which led Apple to a stronger update in 2018.
Apple’s Intelligent Tracking Prevention ITP means that ad tech companies cannot track users via third party cookies, which greatly limits the ability of ad tech companies to serve retargeted or behaviorally targeted ads.
Other browsers such as FireFox followed suit in early 2019.
Google eventually also acquiesced in 2019 by limiting cookie tracking on its Chrome browser, but to a much lesser degree that Safari or Firefox.
It’s important to point out that these browser privacy measures only pertain to what is known as ‘third party cookies’ or cookies that come from a domain other than the domain that the site the user is on.
This is important because it also tends to concentrate economic power in online advertising into the hands of the larger tech companies like Google and Facebook, who have many properties throughout the web with billions of users that they are still allowed to collect data from. Smaller companies that only provide advertising technology do not have this advantage.
The Industry Adjusts to 3rd party cookies disappearing
Assuming that browser changes and regulations completely eliminate the market for behavioral advertising based on third party cookies, there will still be forms of behavioral targeting.
Publishers are still allowed to use their own data to target ads. They may gain some incremental economic power over advertisers because of this.
There is also a trend for publishers to share 1st party data with other publishers to create bigger data sets that are more attractive to advertisers. This practice is known as ‘data coops’ and the shared first party data is known as ‘2nd party data’.
As I mentioned above, very large publishers have an advantage in a world where third party cookies are not allowed. This has been one cause of M&A activity in the AdTech and media markets, including with Verizon, AT&T, Disney and others.
In the end, it’s unclear if there will be durable privacy gains for users resulting from regulating third party cookies because of these responses.
The European Union has led antitrust measures with lawsuits leveled at Google and Facebook. These sound like huge amounts of money, but are a relatively small cost of doing business for the tech giants.
Recently, more calls to break up the tech giants have been heard from political leaders and analysts within the US. Recently the FCC and Justice Department have begun investigations into Apple, Amazon, Facebook and Google.
Whereas not all of these actions pertain specifically to online advertising there are profound implications for consumers whose data is currently being used widely by these platforms.
If any of Google, Facebook and Amazon are broken up, the data owned by the smaller companies would become less valuable to advertisers than the larger combined data set is currently.
Steps you can take to prevent tracking
There are some good articles written on this already that I’m going to share here. I’ll be going into this more in depth in future posts, but wanted to provide steps for action for those of you who want to act now.
Check what the companies know about you and adjust you preferences:
PS- this is by no means an exhaustive list, it’s really just scratching the surface of the hundreds of companies that have information about you, but it is a good start that includes the very largest players.
Ad Blocking on Browsers
Brave- the browser I use most of the time now, I hope to do a deeper dive on these guys soon. In short, they want to do an ad model that is not based on cookies and behavioral advertising. Will they succeed? I’m skeptical, but it’s a dream worth supporting.
Some sites don’t work super well on this browser, so I sometimes still use Chrome.
Ultimately you have a lot of control of your web experience if you decide to take control. That means deciding what advertising you find valuable and what you find invasive. It means deciding what data you give up and what data you will not allow to be harvested.
I gave you some knowledge and tools here to help you start making those decisions, and I’ll continue to write about our brave new world.
I think it is worth pointing out now that the internet is an amazing place. All the worlds’ information is at your fingertips! You can find and connect to friends you haven’t seen in years and who now live on the other side of the world! Even the ads are amazingly relevant and useful (even if they take that to the point of being creepy)! The point of writing this is not to denigrate the web or even the way that web-based companies make money. It’s to make sure that you are making decisions about how to use it consciously and intentionally, and not ceding those decisions to an algorithm.