ID Stuffing across Wrappers
Author: Ian Meyers & Mike O'Sullivan
April 24, 2023
Edit: April 27th, 2023
Beginning on April 25th, we noted the following changes on impacted domains:
Consent timeouts were increased from 1ms and 50ms to 497ms and 500ms. These are less extreme, but still well below industry averages.
The SharedId module was removed from all installations.
There have been no changes to the overwriting functionality as mentioned in the appendix.
A popular, identity-focused header-bidding wrapper deployed across over 2,500+ domains, including Time, The LA Times and Barstool Sports, ignores user consent, bypasses Prebid consent controls, injects identifiers into programmatic bidding traffic, and overwrites existing identifiers.
Publishers are inadvertently deploying technology that violates regional privacy law.
Publishers, ad technology providers and buyers are likely seeing reduced ad effectiveness due to volatile identifiers.
Whenever possible, we’ve followed the IAB best practices + guidelines on responsible disclosure, including notifying impacted parties and utilization of established channels and mediums for content dissemination.
At Sincera, we've been working with our customers to optimize identifier absorption beyond simple deployment figures and to improve their regulatory compliance using our consent scores.
We came across a puzzling phenomenon -- a number of publishers deploy additional wrappers alongside their primary header bidding wrapper. At first glance, these secondary wrappers seemed to not do much. Multiple wrappers in and of itself isn't that unexpected -- plenty of publishers deploy a self-hosted Prebid wrapper, an Amazon APS wrapper to connect to TAM, and often a video-specific wrapper like JWPlayer. In this case, both the JWPlayer wrapper and self-hosted Prebid wrapper are based on the Prebid codebase.
We observed an interesting multi-wrapper pattern across 2,500+ domains - with two (or more) Prebid-based wrappers.
Wrapper A: A “standard” wrapper containing a mix of various Prebid modules, including RTB and Identity Modules, alongside standard Prebid modules.
Wrapper B: A second wrapper which includes only identity modules.
Without any bidders, what exactly was this identity wrapper supposed to do? Out of the box, Prebid Bidding Adapters are not supposed to depend on sources outside of their current Prebid sandbox. Prebid-standard APIs provide access to core functions and data objects, such as device storage, user identifiers and site data. This allows publishers to ensure that the adtech plays fairly and predictably.
The design and policies of Prebid restrict these real time bidding adapters in Wrapper A from contacting the Identity Adapters in Wrapper B for the purposes of user enrichment. Effectively, they are different islands and it’s not possible via standard means for the primary bidding wrapper to reach out and access identifiers from a different wrapper. If the bidders can’t reach out, why does this unusual configuration - a full-featured Wrapper A running alongside a stubbed Wrapper B exist?
Examining further, we found that the reverse was occurring. The identity-only wrapper - Wrapper B - was pushing its identifiers into bid requests generated by the main wrapper, Wrapper A. Rather than using some of the standard APIs such as pbjs.setConfig(), it listens for when an auction is about to occur in Wrapper A. When that event is fired, Wrapper B directly writes the identifiers it discovers to the outbound bid requests. In doing so, if any identifiers were on the original request, they are replaced entirely by wrapper B’s identifier set.
Ignoring Consent and Opt-Outs
The Prebid codebase has a sophisticated set of controls to ensure that it is easy for publishers to remain in compliance with applicable regulation. Think of this as a system of traffic lights - Prebid will give a red light or green light for identifier enrichment, based on the current user’s consent status.
If a user within the EU rejects all requests for consent from a publisher, Prebid does not attempt to look up, store or pass along user and device identifiers.
As implemented, Wrapper B’s design circumvents these checks by waiting until the auction is just about to start (and the checks have already occurred) to push the identifiers into the outbound SSP bid requests - without the SSP Bid Adapter code being able to confirm or restrict the user data from being added.
A publisher can do all the right things: implement both the Consent Management and GDPR Enforcement modules in their primary wrapper, but if they have added Wrapper B to their site, the back door is wide open for expensive and egregious violations of GDPR.
The end result? Publishers who implement Wrapper B will have their users identified, and their programmatic bid requests overwritten with unexpected, unconsented user data, even in cases where the user has explicitly denied permission.
What we would expect is that an identity-focused wrapper would go above and beyond -- ensuring that before identifiers were even requested, the wrapper would check to see if the user had provided consent to that vendor, as well as if the user had allowed access to store data on their device. This is possible using tools like Prebid's GDPR Enforcement modules, which many impacted publishers had installed on their primary wrapper, or settings like `defaultGdprScope`.
In requests to identity providers, Wrapper B was observed sending “gdpr=false” or the equivalent. This is in all likelihood due to the timeout configuration, combined with weak enforcement. Wrapper B’s has a highly unusual configuration of the (often included) Consent Module. The Prebid default timeout to wait for a consent response is 10 seconds. Excluding wrapper B installations, the top 500 domains by traffic have a timeout of around 7.5 seconds. Wrapper B only waits 26 milliseconds on average, and often as little as a single millisecond (1/1000 of a second). This amount of time does not give the user a chance to register a response before the wrapper continues, and in many cases does not even give the CMP sufficient time to load, providing no GDPR value to downstream identity providers.
If there is a silver lining, it is that most ID providers required and validated consent when they detected the user was in the EU based on geoIP.
The primary value of identifiers is stability -- a reliable anchor to user or device identity where none exists. But what we observed was that in addition to ignoring user consent choices, this identity wrapper overwrote identifiers on bids to SSPs generated by the primary wrapper.
This leads to inconsistencies when the second wrapper loads late, isn't included on some pages and so on. Likewise, it will cause problems if a publisher attempts to use the identifier from the primary wrapper (e.g. as a Google Publisher-Provided ID), compare across analytics sources, etc.
The publisher's expectation would be to complement, rather than replace, existing identifiers.
Breaking Publisher-Specific Keys
Many identifiers, responding to publisher concerns of data leakage, no longer generate a "universal" ID that can be understood by all participants. Instead, they issue a more specific key, unique to a sales house or even a single domain. Only participants who should have access to use that identifier receive the decryption key in order to reach a stable identifier.
By overwriting a (potentially) publisher-specific identifier with one retrieved by the identity wrapper owner, the wrapper could be leaking identifier data to bidders the publisher took pains to exclude.
The new Data Protection Module, which surfaces controls to publishers to place limits on which bidders can receive Seller-Defined Audiences or identifiers, is similarly subverted by the identity wrapper, which will add all available identifiers to any and all bid requests, regardless of the primary wrapper’s bidder restrictions.
Our Recommendations and Takeaways
Publishers should ensure they understand not only what their vendors are doing, but how they actually do it. Publisher resources are spread thin, but identity is too valuable (and risky!) to ignore. Evaluate and test adtech across integrations, rather than scoped to a single solution.
Identity vendors need to take a “trust but verify” approach to signals they receive from other entities. As signal data diminishes, it’s important to understand and audit upstream data flows, as well as to build in safeguards for sensitive product functionality.
Ad tech vendors need to build from an assumption that user consent is actively given, and ensure that their offering does not rest entirely on assuming access to permissions which could be denied or revoked.
Interested in the intersection of adtech, privacy and identity? Curious to learn how Sincera could help your company? Drop us a line: firstname.lastname@example.org.
This is the client side code used by Wrapper B to listen to activity on Wrapper A, and just prior to Wrapper A issuing bid requests (and following Consent enforcement) Wrapper B modifies the bid via beforeRequestBids event to include the identifiers from Wrapper B.
.har files that demonstrate Wrapper B behavior are available on request.