Here’s today’s AdExchanger.com news round-up… Want it by email? Sign up here.
When All The Data’s Not Enough
Last week, YouTube CEO Neal Mohan warned OpenAI that ingesting data from public videos to train its AI models would be a terms of service violation.
OpenAI fully did this, by the way.
The response was mostly eye-rolling about Google bemoaning another company’s disregard for original content in pursuit of its own data collection.
But the dilemma of transcribing and ingesting YouTube videos, not to mention AI data licensing deals by the likes of Reddit and Tumblr, demonstrates a painful fact about AI software: The amount of content required to make these chatbots work is staggering. Even the internet, a seemingly endless well of text and imagery, is too little and wholly undifferentiated.
That data shortage has led to mass corner-cutting by AI software companies, The New York Times reports.
Many AI firms are turning to “synthetic data” – which is to say, AI models are creating more content to replicate web publishing that’s fed back into the system.
Google itself originally said it would only transcribe and ingest YouTube video data for purposes of Google Translate. That is, until 2022, when Google broadened that self-imposed rule to include Bard and its cloud-based AI products.
Stuck In The Sand
Publishers have a few bones to pick with Google Chrome’s Privacy Sandbox.
Not only is the Sandbox currently incapable of supporting most digital advertising use cases, as per the IAB Tech Lab, but publishers also worry it will only deepen their dependence on Google’s ad products, Digiday reports.
The Privacy Sandbox auction works sequentially, with the results of one auction fed into another (try to sort out all the diagrams here – good luck). This workflow arguably bypasses the traditional role of the ad server and supply-side platforms within open RTB auctions, which could be self-preferencing for Google, an anonymous source with knowledge of the matter said. For publishers and SSPs who saw benefits from unifying the ad auction, moving back to a waterfall-like setup raises alarm bells. Not to mention the fact that there are hierarchies like “top-level sellers” and “top-level auctions.” Any guesses for which tech partner publishers will end up feeling obligated to put in those spots?
Naturally, Google denies any self-serving motives. But, regardless, all the Sandbox squabbling could delay cookie deprecation further.
See You Later, Legislator
Take this with the usual grain of salt, but a federal data privacy law is in the works, the Washington Post reports.
On Sunday, Senate Commerce Committee Chair Maria Cantwell (D-Wash.) and House Energy and Commerce Committee Chair Cathy McMorris Rodgers (R-Wash.) floated a bipartisan bill, though it’s in draft form.
The proposed legislation, called the American Privacy Rights Act, would mandate that companies earning more than $40 million in annual gross revenue disclose their data collection practices and only collect the data they need to market their products. The act would also prohibit companies from discriminating against users based on data they collect.
Users would have the right to access their data, fix mistakes or delete their customer profiles altogether. People would also be able to opt out of targeted ads.
And, as a liberal priority, individuals would be allowed to sue companies that didn’t delete their data or meet the requirements.
But Wait, There’s More!
Passive TV viewership is behind the growth of free ad-supported TV channels. [The Verge]
Apple TV+ grows its US streaming market share but still lags its competitors. [9to5Mac]
Google explains why Ad Strength is ‘so important’ as it addresses industry concerns. [Search Engine Land]
Paramount Global is in exclusive merger talks with Skydance Media. [WSJ]
The most popular TV shows of the streaming era. [Bloomberg]
The Soros Fund is building an audio empire. [Semafor]