218 | ⬅️👀 All Major AI Models Lean Left Politically

It’s Friday. And we have a teaser for the next-gen humanoid robot from Figure. The Figure 02 coming August 6th. Look at the hand and fingers!!

Here we go.

In today’s Brainyacts:

AI music generator startups counterpunch
AI events in August and September
All major AI models lean left politically and other AI model news
Get a legal counsel job at an AI startup and more news you can use
👋 to all subscribers!

To read previous editions, click here.

Lead Memo

👊🥊 AI Music Generation Startups Counterpunch

Two AI Startups just filed their answers yesterday in lawsuits that likely represent a crucial battleground for copyright in the AI era. And, wow!

Recent lawsuits filed by major record labels, including Sony Music Entertainment, UMG Recordings, Inc., and Warner Records, Inc., against AI music generation startups Suno and Udio have brought to light significant questions about the legal and ethical use of copyrighted material in AI training processes. The core of the complaints by these labels centers on allegations of unlicensed copying and exploitation of sound recordings. However, the bold responses from the accused startups present a contrasting perspective that highlights a critical issue: how do AI models utilize ingested content to create new works, and what does this mean for copyright law?

The Labels’ Complaints

It’s important to note that the plaintiffs are not alleging that the output of these services is infringing copyright – only that the copying and use of music is a misuse. From the complaint, (“Plaintiffs are not . . . alleging that these outputs themselves infringe the Copyrighted Recordings”).

So, what are the plaintiffs alleging? The record labels argue that Suno and Udio have engaged in widespread copying of sound recordings without obtaining necessary permissions, thus infringing on copyright laws. The lawsuits emphasize several key points:

Unauthorized Copying: The labels assert that Suno and Udio have copied numerous sound recordings without authorization, an act that they claim constitutes direct copyright infringement.
Commercial Exploitation: They argue that the startups have commercially exploited these recordings, as both companies monetize their AI-generated music services.
Harm to the Music Industry: The plaintiffs contend that this unauthorized use deprives artists and rightsholders of fair compensation and threatens the overall value and diversity of music.
Rejection of Fair Use: The labels reject any fair use defense, arguing that the mass copying of sound recordings for commercial purposes does not qualify as fair use.
Lack of Transparency: The lawsuits accuse Suno and Udio of being evasive about the extent of their copying, which the labels interpret as an attempt to conceal deliberate copyright infringement.

The Startups’ Defense (and Offense)

In its strong and bold response, Udio provides a robust defense, focusing on the transformative nature of its AI models and the concept of fair use:

Fair Use Argument: Udio claims that their use of existing sound recordings for analysis and pattern identification falls under fair use. They argue that their model does not store copies of sound recordings but instead uses the information to generate entirely new musical renditions. From the complaint, “[we] use existing sound recordings as data to mine and analyze for the purpose of identifying patterns in the sounds of various musical styles, all to enable people to make their own new creations.”
Transformative Nature: Udio emphasizes that their AI service is not a repository of pre-existing content. Instead, it creates new works based on a "vast store of information" about musical styles.
Non-infringing Outputs: The startups argue that the outputs generated by their AI models do not infringe on copyrights, as the process involves creating new, original works rather than reproducing existing ones.
Misunderstanding of Technology: Udio contends that the plaintiffs misunderstand the technology, pointing out that similarities in generated music could stem from the general musical style rather than direct copying of specific recordings.
Copyright Misuse: Udio also hints at a defense based on copyright misuse, accusing the plaintiffs of leveraging their aggregated copyrights to gain unfair market advantages.

Oh, but wait. It gets really good when the startups don’t just stop at accusing the labels of using copyright to reduce competition in the marketplace but also that the plaintiffs actually violated the startups’ terms of service when the record labels used their service to “prove” copyright infringement.

“So when Plaintiffs’ lawyers prompted Udio with the lyrics to “My Way,” see id. ¶ 61, they flagrantly violated Udio’s Terms of Service—which are designed to ensure that the product is used to generate new artistic expression.”

In other words – not only are the labels bullies and ignorant of how our technology works, they also flagrantly ignored our terms of service that clearly spell out that what you wanted to prove is in fact a prohibited use of our service. Damn!

Ok, this will be interesting and fun to watch play out. But it does re-raise the fundamental challenge in this arena a.k.ka the core question.

The Core Question: How Do AI Models Use Ingested Content?

The crux of the debate lies in understanding how AI models process and utilize ingested content to create new works. Udio’s statement that their model is a "vast store of information" about musical styles rather than a library of pre-existing content raises important questions.

AI models, including large language models (LLMs), typically "tokenize" input data, breaking it down into smaller components (tokens). These tokens are then analyzed in relation to one another to generate new content. This process theoretically deconstructs the original work into such small pieces that the new output may not resemble the original in any recognizable way. However, this raises complex questions:

Tokenization and Originality: Does the tokenization process effectively dismantle the original copyrighted work into unrecognizable components, thereby creating something entirely new and non-infringing?
Use of Multiple Versions: If multiple versions of a song are used in training, can it be said that the AI model utilized all these versions, or is this irrelevant if the new work is considered original?
Copyrightability of Tokens: If tokens are seen as individual bytes of data, can these be subject to copyright, and does this challenge the notion of originality in AI-generated content?

Where we are going?

The lawsuits against Suno and Udio highlight a critical, unresolved issue in the intersection of AI and copyright law: how AI models use ingested content to create new works. As AI technology continues to advance, the legal frameworks governing copyright will need to evolve to address these complexities. See the recent report from the US Copyright Office below. Understanding the tokenization process and its implications for originality and copyrightability is crucial in this ongoing debate. The answers to these questions will shape the future of AI-generated content and its place within the legal landscape of intellectual property. Too bad few people actually understand what is going on inside these LLM models to crisply describe it.

If you want to read the filings in full - you should if you have any interest - here is a link to one of them (they are both relatively the same).

Spotlight

📅 🎤 Upcoming AI Events

Date:	AI Conference:	Location:
Aug 7, 2024	AI Enterprise Scale - AI Impact Boston	Boston MA
Aug	Ai4 2024	Las Vegas, NV
Aug 19 to 22, 2024	AI Creative Summit + Bootcamp	Digital
Sept 5 to 6, 2024	Conversational AI Innovation Summit	San Francisco, CA
Sept 5 to 6, 2024	K1st World Symposium	Stanford, CA
Sept 9 to 11, 2024	Generative AI for Automotive USA 2024	Detroit, MI
Sept 9 to 11, 2024	Software-Defined Vehicles USA 2024	Ann Arbor, MI
Sept 9 to 12, 2024	Efficient Generative AI Summit	San Jose, CA
Sept 9 to 12, 2024	AI Hardware & Edge Summit	San Jose, CA
Sept 10 to 11, 2024	The AI Conference 2024	San Francisco, CA
Sept 11, 2024	AI Powered Supply Chain - AI Impact SF	San Francisco, CA
Sept 11 to 12, 2024	AI for Defense Summit	Washington, DC
Sept 11 to 12, 2024	AIAI Berlin	Berlin, Germany
Sept 16 to 18, 2024	Machine Learning in Quantitative Finance	Amsterdam, NL
Sept 16 to 18, 2024	Responsible AI Summit	London, UK
Sept 18, 2024	Data Science Salon MIA	Miami, FL
Sept 18 to 19, 2024	CDAO Government	Washington, DC
Sept 18 to 20, 2024	Credit Risk Management, Modelling and Validation	Amsterdam, NL
Sept 23 to 25, 2024	Machine Learning in Quantitative Finance	New York, NY
Sept 24 to 26, 2024	AI for Pharma and Health Care	Amsterdam, NL

AI Model Notables

► All major AI chatbots found to lean left politically - even Elon Musk’s Grok.

► New “thermometer” method prevents an AI model from being overconfident about wrong answers.

► A sneaky new model from Google appears out of nowhere and takes over lead spot on performance benchmark – beating out GPT-4o and Claude 3.5 Sonnet. We don’t know much about it yet. The LMSYS Chatbot Arena is a crowdsourced open platform where people can test and evaluate LLMs.

► Microsoft says OpenAI is now a competitor in AI and search even as it has invested over $13 billion into it.

► Google tweaks Search to help hide explicit deepfakes.

► TikTok is one of Microsoft’s biggest AI cloud computing customers.

► Where is OpenAI putting its investment money? Across the whole spectrum of the market - from consumer-focused AI startups to pro-tools to B2B. A diversified approach out of necessity (unclear of winners) or belief (AI will indeed impact across the entire spectrum)? This is a bogus question as the answer is clearly the latter.

News You Can Use:

➭ The EU’s AI Act is now in force (as of yesterday) and major compliance obligations come with it.

➭ The US Copyright Office adopts the term “digital replica’ to refer to deepfakes in its first in a series of reports on the impact of AI. This report calls out the need for more specific federal laws on the topic. Read the full report HERE.

➭ Backlash over a commercial aired during the Olympics showing how a parent might use Google’s GenAI - Dear Google, who wants an AI-written fan letter?

➭ Taco Bell to roll out AI drive-thru ordering in hundreds of locations by end of year (recall McDonald’s ended their AI drive-thru experiment with IBM).

➭ Despite getting kicked to the curb by McDonald’s, IBM has now booked more than $2 billion worth of generative AI business. Most of that comes from consulting signings, with the rest coming from software.

➭ JOB ALERT: Join a rising healthcare genAI startup as their legal counsel.

Was this newsletter useful? Help me to improve!

With your feedback, I can improve the letter. Click on a link to vote:

Who is the author, Josh Kubicki?

Some of you know me. Others do not. Here is a short intro. I am a lawyer, entrepreneur, and teacher. I have transformed legal practices and built multi-million dollar businesses. Not a theorist, I am an applied researcher and former Chief Strategy Officer, recognized by Fast Company and Bloomberg Law for my unique work. Through this newsletter, I offer you pragmatic insights into leveraging AI to inform and improve your daily life in legal services.

DISCLAIMER: None of this is legal advice. This newsletter is strictly educational and is not legal advice or a solicitation to buy or sell any assets or to make any legal decisions. Please /be careful and do your own research.8