• The Brainyacts
  • Posts
  • 209 | 🍽️ 🍝 Eating at AI dinner parties + synthetic data

209 | 🍽️ 🍝 Eating at AI dinner parties + synthetic data

Brainyacts #209

It’s Tuesday. In two days it is Independence Day here in the US. Last night I drove to Indiana (where they allow the sale of “real” fireworks - like the ones that go 100s of feet into the air). Been doing it for 20 years or more now. That said, I hope to see you all on Friday as I avoid any accidental explosions!!! 😬 🎇

Let’s launch into it.

In today’s Brainyacts:

  1. Understanding synthetic data

  2. Online Presence Analysis Prompt

  3. Microsoft’s AI boss on “fair use” and other AI model news

  4. A guide to AI lawsuits and more news you can use

    👋 to all subscribers!

To read previous editions, click here.

Lead Memo

🥸 📈 Synthetic data: What you should know

Gartner predicts that 75% of businesses will be using generative AI to create synthetic data by 2026. Why? Well, using real data is expensive, and legally troublesome, and the simple fact is, we are running out of it, believe it or not.

So what is synthetic data?

Synthetic data refers to artificially generated information that mimics the statistical properties and patterns of real-world data without containing actual individual records. It serves as a proxy for authentic data, offering a solution to privacy concerns, data scarcity, and the need for diverse datasets in AI training.

Imagine you're trying to teach a computer to recognize cats. Normally, you'd need thousands of real cat photos. But what if you could create realistic "fake" cat photos instead? That's the basic idea behind synthetic data.

Here's why it's useful:

  1. Privacy: You can create data that looks real without using anyone's personal information.

  2. Cost: It's often cheaper than collecting real data.

  3. Flexibility: You can generate exactly the kind of data you need, even rare scenarios.

As for how it's made:

  1. First, you study the patterns in real data.

  2. Then, you create computer programs (algorithms) that can generate new data following those same patterns.

  3. These programs use complex math and statistics to ensure the fake data looks and behaves just like real data.

It's a bit like a very advanced version of the predictive text on your phone. The computer learns the rules of how real data looks and then creates new data following those rules.

Potential Risks and Considerations

While synthetic data presents numerous advantages, its use is not without risks:

  1. Data Quality: Inaccuracies or biases in the original data may be amplified in synthetic datasets, potentially leading to further flawed AI models.

  2. Over-reliance: Exclusive use of synthetic data may result in AI systems that perform well on artificial scenarios but fail in real-world applications. It’s like binging on carbs and avoiding protein.

  3. Legal Admissibility: The use of synthetic data in legal proceedings may face challenges regarding authenticity and reliability if the foundational data of an AI model is called into question.

  4. Ethical Concerns: Generation of synthetic legal data raises questions about the potential for misuse, such as creating false precedents or misleading evidence.

  5. Regulatory Compliance: Ensuring that the use of synthetic data aligns with data protection laws and ethical guidelines in legal practice.

So what?

  • Synthetic Data is part of the AI privacy concerns mega trend.

  • Nearly 70% of consumers around the world are concerned about their online privacy.

  • And nearly 60% of consumers say the use of AI in collecting and processing personal data is a significant threat to their privacy.

  • More specifically, KPMG reports that 63% of people are concerned that generative AI could potentially expose their personal data via breaches or other unauthorized access.

  • Synthetic data is the ideal solution for companies who want to train AI models while maintaining customer privacy.

Spotlight

🎥 💄 Online Presence Analysis Prompt

Throwback Tuesday. Let’s do a simple prompt for you all who may still be new to prompting or who have fallen into a rut.

This prompt is designed to help users conduct a comprehensive analysis of an individual's online presence. It guides you through researching the person, assessing their approachability, understanding how their network perceives them, and considering a cynical view of their online activity.

I ran it on me and two others that I know have an online presence. Both Joshua and Andrew have been outspoken and caused some stirs so I wanted to see what this prompt was able to do with capturing and explaining this. The prompt captured these nuances and gave a good jumping-off point to learn more.

▶︎▶︎ PROMPT

Your task is to research [your name], a [describe who you are within a business context]. (If there are multiple options for the same name, check which one to analyze before proceeding) Answer the following questions in your assessment:

How can this person's online presence be summarized from different perspectives?

What does their online presence indicate about their approachability?

How might their followers and professional network perceive this person based on their online activity?

What would a cynical view of their online activity reveal?

Results:

Joshua Browder, founder of DoNotPay

Andrew Arruda, founder of ROSS Intelligence

AI Model Notables

Microsoft’s AI boss thinks it’s perfectly okay to steal content if it’s on the open web.

Elon Musk revealed that Grok-2 will be released in August, and will improve on the problem of LLMs being trained on data from other models.

Meta changes its labels for AI-generated images after complaints from photographers. The company had flagged images with "minor modifications as "Made with AI."

Apple releases a new AI model (called 4M)available to the public as it shows it is a serious research and development player in AI, not just a consumer-friendly AI packager.

News You Can Use:

China AI Startups Head to Singapore in Bid for Global Growth

YouTube now lets you request removal of AI-generated content that simulates your face or voice 

This is Big Tech’s playbook for swallowing the AI industry (and avoiding antitrust actions (for now))

This man spent a year using AI to create menus for AI dinner parties - this is how it went.

AI lawsuits worth watching: A curated guide

Louisiana governor vetoes political deepfakes bill

Was this newsletter useful? Help me to improve!

With your feedback, I can improve the letter. Click on a link to vote:

Login or Subscribe to participate in polls.

Who is the author, Josh Kubicki?

Some of you know me. Others do not. Here is a short intro. I am a lawyer, entrepreneur, and teacher. I have transformed legal practices and built multi-million dollar businesses. Not a theorist, I am an applied researcher and former Chief Strategy Officer, recognized by Fast Company and Bloomberg Law for my unique work. Through this newsletter, I offer you pragmatic insights into leveraging AI to inform and improve your daily life in legal services.

DISCLAIMER: None of this is legal advice. This newsletter is strictly educational and is not legal advice or a solicitation to buy or sell any assets or to make any legal decisions. Please /be careful and do your own research.8