OpenAI admits it’s ‘impossible’ to create ChatGPT-like tools without using copyright material, amid court battles over intellectual property theft allegations
Trouble continues to brew for OpenAI over copyright issues, and the use of the resources to train its AI-powered chatbot without compensation.
When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.
What you need to know
While theOpenAI’s fiascothat led to its board of directors to stripe Sam Altman of his position at the company as CEO is out of the way, the company can’t catch a break as more trouble is seemingly brewing. As 2023 came to an end, The New York Times publicly announced its plans tosue Microsoft and OpenAIover AI unfairly using its copyrighted material, which negatively impacted the outlet monetarily.
Recently joining the fray, two non-fiction authors fileda class-action lawsuit against Microsoft and OpenAI for intellectual property theft, further staking a claim of $150,000 as restitution for damages. For those unaware, AI-powered chatbots likeOpenAI’s ChatGPTorMicrosoft’s Copilot(formerly Bing Chat) heavilystealrely on already existing information and resources from the internet (predominantly from websites) for training purposes.
The issue here is that the AI chatbots use the information to curate specific and detailed responses to queries, with “subtle” attribution to the source. What’s more,no compensation is provided to content creatorsfor using their work to train these models.
OpenAI recentlyadmittedthat it’s literally “impossible” to create tools like ChatGPT without copyrighted material from the internet while submitting its defense to the House of Lords communications and digital select committee. For an AI chatbot to provide users with accurate information, it has to refer to vast resources already existing on the internet. However, the twist is that virtually everything on the internet right now is copyrighted.
Because copyright today covers virtually every sort of human expression – including blogposts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials.
OpenAIindicated that limiting its training data set to copyright-free material would create AI chatbots that cannot meet the average user’s minimum requirements. Per the company’s submission and defense strategy, it’s apparent that “fair use” of copyrighted content is its entire lifeline.
Fair use of copyright resources creates a gray area, ultimately presenting a scenario where chatbots can obtain and use copyrighted information without necessarily seeking permission from the owner first. “Legally, copyright law does not forbid training,” OpenAI added.
There’s no AI without copyrighted content
OpenAI, one of the most sought-after companies when it comes to generative AI has openly admitted that it’s next to impossible to create AI-powered chatbots like ChatGPT without using copyrighted material to train the models. This is despite having unlimited access to Microsoft resources, on top of itsinitial multi-billion dollar investment in the technology.
Get the Windows Central Newsletter
All the latest news, reviews, and guides for Windows and Xbox diehards.
In the past few months, ChatGPT has suffered several setbacks, including reports that it’s getting dumber and a decline in its user base. This is amid speculations that OpenAI is running on fumes and on the verge of bankruptcy. Granted, it’s quite costly a fair to run a chatbot daily. Figuratively speaking, it’s to the tune of700,000 dollars per dayandone water bottle per query for cooling. A report highlighted thatgenerative AI could consume energy to power a small county by 2027 for a year.
While the matter is still in court, it’ll be interesting to see how things pan out. President Bidenissued an Executive Orderaddressing safety and privacy concerns revolving around AI, but guardrails for the technology remain a major concern among most users.
AI chatbots have been spotted having lucid hallucinations, erroneouslyrecommending a Food Bank as a tourist attraction, and even asking readers to take part in apoll to determine the cause of a woman’s unfortunate passing. If this happened while the chatbots had access to copyrighted material, it raises a lot of concern about how much damage the technology would cause when restricted to copyright-free data. In the meantime, Google’s Bard could potentially rise up the ranks having unlimited access to the entire internet.
What are your thoughts on AI chatbots using copyrighted resources without compensation and sweeping the issue under the rug as “fair use”?Let us know in the comments.
Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You’ll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.