Jon Thomas
- Jul 5, 2023
- 2 min read

Lawsuit Claims Open AI Stole ‘Massive Amounts Of Personal Data’ To Train ChatGPT

The Technocrat mind sees access to any and all data in the universe as an unalienable right. Thus, they have no moral or ethical restraint when hoovering up otherwise protected data such as medical records and personal data. This behavior underscores Technocracy’s disdain for laws and the rule of law.

OpenAI stole “massive amounts of personal data” to train ChatGPT, a lawsuit alleges.

The proposed class-action suit claims that Sam Altman’s company “secretly” harvested data to train its large language models so that its chatbot could replicate human language.

“Despite established protocols for the purchase and use of personal information, Defendants took a different approach: theft,” the lawyers wrote in the 157-page lawsuit, filed on Wednesday in the US District Court for the Northern District of California.

The lawsuit alleges that OpenAI crawled the web to amass huge amounts of data, including vast quantities taken from social-media sites. OpenAI’s propertiatary AI corpus of personal data, WebText2, for example, scraped huge amounts of data from Reddit posts and the websites they linked to, the lawsuit claims.

The data accessed included “private information and private conversations, medical data, information about children — essentially every piece of data exchanged on the internet it could take — without notice to the owners or users of such data, much less with anyone’s permission,” per the lawsuit.

This amounted to “the negligent and otherwise illegal theft of personal data of millions of Americans who do not even use AI tools,” the lawsuit claims.

OpenAI did not immediately respond to Insider’s request for comment, made outside of regular working hours.

As well as scraping the “digital footprints” of the wider public, the lawsuit claims that OpenAI also stores and discloses users’ private information, including the details they enter to create OpenAI accounts, their chat log data, and social media information.

Alongside people who use ChatGPT directly, this includes data from people using applications that have integrated ChatGPT, such as Snapchat, Stripe, Spotify, Microsoft Teams, and Slack, the lawsuit alleges. The companies did not immediately respond to Insider’s request for comment.

The lawsuit is seeking a temporary freeze on commercial access to and commercial development of OpenAI’s products until the company has implemented more regulations and safeguards, including allowing people to opt out of data collection and preventing its products from “surpassing human intelligence and harming others.”

The lawsuit also seeks financial compensation for people whose data was accessed to train the bots.

As well as OpenAI, major backer Microsoft was named as a defendant.

The plaintiffs were identified only by their initials, occupations, and state, which their lawyers said was to “avoid intrusive scrutiny as well as any potentially dangerous backlash.”

Generative AI, which can create text, audio, images, and videos, has exploded in popularity since OpenAI released its ChatGPT in November.

People have been using generative AI for personal, professional, and academic purposes, though there are concerns about its access to data.

Read full story here…

Lawsuit Claims Open AI Stole ‘Massive Amounts Of Personal Data’ To Train ChatGPT

Recent Posts