[LINK] OpenAI just admitted it can't identify AI-generated text.

Sat Jul 29 00:23:11 AEST 2023

OpenAI just admitted it can't identify AI-generated text.

That's bad for the internet and it could be really bad for AI models.

By Alistair Barr Jul 28, 2023 
https://www.businessinsider.com/openai-cant-identify-ai-generated-text-bad-for-internet-models-2023-7

Large language models and AI chatbots are beginning to flood the 
internet with auto-generated text.

It's becoming hard to distinguish AI-generated text from human writing.
OpenAI launched a system to spot AI text, but just shut it down because 
it didn't work.

Beep beep boop. Did a machine write that, or did I?

As the generative AI race picks up, this will be one of the most 
important questions the technology industry must answer.

ChatGPT, GPT-4, Google Bard, and other new AI services can create 
convincing and useful written content. Like all technology, this is 
being used for good and bad things. It can make writing software code 
faster and easier, but also churn out factual errors and lies.

So, developing a way to spot what is AI text versus human text is 
foundational.

OpenAI, the creator of ChatGPT and GPT-4, realized this a while ago. In 
January, it unveiled a "classifier to distinguish between text written 
by a human and text written by AIs from a variety of providers."

The company warned that it's impossible to reliably detect all 
AI-written text.

However, OpenAI said good classifiers are important for tackling several 
problematic situations. Those include false claims that AI-generated 
text was written by a human, running automated misinformation campaigns, 
and using AI tools to cheat on homework.

Less than seven months later, the project was scrapped.

"As of July 20, 2023, the AI classifier is no longer available due to 
its low rate of accuracy," OpenAI wrote in a recent blog. "We are 
working to incorporate feedback and are currently researching more 
effective provenance techniques for text."

The implications

If OpenAI can't spot AI writing, how can anyone else? Others are working 
on this challenge, including a startup called GPTZero. But OpenAI, with 
Microsoft's backing, is considered the best at this AI stuff.

Once we can't tell the difference between AI and human text, the world 
of online information becomes more problematic.

There are already spammy websites churning out automated content using 
new AI models. Some of them have been generating ad revenue, along with 
lies such as "Biden dead. Harris acting President, address 9 a.m." 
according to Bloomberg.

This is a very journalistic way of looking at the world. I get it. Not 
everyone is obsessed with making sure information is accurate. So here's 
a more worrying possibility for the AI industry:

If tech companies use AI-produced data inadvertently to train new 
models, some researchers worry those models will get worse. They will 
feed on their own automated content and fold in on themselves in what's 
being called an AI "Model Collapse."

A group of AI researchers from fancy universities including Oxford, 
Cambridge and Toronto has been studying what happens when text produced 
by a GPT-style AI model (like GPT-4) forms most of the training dataset 
for the next models.

"We find that use of model-generated content in training causes 
irreversible defects in the resulting models," they concluded in a 
recent research paper.

After seeing what could go wrong, the authors issued a plea and made an 
interesting prediction.

"It has to be taken seriously if we are to sustain the benefits of 
training from large-scale data scraped from the web," they wrote.

"Indeed, the value of data collected about genuine human interactions 
with systems will be increasingly valuable in the presence of content 
generated by LLMs in data crawled from the Internet."

We can't begin to tackle this existential problem if we can't tell 
whether a human or a machine wrote something online. I emailed OpenAI to 
ask about their failed AI text classifier and the implications, 
including Model Collapse. A spokesperson responded with this statement: 
"We have nothing to add outside of the update outlined in our blog post."

I wrote back, just to check if the spokesperson was a human. "Hahaha, 
yes I am very much a human, appreciate you for checking in though!" they 
replied.

--