logo
logo
extension

Install for Free

Chrome Extension for ChatGPT

👈 Go back to Blog

Marketing

avatarMichael King
clock

July 28, 2023

Why You Can’t Trust GenAI Detection Tools -Episode 8 – FridAI

TL;DR

Mike King shares how you can use Midjourney to generate some awesome images for your business including social media and website visuals.

Table of Contents

arrow-down

Greetings and salutations, folks.

Welcome back to another edition of FridAI. I’m your host, Mike King, CMO of AIPRM, and what I want to talk to you about today is why you can’t trust generative AI content detection tools. This is a subject I’m very, very passionate about because this is a set of very unreliable tooling that people are using. Taking it at face value and it’s such a dangerous thing.

Generative AI Detection Tools are Putting Academic Careers at Risk

Here’s an example, and I’m seeing a lot more of this happening on social media. This example comes from Dr. Timnit Gebru. Someone emailed her saying, “Hey, I’m this 4.0 GPA student. I take a lot of pride in writing my papers. My professor ran it through a tool called ‘Turn It In’, and it said that it’s made by generative AI.”

So they’re about to get a zero on the paper. No one at the school will listen. Their academic career is at risk from a tool that is not reliable. People often take these outputs from computers at face value because we’re just so used to computers being right or accurate. This is a dangerous assumption to make when people just don’t understand how these tools work or why they don’t work.

OpenAI’s Tool was Only Accurate 26% of the Time

For example, OpenAI had their own classifier that looked to determine whether or not content was generated by a generative AI tool, and it was only 26% accurate.

Let’s run that back.

The people that built the most popular, and arguably the most powerful large language model, are having difficulty identifying whether something shown to it was generated using that tool. So why do we think that these third-party tools, who’ve put less academic effort and rigor into what they’re doing, are better at it? In fact, OpenAI went ahead and sunsetted that tool because they realized that the output was so bad.

A few months ago, our founder, Christoph, tested 13 of the different generative AI tools, content detectors, and he was not impressed with the results.

False Negatives & False Positives

This is largely because these tools give you a lot of false negatives and false positives.

Here are a couple of famous examples on this slide, where both a passage from the Constitution and a passage from the Bible were misidentified as generated by a generative AI tool. So if we’re having something like that happen, how can you trust it to judge what is happening in real life for how people are using and creating content?

Some Tools Claim to Do Better

Some tools claim to do it better. This comes from a study that Originality AI put out where they’re basically comparing themselves against other different tools out there and saying that they are very close to a hundred percent accurate in not generating false negatives or false positives. They’re saying that they are really good at being accurate. Despite this study, in real life, when I’ve played with these tools, I’ve not seen them be consistently accurate, nowhere near this 90 plus percent that it’s claiming.

Is Watermarking the Solution?

A lot of people are beginning to talk about watermarking.

The White House got with some of the big generative AI companies and basically got them all to commit to this idea of using watermarks in what they generate. This will probably work quite well for imagery because there’s a lot of metadata in images. You can hide little artifacts in the image itself that are perceptible or imperceptible by the end user. But for text, I don’t think that’s realistic. Either way, what will pop up in the wake of any watermarks is watermark removal tools.

When we talk about watermarks for things like text as an example, people are talking about using different Unicode characters and using different patterns in how they’re distributed within the text so that a user doesn’t know because they don’t know that the Unicode character for a given letter isn’t the same as when you type it directly off of your keyboard.

Nevertheless, this will cause a bunch of tools to pop up that do the opposite of what the watermarking process does and just remove it. So I don’t think that’s necessarily the answer either.

Precision vs Accuracy

What we do need to be aware of is that these tools are not accurate. But we should expect these tools to be precise. This is the same as any other tool out there that’s based on some sort of modeling. There’s not always going to be a high likelihood that it’s more than like a 70% accuracy or whatever. So if you’re going to use these tools, you have to understand that if you’ve got the same person writing and you’re running it through these tools, you should get a precise answer. Meaning that if something changes about how that person writes, then you should have that change reflected in these tools.

But looking at these tools, if you put a piece of content in it and then you get a score, that’s not enough information to tell you whether or not it was actually generated using a generative AI tool.

Determine the Trend

What you should be doing is looking at the trend. If you’re going to use tools like this, you want to identify the trends across your authors. What you’re going to want to do is track that trend from before ChatGPT existed, and then look at what it looks like after that.

In this case, in this fictitious example here, what we’re doing is we are looking at a variety of blog posts by the same author, and then we’re running it through all the different tools that are available. Not all of them, but in this case just like five example tools. We’re getting the scores for all of those different tools, for all those articles. We’re tracking it across a time series. That way we’ll see what the average is across all the tools as far as what that percentage is, and whether it goes up or goes down.

So, if we saw some sort of step change after ChatGPT rolled out, the indication is that that user or that author made a shift in how they’re writing and it likely may have involved using generative AI. But if what you see is that the trend is very similar to what it was before ChatGPT was released, then you can be pretty sure that the tool is just not accurate. Perhaps this person did not use a generative AI tool to write the content.

Ultimately, I don’t necessarily think it matters, but this sentiment that we all have about these tools being accurate and what that means for the person behind the content, I think is the wrong one to have. We definitely need to be a lot more thoughtful about the output of these tools. We cannot take them at face value.

As always, if there’s anything you want me to cover about generative AI, please feel free to drop it in the comments so I can think about what I’m talking about next week. And also, as always, if you aren’t using AIPRM, I definitely recommend that you do.

It’s free.

Go ahead and download it.

We’ve got all these great prompts, so you just have to remember: There’s a prompt for that. But there are no prompts for us to determine whether or not your content was generated with generative AI because it doesn’t make sense.

See you next week.

Join in the conversation over at our Community Forum on AIPRM.

Thank you for reading.

Written by

YouTubeLinkedInTwitter
Michael King an artist and a technologist all rolled into one, Michael King is the CEO and founder of digital marketing agency, iPullRank focused on technical SEO, content strategy and machine learning. King consults with enterprise and mid-market companies all over the world, including brands like SAP, American Express, Nordstrom, SanDisk, General Mills, and FTD. King's background in Computer Science and as an independent hip-hop musician sets him up for deep technical and creative solutions for modern marketing problems. Check out his book "The Science of SEO" from Wiley Publishing.

Marketing

avatarMichael King
clock

July 28, 2023

Why You Can’t Trust GenAI Detection Tools -Episode 8 – FridAI

TL;DR

Mike King shares how you can use Midjourney to generate some awesome images for your business including social media and website visuals.

🚀2M+ Users

users_avatar

FEATURED

Introducing Teams: Share Your Prompts💫

In this episode of FridAI with Mike King, learn the best ChatGPT SEO use case candidates for keywords, content, product description copy, metadata, and structured data.

extension

Install for Free

Already have the extension?
Subscribe to a Premium Plan

Recent Articles 👇

cG9zdDo4NTAz
min-icon

7 Min

ChatGPT Holiday Prep Prompt Kit: A Marketing Sidekick for SMBs

Marketing

cG9zdDo4NjYx
min-icon

9 Min

OPENAI DevDay: Recap

Marketing

cG9zdDo4Mjky
min-icon

8 Min

How Much Time Have You Saved Using ChatGPT & AIPRM?

Marketing

cG9zdDo4MTg1
min-icon

6 Min

Introducing AIPRM’s New Multi-Seat Business Plans: Streamlining AI Content Creation for Small Businesses

Marketing

cG9zdDo4MDcz
min-icon

12 Min

[WEBINAR REPLAY] Your Practical Guide to Advanced AIPRM Features for Content Creation with ChatGPT

Marketing

cG9zdDo3NzUy
min-icon

2 Min

Introducing the AIPRM Prompt Wizard: A Practical Beginner’s Solution for Prompt Engineering

Marketing