Artificial intelligence (AI) detectors, also referred to as AI writing detectors or AI content detectors, are software tools that help users identify text generated using AI, such as ChatGPT. Such AI detectors are useful in several fields, such as publishing and education, because they can help journal editors determine whether a research article uses original language and has not copied text from any other published source. These AI detectors can also help universities establish the originality of dissertations submitted by a candidate.
But how do these AI detectors work?1 These tools use machine learning (ML) and natural language processing (NLP) to identify linguistic patterns and sentence structures and compare the submitted text with existing datasets to determine whether a block of text is generated by humans or AI. Are these AI content detectors accurate and reliable? Not entirely. These AI detectors are considered reliable 7 out of 10 times on a sample size of 100 articles1 and can be very useful in ensuring that the content received is mostly original.
With the rise of AI and the increasing popularity of ChatGPT and its subsequent versions, several AI detectors have been developed with varying degrees of reliability. It is essential to understand the functioning of these AI detectors and choose the correct one for your work to ensure good accuracy. In the subsequent sections, this article will delve into the characteristics, functioning, and ethical use of these AI detectors to help ensure accuracy in your content.
AI detectors[2] are software tools that can help determine whether text and images have been generated using AI. These tools are mainly used in educational settings to ensure academic integrity.
In the past few years, the use of AI has surged with the advent of generative AI, which is the ability of AI to generate content such as text and images in response to prompts. So, how does generative AI work? While previous systems used codes to respond to inquiries, generative AI, in contrast, learns from publicly available sources to generate suitable responses.
The AI research company, OpenAI, built a generative pre-trained transformer (GPT) to build early language models GPT-1, GPT-2, and GPT-3 (in 2020), which was trained on 175 billion parameters. In 2022, OpenAI released its AI chatbot—ChatGPT—which revolutionized the AI industry because of the realistic responses it generated, which were much more conversational, creative, and human-like. ChatGPT started being used for diverse purposes.
Its realistic responses generated significant interest, leading several factions to discuss the possibility of generative AI becoming a potential risk to the academic industry. This is because, with the advent of newer and improved models after ChatGPT, generating all types of content has become relatively easy. The responses have become more sophisticated, eventually tempting researchers to write their research papers using this technology.
However, over-dependence on such AI models to write academic text can amount to a form of plagiarism mainly because these models generate responses by parsing text from publicly available sources, eventually affecting the originality and accuracy of the academic text. This is where AI writing detectors come into the picture to check on the excessive use of AI-generated content. AI detectors use algorithms to help detect text that has been generated by AI and are increasingly being used by universities and journals.
The principles of ML and NLP are used by AI detectors to process the input and distinguish between human- and AI-generated content. Four standard techniques that AI detectors use are as follows:1
This ML model sorts or categorizes the provided data into predetermined categories. Classifiers use labelled training data; that is, they learn from examples of text that have already been categorized as human—or AI-generated. These classifiers can analyze and discover patterns in text independently and require fewer resources to distinguish human vs. AI outputs. The classifier then assigns a confidence score that indicates the likelihood of the text being AI-generated. However, false positives can occur if manually written text matches the model’s parameters of AI-generated text.
Embeddings are numerical representations of real-world objects used by ML and AI systems to understand complex knowledge and its relationships like humans. These numerical representations are called vectors and help understand the structure and context of words and the semantic relationships between words. Embeddings are of different types: word, contextual, transformer-based, document, image, graph, and knowledge graph.
Perplexity [3,4] measures how well a model is able to “predict the next word” in a sequence. Simply put, this metric quantifies the “surprise” when encountering new words. Hence, a lower score (lower degree of surprise) indicates better prediction accuracy and more chances of the text being AI-generated. As human writing tends to have higher perplexity (such as varying sentence lengths and complexities), AI writing detectors are likely to classify more predictable text as AI content. However, this feature may also generate false positives when human writing is well-structured and displays characteristics similar to AI-generated text.
Burstiness [4] is similar to perplexity, except that it focuses on sentences instead of words, measuring variations in sentence structure, length, and complexity. High burstiness indicates that the text has more varied sentence structures and word usage patterns, which is more characteristic of human-generated content. AI generators may produce more monotonous text and use some words repetitively based on their training data. So, AI-generated text may have lower burstiness.
Although AI detectors are increasingly being used by educational institutions and academic journals to detect AI content in dissertations, essays, articles submitted for publication, etc., their results are not 100% accurate. [5] AI detectors may tend to misidentify human-written content as AI-generated and fail to detect AI content that may actually be present, leading to false positives and false negatives, respectively. A study [6] by Stanford has also shown that articles written by non-native English speakers have a higher tendency to be flagged as AI because the content may not conform to the stylistic sentence structure that the training data had.
In addition, many AI detectors that use open-source models have a much higher rate of false positives. People have found ways to fool these AI content detectors, such as by adding whitespace to the text, introducing misspellings, removing grammatical articles, and using homoglyphs, which are characters that look the same as normal letters or numbers.
Considering this, AI detectors should be used with caution and should not be relied on completely to detect AI-generated content.
Although you can use AI detectors to identify AI-generated text, with practice and a few tips, you can also try to identify such text manually.[7] AI-generated text has few patterns and characteristics that can be easily identified once you train your eye to detect them. These patterns exist because AI models predict the next word in a sequence based on the data they have been trained on and, therefore, lack variety in their sentences. There are usually no surprises in AI-generated text because everything is so repetitive.
Here are a few common patterns that you could use to detect AI-generated text manually:
AI detectors and plagiarism checkers, although similar in some aspects, have several key differences, as shown in the following table. [8]
AI Detectors | Plagiarism Checkers | |
Purpose | Identify the origin of the content, whether generated by AI writing tools or humans | Identify unoriginal text, that is, instances where text has been copied without proper attribution |
Method |
|
|
Challenges |
|
|
With the increasing advances in AI technology for generating images and videos, including models such as Dall-E, Ideogram, Midjourney, etc., AI detectors for such content also need to become more sophisticated to avoid spreading incorrect information.
With easy access to AI-based image—and video-generating tools, the number of such images and videos has increased. Unfortunately, the technology is being misused to spread misinformation about people and events. In such cases, detecting such images and videos has become increasingly necessary.
Sometimes, detection is easy even with an untrained eye because when we look at an image, we can instantly know that something is wrong. There may be inconsistencies in the color, number, etc. However, for a more thorough detection, AI detectors can be of good use. Some of the common AI image and video detectors include AI or Not, Huggingface, V7 Deepfake Detector, Illuminarty, etc.
Q1. Are AI detectors 100% accurate and reliable?
A1. No, AI detectors are not 100% accurate and reliable because they are based on training data and may generate false positives if the text falls within the parameters on which they were trained; or false negatives if the text is particularly creative and similar to human-generated content. False negatives are becoming more common because of increasingly sophisticated AI writing tools that are currently being developed. Therefore, AI detectors are not foolproof and should not be your only resource to detect AI-generated content.
Q2. What to do if my manually written draft gets flagged as AI detected?
A2. Having your content flagged as AI-generated despite having written it manually can be demotivating. However, as mentioned earlier, AI detectors may often generate false positives. This happens if the text you’ve written very closely matches the training data on which the detector is trained, including specific parameters they look out for.
Here’s a list of ways by which you could make your text appear more “human.”9 Read through your content again and see if you can make any of these changes to avoid getting flagged as AI-generated.
Here’s an example of a human-written text that was flagged as AI-generated.
Use varied sentence starters and combine similar sentences to avoid repetition of the idea.
Example:
Q3. Can AI detectors identify plagiarism and vice versa?
A3. AI writing tools such as ChatGPT and its subsequent versions generate content using different combinations of words from data on which they are trained. Since these tools do not use specific words but rather only repetitive patterns and structures and generate completely new words and sentences, traditional plagiarism checkers may not be able to identify content that has either been copied verbatim or even paraphrased.10 checkers identify copied content using text matching algorithms by comparing against existing data. Consequently, AI detectors may not be able to accurately identify all plagiarized text and vice versa.
Q4. Are AI detectors ethical to use in academic writing?
A4. An increasing number of journals and educational institutions are now using AI detectors in assessing the originality of the papers submitted as coursework or for publication. It is acceptable and ethical to use AI detectors, but users should understand that these detectors should not be the only source for testing originality as they are not completely reliable or accurate. They may generate false positives and false negatives, as discussed earlier. These tools should be used with caution and their responses should be cross-checked to ensure that there is no bias.
Thus, while AI detectors can help detect whether your content is human- or AI-generated and are being increasingly used by academics, they should be used with caution to ensure that the instances getting flagged as AI are indeed generated by AI and that original, human-written content doesn’t get flagged.
We hope this article has helped you understand the use and functionality of AI detectors and how they would be more useful as a conversation starter to help improve students’ and researchers’ manuscripts rather than being the final decision in their research journey.
References
Paperpal is a comprehensive AI writing toolkit that helps students and researchers achieve 2x the writing in half the time. It leverages 22+ years of STM experience and insights from millions of research articles to provide in-depth academic writing, language editing, and submission readiness support to help you write better, faster.
Get accurate academic translations, rewriting support, grammar checks, vocabulary suggestions, and generative AI assistance that delivers human precision at machine speed. Try for free or upgrade to Paperpal Prime starting at US$25 a month to access premium features, including consistency, plagiarism, and 30+ submission readiness checks to help you succeed.
Experience the future of academic writing – Sign up to Paperpal and start writing for free!
For academics looking for the best grammar checker for Google Docs, we heard you! While…
AI PDF readers have quickly emerged as the latest in the list of must-have smart…
The Modern Language Association (MLA) format is a widely used style for writing academic papers,…
Did you know that authors spend over 11 hours a week reading research literature?[1] Are…
A theoretical framework is a crucial aspect in the research process, akin to a blueprint…
In an academic setting, students are often tasked with writing different kinds of essays designed…