What is AI writing detection?
AI writing detection encompasses artificial intelligence and machine learning technologies to analyze and understand text produced by generative AI—a phenomenon taking the education community by storm.
The sudden introduction of Large Language Models (LLMs) in late 2022 has transformed the way that content is created, with both potential benefits and dangers for academia. Many are now embracing how appropriate use of generative AI, such as ChatGPT, can bolster creativity, whilst others are concerned about its potential threat to academic integrity.
In a recent study by Casal and Kessler on how accurately linguists could distinguish between AI and human writing, 72 linguistics experts correctly identified AI-generated content only 38.9% of the time (2023).
As the mortal eye struggles to decipher AI-written content from that of a human, AI writing detection tools are stepping in to help educators sort through reams of student papers, effortlessly separating original writing from text that a generative AI tool may have prepared.
However, the recent emergence of more nuanced writing tools has encouraged further development into AI writing detection, and many institutions are looking to understand how organizations, like Turnitin, are helping to bridge the gap between AI-generated and ethical writing, and how it will continue to do so into the future.
Why is AI writing detection important?
First, why is AI writing important? Many industries—specifically academia—are facing up to the reality that generative AI is here to stay.
Following the release of our AI writing detection capabilities, Turnitin’s own 2023 study, conducted by Tyton Partners in collaboration with Anthology, Macmillan Learning, Lumina Foundation, Every Learner Everywhere, and the Bill and Melinda Gates Foundation, it was found that three times as many students than faculty reported being regular users of generative AI writing tools like ChatGPT.
Many institutions have tried to prohibit the use of the likes of ChatGPT across their networks. Kevin Roose comments, “Sure, a school can block the ChatGPT website on school networks and school-owned devices. But students have phones, laptops and any number of other ways of accessing it outside of class.” The ubiquitousness of this emerging technology makes banning it potentially obstructive to student skills development. In the same 2023 study by Tyton Partners, a huge 46% of students reported that they will use generative AI writing tools, even if prohibited by their instructor or institution.
Whilst educators have varying views on the appropriateness of using AI in assignments, many are now embracing AI writing as a teaching aid. Its appropriate use has even been described as helping students overcome creative blocks, and providing starter ideas that they can build upon using their unique voice. But despite its positive impacts, educators want to mitigate the negative capabilities of generative AI, thus harnessing AI writing detection as a backstop solution that helps institutions to place confidence in their students’ integrity.
Being uninformed about the capabilities of AI—both for creating obstacle and opportunity—educators may miss the window to adjust their practices and stay ahead of the AI curve. Revising assessment formats and processes, reevaluating mechanisms for proof of learning, and adopting an AI writing detection tool, are all requisites for limiting the likelihood of AI being the only author of their students’ papers. Jennifer Rose, Lecturer at Manchester University, comments, “Humans need to stay ahead of AI skills for their work to be worthwhile, so lecturers need to embrace the technology as a timesaving device that opens new opportunities. Lecturers are uniquely placed to help students learn to use AI effectively – both during their studies and in their future employment.”
Having collected reflective feedback from students who’d used AI for learning, Marc Watkins notes, “Students shared that using these [AI] apps in scaffolded assignments can enhance their creative process, a promising outcome, when both students and faculty approach the technology with caution.” Caution is key to the conversation around AI writing, and spotlights the question: What should educators do when AI goes too far? Where should we draw the line between using generative AI for learning and using it for less-than-ethical purposes?
“It’s undeniable that generative AI has the potential to enhance creativity and increase productivity. Yet, losing the ability to distinguish between natural and synthetic content could also empower nefarious actors … the dangers behind machines writing text, drawing pictures or making videos are manifold” (Beyer). To illustrate the detrimental impact that AI can have on academic writing, just seven weeks after the launch of Turnitin’s AI writing detection tool, Turnitin assessed 38.5 million submissions for AI writing. 9.6% of papers reported over 20% of AI writing and 3.5% reported between 80% and 100% of AI writing.
Bringing the AI writing detection tool into the classroom as both a deterrent and investigative tool creates reassurance for educators as they grapple with this new and confusing dimension of the education landscape. Rather than constantly questioning the possibility that students have taken to AI to write their papers, faculty members can use AI writing detection as a data point to understand whether AI use is a problem or is appropriate, according to their institution’s guidelines and academic integrity policy.
How does AI writing detection work?
When a student submits a paper to Turnitin, the document is broken down into segments of text, which are overlapped with each other to capture each sentence in context. Each individual segment from the student paper is run against Turnitin’s AI writing detection tool to determine the likelihood that it was written by generative AI. The system finally calculates an average AI writing score for all segments in the document to deliver an overall prediction of how much text was written by AI.
To dig deeper, ever wondered how your phone predicts the next word you’ll type when composing a message? The technology behind large language models, like ChatGPT, is similar. They are trained on the text of the entire internet, and in the most general terms, they take this large amount of text and generate sequences of words based on picking the next highly probable words. Human writing, of course, differs in this sense, as it tends to be inconsistent and idiosyncratic, and therefore, the probability of picking the next word in a human sequence is low.
Turnitin’s AI writing detection system is built around a deep-learning architecture called the transformer model, which makes accurate next-word predictions by modeling language. As a result, Turnitin is able to leverage the model to identify subtle statistical patterns of AI-generated writing.
In a differentiation analysis of scientific content variation by Ma et al. (2023), the results pointed out that, “while AI has the potential to generate scientific content that is as accurate as human-written content, there is still a gap in terms of depth and overall quality. The AI-generated scientific content is more likely to contain errors in factual issues.”
At present, Turnitin’s AI writing detection tool can detect content from the GPT-3 and GPT-3.5 language models. The writing characteristics of GPT-4 are also consistent with earlier model versions, meaning Turnitin’s AI writing detection tool can detect content from GPT-4 (ChatGPT Plus) most of the time, too. We are actively working on expanding our model to enable us to better detect content from other AI language models.
How accurate is AI writing detection?
Integrity is the heartbeat of Turnitin, and with integrity comes clarity and honesty in the way we communicate with the education community. AI writing detection tools are designed to give you a reliable guide to start conversations with a submitting student, but we want to be clear that the AI writing report is not an absolute proof or disproof of AI writing.
At Turnitin, we strive to maximize the effectiveness of our detector, but we must consider false positive rates when assessing the AI writing score attached to a student paper. False positives occur when fully human-written text is identified as being AI generated. While the risk of false positives from Turnitin’s AI writing detection tool are low (less than 1% for a document with over 20% AI-generated content), their presence highlights the importance of using AI writing detection as a signaling tool, utilizing the AI writing score as one piece of an investigative puzzle. The AI writing score alone is not always sufficient when investigating academic misconduct; it exists to facilitate formative conversations with students, and has much more impact when assessed in combination with other supporting factors.
As large language models grow, we are focused on adapting and optimizing our AI writing detection tool based on our learnings from real-world document submissions.
How can institutions introduce AI writing detection to students and faculty members?
The introduction of a new AI writing detection tool is likely to be daunting for both students and faculty members across an institution. Students with positive intent may be wondering why their integrity is under scrutiny. Faculty members may be hesitant to take on new technologies due to time constraints, or even fail to fully understand the changing landscape around generative AI writing and why being on high alert for this type of misconduct is necessary.
Remaining transparent with members of your institution about how an AI writing detection tool will be adopted is key to preventing confusion and panic.
Communicate the impact of generative AI with both educators and students
As generative AI tools become even more accessible and sophisticated, there are both benefits and drawbacks associated with their presence in the classroom. Being aware of the impact that generative AI stands to have on reforming education in the coming years will give your institution a head start in modernizing its pedagogical approaches, including proof-of-learning methods. Without this insight, educators run the risk of falling behind in their teaching practices, having failed to safeguard their assignments against misuse. Conversely, students with a lack of understanding of the impact of this powerful tool expose themselves to unintentional academic misconduct.
Learn how to use and interpret the AI writing detection tool
Educators and investigators are compelled to understand the capabilities and limitations of the AI writing detection tool before looking to adopt it as part of their assessment process. Due to a potential presence of false positives, we do not claim that our AI writing detection tool is foolproof. When initially approaching the issue of false positives in a student paper, we recommend offering the benefit of the doubt to your students. It is only when further, more definitive evidence has been gathered should an investigator opt to move forward with next steps according to the institution’s academic integrity policy.
Be transparent about how and when the AI writing detection tool will be used
Whilst the AI writing score is visible only to Turnitin instructors and administrators, this does not discount the importance of openly discussing the usage of AI writing detection with students, whether this be verbally, via email, or by updating your institution’s academic integrity policy. Educators must be open about when and how the AI writing score will be closely monitored. Will it be used during formative assessment, or summative only? Will students be able to see the tool in action ahead of time to understand its capabilities? How will the institution manage potential AI writing misuse cases?
Susan D’Agostino quotes Nestor Pereira, Vice Provost of Academic and Learning Technologies at Miami Dade College (USA), as describing AI writing detection tools as “a springboard for conversations with students.” Pereira goes on to say that students who are inclined to use generative AI to replace their writing may think twice about it if an AI writing detection tool is in place within the institution.
How can institutions use AI writing detection as an investigative tool?
Having an investigative process in place is indispensable should you encounter a paper potentially written by a generative AI tool, or if potential false positives arise when using an AI writing detection tool. While the risk of false positives is low, being prepared to have a direct conversation with a student can make the investigation as pain-free as possible—both for the student and educator alike.
We must remember that although AI writing detection tools are an investigative aid, “...as we all glide into an artificially drafted future, it's clear that a human questioning mindset will be needed. Indeed, our investigative skills and critical thinking techniques could be in more demand than ever before” (O'Brien, 2023).
Download the AI writing report
As a first step toward preparing for a conversation with your student, we recommend downloading your student’s AI writing report and analyzing it in full alongside any other findings you may have. Use the AI writing report to evaluate the student’s work, remembering that the AI writing score is only an indicator. As part of a downloaded report, you’ll find key information about there being a higher incidence of false positives when the AI writing score is less than 20%. Our goal is to provide the insights needed to make good decisions, but the AI writing report is only one piece of the puzzle. Educators know their students and their work, and that is an equally important piece of data.
Rely on educator relationships
An educator’s relationship with their student can act as a good starting point when investigating potential AI writing in a student’s paper. Whatever the score highlighted by the AI writing detection tool, an accusation should never be made without a respectful dialogue with the student in question. Has the educator worked with the student throughout the semester? Has feedback been offered during the writing process? As part of their interactions with the student, do they recall the student having sufficient subject knowledge? Educators will have built relationships with the student; use that as the filter for evaluating what happened.
Ask for proof of critical thinking
If a student is the true author of the paper they have submitted, they are likely to have several items that can corroborate their thought process as they wrote the paper. For example, research notes, outlines, document version histories and metadata, and previous draft print-outs. The student may also have received feedback from you, another educator, a peer, or a trusted reviewer; are there witnesses to the writing process? This could be a solid indicator for AI writing versus human writing.
Assess previous writing samples
Use the student’s previous writing samples to compare writing style, grammar sophistication, and vocabulary complexity to the paper in question. Do they match? We understand that as time elapses, you may struggle to acquire a writing sample from a period before generative AI existed. If this is the case, can you put guardrails in place to protect future assignments from AI misuse? Ask your students to write a piece under test conditions (and make it clear to them why you are doing this). This will firstly deter your students from using generative AI to write their papers, whilst offering the reassurance that you have original writing to compare against should the issue of potential AI writing arise in future.
Assume positive intent
If you are unable to reach a definitive evaluation of AI versus human writing, this may be the right time to start assuming positive intent. If there is any question of uncertainty during your investigation—from AI writing detection to having all of the right conversations and asking the right questions—we recommend moving forward without accusation or penalty. The good news is that this experience alone should act as a powerful deterrent for any potential future misuse.
Conclusion: What academic leaders need to know about AI writing detection
Generative AI is developing at speed, but so too is AI writing detection. At Turnitin, we recognize the needs of our education community and are already hard at work building detection systems for future Large Language Models.
But AI writing detection does not stop at technology—it is simply one part of a whole. Evidence-based practice is an integral part of investigative and decision-making processes in all walks of life, and we mustn't make exceptions when dealing with the sensitive area of academic misconduct.
An AI writing detection tool ultimately serves as an indication that AI may have been used to write a paper, but cannot provide a rounded conclusion. We encourage human interpretation to take precedence when looking for AI writing in a student’s paper, considering the possibility of false positives, intentionality, and overall, what we know about our students and their skills.