Unexpected Bedfellows: Using ChatGPT to Uphold Academic Assessment Integrity

Authors:: Arno Louw
Published:: Wednesday, September 20, 2023
Columns:: Emerging Technologies and Trends

min read

The release of ChatGPT has many in higher education worried about the threat it poses to academic integrity, but what if it could be used to help?

Photo of a wolf and a goat sleeping next to each other. — *Credit: Mariya Chechulina / Shutterstock.com © 2023*

The sudden uproar around ChatGPT and its text-generating artificial intelligence has sparked debate at South African universities and around the world. Presumptions abound that students might misuse AI to compose written work and that faculty will be unable to verify the authenticity of documents submitted for academic credit. Used as a co-assessor, however, ChatGPT can actually heighten the ability to ensure the integrity of academic assessment.

ChatGPT is a large language model created by OpenAI that uses deep learning algorithms to generate human-like text in response to prompts or questions. Although advanced AI language generators are useful for academic and professional writing, concerns arise about academic integrity if students employ the tools carelessly or mischievously. The concern is that ChatGPT could upend traditional assessment methods because it can generate high-quality written work that is difficult or impossible to identify as having been written by a machine, making it challenging to ensure the integrity of academic work.^Footnote1 As the availability and sophistication of predictive text and language generators increase, leaders of educational institutions must remain vigilant and adopt strategies to prevent academic misconduct. How, then, can ChatGPT be used to improve academic assessment?

The Growing Power and Reach of Generative AI

Predictive text technology is widely available on various devices and applications, using machine learning to suggest commonly used words and phrases to speed up the input process. OpenAI, initially a nonprofit focused on reinforcement learning research and meta-learning robots, released GPT-2 in 2019 before unveiling GPT-3 in 2020 and GPT-4 in March 2023. These language models demonstrate unprecedented natural language processing capabilities. GPT-3 uses more than 175 billion machine learning parameters and can generate text in various formats and perform complex tasks.

Despite increasing sophistication, however, predictive text software remains error-prone. Neural networks, modeled after the structure of the human brain, are a type of machine-learning model.^Footnote2 The architecture used by GPT-3 avoids the limitations of previous recurrent neural network architectures, which were sequential in nature, by using attention mechanisms to generalize sequences. ChatGPT employs deep learning algorithms to produce text that closely resembles human writing. AI language generators such as ChatGPT bring potential benefits for academic and professional writing. ChatGPT is a tool for a long-awaited evolution in writing and language communication, and it has the potential to revolutionize the way we communicate and learn.

The fear, however, is that ChatGPT could undermine conventional evaluation methods by generating high-quality writing, calling into question the authenticity and integrity of academic work. This concern is based on the assumption that academic assessments have remained the same over the past decades, focusing on writing styles and academic fashion while overlooking the importance of proven research and logic to support arguments. Disregarding ChatGPT based on these assumptions overlooks the potential benefits and applications of the technology in educational settings. One effective strategy to ensure authenticity is to investigate whether a piece of academic writing shares authorship with another piece of writing whose author is known. In such cases, ChatGPT can become a helpful tool.

Enlisting ChatGPT for Assessment

When asked to identify ten ways of establishing the authenticity of academic written text, ChatGPT offered a range of methods: plagiarism detection, style analysis, topic analysis, timeframe analysis, syntax analysis, vocabulary analysis, plagiarism comparison, grammar analysis, content analysis, and semantic analysis. ChatGPT also suggested a rubric that includes criteria such as clarity of argument, originality and plagiarism, language proficiency, cohesion and coherence, and adherence to academic standards. Each criterion can be assessed on a scale from 1 to 5. A total score can be calculated to determine the essay's overall quality.

Although ChatGPT can be a valuable tool in helping establish authenticity, it should be combined with other assessment methods to ensure academic integrity. It is up to educators, who are also the assessors, to maintain academic assessment integrity.^Footnote3 Technology has made academic assessment more efficient and effective, but human intelligence remains critical in ensuring fair and accurate assessments. As such, assessors should remain vigilant in assessing the quality and validity of academic work.

Similarity, Grammar, and Pedagogy

Similarity checkers are tools that compare submitted texts against known content in an effort to uncover instances of plagiarism. Many institutions have developed policies and protocols that include similarity checking. However, similarity reports should not be the only factor determining plagiarism because they can sometimes produce disputable evidence.^Footnote4 ChatGPT can also be effectively utilized as a similarity checker. As a language model, ChatGPT can go beyond exact text matches and recognize similarities in meaning and intent, making it a valuable tool for identifying subtle instances of potential plagiarism. Its ability to analyze and assess text at a deeper level enhances its capacity to serve as an efficient and accurate similarity checker, providing valuable insights for maintaining academic integrity in educational institutions.

Grammar checkers are powerful writing assistants that check for spelling, grammar, punctuation, style, and tone. They can help identify sentence structure and syntax errors and suggest corrections to improve clarity and readability. Grammar checkers are an excellent tool for proofreading and improving the overall quality of your writing. As a language model that can generate human-like responses to a wide range of queries, ChatGPT can provide insights on various topics, from basic knowledge to complex ideas. It can help with brainstorming, research, and idea generation. ChatGPT can guide an essay's content, suggest alternative approaches, and offer examples to help clarify points. ChatGPT and grammar checkers serve different purposes and have different strengths when checking an academic essay. When assessing an academic essay, both tools can be helpful. Grammar checkers can help identify errors that might affect the clarity and coherence of your writing. Meanwhile, ChatGPT can provide insights and suggestions to help you develop and refine your arguments. The synergetic use of ChatGPT as an additional strategy to check for plagiarism and grammar adds value to the assessment process.

Using similarity and grammar checkers and ChatGPT in combination with human assessors could allow for an even more thorough investigation of authenticity. ChatGPT can efficiently scan through a vast amount of text to identify potential similarities, provide suggestions for improvements in writing style, and highlight possible areas of concern. ChatGPT's ability to analyze language at a deeper level allows it to detect similarities in meaning and intent, providing a more nuanced assessment of the text. Additionally, ChatGPT can offer valuable feedback on writing quality, grammar, and coherence, enhancing the overall assessment process. Human assessors bring their expertise, knowledge, and understanding of the subject matter to the assessment process. They have the ability to contextualize the information provided by ChatGPT and other similarity checkers, interpret the results, and make informed judgments about the quality and originality of the academic work. Human assessors can identify instances of plagiarism that may go beyond verbatim copying or close paraphrasing, utilizing their critical thinking skills to detect more subtle forms of plagiarism that ChatGPT might miss. They also have the discretion to evaluate the relevance and appropriateness of the sources used in the work and assess the overall coherence and cohesion of the essay.

The combination of human intelligence and the capabilities of intelligent agents like ChatGPT allows for a more comprehensive and accurate assessment of academic work. While the technology tools can expedite similarity checking and offer valuable insights, human assessors can apply critical thinking skills, domain knowledge, and ethical judgment to ensure fair and transparent evaluation while upholding academic integrity. Together, they form a collaborative and practical approach to maintaining assessment integrity when using ChatGPT and similar AI technologies in the academic context.

The potential for ChatGPT to revolutionize assessment processes can be harnessed through creative pedagogies and engagement. Extracting text from video-generated transcripts and using ChatGPT to construct quizzes, for example, is a practical assessment strategy to assess students' comprehension of learning content. Another example could be a mechanism that forces students to re-read and revise their work, adding sources and in-text references and highlighting where arguments originated. Such a mechanism could be incorporated from the first year of study, ensuring that students develop an understanding of proper referencing. Adding a slide summary to written work can also reinforce understanding and logic. For continuous assessment, a battery of assignments building up to a final assignment with the exact intended instructions is a better practice. A balanced approach can address academic misconduct while leveraging ChatGPT's capabilities.

Better Writing through Assessments

ChatGPT can also identify areas where writing can be improved. While traditional similarity checkers may focus solely on detecting text similarities, ChatGPT can analyze the overall quality of the writing and provide feedback on writing quality, grammar, and coherence. Suggesting improvements in writing style, phrasing, and sentence structure assists in enhancing the clarity of the text. This feature especially benefits students looking to improve their writing skills and produce higher-quality academic work. This can be particularly useful for students who are not native speakers of the language they are writing in and for instructors who want to provide more comprehensive feedback on their students' writing.

Moreover, ChatGPT can provide suggestions for improving the clarity and coherence of a text, which can be valuable for academic writing that requires a high level of precision and clarity. It can also analyze the structure and organization of a text, identifying areas where the writer can improve the flow and structure of their ideas. ChatGPT's ability to generate human-like responses also fosters an engaging and interactive learning experience. Students can use it as a writing companion, stimulating creativity and promoting originality. Instead of merely flagging potential issues, ChatGPT can act as a co-author, encouraging students to re-read and refine their work before submission.

ChatGPT can revolutionize how academic works are assessed and evaluated by providing more nuanced and comprehensive feedback on writing quality, plagiarism, and overall authenticity. However, it is essential to note that ChatGPT should be accompanied by ethical guidelines and accountability measures to ensure its responsible and fair use.

Human Intervention During an Academic Assessment

Despite the advancements in technology and tools like ChatGPT, similarity checkers, and grammar checkers, human intelligence still plays a critical role in ensuring accurate and fair assessments of academic essays and assignments. Technology tools have limitations in comprehending the nuances and complexities of language that human beings can perceive. The human assessor's contextualization and interpretation of the information provided by these tools are crucial in making informed judgments about the quality of written work. Human intelligence is also necessary for identifying instances of plagiarism that these tools might miss.

Research Experiments Used to Test Authenticity

I used two strategies to test assessment authentication using ChatGPT. The first strategy involves asking ChatGPT to compare written work submitted by a student to another piece of writing that is known to have been written by that student. The second strategy asks ChatGPT to compare a piece of work submitted by a student to another piece of writing known to have been written by someone else. In each case, ChatGPT analyzed the writing styles of the documents being reviewed and provided feedback about the likelihood that both were written by the same person, as well as specific evidence supporting the analysis.

Strategy 1

The strategy for this test is to ask ChatGPT if two texts are written by the same author. Students hand in a piece of written work before doing a formal assignment. A biography is an example of such a written piece of text. When a later assignment is completed and seems suspicious, the instructor can use ChatGPT to compare the assignment text against the biography. When I ran this experiment, ChatGPT delivered the following response:

It is likely that the two paragraphs were written by the same person. Both paragraphs discuss the same organization, the University of Johannesburg (UJ), and relate to education and technology. Additionally, the first paragraph mentions the author's work at the Centre for Academic Technologies (CAT) at UJ, while the second paragraph describes the impact of information and communication technologies (ICTs) on assessment at UJ. Furthermore, the author's name, [writer], is mentioned in the first paragraph, and his role as Senior Specialist: Instructional and E-learning at CAT is described. These details suggest that the author of both paragraphs is likely to be [writer].

To add another layer to this exercise, I submitted a third piece of writing and asked ChatGPT to compare it with the other two.

ChatGPT's response:

It is possible that this was written by [writer]. The paragraph discusses how UJ has used ICTs to improve student feedback and engagement, which is in line with [writer's] area of expertise as a Teaching and Learning Consultant (Instructional Design) at the Centre for Academic Technologies (CAT) at UJ. The paragraph also uses similar language and terminology as the previous paragraphs, such as "ICTs" and "online surveys." However, without further information, it is impossible to determine with certainty whether [writer] is the author of this particular paragraph.

Strategy 2

For the second experiment, I searched online for writings on a particular topic that was the subject of a student assignment. I copied some text written by a credible author on that topic and pasted in into ChatGPT. I then pasted in text written by a student on the topic in question and asked ChatGPT to compare the writing styles of the two texts. In the same manner, assessors can compare a student's writing with the written work of other students, which can also result in the detection of suspicious similarities.

The results were as follows:

It is unlikely that these two paragraphs were written by the same author. The first paragraph discusses a specific exercise for students to complete in order to compare their writing styles, which is related to instructional design and learning, while the second paragraph discusses the benefits of ICT for students, which is related to education and technology. Additionally, the first paragraph uses technical language and terminology related to electronic submission, while the second paragraph uses more general language and does not mention any specific technologies or methods. These differences suggest that the paragraphs were written by different authors with different areas of expertise and writing styles.

As an AI language model, I cannot confirm or deny the authorship of a text without further information. However, based on the content of the paragraph and the fact that it discusses topics related to [writer's] area of expertise, it is possible that he or she could have written it. Nonetheless further investigation or confirmation is needed to determine the actual author of the text.

The results of each of these experimental tests depict the acceptable conclusions and comparisons that ChatGPT can generate to assist in the authentication and ensure the integrity of an academic document during an assessment. Subsequently, a last question on authenticity should be whether a human assessor is needed.

Conclusion

ChatGPT can be a valuable tool for promoting originality and enhancing the overall quality of academic writing. However, using it ethically and impartially is vital to ensuring academic assessment integrity. Academic institutions must embrace ChatGPT while also taking measures to mitigate its associated risks. Proposed assessment strategies can enhance the accuracy of plagiarism detection, such as adding in-text references, and enable a more comprehensive analysis of writing quality. By using ChatGPT as a co-author, students can benefit from a neutral assessment that may enhance the integrity and authenticity of their academic written work.

However, it is crucial to note that human intelligence continues to play a crucial role in identifying instances of plagiarism that similarity checkers might miss. Human assessors can detect plagiarism instances involving text rewording or using synonyms by drawing on their knowledge and expertise in the subject matter being assessed. Additionally, although grammar checkers can identify errors and suggest corrections, the human assessor should review these suggestions and make a final judgment about the context and intended meaning of the text.

While ChatGPT and other AI technologies can aid in assessing and improving academic writing, they should be used as assistive authoring tools alongside human intelligence. Co-utilization ensures fair and transparent evaluation processes. The responsibility of upholding and promoting academic assessment integrity remains with students and academic institutions. By using ChatGPT ethically and impartially, academic institutions can enhance the quality of academic writing and ensure academic assessment integrity.

Notes

Wagdy Sawahel, "Embrace It or Reject It? Academics Disagree about ChatGPT," University World News: Africa Edition, February 7, 2023. Jump back to footnote 1 in the text.
Ben Lutkevich and Ivy Wigmore, "What is Predictive Text?" TechTarget, October 2022; Ronald Schmelzer, "GPT-3 AI Language Model Sharpens Complex Text Generation," TechTarget, October 22, 2020; Bernard Marr, "What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?" Bernard Maar & Co., October 2020; and Jürgen Schmidhuber, "Deep Learning in Neural Networks: An Overview," Neural Networks 61 (January 2015): 85–117. Jump back to footnote 2 in the text.
Sawahel, "Embrace It or Reject It?" and Sarah Clark, "Demystifying ChatGPT," Learning Guild, February 2, 2023. Jump back to footnote 3 in the text.
Jon Nathanson, "At what point does similarity become plagiarism?" Quora, September 9, 2019. Jump back to footnote 4 in the text.

Thomas Arnoldus (Arno) Louw is Senior Instructional Designer: e-Learning Specialist at the University of Johannesburg.

ParentTopics:: Artificial Intelligence (AI) Assessment and Evaluation Instructional Design