Passage one:
I’ll tell you how the Sun rose —
A Ribbon at a time —
Passage two:
Harry stepped into the clearing — heart thudding — the wand trembling slightly in his hand. The moon hung low — casting silvery shadows that danced at the edge of his vision.
A world divided by a punctuation mark — versatile enough to act at times as a comma, practical to the point of adding clarity to some of our writings and chubby enough to mirror the girth of an “m” — is better than one divided by hatred. The em dash — a tool of intense emotion for Emily Dickinson — has caught the attention of those familiar with the text output of ChatGPT.
The popular chatbot appears to be biased towards the em dash, if online chatter is to go by. The more it appears, the higher the chance of the passage being generated by ChatGPT but that’s just a thought among some podcasters and content creators on social media. The first extract above is by none other than Dickinson, while the other is a ChatGPT creation.
LuxeGen, a fashion podcast for example, called the em dash a “ChatGPT hyphen” and so have many in a ChatGPT subreddit (“Is an em dash (—) proof of AI manipulation?”).
The appearance of the punctuation mark could be related to the datasets that have been used for training. The models may have over-indexed em dashes and certain stylistic choices, making them appear disproportionately.
But it shouldn’t raise alarm bells, forcing you to use fewer em dashes.
“I just picked up the novel Martyr! by Kaveh Akbar, which was published in 2024. It has five em dashes in the first two pages. It has five more in the first three pages of Chapter 1. So I think it’s safe to say that em dashes do still exist in the wild, deployed fluently and effectively by our best writers, the way they always have been. It’s absurd to think that Akbar would erase those dashes from his manuscript now, or that he’ll worry people might think AI wrote his sentences. AI also tends to use ‘atmosphere’ a lot. Does that mean we should avoid that word too?” J.T. Bushnell, a senior instructor at Oregon State University’s School of Writing, Literature and Film, told The Telegraph.
To deliver a full em dash, said Bushnell, takes more work on the keyboard, “so you really have to know what you’re doing to get there, which means they appear very rarely in student work”.
He pointed out that in most situations, his students can use other punctuation as a substitute, “so it’s rare to see them even attempt a dash” and when they do, it’s usually “a hyphen posing incorrectly as a dash”.
“If I suddenly saw a bunch of students using em dashes properly, with nuance and intention, it might raise questions for me about whether there was AI assistance, but the dashes alone don’t provide an answer to the question.”
It’s not just the em dash that’s suddenly furrowing eyebrows, certain words are being given a similar treatment.
Last year, four researchers from University of Tubingen and Northwestern University took a look at “excess word usage” after LLM writing tools began making news. Four researchers — Dmitry Kobak, Rita González-Márquez, Emoke-Ágnes Horvát and Jan Lause — found that “the appearance of LLMs led to an abrupt increase in the frequency of certain style words”. They studied vocabulary changes in 14 million PubMed abstracts from 2010–2024. For example, there has been an uptick in the usage of the word “delve”.
“I have seen a hypothesis that the word ‘delve’ is popular in some developing English-speaking countries (example: Nigeria) where OpenAI hired many remote workers to provide instructions for training ChatGPT. Chatbot LLMs like ChatGPT are fine-tuned using human feedback, where human raters provide ratings for several candidate answers. Maybe the raters from Nigeria liked the word ‘delve’ because it’s popular in Nigeria. But maybe not! It’s only a hypothesis. Maybe the raters simply liked the word ‘delve’ because it sounds fancy,” Dmitry Kobak, researcher at Tübingen University, told The Telegraph.
Professors are yet to find an effective way to spot a chatbot-prepared abstract or article. Dr Brinda Bose, professor of English Studies at Jawaharlal Nehru University, told this newspaper: “The students’ written work is evaluated along with their classroom participation and project presentations, which gives us an intuitive sense of how much they have grasped and how they are thinking things through, that then should be reflected in their long papers at a similar level. If there is a stark discrepancy in standards between the two, then it may be something to investigate. The other clue is not so much the ‘grammatically perfect’ essay but the ‘carefully careless’ one with a few odd/obvious punctuation and spelling mistakes thrown into an impeccable, if bland, piece of writing.”
Bushnell, also the author of the book The Step Back, too says there is “no single giveaway”.
“Usually, it has more to do with departures from the normal cognitive patterns you’re used to seeing. When you ask a student to reflect on a short story they’ve read, usually they say whether or not they liked it, and then they say why, and the tone of their writing matches their level of enthusiasm. When those things are disjointed, or overly formal, or replaced by pithy summaries of theme, you sense a cognitive departure behind the writing that raises your suspicions. But even then, it’s not appropriate to call them anything except suspicions,” he said.