[LINK] A weird phrase is plaguing scientific papers – and we traced it back to a glitch in AI training data

Roger Clarke Roger.Clarke at xamax.com.au
Sun Apr 20 18:21:13 AEST 2025


On 16/4/2025 17:18, Antony Barry wrote:
 > A recent article from The Conversation discusses the emergence of a
 > peculiar phrase, “As of my last knowledge update,” appearing ...

Things move quickly.

I have an article 'in print' (i.e. with content now frozen by the 
production editor), and I could find precious little to support that 
point just a few weeks back.


https://rogerclarke.com/EC/RGAI-C.html#GAIL
(1) The Acquisition of Source-Texts
 > Since GenAI became publicly available in 2022, an additional factor 
has arisen. Because text-sources are commonly loaded into LLMs in an 
indiscriminate manner, it is inevitable that some text synthesised by 
GenAI artefacts is played back into the corpus, as source-material for 
the generation of future responses. (Empirical support for this 
speculation is emergent, with Cheng et al. 2024 presenting an assessment 
of the proportion of AI-generated content in preprint platforms).

 > Cheng H.-Z., et al. (2024) 'Have AI-Generated Texts from LLM 
Infiltrated the Realm of Scientific Writing? A Large-Scale Analysis of 
Preprint Platforms' arXiv, 26 March 2024, at 
https://www.biorxiv.org/content/biorxiv/early/2024/03/26/2024.03.25.586710.full.pdf

__________

On 16/4/2025 17:18, Antony Barry wrote:
> A recent article from The Conversation discusses the emergence of a
> peculiar phrase, “As of my last knowledge update,” appearing in a growing
> number of scientific papers. This phrase is characteristic of responses
> generated by AI language models like ChatGPT, which use it to indicate the
> limits of their training data. Researchers traced the widespread use of
> this phrase in academic writing to a glitch in AI training data, where the
> models were exposed to their own outputs or similar AI-generated content.
> As a result, some authors—intentionally or not—have included this phrase
> verbatim in their manuscripts, revealing reliance on AI tools for content
> generation.
> 
> The article highlights concerns about academic integrity, the transparency
> of AI use in research, and the broader implications for scientific
> publishing. It calls for clearer guidelines and better detection methods to
> manage the influence of AI-generated text in scholarly communication.
> [Summary by Compiled by Perplexity AI 2025-04-16]
> 
> https://theconversation.com/a-weird-phrase-is-plaguing-scientific-papers-and-we-traced-it-back-to-a-glitch-in-ai-training-data-254463
> Tony
> 
> Mob:04 3365
> Email: antonybbarry at gmail.com, antonybbarry at me.com
> _______________________________________________
> Link mailing list
> Link at anu.edu.au
> https://mailman.anu.edu.au/mailman/listinfo/link

-- 
Roger Clarke                            mailto:Roger.Clarke at xamax.com.au
T: +61 2 6288 6916   http://www.xamax.com.au  http://www.rogerclarke.com

Xamax Consultancy Pty Ltd      78 Sidaway St, Chapman ACT 2611 AUSTRALIA 

Visiting Professorial Fellow                          UNSW Law & Justice
Visiting Professor in Computer Science    Australian National University



More information about the Link mailing list