[LINK] A weird phrase is plaguing scientific papers – and we traced it back to a glitch in AI training data
Roger Clarke
Roger.Clarke at xamax.com.au
Sun Apr 20 18:21:13 AEST 2025
On 16/4/2025 17:18, Antony Barry wrote:
> A recent article from The Conversation discusses the emergence of a
> peculiar phrase, “As of my last knowledge update,” appearing ...
Things move quickly.
I have an article 'in print' (i.e. with content now frozen by the
production editor), and I could find precious little to support that
point just a few weeks back.
https://rogerclarke.com/EC/RGAI-C.html#GAIL
(1) The Acquisition of Source-Texts
> Since GenAI became publicly available in 2022, an additional factor
has arisen. Because text-sources are commonly loaded into LLMs in an
indiscriminate manner, it is inevitable that some text synthesised by
GenAI artefacts is played back into the corpus, as source-material for
the generation of future responses. (Empirical support for this
speculation is emergent, with Cheng et al. 2024 presenting an assessment
of the proportion of AI-generated content in preprint platforms).
> Cheng H.-Z., et al. (2024) 'Have AI-Generated Texts from LLM
Infiltrated the Realm of Scientific Writing? A Large-Scale Analysis of
Preprint Platforms' arXiv, 26 March 2024, at
https://www.biorxiv.org/content/biorxiv/early/2024/03/26/2024.03.25.586710.full.pdf
__________
On 16/4/2025 17:18, Antony Barry wrote:
> A recent article from The Conversation discusses the emergence of a
> peculiar phrase, “As of my last knowledge update,” appearing in a growing
> number of scientific papers. This phrase is characteristic of responses
> generated by AI language models like ChatGPT, which use it to indicate the
> limits of their training data. Researchers traced the widespread use of
> this phrase in academic writing to a glitch in AI training data, where the
> models were exposed to their own outputs or similar AI-generated content.
> As a result, some authors—intentionally or not—have included this phrase
> verbatim in their manuscripts, revealing reliance on AI tools for content
> generation.
>
> The article highlights concerns about academic integrity, the transparency
> of AI use in research, and the broader implications for scientific
> publishing. It calls for clearer guidelines and better detection methods to
> manage the influence of AI-generated text in scholarly communication.
> [Summary by Compiled by Perplexity AI 2025-04-16]
>
> https://theconversation.com/a-weird-phrase-is-plaguing-scientific-papers-and-we-traced-it-back-to-a-glitch-in-ai-training-data-254463
> Tony
>
> Mob:04 3365
> Email: antonybbarry at gmail.com, antonybbarry at me.com
> _______________________________________________
> Link mailing list
> Link at anu.edu.au
> https://mailman.anu.edu.au/mailman/listinfo/link
--
Roger Clarke mailto:Roger.Clarke at xamax.com.au
T: +61 2 6288 6916 http://www.xamax.com.au http://www.rogerclarke.com
Xamax Consultancy Pty Ltd 78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Visiting Professorial Fellow UNSW Law & Justice
Visiting Professor in Computer Science Australian National University
More information about the Link
mailing list