HomeTechnologyAIs can generate near-verbatim copies of novels from training data

AIs can generate near-verbatim copies of novels from training data

TechnologyFebruary 23, 2026
1 min read
AIs can generate near-verbatim copies of novels from training data
LLMs memorize more training data than previously thought.

The world’s top AI models can be prompted to generate near-verbatim copies of bestselling novels, raising fresh questions about the industry’s claim that its systems do not store copyrighted works.

A series of recent studies has shown that large language models from OpenAI, Google, Meta, Anthropic, and xAI memorize far more of their training data than previously thought.

AI and legal experts told the FT this “memorization” ability could have serious ramifications on AI groups’ battle against dozens of copyright lawsuits around the world, as it undermines their core defense that LLMs “learn” from copyrighted works but do not store copies.

Read full article

Comments

Source: Ars Technica

Share this article

Related Articles

Microsoft and OpenAI’s famed AGI agreement is dead
2026Apr 27

Microsoft and OpenAI’s famed AGI agreement is dead

OpenAI and Microsoft's partnership-turned-situationship just got even less committed. And a clause about artificial general intelligence, which has for years dictated the future of their deal, has off

Article1 min read
Read More
We reviewed Valve’s new Steam Controller, ask us anything
2026Apr 27

We reviewed Valve’s new Steam Controller, ask us anything

Hey hey, it's Jay Peters, senior reporter at The Verge. Today, Valve finally announced that the second version of the Steam Controller - and the first piece of Valve's slate of new gaming hardware set

Article1 min read
Read More