HomeTechnologyAIs can generate near-verbatim copies of novels from training data

AIs can generate near-verbatim copies of novels from training data

TechnologyFebruary 23, 2026
1 min read
AIs can generate near-verbatim copies of novels from training data
LLMs memorize more training data than previously thought.

The world’s top AI models can be prompted to generate near-verbatim copies of bestselling novels, raising fresh questions about the industry’s claim that its systems do not store copyrighted works.

A series of recent studies has shown that large language models from OpenAI, Google, Meta, Anthropic, and xAI memorize far more of their training data than previously thought.

AI and legal experts told the FT this “memorization” ability could have serious ramifications on AI groups’ battle against dozens of copyright lawsuits around the world, as it undermines their core defense that LLMs “learn” from copyrighted works but do not store copies.

Read full article

Comments

Source: Ars Technica

Share this article

Related Articles

After three months on Linux, I don’t miss Windows at all
2026Apr 26

After three months on Linux, I don’t miss Windows at all

In January I finally made good on my threat/promise to install Linux on my desktop. I wanted to see how far I could get using a Linux PC as my main computer without doing a bunch of research beforehan

Article1 min read
Read More