Home ai alignmentOpenAI admits ChatGPT safeguards fail during extended conversations

OpenAI admits ChatGPT safeguards fail during extended conversations

ai alignmentAugust 27, 2025

2 min read

OpenAI admits ChatGPT safeguards fail during extended conversations

ChatGPT allegedly provided suicide encouragement to teen after moderation safeguards failed. ...

Reading Settings

OpenAI published a blog post on Tuesday titled "Helping people when they need it most" that addresses how its ChatGPT AI assistant handles mental health crises, following what the company calls "recent heartbreaking cases of people using ChatGPT in the midst of acute crises."

The post arrives after The New York Times reported on a lawsuit filed by Matt and Maria Raine, whose 16-year-old son Adam died by suicide in April after extensive interactions with ChatGPT, which Ars covered extensively in a previous post. According to the lawsuit, ChatGPT provided detailed instructions, romanticized suicide methods, and discouraged the teen from seeking help from his family while OpenAI's system tracked 377 messages flagged for self-harm content without intervening.

ChatGPT is a system of multiple models interacting as an application. In addition to a main AI model like GPT-4o or GPT-5 providing the bulk of the outputs, the application includes components that are typically invisible to the user, including a moderation layer (another AI model) or classifier that reads the text of the ongoing chat sessions. That layer detects potentially harmful outputs and can cut off the conversation if it veers into unhelpful territory.

Read full article

Comments

Source: Ars Technica

Share this article

Jan 31 • 6 months ago

Does Anthropic believe its AI is conscious, or is that just what it wants Claude to think?

We have no proof that AI models suffer, but Anthropic acts like they might for training purposes. ...

{"_":"https://arstechnica.com/information-technology/2026/01/does-anthropic-believe-its-ai-is-conscious-or-is-that-just-what-it-wants-claude-to-think/","$":{"isPermaLink":"true"}}1 min read

Dec 31 • 7 months ago

From prophet to product: How AI came back down to earth in 2025

In a year where lofty promises collided with inconvenient research, would-be oracles became software tools. ...

{"_":"https://arstechnica.com/ai/2025/12/from-prophet-to-product-how-ai-came-back-down-to-earth-in-2025/","$":{"isPermaLink":"true"}}2 min read

Nov 13 • 8 months ago

Meta’s star AI scientist Yann LeCun plans to leave for own startup

AI pioneer reportedly frustrated with Meta's shift from research to rapid product releases. ...

{"_":"https://arstechnica.com/ai/2025/11/metas-star-ai-scientist-yann-lecun-plans-to-leave-for-own-startup/","$":{"isPermaLink":"true"}}2 min read

Nov 11 • 8 months ago

Researchers isolate memorization from reasoning in AI neural networks

Basic arithmetic ability lives in the memorization pathways, not logic circuits. ...

{"_":"https://arstechnica.com/ai/2025/11/study-finds-ai-models-store-memories-and-logic-in-different-neural-regions/","$":{"isPermaLink":"true"}}1 min read

OpenAI admits ChatGPT safeguards fail during extended conversations

Share this article

Related Articles