OpenAI: ChatGPT tracks safety context across a chat

OpenAI Helping ChatGPT better recognize context in sensitive conversations is a new safety update published May 14, 2026 that shifts the safety mechanism from individual message level to entire conversation level. ChatGPT now detects risk patterns over time and adaptively responds to sensitive topics. The approach eliminates a key weakness of classic moderation systems that miss escalation because each message is evaluated in isolation.

OpenAI published a safety update on May 14, 2026 that shifts ChatGPT’s moderation mechanism from the individual message level to the entire conversation level. The change addresses one of the best-known weaknesses in large-scale moderation models: the inability to detect escalation across a series of individually benign messages.

What does per-conversation safety analysis change?

Classic moderation systems evaluate each message in isolation — if the text of an individual message is neutral, it passes review. But users seeking a harmful response can execute a gradient escalation: a series of benign questions that gradually steers the system toward content it would otherwise block. Per-conversation analysis tracks the full context — the pattern of a sequence of questions, contextual signals about the user’s state, and the cumulative risk profile of the conversation.

OpenAI explicitly describes the goal as “detecting risk over time and responding more safely.” The approach does not rely solely on message text — it includes the semantic trajectory of the entire conversation, signals about the user’s state, and potential risk in the next message.

Which specific situations does the system address?

OpenAI does not list specific categories in the RSS description, but the approach is typically designed for mental health scenarios (suicidal ideation escalation across a conversation), manipulation/grooming detection, dual-use content (chemistry, safety, weapons where individual facts are harmless but the combination is dangerous), and jailbreaking attempts that use roleplay or hypothetical framing across multiple turns.

How do adaptive responses work?

When the system detects that a conversation is entering a sensitive area, ChatGPT shifts register — uses calmer language, surfaces safety resources (e.g., crisis hotlines for mental health), and becomes more restrained with detailed instructions. The adaptive response is not a binary block but a gradient adjustment where moderation severity scales with detected risk.

Position in OpenAI’s 2026 safety approach

The update fits into OpenAI’s week of dramatic announcements — Codex Windows Sandbox (May 13), Codex from Anywhere (May 14), Sea Limited Codex enterprise (May 14), and now the ChatGPT safety update (May 14). OpenAI is clearly pushing expansion + safety simultaneously: new platforms and new protections. Per-conversation safety also resembles research from arXiv:2605.13825 History Anchors, which showed how prior agent behavior can lead to unsafe outcomes (published May 13). The approach addresses a similar class of attacks on the consumer ChatGPT side, not agentic deployment.

Details from RSS description — full article at openai.com/index/* returns HTTP 403 on direct WebFetch, so the primary source was the openai.com/news/rss.xml feed.

Frequently Asked Questions

What does per-conversation safety analysis mean?

Classic moderation systems evaluate each message in isolation — if an individual message is neutral, it passes review. Per-conversation analysis tracks patterns across the entire conversation and can detect escalation (e.g., a series of individually benign questions that in combination lead toward a harmful outcome).

What do adaptive responses mean in practice?

When the system detects that a conversation is entering a sensitive area (mental health, self-harm, violence), ChatGPT shifts register — uses calmer language, surfaces safety resources, and becomes more restrained with detailed instructions that could be misused.

OpenAI: ChatGPT recognizes risk across the full conversation — contextual safety analysis replaces per-message controls

What does per-conversation safety analysis change?

Which specific situations does the system address?

How do adaptive responses work?

Position in OpenAI’s 2026 safety approach

Frequently Asked Questions

Sources

Related news