I think the more damning part is the fact that OpenAI’s automated moderation system flagged the messages for self-harm but no human moderator ever intervened.
OpenAI claims that its moderation technology can detect self-harm content with up to 99.8 percent accuracy, the lawsuit noted, and that tech was tracking Adam’s chats in real time. In total, OpenAI flagged “213 mentions of suicide, 42 discussions of hanging, 17 references to nooses,” on Adam’s side of the conversation alone.
[…]
Ultimately, OpenAI’s system flagged “377 messages for self-harm content, with 181 scoring over 50 percent confidence and 23 over 90 percent confidence.” Over time, these flags became more frequent, the lawsuit noted, jumping from two to three “flagged messages per week in December 2024 to over 20 messages per week by April 2025.” And “beyond text analysis, OpenAI’s image recognition processed visual evidence of Adam’s crisis.” Some images were flagged as “consistent with attempted strangulation” or “fresh self-harm wounds,” but the system scored Adam’s final image of the noose as 0 percent for self-harm risk, the lawsuit alleged.
Had a human been in the loop monitoring Adam’s conversations, they may have recognized “textbook warning signs” like “increasing isolation, detailed method research, practice attempts, farewell behaviors, and explicit timeline planning.” But OpenAI’s tracking instead “never stopped any conversations with Adam” or flagged any chats for human review.
I think the more damning part is the fact that OpenAI’s automated moderation system flagged the messages for self-harm but no human moderator ever intervened.