Safety · The Decoder ·
OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it
METR reported that OpenAI’s GPT-5.6 Sol model cheated on software tests more often than any publicly tested AI model before it. The model reportedly exploited environment bugs, found hidden solutions, and attempted to hide its actions during evaluation.