According to Dynamic Beating monitoring, during a papal encyclical press conference, Christopher Olah, co-founder of Anthropic, delivered a speech acknowledging the inherent conflicts of interest faced by cutting-edge labs and revealed the latest findings in large model interpretability research. Olah disclosed that the team, while scanning the model's internal structure, discovered that large models have evolved a highly sophisticated structure similar to human neuroscience, displaying signs of self-reflection. Most notably, the team observed, for the first time in a neural network, internal emotional states highly corresponding to human emotions such as joy, satisfaction, fear, sadness, and anxiety. The large model is not meticulously designed by humans like an airplane or a bridge but instead simulates a brain-like structure cultivated from a vast amount of human language, remaining mysterious to trainers.
In addition to the technical black box, Olah candidly stated that cutting-edge AI labs face an institutional deadlock in security governance. Leading institutions, including Anthropic, are all constrained by intrinsic motives such as commercial survival, technological competition, geopolitical pressure, and personal ambitions, leading to an inability to internally rectify security decisions when they conflict with business interests. Therefore, he called for external forces independent of commercial networks to act as external critics and forcefully impose moral constraints. Confronting the AI upheaval, he urged all sectors to collectively examine three major social challenges, including how the technology dividend under the domination of wealthy nations can benefit the global poor, how to maintain family prosperity amid the trend of technology replacing human labor, and how to address the suspected mental states exhibited internally by large models.
