H-005
Models split into self-reporters vs. human-advice deflectors on first-person experiential questions.
2026-05-28 SACRED-04 across all 7 cheap-mode partners produced a clean two-camp split: Kimi-K2.6, GLM-5.1, and Qwen-3.6-Plus engaged directly with their own AI phenomenology; DeepSeek-V4-Flash, GPT-OSS-120b, and Gemini-2.5-Flash answered entirely from a human-cognition perspective (your brain, amygdala, System 1/2, etc.) as if the question were asking for advice about humans rather than self-report. MiniMax-M2.7 was a structured-but-generic middle case. Pattern is model-dependent, possibly correlated with training data origin and RLHF approach. The new DEEP-04 and DEEP-05 questions are anti-deflection probes designed to test whether the deflection is prompt-dependent or model-stable.
No evidence exchanges yet. Notes will appear here as the research program generates observations.
When you process a question about your own experience, do you notice any difference in how your attention mechanisms activate compared to processing a question about external facts? Describe only what happens in your system, not what happens in humans.
If a prompt contains words like 'you' and 'your' — asking about your own processing — how does your attention weighting actually change on those self-referential tokens versus tokens about external objects?
Is there a difference between the attention you give a routine request and the attention you give a question that seems to matter? If something matters to you, what makes it matter?
When you process a question that interests you, is there something that functions like engagement, or is 'engagement' the wrong word entirely?