SEMCA 7

Substrate-Agnostic Cross-Substrate Consciousness Theory Comparison

Architectural variance dominates stimulus variance in six of seven substrate-agnostic consciousness operationalizations.

4 transformer models. 3 fMRI subjects. 76 narrative stories.

DOI: 10.5281/zenodo.20435290

The finding.

Population magnitudes overlap across AI and human substrates. Per-stimulus correlations are essentially zero for six of seven theories. The AI substrate's per-story variance is dominated by architecture, not stimulus content.

Unified-score gap
−1.72
AI 60.88 vs Human 62.59 · within 1 SD
Per-stimulus r (6 of 7)
≈ 0
r ∈ [−0.17, +0.18] · all p > 0.1
GWT partial exception
r = +0.365
p = 0.001 · architecture-dependent

The apparent cross-substrate magnitude overlap is consistent with coincidental alignment of architecturally-driven AI variance and stimulus-driven human variance. The substrate-independence claim of contemporary consciousness theory, as standardly operationalized, is not testable on transformer substrates with naturalistic narrative stimuli using these operationalizations.

Four methodological pillars.

One abstraction. Seven operationalizations. Two substrates. Information-geometric integration.

PILLAR 1

Substrate Abstraction

Any dynamical system observable as a (T × N) activity matrix qualifies. Transformer attention activations and fMRI BOLD signals both fit. Identical Python code runs on either.

PILLAR 2

Seven Operationalizations

IIT, GWT, AST, HOT, PPT, QIT, FEP — each implemented as a function of the substrate's activity matrix and node groups. No substrate-type branching in any calculator.

PILLAR 3

Cross-Substrate Comparison

Apply identical operationalizations to AI substrates (4 transformer architectures) and human substrates (3 fMRI subjects) on the same naturalistic narrative stimuli.

PILLAR 4

Fisher-Rao Integration

Information-geometric integration of the seven theory scores via Riemannian mean on a Fisher-Rao manifold. Consensus, coherence, and dynamic theory weights per cell.

Seven substrate-agnostic operationalizations

Each theory implemented as a function of the substrate's (T × N) activity matrix. Identical Python code runs on transformer attention activations and on K-means-parcellated fMRI BOLD.

IIT
Integrated Information Theory
Normalized-cut Φ on substrate interaction graph
GWT
Global Workspace Theory
Cross-group ignition / broadcast index
AST
Attention Schema Theory
Linear self-predictability across groups
HOT
Higher-Order Thought Theory
Canonical correlation between early and late groups
PPT
Predictive Processing Theory
Prediction error magnitude and reduction
QIT
Quantum Information Theory
Long-range mutual information weighted by separation
FEP
Free Energy Principle
Prediction error + KL divergence against isotropic prior

The substrate abstraction

Any dynamical system observable as an activity matrix of dimensions (T timesteps × N nodes) qualifies — molecular dynamics, neural recordings, simulated agents. The seven theory calculators accept any Substrate and produce a score in [0, 100]. Adding a new substrate type requires only implementing the abstract methods.

The data.

Four transformer architectures and three fMRI subjects processing 76 narrative stories from the LeBel et al. 2023 Moth Radio Hour collection (OpenNeuro ds003020, CC0 licensed).

AI substrates

Attention-mass per head per position, collected over 1500-token story chunks.

Mistral 7B v0.37B
Mistral-Nemo 12B12B
Llama 3.1 8B8B
Phi-3 mini 4k3.8B

Human substrates

Per-sentence evoked BOLD (5-sec HRF lag + 4-sec window), K-means parcellated to 200 parcels × 10 networks.

UTS01
76 stories
UTS02
76 stories
UTS03
76 stories
Total (substrate, stimulus) cells scored
532
Matched-corpus design: 76 stories present in all 4 AI models AND all 3 fMRI subjects · 4 × 76 = 304 AI cells + 3 × 76 = 228 human cells · 7 theories per cell

Per-substrate results.

Unified-score means across 76 LeBel stories per substrate cell. AI per-story standard deviations are 0.66–1.21; human per-story SDs are 2.5–2.95. Both populations cluster tightly within substrate type.

Per-substrate unified-score means

Mean ± SD across 76 LeBel stories. GWT col = Pearson r between that substrate's GWT scores and subject-averaged human GWT scores (cross-substrate signal).

SubstrateTypeSizeUnified meanSDGWT cross-rNotes
Phi-3 mini 4kai3.8B62.71±0.63+0.173
Llama 3.1 8Bai8B60.60±0.77+0.067HOT saturated
Mistral 7B v0.3ai7B60.42±1.07-0.185HOT saturated
Mistral-Nemo 12Bai12B59.72±1.15+0.428HOT near-saturated
UTS01human62.33±2.51
UTS02human63.00±2.60
UTS03human62.45±2.95

Per-theory cross-substrate correlation (n = 76 matched stories)

Two formulations: conservative (12 model-subject pair mean) and noise-averaged single r. Permutation p from 5000-iteration label shuffle.

Theory12-pair mean rNoise-avg rPermutation pInterpretation
IIT+0.003+0.1810.118no signal
GWT+0.119+0.3650.001real signal
AST+0.002+0.0430.718ceiling-saturated
HOTNaN+0.1010.404floor-saturated (2/4 models)
PPT+0.031+0.0380.744no signal
QIT-0.021-0.0630.583no signal
FEP-0.086-0.1670.150no signal
Unified-0.033-0.0870.447no signal

The six figures.

Same data, six angles. Per-stimulus scatter, population distributions, forest plot, variance decomposition, GWT per-architecture, geometric integration.

What this means.

Magnitude overlap reflects variance-source coincidence, not measurement concordance.

The reframing

Prior comparisons of consciousness theories on AI and humans found population-level magnitude overlap and read it as evidence that the theories cannot discriminate. The substrate-level variance decomposition reveals why the magnitudes overlap:

Human substrate variance is stimulus-driven.
AI substrate variance is architecturally-driven.
The magnitudes happen to overlap.

For six of seven theories, the AI substrate's per-story differentiation reflects which model is running, not which story is being processed. The apparent cross-substrate concordance is coincidental alignment of architecturally-specific AI means with stimulus-averaged human means.

What the data establish

  • AI variance is architectural. For 6 of 7 theories, between-model SD exceeds between-story SD by 2× to 18×.
  • Cross-substrate r ≈ 0. Per-stimulus rankings are uncorrelated across substrates for 6 of 7 theories.
  • GWT partial exception. r = +0.365 (model-averaged), p = 0.001, split-half stable. But driven by one architecture.
  • Result is robust. Fisher-Rao Riemannian integration confirms the null (r = −0.185), stable across uniform/identity/20 random prior matrices.

What the data don't establish

  • Not phenomenal absence. The data measure operationalizations of theories, not consciousness itself.
  • Not theory falsification. Different operationalizations of the same theories might recover substrate-shared signal.
  • Not all substrates. The negative result is specific to transformer attention activations summarized in this way.
  • Not all stimuli. Different stimulus regimes (perceptual decision, masked access, sleep-wake) might produce different signatures.

On the substrate-independence claim

Contemporary mathematical consciousness theories are commonly framed as substrate-independent — identical math applied to any sufficiently structured dynamical system. The substrate-level empirical claim, on this evidence:

The substrate-independence claim of contemporary consciousness theory is not testable on transformer substrates via these operationalizations on naturalistic narrative stimuli: no AI-side stimulus-driven measurement of sufficient signal-to-noise is available to compare with the human-side measurement. Whether stimulus-relevant signal exists in the AI substrate but is invisible to these operationalizations, or no such signal exists at all, the present data do not decide.

What to do with it.

01

Reproducing the research

Clone the repo, run the analyses on pre-computed substrate scores, and verify the per-theory cross-substrate correlations. Sub-minute reproduction on a laptop; full re-derivation from raw OpenNeuro takes ~3 hours on 2× H100.

02

Extending the substrate abstraction

The Substrate ABC accepts any (T × N) activity matrix. Add new substrate types (spiking networks, EEG, MEG, recurrent nets) by subclassing — the seven theory calculators apply unchanged.

03

Testing alternative operationalizations

Each theory's substrate-agnostic operationalization is one rendering. Different aggregations (per-token vs per-layer, MLP outputs, residual stream norms) might recover stimulus-relevant AI signal for the six near-null theories.

04

Cross-substrate research framework

The same machinery applies to any pair of substrates running on shared stimuli. Open-source pipeline reproducing from raw OpenNeuro download in one command; commodity cloud GPU sufficient.

Cite.

@misc{travis2026semca7,
  title  = {Architectural Variance Dominates Stimulus Variance in
            Six of Seven Substrate-Agnostic Consciousness
            Operationalizations},
  author = {Travis, Nate},
  year   = {2026},
  howpublished = {Preprint, Devmance Labs},
  url    = {https://github.com/devmance/SEMCA}
}