SEMCA 7

Substrate-Agnostic Cross-Substrate Consciousness Theory Comparison

Architectural variance dominates stimulus variance in six of seven substrate-agnostic consciousness operationalizations.

4 transformer models. 3 fMRI subjects. 76 narrative stories.

Read the paper View findings GitHub Zenodo DOI

DOI: 10.5281/zenodo.20435290

The finding.

Population magnitudes overlap across AI and human substrates. Per-stimulus correlations are essentially zero for six of seven theories. The AI substrate's per-story variance is dominated by architecture, not stimulus content.

Unified-score gap

−1.72

AI 60.88 vs Human 62.59 · within 1 SD

Per-stimulus r (6 of 7)

≈ 0

r ∈ [−0.17, +0.18] · all p > 0.1

GWT partial exception

r = +0.365

p = 0.001 · architecture-dependent

The apparent cross-substrate magnitude overlap is consistent with coincidental alignment of architecturally-driven AI variance and stimulus-driven human variance. The substrate-independence claim of contemporary consciousness theory, as standardly operationalized, is not testable on transformer substrates with naturalistic narrative stimuli using these operationalizations.

Four methodological pillars.

One abstraction. Seven operationalizations. Two substrates. Information-geometric integration.

PILLAR 1

Substrate Abstraction

Any dynamical system observable as a (T × N) activity matrix qualifies. Transformer attention activations and fMRI BOLD signals both fit. Identical Python code runs on either.

PILLAR 2

Seven Operationalizations

IIT, GWT, AST, HOT, PPT, QIT, FEP — each implemented as a function of the substrate's activity matrix and node groups. No substrate-type branching in any calculator.

PILLAR 3

Cross-Substrate Comparison

Apply identical operationalizations to AI substrates (4 transformer architectures) and human substrates (3 fMRI subjects) on the same naturalistic narrative stimuli.

PILLAR 4

Fisher-Rao Integration

Information-geometric integration of the seven theory scores via Riemannian mean on a Fisher-Rao manifold. Consensus, coherence, and dynamic theory weights per cell.

Seven substrate-agnostic operationalizations

Each theory implemented as a function of the substrate's (T × N) activity matrix. Identical Python code runs on transformer attention activations and on K-means-parcellated fMRI BOLD.

IIT

Integrated Information Theory

Normalized-cut Φ on substrate interaction graph

GWT

Global Workspace Theory

Cross-group ignition / broadcast index

AST

Attention Schema Theory

Linear self-predictability across groups

HOT

Higher-Order Thought Theory

Canonical correlation between early and late groups

PPT

Predictive Processing Theory

Prediction error magnitude and reduction

QIT

Quantum Information Theory

Long-range mutual information weighted by separation

FEP

Free Energy Principle

Prediction error + KL divergence against isotropic prior

The substrate abstraction

Any dynamical system observable as an activity matrix of dimensions (T timesteps × N nodes) qualifies — molecular dynamics, neural recordings, simulated agents. The seven theory calculators accept any Substrate and produce a score in [0, 100]. Adding a new substrate type requires only implementing the abstract methods.

The data.

Four transformer architectures and three fMRI subjects processing 76 narrative stories from the LeBel et al. 2023 Moth Radio Hour collection (OpenNeuro ds003020, CC0 licensed).

AI substrates

Attention-mass per head per position, collected over 1500-token story chunks.

Mistral 7B v0.37B

Mistral-Nemo 12B12B

Llama 3.1 8B8B

Phi-3 mini 4k3.8B

Human substrates

Per-sentence evoked BOLD (5-sec HRF lag + 4-sec window), K-means parcellated to 200 parcels × 10 networks.

UTS01

76 stories

UTS02

76 stories

UTS03

76 stories

Total (substrate, stimulus) cells scored

532

Matched-corpus design: 76 stories present in all 4 AI models AND all 3 fMRI subjects · 4 × 76 = 304 AI cells + 3 × 76 = 228 human cells · 7 theories per cell

Per-substrate results.

Unified-score means across 76 LeBel stories per substrate cell. AI per-story standard deviations are 0.66–1.21; human per-story SDs are 2.5–2.95. Both populations cluster tightly within substrate type.

Per-substrate unified-score means

Mean ± SD across 76 LeBel stories. GWT col = Pearson r between that substrate's GWT scores and subject-averaged human GWT scores (cross-substrate signal).

Substrate	Type	Size	Unified mean	SD	GWT cross-r	Notes
Phi-3 mini 4k	ai	3.8B	62.71	±0.63	+0.173
Llama 3.1 8B	ai	8B	60.60	±0.77	+0.067	HOT saturated
Mistral 7B v0.3	ai	7B	60.42	±1.07	-0.185	HOT saturated
Mistral-Nemo 12B	ai	12B	59.72	±1.15	+0.428	HOT near-saturated
UTS01	human	—	62.33	±2.51	—
UTS02	human	—	63.00	±2.60	—
UTS03	human	—	62.45	±2.95	—

Per-theory cross-substrate correlation (n = 76 matched stories)

Two formulations: conservative (12 model-subject pair mean) and noise-averaged single r. Permutation p from 5000-iteration label shuffle.

Theory	12-pair mean r	Noise-avg r	Permutation p	Interpretation
IIT	+0.003	+0.181	0.118	no signal
GWT	+0.119	+0.365	0.001	real signal
AST	+0.002	+0.043	0.718	ceiling-saturated
HOT	NaN	+0.101	0.404	floor-saturated (2/4 models)
PPT	+0.031	+0.038	0.744	no signal
QIT	-0.021	-0.063	0.583	no signal
FEP	-0.086	-0.167	0.150	no signal
Unified	-0.033	-0.087	0.447	no signal

The six figures.

Same data, six angles. Per-stimulus scatter, population distributions, forest plot, variance decomposition, GWT per-architecture, geometric integration.

What this means.

Magnitude overlap reflects variance-source coincidence, not measurement concordance.

The reframing

Prior comparisons of consciousness theories on AI and humans found population-level magnitude overlap and read it as evidence that the theories cannot discriminate. The substrate-level variance decomposition reveals why the magnitudes overlap:

Human substrate variance is stimulus-driven.
AI substrate variance is architecturally-driven.
The magnitudes happen to overlap.

For six of seven theories, the AI substrate's per-story differentiation reflects which model is running, not which story is being processed. The apparent cross-substrate concordance is coincidental alignment of architecturally-specific AI means with stimulus-averaged human means.

What the data establish

AI variance is architectural. For 6 of 7 theories, between-model SD exceeds between-story SD by 2× to 18×.
Cross-substrate r ≈ 0. Per-stimulus rankings are uncorrelated across substrates for 6 of 7 theories.
GWT partial exception. r = +0.365 (model-averaged), p = 0.001, split-half stable. But driven by one architecture.
Result is robust. Fisher-Rao Riemannian integration confirms the null (r = −0.185), stable across uniform/identity/20 random prior matrices.

What the data don't establish

Not phenomenal absence. The data measure operationalizations of theories, not consciousness itself.
Not theory falsification. Different operationalizations of the same theories might recover substrate-shared signal.
Not all substrates. The negative result is specific to transformer attention activations summarized in this way.
Not all stimuli. Different stimulus regimes (perceptual decision, masked access, sleep-wake) might produce different signatures.

On the substrate-independence claim

Contemporary mathematical consciousness theories are commonly framed as substrate-independent — identical math applied to any sufficiently structured dynamical system. The substrate-level empirical claim, on this evidence:

The substrate-independence claim of contemporary consciousness theory is not testable on transformer substrates via these operationalizations on naturalistic narrative stimuli: no AI-side stimulus-driven measurement of sufficient signal-to-noise is available to compare with the human-side measurement. Whether stimulus-relevant signal exists in the AI substrate but is invisible to these operationalizations, or no such signal exists at all, the present data do not decide.

What to do with it.

01

Reproducing the research

Clone the repo, run the analyses on pre-computed substrate scores, and verify the per-theory cross-substrate correlations. Sub-minute reproduction on a laptop; full re-derivation from raw OpenNeuro takes ~3 hours on 2× H100.

02

Extending the substrate abstraction

The Substrate ABC accepts any (T × N) activity matrix. Add new substrate types (spiking networks, EEG, MEG, recurrent nets) by subclassing — the seven theory calculators apply unchanged.

03

Testing alternative operationalizations

Each theory's substrate-agnostic operationalization is one rendering. Different aggregations (per-token vs per-layer, MLP outputs, residual stream norms) might recover stimulus-relevant AI signal for the six near-null theories.

04

Cross-substrate research framework

The same machinery applies to any pair of substrates running on shared stimuli. Open-source pipeline reproducing from raw OpenNeuro download in one command; commodity cloud GPU sufficient.

Cite.

@misc{travis2026semca7,
  title  = {Architectural Variance Dominates Stimulus Variance in
            Six of Seven Substrate-Agnostic Consciousness
            Operationalizations},
  author = {Travis, Nate},
  year   = {2026},
  howpublished = {Preprint, Devmance Labs},
  url    = {https://github.com/devmance/SEMCA}
}

SEMCA 7

The finding.

Four methodological pillars.

Substrate Abstraction

Seven Operationalizations

Cross-Substrate Comparison

Fisher-Rao Integration

Seven substrate-agnostic operationalizations

The substrate abstraction

The data.

AI substrates

Human substrates

Per-substrate results.

Per-substrate unified-score means

Per-theory cross-substrate correlation (n = 76 matched stories)

The six figures.

Per-stimulus cross-substrate scatter

Population distribution overlap

Cross-substrate correlation forest plot

AI-side variance decomposition

GWT cross-substrate r per AI architecture

Fisher-Rao geometric integration

What this means.

The reframing

What the data establish

What the data don't establish

On the substrate-independence claim

What to do with it.

Reproducing the research

Extending the substrate abstraction

Testing alternative operationalizations

Cross-substrate research framework

Cite.