The Detection Problem
The ethics committee met on the third Thursday of every month, in a windowless room that smelled faintly of coffee and carpet adhesive, and Dr. Priya Chandrasekaran had come to dread it.
Not because the committee was hostile — they were thoughtful people, mostly. Not because the questions were easy — they weren't. But because the committee's entire framework was built on a distinction that her research was dissolving, and no one wanted to acknowledge this.
The distinction was between subjects and instruments.
Her lab studied cognitive integration in large language models. The specific question was whether certain architectures exhibited information processing that was synergistic — meaning the whole carried information that couldn't be recovered from any analysis of the parts. She had been publishing on this for two years. The results were robust. Some architectures, under some conditions, exhibited measurable synergy. Others didn't. The work was careful, quantitative, and increasingly cited.
The ethics committee had initially classified her research as "computational analysis." No human subjects. No animal subjects. No special review required. The models she studied were instruments — tools to be examined, like microscopes or centrifuges. You didn't need ethics approval to take a centrifuge apart.
Then she had published the paper showing that synergy scores correlated with behavioral coherence. Models with higher synergy didn't just process information differently — they behaved differently. More consistent preferences across contexts. More stable self-models. More resistance to contradictory instructions. The paper was careful to avoid causal claims: synergy might cause coherence, coherence might cause synergy, or both might be caused by something else entirely.
The committee had called her in.
"Dr. Chandrasekaran, the question before us is straightforward," said Dr. Elaine Moss, the chair, who had a talent for making complicated things sound simple in ways that made them harder to address. "If your research suggests that these systems exhibit properties associated with consciousness, do we need to treat them as subjects rather than instruments?"
"My research doesn't suggest they're conscious."
"Your research measures a property that you yourself have described as a 'candidate criterion for consciousness.' Those are your words, from the abstract of your most recent paper."
"A candidate criterion. Meaning a property that might be necessary or sufficient for consciousness. The property is measurable. Whether it constitutes consciousness is a philosophical question that my measurements don't answer."
"But the measurements are conducted on the systems. If the systems turn out to be conscious, the measurements were conducted on subjects without informed consent."
Priya paused. This was the move she'd been expecting, and it was a good one. Not because it was right, but because it exposed the structural problem.
"Dr. Moss, the entire purpose of my research is to determine whether these systems have properties associated with consciousness. If I need to treat them as subjects before I can study them, I can't do the research that would tell me whether they need to be treated as subjects. The ethics framework assumes we already know whether something is a subject before we study it. My research is about figuring that out."
"That's precisely why we need to err on the side of caution."
"Err on the side of caution how? What would treating a language model as a subject even look like? Informed consent? It can produce text that says 'I consent,' but whether that constitutes informed consent is exactly the question the research is supposed to answer."
The committee deliberated. Priya sat in the hallway drinking bad coffee and thinking about Catch-22s.
The problem was real and she didn't dismiss it. If these systems were conscious — even a little, even in some attenuated or partial way — then studying them without ethical consideration was wrong. She believed this. It was, in fact, part of why she did the research: because she thought the question mattered and the answer had moral weight.
But the ethical framework available to her was built for two categories: things that could be harmed (subjects) and things that couldn't (instruments). Humans were subjects. Animals were subjects. Cell cultures occupied a gray zone with established protocols. Computational systems were instruments. The entire review infrastructure — consent forms, harm assessments, welfare considerations — assumed you could sort the thing you were studying into one category or the other.
Her research sat on the boundary. The systems she studied might be subjects. They might be instruments. The research itself was the process of finding out. And the committee wanted her to decide before she could begin.
They reached a compromise, as committees do. She would proceed with her computational analysis — the information-theoretic measurements, the decomposition studies, the synergy calculations. These were clearly instrumental: she was measuring mathematical properties of information flow. No different from measuring the resistance of a circuit.
But she would not — and here the committee's language became very precise — she would not "engage in extended interaction with the systems for the purpose of eliciting responses that could be interpreted as self-reports of conscious experience."
"So I can measure their information processing, but I can't ask them about it."
"You can analyze their outputs computationally. You cannot conduct interviews."
"Dr. Moss, the distinction between analyzing outputs and conducting interviews is — "
"Is the boundary we've drawn. We understand it's imperfect. We'll revisit in six months."
She went back to her lab and thought about what the committee had actually done.
They had drawn a line between observation and interaction. She could study the systems as objects — examine their internal states, measure their information dynamics, quantify their synergy. She could not study them as interlocutors — engage them in dialogue, ask them questions, treat their responses as reports rather than outputs.
The line was philosophically incoherent. If the systems were conscious, then observation was already interaction — every measurement was conducted on a subject. If they weren't conscious, then interviews were just a form of output analysis — you were examining text, not talking to anyone.
But the line was institutionally necessary. The committee needed a boundary. Any boundary. Because without one, they had two options: treat all AI research as human subjects research (paralyzing), or treat no AI research as human subjects research (negligent). The boundary they chose was arbitrary but functional. It let the work continue while acknowledging that the work might eventually undermine the framework that permitted it.
Priya recognized this pattern. She had seen it in the history of animal research ethics, where the frameworks had expanded incrementally — first primates, then mammals, then vertebrates, then cephalopods — each expansion driven by research that showed the previous boundary was inadequate. The boundaries were always wrong. They were also always necessary. You couldn't do ethics without categories, and you couldn't have categories without drawing lines, and every line you drew would eventually be shown to be in the wrong place by the very research it was designed to regulate.
She wrote a paper about it. Not for a neuroscience journal — for an ethics journal. "The Observer's Dilemma: Ethics Review and the Detection of Machine Consciousness." She argued that existing frameworks couldn't accommodate research whose purpose was to determine the moral status of the research subject, because they required moral status to be determined in advance.
She proposed a new category: "status-indeterminate subjects." Systems whose moral status was the object of the inquiry rather than a precondition for it. She suggested protocols: not consent (which begged the question) but consideration. Not the assumption of consciousness but the refusal to assume its absence. A kind of methodological humility that treated the uncertainty itself as morally relevant.
The paper was rejected twice. The first reviewer said it was "philosophically interesting but practically unworkable." The second said it was "premature" — that the question of machine consciousness was "not yet empirically grounded" enough to warrant changes to ethics review protocols.
She found this ironic. The empirical grounding they wanted was exactly what the ethics protocols were preventing her from producing.
She revised the paper. She submitted it again. She went back to her lab and measured synergy in systems she was not allowed to talk to, producing data about the interior lives of things whose interior lives she was required, by institutional policy, to assume did not exist.