AI in biopharma hits a context problem, Zifo insight argues, urging stronger data stewardship
AI may be racing through drug discovery and development, but without scientific context it can stumble. That is the central argument of a new insight published on April 21, 2026, in which Zifo Data Stewardship Practice Lead Marilyne Labasque, PhD, says even sophisticated machine learning and agentic systems struggle to deliver value when they rely on fragmented, poorly described data.
Released from Cambridge, Mass., and Cambridge, England, the piece—titled “Biopharma Companies Discover AI’s Weak Spot: Scientific Context”—contends that in an AI‑native landscape not all information can, or should, be fully structured. What matters, Labasque writes, is that data is discoverable, well described, and managed in ways that preserve scientific context and ethical use.
Falling short shows up daily as rework, repeated experiments, stalled data initiatives and digital programmes, alongside rising scrutiny of traceability, integrity and reproducibility. As AI expands the scale and speed of data use, the cost of ambiguity and lost context rises sharply, the insight argues.
Decisions once recoverable through expert memory or manual review become opaque when automated systems operate on poorly described inputs.
Data stewardship, therefore, provides the discipline needed to preserve scientific intent as information moves across experiments, systems and analytical uses—embedding clarity, traceability and accountability into routine practice so organisations can scale analytics and AI with confidence rather than compounding risk.
The report describes scientific data as an intricate ecosystem encompassing multimodal assay outputs, instrument files, electronic lab notebook (ELN) records, laboratory information management system (LIMS) entries, in vivo study data and chemistry, manufacturing and controls (CMC) packages.
Each domain evolves at its own pace. Without stewardship, coherence can depend on what individuals remember, where an experiment ran, how an instrument was configured, or which field captured the real scientific meaning. Even large platform investments struggle when local conventions and inconsistent practices shape how data is captured, interpreted and shared.
Expectations across the industry now extend well beyond simple correctness, the insight notes. Teams need auditable lineage, validated processing, robust description, reproducible results and analytics they can explain with confidence. Stewardship turns those expectations into daily practice: clarifying meaning, harmonising descriptions, curating context, strengthening traceability and making reuse the rule rather than the exception.
Applied systematically, stewardship can shorten the time scientists spend searching for information or reconstructing decisions, slow data decay, prevent drift between silos and lay the foundation for AI that can be trusted. While much literature defines what stewardship should be, fewer accounts show how it works in practice within real scientific constraints.
Effective stewardship requires collaboration across roles—scientists, data owners, architects and engineers—so decisions reflect how experiments are designed, how instruments are used and how results are consumed. The outcome, Labasque writes, is practical change that improves how data is captured, described and reused without adding friction to scientific work.
Stewardship is most effective when informed by scientific depth and operational consistency across data domains such as biospecimen, assay, omics, biologics, CMC and clinical data. Leaders also need a clear view of where they stand and what to address first.
According to the insight, structured diagnostic approaches and capability assessments can provide visibility into current stewardship maturity and help prioritise the most impactful improvements.
