What We Mean by Collaborative Research Ecosystems
The future of outcomes research isn't a better database — it's a shared infrastructure where patients, providers, researchers, and industry work from the same foundation.
Ben Smith
Founder & CEO, Principia Health Sciences
Every few years, someone declares that healthcare data is finally going to be interoperable. Standards are published. Initiatives are launched. Pilot programs run successfully in controlled environments. And then the industry goes back to moving data around in spreadsheets and encrypted ZIP files.
The problem isn’t technical. We’ve had the technical ability to share data securely and at scale for years. The problem is structural: the incentives, governance models, and trust frameworks that would make data sharing routine simply don’t exist in most research contexts.
That’s what a Collaborative Research Ecosystem is designed to solve.
Beyond the data warehouse
The traditional approach to multi-stakeholder research is the centralized data warehouse. Everyone sends their data to a central repository, a coordinating center cleans and harmonizes it, and researchers submit queries against the combined dataset.
This model works. It’s also slow, expensive, and creates governance nightmares. Who owns the combined dataset? Who controls access? What happens when a contributing site wants to withdraw? How do you handle data that’s subject to different regulatory regimes?
A Collaborative Research Ecosystem takes a different approach. Instead of centralizing data, it creates a shared infrastructure layer that allows each stakeholder to contribute data on their terms while still enabling the combined analysis that makes multi-site research valuable.
The three layers
A functional research ecosystem needs three things:
A common data model that doesn’t require common data. Each participating site keeps its data in whatever format and system it already uses. The ecosystem provides mapping and harmonization tools that translate local data into a shared analytical model — without requiring sites to change their workflows or replace their systems. In practice that means being standards-fluent: ingesting via FHIR and SMART on FHIR, harmonizing to the OMOP Common Data Model so every source lines up, and producing SDTM/CDASH outputs when the work is submission-grade. The standards do the heavy lifting; the sites don’t have to.
Governance that scales. Data use agreements, IRB approvals, consent management, and access controls need to be built into the infrastructure, not bolted on as an afterthought. Every query, every export, every analysis needs to be auditable and tied to a specific authorization.
Analytics that respect boundaries. Federated analytics, privacy-preserving computation, and tiered access models allow researchers to ask questions of data they can’t directly access. The insights move; the raw data doesn’t have to.
Why this matters now
Three trends are converging to make collaborative research ecosystems not just possible but necessary:
Regulatory pressure for real-world evidence. FDA, EMA, and payers increasingly require evidence from real-world settings — not just controlled trials. Generating that evidence requires data from clinical practice, which means engaging providers and health systems as research partners, not just data sources.
Patient expectations. Patients increasingly expect to participate in research as partners, to have visibility into how their data is used, and to benefit from the research they contribute to. A walled-garden approach to data doesn’t meet those expectations.
Economic reality. The cost of traditional prospective data collection is becoming prohibitive for many research questions. Organizations that can leverage existing clinical data — with appropriate consent and governance — will have a structural advantage.
What we’re building
CuRE is our implementation of this model. It’s not a database or an analytics platform — it’s the connective tissue that allows a disease association, its member clinics, their patients, and industry sponsors to operate as a unified research program while each maintains control of their own data, systems, and relationships.
It works because it’s one platform, not a toolbox. Twelve products — from EHR connectivity and patient-reported outcomes to harmonization, analytics, and decision support — compose on a single governed record. Every event lands on the same shared substrate, so there are no flat-file islands and no nightly reconciliation between systems that were never meant to talk to each other.
Is it harder to build than a data warehouse? Yes. But it solves the right problem: not “how do we store data” but “how do we do research together.”