Skip to content
DataLab

About DataLab

DataLab exists because truly useful data is rare—yet the frontier of AI development only moves forward when high-quality data makes it possible.

We understand the three core pillars driving AI: models, chips, and data. We are convinced that with the right datasets—the third, underdeveloped pillar—you can push the entire frontier forward. Imagine a task force explicitly focused on producing the next 100 or 1,000 ImageNet-grade datasets. Where would that take AI?

Our Goal

We aim to create the intellectual scaffolding that AI researchers look for when navigating ambiguity in data quality, data selection, medical complexity, and annotation strategy — including methodological advice, evaluation design, and guidance on safe and effective use of sensitive real-world data.

Our Approach

DataLab is deeply technical but grounded in reality. We think at the margin—always weighing the marginal value of a datapoint on learning and the opportunity cost of choosing the wrong dataset. We treat data decisions the way an applied researcher treats resource allocation: every cohort you build, every inclusion rule, every missing field has a cost. The question is always: what does this next datapoint buy us, and what do we lose if we chase the wrong thing?

Our team of in-house experts constantly innovates to produce, repackage, and surface novel datasets from existing real-world data. We aim to be the single source of truth on using real-world data correctly to train and evaluate AI models.

What We Do

Scientific partnership for commercial opportunities.

We speak directly to foundation model researchers to navigate frontier-level technical discussions.

Building high-value datasets and data products.

Deep methodological discipline, exposure to commercial data applications, and rigorous processes create new product opportunities stemming from the lab.

Leading AI data research.

We maintain a presence within the broader academic community through publishing cutting-edge data research, designing evaluations and benchmarks, and identifying gaps in training and evaluation data today.

Read more on our blog and our research page.

Join Us

Interested in our mission? Learn about our people or visit our contact page to get in touch.