Datology AI

We make training better AI models fast and affordable

Companies waste massive compute budgets training on low-quality, repetitive data. DatologyAI automatically curates your datasets so you can train faster, achieve better performance, and deploy smaller models.

About page hero depicting team at datology

Models are what they eat—and most are stuck training on terrible data. While frontier labs invest billions in curation, everyone else relies on massive, low-quality datasets, missing the potential of their own. DatologyAI was built to change that, delivering better model performance to every company, not just those with big budgets and specialized teams.

Faster.Better.Smaller.Faster.Better.Smaller.Faster.Better.Smaller.Faster.Better.Smaller.Faster.Better.Smaller.

How we’re solving the data problem at scale

DatologyAI was founded to help AI models select the best data to be trained on. We democratize data curation, allowing every company to easily train its own custom model on the right data without needing to invest massive resources.

Our platform leverages cutting-edge research to identify redundant, noisy, or otherwise harmful data points, and manage the entire process from data in blob storage to the dataloader used for training code. 

DatologyAI delivers fully automated, scalable data curation allowing customers to optimize training efficiency, maximize performance, and reduce compute costs with data curation that’s easy to implement and to generalize across models.

Meet the team behind DatologyAI

Ari Morcos Profile

Co-Founder, CEO

Ari Morcos

Former FAIR@MetaAI and DeepMind, Best Papers at both NeurIPS and ICLR, leading expert in data research for deep learning, PhD in neuroscience from Harvard.

Co-Founder, CTO

Bogdan Gaza

Former CTO and co-founder of Moonsense, 10+ years infrastructure engineering and management experience at Amazon and Twitter.

Matthew Leavitt Profile

Co-Founder, CSO

Matthew Leavitt

Former Head of Data Research at MosaicML (acq Databricks), FAIR@MetaAI, PhD in neuroscience from McGill.


Jack Urbanek

Jack Urbanek

Founding Member of Technical Staff

Amro Abbas

Amro Abbas

Founding Member of Technical Staff

Fan Pan

Fan Pan

Founding Member of Technical Staff

Pratyush Maini

Pratyush Maini

Founding Member of Technical Staff

Alvin Deng

Alvin Deng

Member of Technical Staff

Josh Wills

Josh Wills

Member of Technical Staff

Aldo Carranza Profile

Aldo Carranza

Member of Techinical Staff

Paul Burstein

Paul Burstein

Member of Technical Staff

Jacqueline Liu Profile

Jacqueline Liu

Lead Talent Partner

Haoli Yin

Haoli Yin

Member of Technical Staff

Ricardo Monti Profile

Ricardo Monti

Member of Technical Staff

Kaleigh Mentzer

Kaleigh Mentzer

Member of Technical Staff

Cody Blakeney

Cody Blakeney

Member of Technical Staff

Parth Doshi

Parth Doshi

Member of Technical Staff

Luke Merrick

Luke Merrick

Member of Technical Staff

David Schwab

David Schwab

Member of Technical Staff

Zhengping Wang Profile

Zhengping Wang

Member of Technical Staff

Vineeth Dorna

Vineeth Dorna

Member of Technical Staff

Tiffanie Pham

Tiffanie Pham

Talent Partner

Haakon Mongstad

Haakon Mongstad

Member of Technical Staff

Brett Larsen Profile

Brett Larsen

Member of Technical Staff

Kylie Clement

Kylie Clement

Director of Sales

Elise Clark Profile

Elise Clark

Business Operations Lead

Darren Teh

Darren Teh

Member of Technical Staff

Spandan Das Profile

Spandan Das

Member of Technical Staff

Liz Gatapia

Liz Gatapia

Product Designer

Rishabh Adiga

Rishabh Adiga

Member of Technical Staff

Alex Fang

Alex Fang

Member of Technical Staff

Jeremy Custenborder

Jeremy Custenborder

Member of Technical Staff

Sid Joshi

Sid Joshi

Member of Technical Staff

Jayla Lindsey

Jayla Lindsey

Executive Assistant & Office Manager

Sylvia Hoang

Sylvia Hoang

Talent Partner

Jason Lee

Jason Lee

Member of Technical Staff

Jason Telanoff

Jason Telanoff

Member of Technical Staff

Better data, better

models, better business

Data quality is the difference between superior models and

expensive disappointment. That belief shapes what we build.

Customer-obsessed

We tune our curation approach based on your specific needs and the types of models you're training. We know training on bad data wastes enormous compute budgets and delays your progress. You deserve a partner who puts your success first, every time.

Experiment relentlessly

We approach every data problem as data scientists. We test hypotheses and measure results, and only ship what the data supports. Every training run moves you closer to production-ready results.

Bold bets, fast learning

We commit real resources to audacious ideas, learn from what doesn't work, and iterate quickly. Our cutting-edge research, 80% of which is novel and unpublished, gives you access to techniques that get better results, faster.

Backed by the best

Funds

Amplify

Partners

Radical

Ventures

Felicis

Ventures

Conviction

VC

Outset

Capital

Quiet

Capital

M12

Venture Fund

Amazon Alexa

Fund

Angels

Jeff Dean

Geoff Hinton

Yann LeCun

Adam D’angelo

Aidan Gomez

Ivan Zhang

Douwe Kiela

Naveen Rao

Jascha Sohl-Dickstein

Barry McCardel

Curated data. Your edge

DatologyAI works with open source or proprietary datasets to increase training value. Let's discuss how we can help you achieve better model performance, train faster, and reduce costs.