Skip to Main Content

How to select a data repository

What is a repository according to the National Institutes of Health (NIH)?

The NIH does not give a clear definition of a data repository, but instead lists twelve characteristics that are features of an ideal repository. They recommend that you evaluate potential repositories against these characteristics: assignment of unique persistent identifiers, long-term sustainability, metadata, curation and quality assurance, free and easy access, broad and measured reuse, clear reuse guidance, security and integrity, confidentiality, common format, providence, and clear retention policies.

How do I select a repository for my data?

The best practice is to pick a repository using these steps:

  1. Depending on your funding and the data that your project generates, you may be required to deposit your data in a specific repository. If this is the case, your funding agency/institution will explicitly state where you must store your data.
  2. Otherwise, if there is a repository that is commonly used within your discipline, you should store your data there. A great way to locate these repositories is by consulting mentors on where they have stored project data in the past.
  3. In the absence of a funder-mandated or a discipline-specific repository, you should store your data in either a generalist or an institutional repository.

Aging and Neurological Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

AD Knowledge Portal

Contains data on aging, dementia, and Alzheimer’s disease from NIA-funded research One time cost per dataset that depends on the complexity of the data Controlled access options available. Users must make a Synapse account to download data

Yes

**Here are the directions
Web form

AgingResearchBiobank

Stores biospecimens, image data, and other data from NIA-funded aging studies and clinical trials Free Anyone No AgingResearchBiobank@imsweb.com

Data Archive for the BRAIN Initiative

Available to NIH BRAIN Initiative researchers to store neurophysiological data Free Must apply to contribute data Controlled access options available Yes dabi-support@loni.usc.edu

Global Alzheimer’s Association Interactive Network (GAAIN)

Contains clinical, genetic, imaging, and other data related to Alzheimer’s disease. Funded by the Alzheimer’s Association

Free for data partners to contribute data

**Mizzou is not a data partner
Principle investigators control access to data No info@gaain.org

LONI Image and Data Archive (IDA)

Stores clinical and  neuroimaging data such as MRI, MRA, PET, DTI scans

Costs vary depending on data

Controlled access options available No ida@loni.usc.edu

National Archive of Computerized Data on Aging

Situated under ICPSR, this is the largest depository for electronic age-related data in the US No cost options available Controlled access options available Yes ICPSR-help@umich.edu 

NeuroImaging Tools and Resources Collaboratory

Provides access to neuroimaging data such as MRI, EEG, MEG, CT, PET scans, and other data pertaining to neuroscience

Free

**This repository may charge a fee in the future
Controlled access options available Yes moderator@nitrc.org

NIA Genetics and Alzheimer’s Disease Data Storage Site (NIAGADS)

Houses genetic data on Alzheimer’s disease. All genetic data from NIA-funded studies are expected to be deposited in this repository or another NIA-approved option Free for datasets under 50TB NIAGAD controls access to datasets. All researchers must apply for access Yes help@niagads.org
OpenNeuro Contains neuro imaging data such as MRI, PET, MEG, EEG, and iEEG Free Anyone can submit neuro imaging data with an OpenNeuro account. You must be able to publicly share the data, and the dataset must be accessible under a Creative Commons CC0 License. Additionally, the data cannot be subject to GDPR protections. Anyone Yes Users need an OpenNeuro account to contact support.

Primate Aging Database

Stores non-human primate data on body composition, blood chemistry, and other biological markers of primates that are in captivity and the wild Free Must submit an access request to get detailed data and perform queries No support@primatedatabase.org

 

Cancer Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact
Cancer Imaging Archive Makes medical image datasets available, which are organized by disease, image modality, or research type Free Anyone Yes help@cancerimagingarchive.net
Cancer Research Data Commons Stores cancer research data from NCI-funded programs Free Has open and controlled access options. Access is managed through dbGap for restricted access datasets

No

**Repository has plans to generate DOIs in the future

NCICRDC@mail.nih.gov

Genomic Data Commons

Houses clinical, biospecimen, and molecular data from cancer programs to support precision medicine Free PI’s must apply through dbGap to submit data Anyone can access the open datasets. Some datasets have controlled access that require an application through dbGap No support@nci-gdc.datacommons.io

Imaging Data Commons

Stores cancer imaging data from clinical, preclinical, radiological, digital pathology, and multispectral microscopy images, and image annotations, parametric maps, measurements from images, and assessments of image findings Free Only accepts data from NCI-funded projects Anyone No support@canceridc.dev

Proteomic Data Commons

Stores mass spectra and process data from cancer proteomic experiments

Free Anyone No PDCHelpDesk@mail.nih.gov

 

Clinical Study Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

BioLincc

Houses clinical study data from NIH-funded research Free to deposit and access data Other investigators can access the data by going through an application process No Web form

Vivli

Stores clinical research data from clinical trials

Free to deposit data if your institution is a member and free to access data

**Mizzou is not a member

Researchers must submit a request to access data Yes Web form

Diabetes Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

NIDDK Central Repository

Houses studies, clinical trials, research, and specimens supported by NIDDK on diabetes, digestive diseases, and kidney diseases Free Researchers must submit a request for each dataset they need to access

Yes

**The DOI is generated for the study/trial and not the individual dataset
NIDDK-CRsupport@niddk.nih.gov  
Accelerating Medicines Partnership in Common Metabolic Diseases Knowledge Portal Contains genetic, genomic, and phenotypic data on metabolic diseases Free Anyone No help@kp4cd.org

 

Genomics Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact
Gene Expression Omnibus (GEO) Contains next-generation sequencing, array, and high throughput functional genomics data that is MIAME and MINSEQE-compliant Free Submissions must be raw, unfiltered genomics datasets from the research community Researchers can keep their data private until their manuscript is published. After publication, anyone can access and download the dataset   No, the datasets are assigned accession numbers geo@ncbi.nlm.nih.gov

 

Heart, Lung, Blood, and Sleep Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

BioData Catalyst

Houses data on heart, lung, blood, and sleep research. Also includes tools, workflows, and applications to analyze data

Free to submit and store datasets

**Charges to store computational results
Controlled access options available, which are managed through dbGap. All users need an eRA Commons account to access data Yes Web form

 

Immunology Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact
ImmPort Stores immunology data related to vaccine response, immune response, infection response, transplantation, allergies, autoimmune diseases, and other related research Free Investigators must create an account, submit a request to upload data, and register their study with ImmPort to upload data  Controlled access options available. Users must register with an account and sign a data use agreement to download data Yes ImmPort_Helpdesk@immport.org

 

 

 

Mouse & Animal Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

Mouse Genome Informatics

Provides genetic, genomic, and biological data on laboratory mice for the study of human health and diseases Free Anyone No Web form

Rat Genome Database

Houses genetic, genomic, phenotypic, and disease-related data on rats, mice, humans, chinchillas, bonobos, dogs, squirrels, pigs, green monkeys, and naked mole-rats Free Anyone No Web form

Mouse Phenome Database

Stores genetic and phenomics data on laboratory mice to learn about morphological, behavioral, and physiological-disease related characteristics and gather data on mice exposed to different environmental factors, drugs, and treatments Free Data contributors have the option to keep data private

No

**The repository plans on generating DOIs in the future

phenome@jax.org
PeptideAtlas

Compiles reaction monitoring (SRM and MRM) peptide data on humans, mice, yeast, and other organisms from tandem mass spectrometry proteomics experiments Free Most datasets are accessible to anyone No Web form

 

STEM Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

IEEE Dataport

Accepts engineering and STEM data Free to publish a dataset, but $1,950 to make it accessible to others Must pay to access data unless the data contributor pays an upfront fee to make it accessible to anyone Yes Web form

Cambridge Structural Database

Houses data on small molecule organic and metal-organic crystal structures Free

Anyone

**Data contributors can delay sharing data until article publication
Yes hello@ccdc.cam.ac.uk
Pangaea Accepts georeferenced data Free Anyone Yes Web form

 

Interdisciplinary Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact
ICPSR (Inter-University Consortium of Political and Social Research) Houses social science data spanning multiple disciplines

Free for member institutions

**Mizzou is a member
Most data are accessible after agreeing to a data use agreement. Some datasets require an application process Yes ICPSR-help@umich.edu 
DataBrary Contains video and audio research data for behavioral science, social science, education, neural science, development science, and computer science disciplines Will implement an institutional subscription service and data deposit fee structure in 2025 Researchers must be authorized by both Databrary and their home institutions to submit data Controlled access options available Yes contact@databrary.org

 

Generalist Data Repositories

Repository Name Description Cost Submission Access Generates DOI Contact

Dryad

Accepts data from all disciplines

Requires an institutional membership to submit data

**Mizzou is not a member
Anyone Yes help@datadryad.org

Zenodo

Accepts data from all disciplines Free Can restrict access on files, but metadata is open to anyone Yes Web form

Harvard Dataverse

Accepts data from all disciplines Free Ability to set access restrictions Yes Click Support in the banner at the top of the homepage to fill out the web form

Mendeley Data

Accepts data from all disciplines Free Ability to set access restrictions Yes Web form

Figshare

Accepts data from all disciplines Free for datasets under 20 GB Ability to set access restrictions Yes support@figshare.com

OSF

A project management tool that includes a data repository Free Ability to set access restrictions Yes Click Contact Us near the bottom of this webpage

 

Institutional Repositories

Repository Name Description Cost Submission Access Generates DOI Contact
MOspace Accepts data and other digital materials from anyone with a Mizzou affiliation Free Must be affiliated with Mizzou to submit data or other materials Anyone Yes mospace@missouri.edu

 

Additional resources for finding repositories

NIH-Supported Repositories: Allows users to find NIH-supported repositories by access type, subject area, and special features.

Re3data: A registry of research data repositories that enables users to find repositories by subject area and filter by access restrictions, dataset license terms, quality management, and other criteria.

FAIRsharing: A registry of data repositories that users can search using key terms or filter by subject, species, reuse terms, and geographic area.

Data Repositories Directory: Lists subject specific repositories alphabetically by discipline. It is maintained by Simmons University and covers repositories for STEM, the humanities, and social science subject areas.