Skip to Main Content

Data Repositories

How to select a data repository for your research needs.

Depositing Data

Before uploading your data to a repository, make sure that the files are complete and well-described. Following best practices for research data management will make this step easier.

Clinical research data requires additional handling to ensure that patient privacy and confidentiality is respected in accordance with the Health Insurance Portability and Accountability Act (HIPAA) and informed consent. Protected Health Information (PHI) must be removed prior to data sharing. The HIPAA Privacy Rule describes two ways to de-identify data:

  • Expert determination method: a person with expert knowledge manipulates data and verifies that there is minimal risk that a someone can be re-identified
  • Safe harbor method: all PHI is removed

Researchers at NYU Langone Health may contact DataCore for assistance with data de-identification.

Data sharing agreements (also known as 'data transfer agreements' and 'data use agreements') are contracts that bind the party who is sharing data and the recipient of the data. The terms of data sharing agreements describe what the data that is being shared and guidelines for using the data. The Data Sharing and Transfer guide includes a decision tree for determining whether a data sharing agreement is required. NYULH researchers may contact the Sponsored Programs Administration (SPA) to discuss creating or reviewing such agreements.

Another resource that is available to all NYU researchers is the Data Curation Network, which offers feedback to improve dataset discovery, reuse, and interoperability prior to public sharing. Researchers may contact Michelle Yee at the Health Sciences Library for more information on how to get started.

NYU Data Catalog

Sharing data through a data catalog is another way to make your data more discoverable. Unlike a data repository, data catalogs do not store data. Like a traditional card catalog for books, data catalogs contain information about datasets and how to find (and access) them. They can extend the reach of data that is deposited in a repository.

The NYU Data Catalog is designed to:

  • Increase the visibility of research data generated by NYU researchers
  • Facilitate collaboration across departments and institutes at NYU
  • Help NYU researchers locate and understand datasets generated at external organizations
  • Support the process of re-using research data