Understanding the NIH Data Management and Sharing Policy
The National Institutes of Health (NIH) recently implemented a new Data Management and Sharing (DMS) Policy that went into effect on January 25, 2023. This policy aims to promote data stewardship and enable the sharing of scientific data generated through NIH-funded research. As an experienced IT professional, it’s essential to understand the implications of this policy and provide practical guidance on managing and sharing data effectively.
At the core of the DMS Policy is the requirement for all grant applications or renewals that generate scientific data to include a detailed Data Management and Sharing Plan (DMSP). This plan outlines how the research data will be managed and shared throughout the funded period. Compliance with the approved DMSP becomes a term and condition of the award, meaning it can impact future funding decisions.
Defining Scientific Data
The DMS Policy defines “scientific data” as the recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings. This includes metadata, associated documentation, and any specialized tools needed to access or manipulate the data. It’s important to note that scientific data does not include laboratory notebooks, preliminary analyses, or physical objects, such as laboratory specimens.
Key Elements of a DMSP
The NIH has provided guidance on the recommended elements to be included in a DMSP, which should be concise (two pages or less). These elements are:
-
Descriptive Information: Briefly describe the scientific data to be managed and shared, including metadata, associated documentation, and any specialized tools needed to access or manipulate the data.
-
Standards and Metadata: Outline the standards, if any, that will be applied to the scientific data and associated metadata, such as data formats, data dictionaries, data identifiers, and definitions.
-
Data Preservation, Access, and Associated Timelines: Provide details on when the scientific data will be made available to other users and for how long. Identify any differences in timelines for different subsets of data.
-
Data Sharing Considerations: Describe any applicable factors that may limit the extent of data sharing, such as legal, ethical, or technical issues. Communicate any potential limitations on subsequent data use.
-
Oversight of Data Management and Sharing: Indicate how compliance with the DMSP will be monitored and managed, the frequency of oversight, and by whom (e.g., title, roles).
Navigating Data Storage and Sharing Options
As an IT professional, you play a crucial role in helping researchers navigate the various data storage and sharing options available to comply with the NIH DMS Policy. Let’s explore some key considerations and recommendations:
Institutional Data Storage Solutions
Many academic and research institutions offer centralized data storage solutions that can be leveraged for NIH-funded projects. These solutions typically include:
Network Drives: Most institutions provide network drives (e.g., H:\ or G:\ drives) for storing productivity data, such as documents and spreadsheets. However, these drives are generally not suitable for storing raw research data.
Cloud-based Storage: Institutional cloud storage solutions, like Box or OneDrive, can be used for storing and sharing productivity data. These cloud-based platforms often have file size limitations and are not recommended for housing large research datasets.
Electronic Lab Notebooks (ELNs): Some institutions, like the Medical College of Wisconsin, offer free access to ELN platforms, such as LabArchives. ELNs can be used to store data that would typically be documented in a physical lab notebook, but they should not be considered a primary storage system for raw research data.
Research Computing and Data Storage Resources
To address the needs of researchers generating large scientific datasets, many institutions have dedicated research computing and data storage resources:
Research Computing Storage: Institutions often provide high-performance storage solutions, such as network-attached storage (NAS) or parallel file systems, specifically designed for active research projects. These storage options typically offer features like data replication, large storage capacities, and support for various operating systems.
Institutional Data Repositories: Some universities and research centers have established institutional data repositories to facilitate the long-term preservation and sharing of research data. These repositories may offer metadata management, data curation, and access control features to help researchers comply with funder requirements.
External Data Repositories
In addition to institutional resources, the NIH encourages the use of external, publicly accessible data repositories for sharing scientific data. Examples of NIH-recommended repositories include:
-
NIH-funded Repositories: The NIH maintains a list of approved data repositories, such as the Database of Genotypes and Phenotypes (dbGaP) and the Sequence Read Archive (SRA), which are specifically designed to host and share genomic and other types of research data.
-
Generalist Repositories: Repositories like Dryad, Figshare, and Zenodo offer a more general-purpose solution for sharing a wide range of scientific data types. These platforms often provide features like persistent identifiers, version control, and support for various file formats.
When selecting a data repository, it’s essential to consider factors such as the type of data, the level of access control required, and the repository’s alignment with the NIH’s data sharing policies.
Budgeting for Data Management and Sharing
The NIH DMS Policy requires applicants to budget for the costs associated with data management and sharing activities. These costs may include:
- Data Curation and Preparation: Expenses related to organizing, formatting, and documenting the data to ensure it is sharable and understandable.
- Data Storage and Preservation: Fees for using institutional or external data repositories, including any associated maintenance and hosting costs.
- Data Sharing and Access: Costs related to enabling access to the shared data, such as user support, data transfer, and any applicable access fees.
It’s important to work closely with your institution’s research administration and IT support teams to accurately estimate these data management and sharing costs and include them in your grant application budget.
Ongoing Plan Revisions and Compliance Monitoring
The DMS Plan is not a static document; it may need to be updated or revised over the course of a research project. Changes in the types of data generated, the availability of more appropriate data repositories, or shifts in the sharing timeline may necessitate plan revisions. Investigators should communicate with their NIH Program Officer and Grants Management Specialist to ensure any necessary updates to the DMS Plan are approved.
Furthermore, NIH staff will monitor compliance with the approved DMS Plan during the annual Research Performance Progress Report (RPPR) process. Investigators must be prepared to demonstrate adherence to the plan and address any issues that may arise during the award period.
Conclusion
The NIH’s new Data Management and Sharing Policy represents a significant shift in the way research data is managed and shared. As an experienced IT professional, you play a vital role in guiding researchers through the complexities of data storage, sharing, and compliance with this policy. By leveraging institutional resources, external data repositories, and effective budgeting strategies, you can help researchers navigate the DMSP requirements and ensure their NIH-funded projects meet the necessary data management and sharing standards.
Remember, staying up-to-date with the latest developments in this area and regularly communicating with researchers, research administration, and NIH program staff will be crucial in supporting your institution’s compliance with the DMS Policy. By providing practical insights and technical expertise, you can help researchers maximize the impact of their work and contribute to the broader goals of data sharing and scientific advancement.