Skip to main content

Articles

Flywheel and the NIH Strategic Plan for Data Science

In January of 2023, new NIH data management and sharing mandates went into effect.  As a platform focused on the data management needs of healthcare and biomedical researchers, Flywheel offers a complete enterprise scale solution for governance and compliance with these new mandates. In this article, we'll outline the NIH's reasoning for its data management and sharing policy, and discuss how Flywheel can support compliance for individual researchers and whole institutions. 

DMS Policy Rationale

The Strategic Plan for Data Science recognizes that growing costs of data management, computation, and related infrastructure could reduce the amount of funding that goes to the scientific aims of research. Data resources are currently siloed. Different formats make it difficult to find, share, and use research data. Too much funding is going toward ineffective data infrastructures, rather than the science itself. There is “no general system to transform, or harden, innovative algorithms” for real world use.

These challenges not only exist at the NIH ecosystem level, but in most research institutions as well. There are significant opportunities to improve research and innovation productivity while driving data management, analytic, and sharing efficiencies.

The Strategic Plan for Data Science outlines a number of themes aimed at improving the data management and sharing ecosystem.

  • Common infrastructure and architecture 

  • Leverage commercial tools, technologies, services, and expertise 

  • Enhance data sharing, access, and interoperability

  • Ensure the security and confidentiality of patient and participant data

  • Improve the ability to capture, curate, validate, store, and analyze clinical data for biomedical research.

  • Data standards, including standardized data vocabularies and ontologies

  • Economies of scale and synergies and prevent unnecessary duplication.

Flywheel as a company and a platform is well aligned to support the objectives of the NIH Strategic Plan for Data Science.

Common infrastructure and architecture.  Flywheel is an open, extensible infrastructure designed to support a broad range of data management challenges in healthcare and biomedical research. The platform is designed to support multimodal data, as well as structured EMR and clinical data. The platform supports standards and provides open, standards-based APIs to enable interoperability with other systems.

Leverage commercial tools, technologies, services, and expertise. Flywheel is a unique, market-leading company supporting open data and open science at over 250 research institutions. We recognize that research data management challenges are increasing beyond the ability for individual labs, institutions, or grant funded projects to sustain in the long run, and are developing and maintaining a sophisticated, enterprise scale data management and sharing platform. The platform runs on all major public clouds and is compliant with the needs of the NIH STRIDES program. Flywheel works across a broad range of research domains and offers the technology and expertise to integrate with institutional and NIH infrastructures.

Enhance data sharing, access, and interoperability. A founding objective for Flywheel was to support data sharing for reuse and reproducibility.  Data sharing is enhanced through a suite of tools for data and metadata curation, quality control, harmonization. Access is managed through secure projects with role-based access controls, catalogs of published datasets,  search and cohort selection, and efficient smart copy of data for reuse.  Interoperability is ensured through use of standards-based technologies, APIs, and formats, as well as support for metadata standards and domain metadata standards.

Ensure the security and confidentiality of patient and participant data. Flywheel operates under SOC2 security standards to ensure the overall security of the platform. Flywheel ensures data security through secure projects with role-based access controls. The platform also offers a comprehensive suite of tools for managing data privacy. Multimodal de-identification tools support a broad range of data types including imaging, pathology, EMR, clinical data and more. With the support of NIH SBIR funding, the company is developing tools for intelligent PHI/PII screening to further reduce data privacy risks.

Improve the ability to capture, curate, validate, store, and analyze clinical data for biomedical research.  As a data management platform for healthcare and biomedical research, these are the fundamental capabilities of the platform. The platform provides a range of options for capture of both prospective and retrospective data including web upload, automated connectors, bulk import tools, and scriptable APIs. Curation is supported through metadata indexing, classification & label harmonization, support for domain standards (BIDS etc), image annotation, reader studies, visual inspection workflows, and more. Validation is supported through templates for protocol validation, plug-in algorithms for data validation, and custom scripting as needed. Flywheel manages storage of data and metadata with versioning. Analysis is supported through open APIs and SDKs for access by popular data science tools, and through automation of portable, reproducible plug-in algorithms.

Data standards, including standardized data vocabularies and ontologies. Flywheel supports standards wherever possible, including DICOM, FHIR/HL7, OMOP and more. The platform also supports domain standards through its extensible metadata framework. For example, the Brain Imaging Data Structure (BIDS) standard, commonly used in the neuroimaging community, has been implemented using Flywheel’s extensible metadata framework. The platform is extensible and configurable to support adoption of the relevant vocabularies and ontologies required for a given area of study.

Economies of scale and synergies and prevent unnecessary duplication. Flywheel is fundamentally designed to support economies of scale, synergies, and unnecessary duplication.  Economies of scale are achieved by providing a common, proven, supported platform, enabling focus on the research at hand rather than replicating, or building, complex redundant infrastructures. Synergies are promoted via collaboration tools, standardization of data and metadata, and data and algorithm sharing. Unnecessary duplication is supported through sharing by reference, rather than copying data. Further, compute in-place capabilities reduce the need to download and copy data.

Flywheel provides a complete data management solution that meets the requirements specified by the DMS Policy. Contact us for more details about how we can help assist you with compliance.