Moving to the Cloud? Best Practices Guide and Resources for Cloud-Based Neuroimaging Research

Andrew Worth

By Andrew Worth

The Action Collaborative on Neuroscience Data in the Cloud, part of the National Academies of Sciences, Engineering, and Medicine, has issued guidance for researchers conducting cloud-based research, the “Hitchhiker’s Guide to Using Cloud-Based Resources for Neuroimaging Research.”

Whether you’re a freshly minted Assistant Professor and want to use the latest methods, or you’re a seasoned P.I. with a history of grant awards and publications and want to keep up with and share your best practices for reproducible research, the report is a resource for you. It is developed for investigators and administrators at different levels of experience to help understand, access and successfully use cloud-based tools in neuroscience research.

The guide provides best practices and links to resources for everything you need to consider, including costs, privacy, security, data size/complexity/scope, access to computational resources and expertise, cloud-compliant tools and analysis pipelines, and sharing data. After all, you want your data to be FAIR (Findable, Accessible, Interoperable, and Reusable), right?

Key Takeaways to Get Started

Some of the insights in the report that are particularly valuable include:

The Hidden Costs of Cloud Computing – Some of the costs associated with cloud computing are unexpected. Consider “hidden” costs such as long-running computational jobs, ingress/egress fees, and inefficient compute management.
Storing Data So It Can Be Queried – It is important to structure large data and multimodal data so it can be explored. Attaching metadata allows the data to be accessed programmatically and intuitive to someone interacting with the data. Ideally, this metadata should be automatically generated from processing pipelines.
Start with Getting Data Organized – Data organization should be built into pipelines from the start instead of saved for a later stage.
- Raw data should be distinguished from derived products, saved with read-only permissions, and shouldn’t be duplicated for multiple researchers. A raw data repository can help support access controls.
- Consistent naming conventions like BIDS can also help make projects widely shareable.
Cutting Down on Data Copies – Being able to explore data without downloading it can reduce replication.
De-Identification and Privacy – A lot of participant information is captured via DICOM files, including birthdates, embedded text and even facial structure. Look for a way to de-identify DICOM tags and other multimodal data.
The Hidden Cost of Curation – Curating and organizing your data to comply with IRB requirements and data standards takes time from team members over the months or years of your study.
Allocating Compute Costs – Making unique labs and teams responsible for their computing costs helps them learn more about cloud computing and make smart decisions about resource consumption.
Software Pipelines That Scale – Researchers can save time and cost by using existing published pipelines, such as containerized software packages. Docker is a helpful tool for developing containerized analytical programs that can then be scaled in parallel.

If you are curious how Flywheel can save you time with data organization, privacy, security, analysis pipelines and data sharing, reach out to us at info@flywheel.io. Flywheel is cloud-agnostic and supports on-premises computing.

Andrew Worth, Ph.D., is a Senior Scientific Solutions Engineer at Flywheel and the Founder and CTO of Neuromorphometrics, which builds a model of the living human brain from MRI scans for “ground truth” comparisons.

Moving to the Cloud? Best Practices Guide and Resources for Cloud-Based Neuroimaging Research

Key Takeaways to Get Started

Unlocking the Potential of Medical Imaging as Real-World Data

Genetic Engineering & Biotechnology News: Manage Data Better or Die

Drug Discovery & Development: Opportunities and obstacles with ML and AI development

Streamlining Data Management for Precision Medicine at an Imaging Core

Flywheel and the NIH Strategic Plan for Data Science

Drug Discovery & Development: Accelerating R&D with FAIR data

Five Key Takeaways from RSNA 2022

HIT Consultant: Transforming Healthcare with Medical Imaging AI

Drug Discovery & Development: Fueling breakthroughs in pharma AI: 3 critical factors

Drug Discovery & Development: How federated learning can enable faster, more accurate pharmaceutical-grade AI

3 Ways to Import Your Imaging Research Data into Flywheel

Flywheel to Serve as Research Platform for Understanding Pediatric Brain Development in Low- and Middle-Income Countries

Smart Copy helps Flywheel sites share and manage data more efficiently

Federated learning project connects pharma with university to train AI model

Designing Custom Multi-Reader Studies is Streamlined with Robust Viewer Configuration Options

Advancing Imaging Research: Focus on Precision Medicine and AI

Modern AI Deployments for Advancing Patient Outcomes

Flywheel Partners with Roche and Genentech to Accelerate Development of Personalized Healthcare Solutions

Flywheel Optimizes Optical Coherence Tomography Research with Robust Array of Features

Connect Powerful Segmentation Tools to Cloud-Based Data with the 3D Slicer Extension

Automate tasks with our open-source plug-ins

Health System Informatics Leader Uses Flywheel to Create AI-Ready Data Sets

Flywheel for Reader Studies: Demo

Streamlining Data Management for Multi-Site Traumatic Brain Injury Research

Flywheel Unveils SaaS Platform for Advancing Medical Imaging AI at Scale

Key Takeaways to Get Started