Flywheel Delivers FAIR Principles

The FAIR acronym is a nice way to summarize four important aspirations of modern research practice: scholarly data should be Findable, Accessible, Interoperable, and Reusable. The article describing the FAIR aspirations is excellent, and we recommend reading it. Some limitations of current practice are described here. Our company was founded to advance research and we embrace these principles.

Flywheel, software used by thousands of researchers, embodies tools and technology that deliver on the FAIR principles.

About Flywheel

Flywheel is an integrated suite of software tools that (a) stores data and metadata in a searchable database, (b) includes computational tools to analyze the data, and (c) provides users with both browser-based and command line tools to manage data and perform analyses. Our customers use these tools on a range of hardware platforms: cloud systems, on-premise clusters and servers, and laptops.

Flywheel supports users throughout a project’s life cycle. The software can import data directly from the instrument (like an MR scanner) and extract metadata from the instrument files that is stored into the database. Auxiliary data from other sources can also be imported into the database. The user can view, annotate, and analyze the data, keeping track of all the scientific activities. Finally the data and analyses can be shared widely when it is time to publish the results.

FAIR Data Principals Implemented

Findable

Flywheel makes data ‘Findable’ by search and browsing. The Flywheel search tools address the entire site’s dataset, looking for data with particular features. It is straightforward, for example, to find the diffusion-weighted imaging data for female subjects between the ages of 30 and 45. The user can contact the owners of the data for access, and the data returned by a search can be placed in a virtual project (Collection) for reuse and further analysis.

Search is most effective when there are high quality metadata associated with the data and analyses. Flywheel creates a deep set of metadata by scanning the image data, classifying them. Users can attach specific searchable key words and add data-specific notes at many places - from the overall project level, the session level, the specific data file or the analyses. Users can find data by searching based on these descriptions.

Accessible

Our customers frequently observe that there is a conflict between making data accessible (sharing) while complying with health privacy rules. We live in a world with privacy officers on the one hand and open data advocates on the other.

Flywheel delivers an accessible solution that is respectful of both principles. We implemented a rigorous user-rights management system that is easy to use. Access to the data and analyses is controlled through a simple web-based interface. The system implements the different roles that are needed during a project’s life cycle. At first perhaps only the principal investigator and close collaborators have access; later, additional people (reviewers, other scientists) might be granted access to check the data and analyses. When ready, the anonymized data and full descriptions of the analyses can be made publicly viewable. An effective system that manages a project through these stages is complicated to write, but Flywheel makes the system easy-to-use through its browser interface.

Interoperable

Most scientists have felt the frustration of learning that a dataset is available, but the file format or organization of the data files requires substantial effort to decode and use. The medical imaging community has worked to reduce this burden by defining standardized file and directory organizations. Flywheel is committed to using and promoting these standards.

Our experience teaches us that well intentioned file formats and directory organizations are not enough. Flywheel stores far more information than what one finds in the header of a DICOM or NIfTI file or the BIDS directory structure. Our commitment to interoperability includes reading in files and directories in these standards and even writing Flywheel data into these formats. Beyond this, we are committed to tools that import and export data and metadata between Flywheel and other database systems.

Flywheel is further committed to supporting the interoperability of computational tools. We have opened our infrastructure so that users can analyze data using Flywheel-defined containerized algorithms, their own containers, or their own custom software. The Flywheel standards are clearly defined based on industry-standard formats (e.g., JSON, Docker, Singularity) so that other groups can use them and in this way support computational interoperability.

Reusable

From its inception, Flywheel was designed to make data reusable. Users at a center can share data within their group or across groups, they can reuse the data by combining from different groups, and create and share different computational tools. The user can select data from any project and merge it into a new project. Such reused data is called a Collection in Flywheel. The original data remain securely in place, and the user can analyze the collection as a new virtual project. All the analyses, notes, and metadata of the original data remain attached to the data as they are reused.

Equally important, the computational methods are carefully managed and reusable. Each container for algorithms is accompanied by a precise definition of its control parameters and how they were set at execution time. This combination of container and parameters is called a Flywheel Gear, and the specific Gear that was executed can be reused and shared.

More

The FAIR principles are an important part of the Flywheel system. We have also been able to design in additional functionality that supports these principles.

  • Security and data backup are very important and fundamental. The ability to import older data into the modern technology has been valuable to many of our customers.
  • The visualization tools built into Flywheel help our customers check for accuracy and data quality as soon as the data are part of the system.
  • The programming interface, supported by endpoints accessible in three different scientific programming languages, permits users to test their ideas in a way that gracefully leads to shared data and code.

Tekne 2019 Award Cloud Computing Finalist — Flywheel's Biomedical Imaging Research Platform

MINNEAPOLIS, Sept. 24, 2019: Flywheel.io, a leading biomedical imaging research platform provider and an essential building block for imaging artificial intelligence (AI) has been acknowledged as a finalist for the Minnesota High Tech Association 2019 Tekne Awards in the Cloud Computing category. For the past two decades, the Tekne Awards have recognized organizations that are leading-edge innovators in science and technology in Minnesota.

Flywheel has been named a finalist in the Tekne 2019 Cloud Computing award category for its Biomedical Imaging Research Platform. The Flywheel platform is a comprehensive scientific workflow for imaging data management, including data capture, curation, computation, and collaboration, all integrated to accelerate imaging research within clinical processes, collaborative multi-site studies, and scalable imaging AI initiatives. 

Flywheel is designed to speed and improve imaging research and scientific discovery; from standardizing and structuring massive volumes of historical imaging data in life sciences for AI and Machine Learning, to reducing time and costs for clinical imaging device scanning (magnetic resonance (MR), positron emission tomography (PET), etc...) and study processes, to testing the functionality of research design, all while ensuring research projects adhere to scientific reproducibility, regulatory compliance, security, and collaboration mandates needed for National Institute of Health (NIH) funding requirements.  

Flywheel’s ease-of-use and scalability make Flywheel.io one of the bioinformatic industry's most respected imaging research workflow platforms with global customers including Fortune 500 life science organizations, AI specialists, and dozens of top-tier NIH-funded imaging research centers at universities like Stanford and Columbia.

"We are honored to be named as a finalist for the Cloud Computing Tekne 2019 Awards. Flywheel is privileged to help principal investigators, data scientists, and imaging center directors build imaging research processes for the future...so they can do more science and less IT in their pursuit of healthcare discoveries," said Flywheel CEO, Travis Richarson.

“This year’s Tekne Award finalists demonstrate thought leadership and are spearheading technology innovation in Minnesota and around the world,” said Jeff Tollefson, President and CEO of the Minnesota High Tech Association. “We look forward to further recognizing these organizations at the 2019 Tekne Awards as well as highlighting the impressive science and technology community here in Minnesota.” 

 

A full list of Tekne finalists and November 20 gala details are available online at tekneawards.org.  The event emcee is Paul Douglas.

To learn more about the Flywheel - Biomedical Imaging Research Platform, go to https://flywheel.io

ClickToTweet:

@Flywheel_io is honored to be a finalist for the Minnesota High Tech Association @MHTA 2019 Tekne Awards in the Cloud Computing category for its biomedical imaging research platform. Go to https://ctt.ac/peqbs+ to learn more about Flywheel. @Flywheel_IO #MachineLearning #ImagingResearch #Bioinformatics

About Flywheel - Biomedical Imaging Research Platform

The Flywheel - Biomedical Imaging Research Platform is the leading provider of scalable and collaborative imaging research processes in an easy-to-use and scientifically sound platform. Flywheel users are creating the building blocks of AI and machine learning based Precision Medicine in a demanding scientific and regulatory environment with high security, privacy, compliance, reproducibility, scalability, and performance requirements. Founded in 2012, Flywheel customers are using millions of imaging research data in Flywheel platform to transform the future of healthcare and medicine.

About the Minnesota High Tech Association (MHTA)

The Minnesota High Tech Association is an innovation and technology association united in fueling Minnesota's prosperity. We bring together the people of Minnesota's science and technology ecosystem and lead the way in bringing science and technology issues to leaders at Minnesota's Capitol and Washington, D.C. MHTA is the only membership organization that represents Minnesota's entire technology-based economy. Our members include organizations of every size − involved in virtually every aspect of technology creation, production, application and education in Minnesota.

Flywheel.io: Brad Canham | 612-223-7359 | bradcanham@flywheel.io

Minnesota High Tech Association: Claire Ayling | 952-230-4553 | cayling@mhta.org


Flywheel and Google Partner to Deliver Industry’s First Cloud-Based MRI Research Center at Columbia University

Flywheel launches cloud-scale collaborative research center on Google Cloud Platform at Columbia University’s Zuckerman Institute

MINNEAPOLIS, MN July 12, 2018 Flywheel, the research informatics company, announces the deployment of its medical imaging research platform at Columbia University on Google Cloud Platform (GCP). The Flywheel platform has been used by the Columbia neuroscience and biomedical imaging community and their internal and external collaborators at Columbias Mortimer B. Zuckerman Mind Brain Behavior Institute since June 2017. With rapid user adoption and increased demands for large- scale data management and computation, Columbia has expanded the Flywheel program and migrated to Google Cloud.

Developed at the direction of Dr. John Thomas Vaughan, Director of the Columbia Magnetic Resonance Research Center and Professor of Biomedical Engineering, Radiology, and Applied Physics and Math and a principal investigator at Columbias Zuckerman Institute, the Flywheel platform is science and industrys first fully integrated cloud-based MRI research center. The launch at the Zuckerman Institute is the first node of a cloud-based Columbia MR Research Center (CMRRC), which will connect the Institute, Columbia University Irving Medical Center, Columbias School of Engineering and Applied Sciences, the Nathan Kline Institute for Psychiatric Research, the New York State Psychiatric Institute, and external collaborators in greater New York and around the world.

Columbias Zuckerman Institute has contracted Flywheel to facilitate the efficient, GCP-based data acquisition, archiving, operations and connectivity with other Center laboratories and collaborators. Flywheel provides a complete data management solution to meet the challenges of large, multi-modal imaging data sets as required by multi-center studies and machine learning based research. Flywheel on the Google Cloud Platform will meet the CMRRC data management requirements for unlimited, on-demand computational power and data storage. Flywheel, together with the GCP are the nervous system of our Center, said Dr. Vaughan, providing the innervation for the many coordinated functions of a modern, distributed laboratory potentially spanning the world. Leveraging the power of the cloud is obvious and central to the future of scientific investigation. To have Flywheel and GCP working with us at the Zuckerman Institute, connecting our Center across Columbia’s campuses as well as extend it to our collaborators around the world. We are excited to have Flywheel at the heart of Columbia’s MRI research cloud initiative. Richardson, Flywheel President. "With our partners at Google, we have created a truly innovative, highly scalable, and secure solution allowing leading research institutions to keep pace with the demands of modern imaging research including multi-modality studies, multi-center collaboration, and machine learning." said Travis

About Flywheel

We are fortunate Flywheel is a leading medical imaging informatics platform for researchers that's transforming the way research is conducted in academia, clinics and pharma industry. By providing tools to automate scientific data management, scale analytic computing, and securely share data and algorithms, we're on a mission to unleash the medical imaging community's creative energy by designing a more powerful way of doing research. Flywheel is headquartered in Minneapolis, MN, and has offices in the Bay Area, Boston, and Budapest. For more information on our mission and products, visit https://flywheel.io.

Contact for information:
Can Akgun, PhD
Director of Business Development email: canakgun@flywheel.io


Flywheel's Comprehensive Support for BIDS

The Brain Imaging Data Structure (BIDS), is becoming a widely used standard in neuroimaging research collaboration. BIDS is a guide for how to organize neuroimaging and behavioral data and associated analytic applications.  Until now, there has been no standard, making research collaboration difficult.  The BIDS initiative, led by Dr. Russell Poldrack and Dr. Chris Gorgolewski at the Stanford Center for Reproducible Neuroscience, addresses this need and opens the door for better collaboration. 

Flywheel, like BIDS, is about enabling researchers to collaborate with one another.  Flywheel is a comprehensive software platform for computational research and collaboration, while BIDS is a standard structure for sharing data and applications in neuroimaging.  The two are highly complementary.  With the release of Emerald 3, Flywheel has introduced comprehensive support for BIDS including:

  • BIDS Conversion and Curation
  • BIDS Import
  • BIDS View
  • BIDS App Gears
  • BIDS Export

Let’s take a closer look at how Flywheel and BIDS differ when it comes to organizing data.  Flywheel’s default method for organizing files is to group them as individual time sequenced acquisitions, representing each unique measurement or scan as they come off the instrument.  Flywheel adds value by capturing the metadata and classifying each file to further identify its specific type (e.g., anatomical, functional). In contrast, BIDS groups files into folders by their class.  For example, the anat folder contains anatomical data, while the func folder holds functional data.  BIDS also specifies a file naming convention that incorporates multiple attributes of the scan into the filename, including the subject, session, type of scan, etc (Figure 1).  

Flywheel can accommodate alternate views and organization of data, such as BIDS, using its extensible metadata capability.  To implement BIDS support, the Flywheel system adds BIDS-specific metadata such as BIDS folder and BIDS filename to the appropriate objects within the Flywheel hierarchy.

One key challenge many researchers face is converting original DICOM scans into a BIDS dataset. Flywheel’s new BIDS Curator Gear automates this process.  The BIDS Curator Gear uses a configurable template and Flywheel metadata to create and populate the necessary attributes for BIDS.  The BIDS Curator also leverages Flywheel’s ability to automate DICOM to NIfTI conversion to create the NIfTI files and metadata sidecars required by BIDS.  If some of the required attributes are missing in the existing metadata, that’s where the info editor comes in.  The info editor is a metadata editor for any custom metadata within Flywheel, including BIDS attributes.  With the info editor you can view attributes created by the BIDS Curator and edit any missing or incorrect attributes (Figure 2).

If you are a Flywheel Lab Edition user, you can easily import existing BIDS datasets into Flywheel using BIDS Import.  The BIDS Import tool uploads your existing BIDS dataset into Flywheel and retains all BIDS attributes as metadata.  This will allow you to reliably reproduce the dataset for subsequent export or use in BIDS App Gears.

With the BIDS metadata available in a Flywheel session, you can view your data in BIDS format using the BIDS View.  You can conveniently toggle back and forth between the standard Flywheel view and a BIDS view of your data.  The BIDS View will visually flag any invalid items to simplify finding items that need review and correction (Figure 3).

Flywheel’s new BIDS capabilities also include support for BIDS Apps.  BIDS Apps are enabled as standard Flywheel Gears with the ability to prepare the input dataset in BIDS format.  Your BIDS App Gears can incorporate existing BIDS Apps or new custom applications in BIDS App format.  The Flywheel Gear Exchange currently includes a few popular BIDS App Gears, such as MRIQC and FMRIPREP.  

Finally, Flywheel allows you to download your entire project in BIDS format.  BIDS Export retains all of your BIDS attributes and structure, whether it was created using BIDS Import or BIDS Curator.  BIDS Export also performs a full BIDS validation to ensure compliance with the BIDS standard.  

The new BIDS capabilities make Flywheel a great platform for those who want to incorporate the BIDS standards naturally into their research workflow, while saving a significant amount of time.  


Support for Multimodal Research Workflows

Flywheel supports multiple data types.  As a matter of fact, I often explain that the core of Flywheel is data-type agnostic. But what does that really mean and why is it important?  

The generalized scientist’s workflow consists of collaborating to acquire data, analyze data, and disseminate finding. We are in the age of computational research and collaborative science; in the age of multimodality and diverse data types. Single purpose or single modality systems are no longer useful to support this diverse research workflow.

Data, especially imaging data, consists of file(s), coupled with metadata that describe the contents of said files.  Metadata could be as simple as subject ID or data acquisition timestamp, or as rich as a full description of the pixel data, subject details, and acquisition parameters. To fully support a data type beyond simple file handling, Flywheel provides a three-level framework:

  1.    Data Connector to capture the data at the source
  2.    Data Classifier to intelligently extract, or derive, metadata
  3.    Data Access to visualize and analyze the data

Data Connector: This is a software that integrates with an instrument or a 3rd-party system that generates the data.  Flywheel categorizes data pulled into the system using a Data Connector as the “original” source of data. The purpose of the Connector is to simplify and automate data capture. Unlike the data-agnostic core, Connectors are built for specific data types with customization to accommodate communication protocols, folder structures, and file naming conventions.

Data Classifier: Once the data is captured, what can we learn about the data? What can we extract from the data to make it usable in our application, searchable in our search engine, and actionable in our processing engine? The richer the meta-data, the richer the user experience will be. We accomplish this by deploying the Data Classifier as a Flywheel Gear that runs as soon as the data is captured. The Gear will extract the available metadata and updates the database.

Data Access:  Data access consists of two parts: visualization and computation. It is important to provide researchers with web-based and platform independent viewers. This is key to enabling a portable, diverse and collaborative community. Data access for computational purposes is accomplished through a set of  SDKs for commonly used analysis languages, such as MATLAB, Python, and R.

This datatype support paradigm is applicable for image-based data (e.g. DICOM and Microscopy), time series based data (e.g. EEG), as well as self-describing text-based files (e.g. CSV).

Let’s take a closer look using DICOM as an example. We have a DICOM Connector that pulls data directly from the modality and completely avoids the hassle of handling files on unsecured, portable media. The software communicates with the imaging modality using the DICOM communication protocol. Once the data is stored in the appropriate project, the Classifier automatically runs and extracts the full DICOM header into the database. Then, commonly used fields such as subject ID, sex, and timestamp, as well as acquisition type are updated in the database. The data can then be viewed using a web-based viewer and accessed through the MATLAB or Python SDKs.

In addition to extracting metadata, the MR Classifier determines the data type as anatomical, functional, or localizer using multiple values in the header. This auto-classification simplifies data access and Gear execution.  

This approach makes Flywheel an extensible framework to accommodate the varying and complex multimodality needs of the research community.  


Flywheel — Next Generation Research Collaboration

Millions of families are struggling with many unresolved diseases and large, complicated healthcare challenges. In neuroscience alone, billions of dollars are spent each year on Alzheimer’s, traumatic brain injury, autism, and other behavioral and neurodegenerative diseases.

The good news is that every day advances in technology and analytics are enabling scientists around the world to make amazing advances. The bad news is that the increasing complexity of new technologies is making it very difficult to share and build and build on each other’s work.

Flywheel offers a cloud-scale collaborative science platform to tackle these issues across both commercial and academic research. We are founded by a collaboration with leading universities and are solving problems every researcher faces on a regular basis. The end goal is to accelerate discovery through collaboration and reproducible research.

Let’s face it: computational science is becoming complicated. The size of the data and complexity of the data is increasing.  The tools and methods are also getting more complex. There are pressures from regulatory agencies to protect data, and pressures from funding agencies to share data, and pressures from publishing companies to promote open science. We are solving these next-generation problems.

Scientists around the world take for granted the ability to share personal data. Yet, sharing scientific data and methods in scaled and secure way it not as easy, often not even possible.  

We believe we can address these issues with a cloud-scale research platform that fosters collaboration and reproducible results.

A well-known study published in Nature states that “More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments”.  As science works through corroboration, this is a big problem and solving it will speed-up discoveries and unlock the potential of disruptive innovations.

So how do we do this?  We provide an open, extensible platform for organizing and managing diverse medical data. We are able to capture and organize large amounts of data from virtually any source to make it easy to find and use.  We leverage the power and scale of cloud infrastructure dramatically speed-up analysis. We enable collaboration and sharing in a secured and controlled way across labs, institutions, or across the globe. Basically, instead of moving huge volumes of data and replicating complex infrastructure, we bring the scientist to the data.

Flywheel targets researchers in both academic and commercial pharma/biotech. Reproducibility is critical to pharmaceutical and biotech companies as they validate any compound, biomarker, or diagnostic being considered for commercialization. Fundamentally it is about translational medicine and accelerating time to market.  We have customers today in the pharma space that are using Flywheel to manage distributed multi-site trials with academic collaborators.

Although, our initial focus is Neuroimaging, we are designed for multi-modality research.  Our team has expertise in medical imaging, specifically neuroimaging and MRI.  Research is increasingly multi-modality and multi-disciplinary, which provides a rich trajectory of adjacent markets, within imaging and also non-imaging based data.

The product is well received and our install base is growing.  We have 20 sites installed with leading institutions across the globe. We are particularly excited about our expanding work at Stanford University and Columbia University where we are moving towards institution-wide deployments. At Columbia alone, we have potential to connect up to 25 MRI systems and 100s of labs collaborating using Flywheel. We are excited about the viral nature of the product, which has led us to new opportunities at University of Pennsylvania, New York University, and others.

Stay tuned for more updates as we continue to develop Flywheel!


CMRR High Field Workshop 2017

Flywheel is happy to be a silver sponsor of the University of Minnesota's 2017 Workshop on High and Ultra-high Field Imaging and Training Workshop. This biennial event is organized and hosted by the Center for Magnetic Resonance Research (CMRR).

Under Dr. Kamil Ugurbil’s leadership, this scientific gathering brings together top researchers from top universities and academic institutions from around the world to disseminate and discuss the technical issues and applications of Magnetic Resonance Imaging (MRI)  and Magnetic Resonance Spectroscopy(MRS) conducted with high magnetic fields (≥ 3 T). Presentations from experts in the major areas of high field MR research will cover fundamental principles, methodology, and biomedical applications in the brain as well as the other organ systems in the body. The auditorium is standing-room only and simulcasted into spillover rooms in the facility.

The workshop also includes poster sessions and training courses covering topics such as Imaging Methods for the Connectome Projects, High-Field Parallel Transmission and Engineering, and MR Spectroscopy. Click here for a detailed program.

Alongside other sponsors such as Siemens Healthineers, GE Healthcare, Bruker, and skope, we take honor in providing researchers with the tools necessary to move research into the computational age.  


What's New in Neuroscience Computing?

We are excited to showcase the latest in neuroimaging analytics during the upcoming Organization for Human Brain Mapping (OHBM) Annual meeting. Vijay Iyer from MathWorks and Michael Perry from Stanford’s CNI will present the latest features from MATLAB and Flywheel, as well as a technical demonstration of the integration between both products.

At Flywheel we are building a collaboration platform for data and compute management. This unique combination of historically differing products into a single research platform will streamline neuroimaging methods development and deployment.  We accomplish this through an easy to use web application combined with the power and flexibility of a suite of Software Development Kits (SDKs) aimed at empowering computational scientists.

Flywheel Components and MATLAB SDK. For more details, visit the Flywheel Platform page.

MATLAB, a popular technical computing language used by many neuroscientists, is building a growing list of neuro-specific communities and tools to support a wide range of research such as brain mapping, behavior psychophysics, and brain-computer interfaces. Join us at OHBM to learn more about new MATLAB features and how Flywheel can help empower you to new discoveries.

After the joint event with MathWorks, I invite you to visit the Flywheel booth for a personalized conversation and product demonstration.  We are making two major feature releases during OHBM.  First is data access and manipulation through SDKs.  Along with the MATLAB SDK, we are releasing the first versions of Python SDK and R SDK.  These SDK implementations are core to our strategy of enabling data analysis and Gear development.  Second, we will be showcasing our Data Explorer feature. This is a powerful, data-centric, facet-driven search engine.

As always, I am very interested in your feedback. So feel free to drop me a line with your thoughts and comments, and register to attend the workshop.

 


DockerCon 2017 Highlights

Flywheel leverages Docker heavily for software distribution and algorithm sharing and execution. I had the pleasure of attending Dockercon ‘17, and there were two presentations I’d like to highlight. One is validation that “containers” is an accepted method to achieve shared data processing goals. The other relates to progress on making the Docker image distribution story more consistent in China.

Cool Genes: The Search for a Cure Using Genomics, Big Data and Docker

James Lowey, CIO at Translational Genomics Research Institute (TGEN), presented the system they designed based upon their needs to effectively deal with the genetic data they process to provide more effective treatments for patients

Data Management

One of James’  starting slides reminds me of one we use. A ceiling-high pyramid of storage media that looks like it is about to topple. Everyone agrees there is value in these troves of unmanaged data. In many cases, the cost of using it is too high due to:

  • Low confidence of finding the data of interest, and that the contents match our memory.
  • Loss of institutional knowledge of what data is available, or how to access it.
  • Effort to retrieve a small bit of data across the whole set for broad analysis
  • Changing standards over time for file formats, organization, compression.

Once you have Data Management, you are able to leverage the Docker ecosystem for the benefit of healthcare and research. Specifically, TGEN has developed a number of data processing pipelines, and have constructed a system to execute them

Automation

The existing ecosystem of Docker orchestration and cluster solutions and patterns mean TGEN, and other institutions can invest less into software engineering, and more into new ways to analyze the genetic data to improve patient outcomes.

Docker Images provide a platform to ensure execution environments match development/test environments. In the case of TGEN, it is easy to imagine how this creates confidence that the treatment prescribed will not be compromised by such differences. This is one of the core reasons Flywheel has chosen containerization technology from the very start in the pursuit of Reproducible Research.

Collaboration

How do you bootstrap a collaboration network for data scientists to share not just ideas, but data conversion and analysis building blocks? Similar to the automation story, the Docker platform handles many of the packaging/distribution/execution concerns. Now the primary concern becomes establishing a standard way to represent inputs, outputs, execution semantics, and domain-specific variables. Once that is in place, others can contribute new tools that can easily be executed by your data execution engine.

Flywheel has an open specification fitting this mold (Flywheel Gears https://github.com/flywheel-io/gears) and manages the Flywheel Exchange https://github.com/flywheel-io/exchange where contributors can publish their gears for use across Flywheel environments.

Docker in China

Docker Hub is still coming to China! I had been concerned with the silence on this front since the initial partnership with Alibaba Cloud was announced last October. As a stakeholder in Flywheel’s software distribution strategy, I am excited at the prospect of unifying our process to lower complexity and risk to achieve higher customer satisfaction.

The project for offering Dockerhub in China is nearing completion, with expected availability this summer. The free service will be limited to public Docker Hub repositories and replicated from the existing Dockerhub. The details were missing for 1) how separate this China Docker Hub would be, and 2) whether there would be additional hurdles for use by Docker Image authors/publishers, or consumers.

JFrog reps said they would be offering private Docker Registry service within China that will not require a Mainland China business entity. I’m taking that lowered bar to entry with some skepticism. If JFrog can pull that off, and make it easy to use, it will be something I recommend to colleagues.