Support for Multimodal Research Workflows

Staff

Flywheel supports multiple data types. As a matter of fact, we often explain that the core of Flywheel is data-type agnostic. But what does that really mean and why is it important?

The generalized scientist’s workflow consists of collaborating to acquire data, analyze data, and disseminate findings. We are in the age of computational research and collaborative science; in the age of multimodality and diverse data types. Single purpose or single modality systems are no longer useful to support this diverse research workflow.

Data, especially imaging data, consists of file(s), coupled with metadata that describe the contents of said files. Metadata could be as simple as subject ID or data acquisition timestamp, or as rich as a full description of the pixel data, subject details, and acquisition parameters. To fully support a data type beyond simple file handling, Flywheel provides a three-level framework:

Data Connector to capture the data at the source
Data Classifier to intelligently extract, or derive, metadata
Data Access to visualize and analyze the data

Data Connector: This is a software that integrates with an instrument or a 3rd-party system that generates the data. Flywheel categorizes data pulled into the system using a Data Connector as the “original” source of data. The purpose of the Connector is to simplify and automate data capture. Unlike the data-agnostic core, Connectors are built for specific data types with customization to accommodate communication protocols, folder structures, and file naming conventions.

Data Classifier: Once the data is captured, what can we learn about the data? What can we extract from the data to make it usable in our application, searchable in our search engine, and actionable in our processing engine? The richer the meta-data, the richer the user experience will be. We accomplish this by deploying the Data Classifier as a Flywheel Gear that runs as soon as the data is captured. The Gear will extract the available metadata and updates the database.

Data Access: Data access consists of two parts: visualization and computation. It is important to provide researchers with web-based and platform independent viewers. This is key to enabling a portable, diverse and collaborative community. Data access for computational purposes is accomplished through a set of SDKs for commonly used analysis languages, such as MATLAB, Python, and R.

This datatype support paradigm is applicable for image-based data (e.g. DICOM and Microscopy), time series based data (e.g. EEG), as well as self-describing text-based files (e.g. CSV).

Let’s take a closer look using DICOM as an example. We have a DICOM Connector that pulls data directly from the modality and completely avoids the hassle of handling files on unsecured, portable media. The software communicates with the imaging modality using the DICOM communication protocol. Once the data is stored in the appropriate project, the Classifier automatically runs and extracts the full DICOM header into the database. Then, commonly used fields such as subject ID, sex, and timestamp, as well as acquisition type are updated in the database. The data can then be viewed using a web-based viewer and accessed through the MATLAB or Python SDKs.

In addition to extracting metadata, the MR Classifier determines the data type as anatomical, functional, or localizer using multiple values in the header. This auto-classification simplifies data access and Gear execution.

This approach makes Flywheel an extensible framework to accommodate the varying and complex multimodality needs of the research community.