Skip to main content

Insights on Research Data Management

Federated learning project connects pharma with university to train AI model


Federated learning holds the potential to dramatically accelerate machine learning in healthcare. This emerging AI training method changes how researchers access data and reduces challenges with compliance and data privacy. 

Traditionally, researchers training AI models have had to acquire and centralize large amounts of data to test and train their algorithms. This means that data has to travel outside the possession of its original owner. Federated learning projects invert this model with decentralized training—the algorithm travels instead of the data

For biomedical researchers, this reduces concerns with deidentification and other compliance issues, as the data itself never leaves its owner. AI researchers can better train their models by accessing distributed, diverse data instead of being limited to what they can acquire and centralize.  

Flywheel’s Federated Learning Project Success

Flywheel recently facilitated a federated learning project between a pharmaceutical company and a university healthcare provider, a case study that helps illustrate the potential of this cutting-edge technology. 

For this federated learning project, two Flywheel sites—one within an academic medical center, another at a pharmaceutical company—ingested a large volume of chest x-ray data. Flywheel’s containerized algorithms (referred to as Gears) automated preprocessing of the x-ray images to prepare them for machine learning—extracting the metadata, validating with QC, adjusting intensity scaling, and resizing. The containerized and version-controlled Flywheel Gears ensured consistency in the data preparation, a critical need for machine learning. 

With the data consistently prepared, machine learning researchers at the pharmaceutical company could then train their AI model on the chest x-rays. Using NVIDIA Flare, PyTorch, and the Flywheel Dataset API, researchers created their training method and sent their model to the sites. These sites received the model from the server, trained it using their respective data, and sent weighted scores back to the pharma researchers.

AI developers and data scientists can see more technical details about this process in our video. 


This collaboration is just one example of how organizations can bring their data and algorithms together to advance healthcare innovation. With Flywheel facilitating collaborations between institutions and enabling consistent curation of data, researchers can more efficiently train their models, and can gain access to a greater volume of data to ensure accuracy. And with Flywheel Exchange, researchers will soon be able to more easily discover data and collaborators for their own federated learning projects