NCSA’s SEAS Team Makes Advanced Computing More Efficient and Accessible
High-performance computing (HPC) can often be challenging for researchers to use because it requires expertise in working with large datasets, scaling the software, and selecting the best user interface.
The National Center for Supercomputing Applications (NCSA) at the University of Illinois Urbana-Champaign not only deploys and operates supercomputing systems, but also offers researchers simplified and efficient use of these systems.
The Scientific and Engineering Applications Support (SEAS) at the NCSA facilitates researchers to maximize the efficiency of the hardware and software resources at their disposal. The SEAS team works with researchers on various aspects including installing Python packages, deploying AI models, and selecting the best parallel computation engines for their project.
A novel computational framework described in the recently published PNAS paper has been influential in allowing the SEAS team to simplify and speed up the process of using AI models to understand the three-dimensional protein structure and predict the conformational diversity of proteins.
The paper is authored by Roland Haas, a senior research programmer in the SEAS group, Eliu Huerta, lead for translational AI at the U.S. Department of Energy’s (DOE) Argonne National Laboratory and CASE senior scientist at the University of Chicago, Hyun Park, an Illinois Ph.D. student in biophysics, and Parth Patel, an NCSA graduate research assistant.
As part of the project, the research team developed APACE, a computational tool designed to enhance the performance of AlphaFold2, an AI program that predicts protection structures. APACE is designed to enhance the accuracy and robustness of AlphaFold 2 to predict protein structure. This technological breakthrough is poised to help biomedical researchers shed light on the fundamental mechanisms of life, develop new materials, and advance biotechnology.
To evaluate the efficiency and performance of APACE, the research team deployed the tool on the Delta supercomputer at the NCSA to predict the structures of four exemplar proteins. Using up to 300 ensembles distributed across 300 NVIDIA A100 GPUs, APACE delivered up to 100 times faster results compared to the AlphaFold implementations.
The team later reproduced the work on the Polaris supercomputer at the Argonne Leadership Computing Facility and got similar results. The project’s success highlights the potential for such methods to be used in a variety of scientific disciplines and could even allow researchers to automate and accelerate scientific discovery.
“Foundation AI models have the potential to transform the practice of science if they are findable, accessible, and ready to use by the broader scientific community,” said Huerta. “This project demonstrates how to create and share the required scientific data infrastructure to truly democratize cutting-edge AI and leverage modern computing environments to maximize its science reach.”
Biomedical researchers have long struggled to understand how proteins are formed, a process known as protein folding. Proteins are made of chains of amino acids, which assemble into structured forms to perform specific functions. Understanding protein folding can help explain how biological processes work and how errors in protein folding can lead to diseases.
Until now the major challenge has been to predict protein folding as it can be an extremely computationally intensive process with intricate molecular interactions. Adding to the complexity, protein structures can fold into a large number of possible conformations.
Traditional methods for studying protein structure, such as X-ray crystallography and cryo-EM, have been successful in providing static snapshots but have been unable to capture dynamic protein behaviors.
Now with APACE, researchers have access to a powerful tool that optimizes AlphaFold2 to run at scale on HPC platforms to deliver unprecedented performance and efficiency. The technology can study multi-protein complexes, capture results at higher resolution, and deliver results in less time compared to traditional methods.
“APACE allows drug researchers to drastically reduce the time required to screen out potential candidate compounds and thus focus on the most promising substances. This way, more compounds can be tested and the time to develop a new drug, for example, one tailored towards a specific viral strain, can be reduced” said Haas.
By facilitating access to both data and computational power, APACE accelerates AI model calculations, resulting in significant speed improvements beneficial across scientific disciplines.
According to Huerta, the research team will continue to expand the APACE user base by making it more accessible. The team also plans to focus on overcoming the remaining bottlenecks in the system that limit processing speeds. In addition, the team hopes to use the methods developed to enhance AlphaFold2 on other foundational machine learning models, making them available for researchers worldwide for scientific advancements.
Related Items
In Advanced Computing and HPC, Dell EMC Sets Sights on the Broader Market Middle
Empowering High-Performance Computing for Artificial Intelligence
Related