The Future of Medical Research: HPC, AI & HBM

A.I. Black GuyOctober 11, 2023

0 2 10 minutes read

The Future of Medical Research: HPC, AI & HBM

//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>

High-performance computing (HPC), designed for computationally intensive workloads, is helping life sciences and medical researchers get answers faster and more cost-efficiently. When combined with accelerated computing, AI, high-bandwidth memory (HBM) and other advanced memory architectures, HPC is powering faster drug discovery research.

The HPC market is expected to reach $49.9 billion in 2027, up from $36 billion in 2022, according to a recent MarketsandMarkets report. One of the biggest demand drivers is genomics research, which entails analyzing and identifying genetic variants associated with various diseases and responses to treatments. The report finds that HPC systems have improved genomic research in speed, accuracy and reliability.

Accelerated computing

Accelerated computing uses specialized chips like GPUs for more efficient computations and improvements in speed and performance compared with CPU-only systems.

Accelerated computing takes a traditional CPU system and pairs that with an accelerator, such as Nvidia’s GPUs, to accelerate the workloads, and HPC has been embracing it for a couple of benefits, including speed and energy efficiency, said Dion Harris, director of accelerated computing at Nvidia.

Nvidia has worked with developers and researchers for longer than 15 years to help leverage the computing power of parallel processors, such as GPUs, to accelerate its applications. GPUs offer “significant speed-ups in the order of multiples of 10×,” Harris said.

Calculations also can be done using a lot less energy because these applications and calculations are processed in less time, he said, noting that this is “becoming a huge benefit, especially as data centers are becoming more and more resource-constrained from an energy perspective.”

Another huge benefit, Harris said, is performing more computations at a lower cost. He explained that even though GPUs are added to the mix, which adds costs to the overall infrastructure, the faster speed and reductions in energy are much more significant relative to the incremental costs.

“When you look at the overall throughput per cost item or cost unit of the infrastructure, accelerated computing tends to improve the cost and economics of the data-center footprint as well as for a lot of these solutions,” he said.

This explains why Nvidia is seeing a large wave of adoption for accelerated computing within HPC and medical and drug discovery use cases.

Nvidia pioneered accelerated computing well over a decade ago when the company began collaborating with researchers to move from CPU-only to GPU-accelerated codes. Work has been done in a variety of applications, including drug discovery, materials science, quantum chemistry, seismic and climate and weather.

“We’ve now created a huge application ecosystem where people who are building supercomputers now see that they can get much more bang for their buck by having an accelerated portion of their system that can run all of these codes that have been optimized for GPUs,” Harris said.

What is important in building out an HPC system is identifying the specific workloads and what specific engine (CPU/GPU) they work best with, according to AMD. Most HPC systems today rely on heterogeneous computing, which uses both CPUs and GPUs, but there are numerous systems that are still CPU only, the company said.

What is the most important factor when selecting the CPU for your workloads? AMD said it is understanding what your workloads need from a CPU: Do they scale with cores? Are they memory-dependent? Do you need more PCIe lanes?

“When looking at CPUs, most HPC-focused workloads scale with cores and memory, so picking a CPU that has maximum core density and memory bandwidth to the cores will help your workloads perform better and scale efficiently,” AMD said.

What about the growing role of HBM in HPC? “There will always be a need for more memory and memory bandwidth in HPC workloads,” AMD said. “With vast amounts of data that needs to be computed, memory is a critical component in an HPC system.”

AMD expects that HBM and variations of it will play a larger role in GPU-specific computing because most of today’s workloads depend on bandwidth with an increasing need for faster and more energy-efficient performance.

AMD plans to target large language model (LLM) training and inference for generative AI workloads with the upcoming AMD Instinct MI300X accelerator supporting up to 192 GB of HBM3 memory.

HPC and AI

AI is being fused into classic HPC simulations and workloads, which is improving speed and opening up new use cases.

Accelerated computing also has taken hold in HBC for numerical simulations, and there is a huge trend in embracing AI as an approach to further accelerate some of these simulation techniques, including in medical and drug discovery use cases, Harris said.

But Harris said it is more than just doing it faster or more cost-effectively; it’s about doing things that were previously impossible to do.

AI is not improving things by 10× or 30×, it is improving by 1000× or 10,000× in terms of speed and overall turnaround time to get real insights into data, he added.

In some domains and scientific applications, where researchers can leverage AI, they see more transformative breakthroughs in terms of doing things that weren’t possible at all with numerical simulations, Harris said. “Those are the key drivers of why we think researchers are embracing both accelerated computing and AI as another means of transforming their workflows.”

One example cited is drug discovery, where the process for bringing a new drug to market typically runs over a decade—at costs billions of dollars. The process entails extensive research to screen how effective a drug can be and to determine the toxicity or adverse effects.

“A lot of this gets done computationally before you get to any clinical trials, so a large part of the drug-discovery process is done with computational biology,” Harris said.

A key process used to determine the compatibility of the target protein and the molecule being developed to treat the condition is called docking, which can be computationally intensive, he added, and by using AI, what normally would be a full year of docking and protein analysis can be reduced to a couple of months.

Nvidia saw this take shape with some projects during the recent pandemic. There were a couple of key techniques that were used during that time frame to condense the process of identifying potential solutions, Harris said, and the docking tool using AI to speed was one of them.

Another part of the process is called sequencing to understand the structure of the virus. One tool used for sequencing is called AlphaFold, and there are a couple of other ones that use a similar approach.

What makes this AlphaFold approach unique, Harris said, is that it uses AI techniques that are typically used for LLMs.

The result was sequencing the protein structures thousands of times faster compared with using a cryo-EM-based (cryo-electron microscopy) approach, which led to the next phase of the process to identify the compounds that target that specific protein, he added.

CPUs, GPUs and the cloud

Companies like AMD, Intel and Nvidia are developing chips that specifically target AI and HPC workloads. They also are leveraging the advantages of the cloud.

One example is the Nvidia Grace Hopper Superchip, an accelerated CPU designed for giant-scale AI and HPC. The CPU and GPU are tightly coupled on the same die and connected via Nvidia’s NVLink-C2C which allows for very high-bandwidth communications across the CPU and GPU.

This lets the GPU access large memory more efficiently, Harris said. A key use case is graph neural networks, which are well suited for these large memory footprints and are used for compound and drug screenings.

“With Grace Hopper being able to access the very large memory footprint, it gives it the opportunity to leverage larger models that can then ultimately drive more accurate and more valuable outcomes from an inferencing standpoint,” Harris noted.

Nvidia’s GH200 Grace Hopper platform, based on a new Grace Hopper Superchip with the industry’s first HBM3e processor, is designed for accelerated computing and generative AI. (Source: Nvidia)

Harris said the superchip will be useful in use cases where the application uses processes on both the CPU and GPU, and that balance can be shifted across the application. It also can be leveraged to migrate to full acceleration, where some applications don’t have all of their code ported to the GPU yet, using the same toolset and platform.

Grace Hopper also is configured with dynamic power shifting, where it will automatically shift the power to the processing unit that requires it the most.

The power envelope can remain the same, for example, at 650 W, but it can shift more of the power, such as 550 W to the GPU for AI-intensive applications and 100 W for the CPU, Harris said. “That’s part of the flexibility of the platform, and it really allows that seamless transition from CPU-heavy apps to GPU-accelerated apps.”

“While we think accelerated computing is the path forward and will service a lot of applications and workloads that we’ve been working with over the years to leverage the GPU, there are some applications that are still CPU-only and so we’ve developed our Grace CPU, which is very performance-based, built on the Arm Neoverse V2 architecture, taking advantage of all the latest and greatest technologies to deliver a very energy-efficient CPU,” Harris said.

One of the latest systems to adopt the Grace CPU is at the University of Bristol, UK, which participates in research across several different domains, including drug discovery. The Isambard 3 supercomputer, built on the NVIDIA Grace CPU Superchip, will be based at the Bristol & Bath Science Park.

The new system will feature 384 Arm-based Grace CPU Superchips to power medical and scientific research. It is expected to deliver 6× the performance and energy efficiency of Isambard 2, which will be one of Europe’s most energy-efficient systems, achieving about 2.7 petaflops of FP64 peak performance and consuming less than 270 kilowatts of power.

The new Grace-powered system will continue to work on simulating molecular-level mechanisms to better understand Parkinson’s disease and find new treatments for osteoporosis and COVID-19.

Nvidia’s platforms also can be used in the cloud. Cloud is definitely another trend in HPC, Harris said. Researchers can use the same platform on-premises and in the cloud, still using their CUDA-based applications.

The company also recently announced a set of generative AI cloud services for customizing AI models to accelerate the creation of new proteins and therapeutics, as well as research in the fields of genomics, chemistry, biology and molecular dynamics.

Part of NVIDIA AI Foundations, the new BioNeMo Cloud service for both AI model training and inference is said to accelerate the most time-consuming and costly stages of drug discovery. Researchers can fine-tune generative AI applications on their own proprietary data and run AI model inference directly in a web browser or through new cloud application programming interfaces (APIs) that integrate into existing applications. It includes pre-trained AI models to help researchers build AI pipelines for drug development.

Drug discovery companies, including Evozyne and Insilico Medicine, have adopted BioNeMo to support data-driven drug design for new therapeutic candidates.

According to Nvidia, the generative AI models can quickly identify potential drug molecules and even design compounds or protein-based therapeutics from scratch. They can predict the 3D structure of a protein and how well a molecule will dock with a target protein.

AMD’s adaptable computing and AI technology also is powering medical solutions for drug discoveries and faster diagnoses, offering performance and energy-efficiency advantages for HPC deployments. Touted as the fastest and most energy-efficient supercomputer, Frontier leverages both AMD CPUs and GPUs, delivering 1.194 exaflops of performance. AMD’s EPYC CPUs and flexible high memory bandwidth with the AMD Instinct GPU accelerators target HPC and AI data centers.

The Frontier supercomputer at Oak Ridge National Laboratory is powered by AMD EPYC processors and AMD Instinct accelerators and supports a range of scientific disciplines. One study example is the Cancer Distributed Learning Environment (CANDLE), which develops predictive simulations that could help identify and streamline trials for promising cancer treatments, reducing years of expensive clinical studies, AMD said.

AMD reports that the Instinct MI250X and EPYC processors are in the top two spots in the latest HPL-MxP mixed-precision benchmark, highlighting the convergence of HPC and AI workloads with the Frontier and Lumi supercomputers, which is used to power new research around cancer as well as climate change. Frontier posted a score of 9.95 exaflops of mixed precision performance, while Lumi posted a score of 2.2 exaflops in the HPL-MxP benchmark.

AMD’s EPCY processors also are being used to improve drug formulation breakthroughs with higher efficiency and lower costs. Microsoft recently announced that Molecular Modelling Laboratory (MML) is deploying its Microsoft Azure HPC + AI and Azure Virtual Machines to deploy virtual machines (VMs) powered by the EPYC processors to scale up its capacity for modeling simulations and drive down delivery time.

“MML is one of the pioneers in doing computational R&D on the cloud,” MML CEO Georgios Antipas said in a video interview for Microsoft. “We apply quantum chemical modeling, AI, and state-of-the-art electron microscopy to the study and development of immune interventions and drug design. Traditional drug design is both lengthy and very costly. Establishing a safety profile for a new drug can typically take a few years at best.”

He also noted that HPC in the cloud is important for the pharmaceutical industry, enabling companies to leverage substantially scaled-up architecture and to reduce development costs during the proof-of-concept stage. “It is really a game changer for our kind of scope.”

The 3rd-generation AMD processors were selected based on their high clock speed and very high-density CPU per VM, which provided MML with a “tremendous” difference in the simulation times, resulting in a “considerable” decrease in the computation time.

AMD’s 4th-generation AMD EPYC processor family offers workload optimized compute to address different business needs. The EPYC “Genoa” processor is suited for most HPC workloads, while the “Bergamo” processors will be best suited for cloud-native applications, and “Genoa-X” processors will address technical computing workloads.

The standard EPYC 9004 Series processor, codenamed Genoa, offers high performance and energy efficiency, providing up to 96 cores for greater application throughput and more actionable insights. The EPYC 97×4 processors, codenamed Bergamo, are claimed as the industry’s first x86 processors purpose-built for cloud-native computing, meeting ongoing demand for efficiency and scalability by leveraging the AMD “Zen4c” architecture to deliver the thread density and scale needed, AMD said.

Lastly, the EPYC 9004 processors with AMD 3D V-Cache technology, known as Genoa-X, deliver 3× larger L3 cache than standard EPYC 9004 processors to store a significantly larger working dataset. “The 3D V-cache design also relieves some of the pressure on memory bandwidth, helping to speed up the processing time for technical computing workloads,” AMD said.

Source link