Normal view
-
Quantum Zeitgeist
- Classiq, NVIDIA, and Tel Aviv Medical Center Launch Quantum Computing Initiative for HealthcareQuantum software company Classiq, in collaboration with NVIDIA and the Tel Aviv Sourasky Medical Center, has launched the Quantum Computing for Life Sciences & Healthcare Center. The initiative aims to develop quantum algorithms and applications to revolutionise life sciences and healthcare, including drug discovery, molecular analysis, and personalised medical treatments. The centre will also address challenges in supply chain and treatment coordination. Classiq CEO Nir Minerbi believes the
-
Quantum Zeitgeist
- Global Investment Summit: UK Showcases Pioneering Innovations to World’s Top CEOs and InvestorsThe UK's Global Investment Summit will host over 200 CEOs, including Stephen Schwarzman of Blackstone, David Solomon of Goldman Sachs, Amanda Blanc of Aviva, Ignacio Galán of Iberdrola, and Jamie Dimon of JP Morgan Chase. The summit will showcase British innovations in AI, quantum computing, agri-tech, clean growth, advanced manufacturing, life sciences and fashion. Barclays, HSBC and Lloyds Bank are confirmed sponsors. Companies such as McLaren, Aston Martin, Fruit Cast Ltd, Delta G, Quantum DX
Global Investment Summit: UK Showcases Pioneering Innovations to World’s Top CEOs and Investors
-
NVIDIA deep learning blog
- Gen AI for the Genome: LLM Predicts Characteristics of COVID VariantsA widely acclaimed large language model for genomic data has demonstrated its ability to generate gene sequences that closely resemble real-world variants of SARS-CoV-2, the virus behind COVID-19. Called GenSLMs, the model, which last year won the Gordon Bell special prize for high performance computing-based COVID-19 research, was trained on a dataset of nucleotide sequences — the building blocks of DNA and RNA. It was developed by researchers from Argonne National Laboratory, NVIDIA, the Unive
Gen AI for the Genome: LLM Predicts Characteristics of COVID Variants
A widely acclaimed large language model for genomic data has demonstrated its ability to generate gene sequences that closely resemble real-world variants of SARS-CoV-2, the virus behind COVID-19.
Called GenSLMs, the model, which last year won the Gordon Bell special prize for high performance computing-based COVID-19 research, was trained on a dataset of nucleotide sequences — the building blocks of DNA and RNA. It was developed by researchers from Argonne National Laboratory, NVIDIA, the University of Chicago and a score of other academic and commercial collaborators.
When the researchers looked back at the nucleotide sequences generated by GenSLMs, they discovered that specific characteristics of the AI-generated sequences closely matched the real-world Eris and Pirola subvariants that have been prevalent this year — even though the AI was only trained on COVID-19 virus genomes from the first year of the pandemic.
“Our model’s generative process is extremely naive, lacking any specific information or constraints around what a new COVID variant should look like,” said Arvind Ramanathan, lead researcher on the project and a computational biologist at Argonne. “The AI’s ability to predict the kinds of gene mutations present in recent COVID strains — despite having only seen the Alpha and Beta variants during training — is a strong validation of its capabilities.”
In addition to generating its own sequences, GenSLMs can also classify and cluster different COVID genome sequences by distinguishing between variants. In a demo available on NGC, NVIDIA’s hub for accelerated software, users can explore visualizations of GenSLMs’ analysis of the evolutionary patterns of various proteins within the COVID viral genome.
Reading Between the Lines, Uncovering Evolutionary Patterns
A key feature of GenSLMs is its ability to interpret long strings of nucleotides — represented with sequences of the letters A, T, G and C in DNA, or A, U, G and C in RNA — in the same way an LLM trained on English text would interpret a sentence. This capability enables the model to understand the relationship between different areas of the genome, which in coronaviruses consists of around 30,000 nucleotides.
In the NGC demo, users can choose from among eight different COVID variants to understand how the AI model tracks mutations across various proteins of the viral genome. The visualization depicts evolutionary couplings across the viral proteins — highlighting which snippets of the genome are likely to be seen in a given variant.
“Understanding how different parts of the genome are co-evolving gives us clues about how the virus may develop new vulnerabilities or new forms of resistance,” Ramanathan said. “Looking at the model’s understanding of which mutations are particularly strong in a variant may help scientists with downstream tasks like determining how a specific strain can evade the human immune system.”
GenSLMs was trained on more than 110 million prokaryotic genome sequences and fine-tuned with a global dataset of around 1.5 million COVID viral sequences using open-source data from the Bacterial and Viral Bioinformatics Resource Center. In the future, the model could be fine-tuned on the genomes of other viruses or bacteria, enabling new research applications.
To train the model, the researchers used NVIDIA A100 Tensor Core GPU-powered supercomputers, including Argonne’s Polaris system, the U.S. Department of Energy’s Perlmutter and NVIDIA’s Selene.
The GenSLMs research team’s Gordon Bell special prize was awarded at last year’s SC22 supercomputing conference. At this week’s SC23, in Denver, NVIDIA is sharing a new range of groundbreaking work in the field of accelerated computing. View the full schedule and catch the replay of NVIDIA’s special address below.
NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics. Learn more about NVIDIA Research and subscribe to NVIDIA healthcare news.
Main image courtesy of Argonne National Laboratory’s Bharat Kale.
This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. DOE Office of Science and the National Nuclear Security Administration. Research was supported by the DOE through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on response to COVID-19, with funding from the Coronavirus CARES Act.
-
NVIDIA deep learning blog
- Next-Gen Neural Networks: NVIDIA Research Announces Array of AI Advancements at NeurIPSNVIDIA researchers are collaborating with academic centers worldwide to advance generative AI, robotics and the natural sciences — and more than a dozen of these projects will be shared at NeurIPS, one of the world’s top AI conferences. Set for Dec. 10-16 in New Orleans, NeurIPS brings together experts in generative AI, machine learning, computer vision and more. Among the innovations NVIDIA Research will present are new techniques for transforming text to images, photos to 3D avatars, and speci
Next-Gen Neural Networks: NVIDIA Research Announces Array of AI Advancements at NeurIPS
NVIDIA researchers are collaborating with academic centers worldwide to advance generative AI, robotics and the natural sciences — and more than a dozen of these projects will be shared at NeurIPS, one of the world’s top AI conferences.
Set for Dec. 10-16 in New Orleans, NeurIPS brings together experts in generative AI, machine learning, computer vision and more. Among the innovations NVIDIA Research will present are new techniques for transforming text to images, photos to 3D avatars, and specialized robots into multi-talented machines.
“NVIDIA Research continues to drive progress across the field — including generative AI models that transform text to images or speech, autonomous AI agents that learn new tasks faster, and neural networks that calculate complex physics,” said Jan Kautz, vice president of learning and perception research at NVIDIA. “These projects, often done in collaboration with leading minds in academia, will help accelerate developers of virtual worlds, simulations and autonomous machines.”
Picture This: Improving Text-to-Image Diffusion Models
Diffusion models have become the most popular type of generative AI models to turn text into realistic imagery. NVIDIA researchers have collaborated with universities on multiple projects advancing diffusion models that will be presented at NeurIPS.
- A paper accepted as an oral presentation focuses on improving generative AI models’ ability to understand the link between modifier words and main entities in text prompts. While existing text-to-image models asked to depict a yellow tomato and a red lemon may incorrectly generate images of yellow lemons and red tomatoes, the new model analyzes the syntax of a user’s prompt, encouraging a bond between an entity and its modifiers to deliver a more faithful visual depiction of the prompt.
- SceneScape, a new framework using diffusion models to create long videos of 3D scenes from text prompts, will be presented as a poster. The project combines a text-to-image model with a depth prediction model that helps the videos maintain plausible-looking scenes with consistency between the frames — generating videos of art museums, haunted houses and ice castles (pictured above).
- Another poster describes work that improves how text-to-image models generate concepts rarely seen in training data. Attempts to generate such images usually result in low-quality visuals that aren’t an exact match to the user’s prompt. The new method uses a small set of example images that help the model identify good seeds — random number sequences that guide the AI to generate images from the specified rare classes.
- A third poster shows how a text-to-image diffusion model can use the text description of an incomplete point cloud to generate missing parts and create a complete 3D model of the object. This could help complete point cloud data collected by lidar scanners and other depth sensors for robotics and autonomous vehicle AI applications. Collected imagery is often incomplete because objects are scanned from a specific angle — for example, a lidar sensor mounted to a vehicle would only scan one side of each building as the car drives down a street.
Character Development: Advancements in AI Avatars
AI avatars combine multiple generative AI models to create and animate virtual characters, produce text and convert it to speech. Two NVIDIA posters at NeurIPS present new ways to make these tasks more efficient.
- A poster describes a new method to turn a single portrait image into a 3D head avatar while capturing details including hairstyles and accessories. Unlike current methods that require multiple images and a time-consuming optimization process, this model achieves high-fidelity 3D reconstruction without additional optimization during inference. The avatars can be animated either with blendshapes, which are 3D mesh representations used to represent different facial expressions, or with a reference video clip where a person’s facial expressions and motion are applied to the avatar.
- Another poster by NVIDIA researchers and university collaborators advances zero-shot text-to-speech synthesis with P-Flow, a generative AI model that can rapidly synthesize high-quality personalized speech given a three-second reference prompt. P-Flow features better pronunciation, human likeness and speaker similarity compared to recent state-of-the-art counterparts. The model can near-instantly convert text to speech on a single NVIDIA A100 Tensor Core GPU.
Research Breakthroughs in Reinforcement Learning, Robotics
In the fields of reinforcement learning and robotics, NVIDIA researchers will present two posters highlighting innovations that improve the generalizability of AI across different tasks and environments.
- The first proposes a framework for developing reinforcement learning algorithms that can adapt to new tasks while avoiding the common pitfalls of gradient bias and data inefficiency. The researchers showed that their method — which features a novel meta-algorithm that can create a robust version of any meta-reinforcement learning model — performed well on multiple benchmark tasks.
- Another by an NVIDIA researcher and university collaborators tackles the challenge of object manipulation in robotics. Prior AI models that help robotic hands pick up and interact with objects can handle specific shapes but struggle with objects unseen in the training data. The researchers introduce a new framework that estimates how objects across different categories are geometrically alike — such as drawers and pot lids that have similar handles — enabling the model to more quickly generalize to new shapes.
Supercharging Science: AI-Accelerated Physics, Climate, Healthcare
NVIDIA researchers at NeurIPS will also present papers across the natural sciences — covering physics simulations, climate models and AI for healthcare.
- To accelerate computational fluid dynamics for large-scale 3D simulations, a team of NVIDIA researchers proposed a neural operator architecture that combines accuracy and computational efficiency to estimate the pressure field around vehicles — the first deep learning-based computational fluid dynamics method on an industry-standard, large-scale automotive benchmark. The method achieved 100,000x acceleration on a single NVIDIA Tensor Core GPU compared to another GPU-based solver, while reducing the error rate. Researchers can incorporate the model into their own applications using the open-source neuraloperator library.
- A consortium of climate scientists and machine learning researchers from universities, national labs, research institutes, Allen AI and NVIDIA collaborated on ClimSim, a massive dataset for physics and machine learning-based climate research that will be shared in an oral presentation at NeurIPS. The dataset covers the globe over multiple years at high resolution — and machine learning emulators built using that data can be plugged into existing operational climate simulators to improve their fidelity, accuracy and precision. This can help scientists produce better predictions of storms and other extreme events.
- NVIDIA Research interns are presenting a poster introducing an AI algorithm that provides personalized predictions of the effects of medicine dosage on patients. Using real-world data, the researchers tested the model’s predictions of blood coagulation for patients given different dosages of a treatment. They also analyzed the new algorithm’s predictions of the antibiotic vancomycin levels in patients who received the medication — and found that prediction accuracy significantly improved compared to prior methods.
NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics.
-
Quantum Zeitgeist
- Cleveland Clinic and IBM Launch Quantum Innovation Program for Healthcare Start-upsCleveland Clinic has launched the Quantum Innovation Catalyzer Program, a competitive initiative for start-ups to explore quantum computing applications in healthcare and life sciences. Four companies will be selected to receive a 24-week immersive experience, including access to IBM Quantum System One computer for research. The program is part of Cleveland Clinic’s and IBM’s 10-year partnership aimed at advancing biomedical research through quantum and advanced computing. The application for th
Cleveland Clinic and IBM Launch Quantum Innovation Program for Healthcare Start-ups
-
Quantum Zeitgeist
- Zapata AI and Mila Partner to Advance Generative AI and Quantum Algorithms for EnterprisesZapata AI, an industrial generative AI company, has partnered with Mila - Quebec AI Institute, the world's largest academic deep learning research centre. The collaboration aims to advance machine learning and quantum algorithms, potentially benefiting industries such as life sciences, financial services, and manufacturing. Zapata AI's CTO, Yudong Cao, and Mila's Executive Vice-President, Stéphane Létourneau, expressed excitement about the partnership. They will work with Professor Guillaume Rab
Zapata AI and Mila Partner to Advance Generative AI and Quantum Algorithms for Enterprises
-
NVIDIA deep learning blog
- Brains of the Operation: Atlas Meditech Maps Future of Surgery With AI, Digital TwinsJust as athletes train for a game or actors rehearse for a performance, surgeons prepare ahead of an operation. Now, Atlas Meditech is letting brain surgeons experience a new level of realism in their pre-surgery preparation with AI and physically accurate simulations. Atlas Meditech, a brain-surgery intelligence platform, is adopting tools — including the MONAI medical imaging framework and NVIDIA Omniverse 3D development platform — to build AI-powered decision support and high-fidelity surgery
Brains of the Operation: Atlas Meditech Maps Future of Surgery With AI, Digital Twins
Just as athletes train for a game or actors rehearse for a performance, surgeons prepare ahead of an operation.
Now, Atlas Meditech is letting brain surgeons experience a new level of realism in their pre-surgery preparation with AI and physically accurate simulations.
Atlas Meditech, a brain-surgery intelligence platform, is adopting tools — including the MONAI medical imaging framework and NVIDIA Omniverse 3D development platform — to build AI-powered decision support and high-fidelity surgery rehearsal platforms. Its mission: improving surgical outcomes and patient safety.
“The Atlas provides a collection of multimedia tools for brain surgeons, allowing them to mentally rehearse an operation the night before a real surgery,” said Dr. Aaron Cohen-Gadol, founder of Atlas Meditech and its nonprofit counterpart, Neurosurgical Atlas. “With accelerated computing and digital twins, we want to transform this mental rehearsal into a highly realistic rehearsal in simulation.”
Neurosurgical Atlas offers case studies, surgical videos and 3D models of the brain to more than a million online users. Dr. Cohen-Gadol, also a professor of neurological surgery at Indiana University School of Medicine, estimates that more than 90% of brain surgery training programs in the U.S. — as well as tens of thousands of neurosurgeons in other countries — use the Atlas as a key resource during residency and early in their surgery careers.
Atlas Meditech’s Pathfinder software is integrating AI algorithms that can suggest safe surgical pathways for experts to navigate through the brain to reach a lesion.
And with NVIDIA Omniverse, a platform for connecting and building custom 3D pipelines and metaverse applications, the team aims to create custom virtual representations of individual patients’ brains for surgery rehearsal.
Custom 3D Models of Human Brains
A key benefit of Atlas Meditech’s advanced simulations — either onscreen or in immersive virtual reality — is the ability to customize the simulations, so that surgeons can practice on a virtual brain that matches the patient’s brain in size, shape and lesion position.
“Every patient’s anatomy is a little different,” said Dr. Cohen-Gadol. “What we can do now with physics and advanced graphics is create a patient-specific model of the brain and work with it to see and virtually operate on a tumor. The accuracy of the physical properties helps to recreate the experience we have in the real world during an operation.”
To create digital twins of patients’ brains, the Atlas Pathfinder tool has adopted MONAI Label, which can support radiologists by automatically annotating MRI and CT scans to segment normal structures and tumors.
“MONAI Label is the gateway to any healthcare project because it provides us with the opportunity to segment critical structures and protect them,” said Dr. Cohen-Gadol. “For the Atlas, we’re training MONAI Label to act as the eyes of the surgeon, highlighting what is a normal vessel and what’s a tumor in an individual patient’s scan.”
With a segmented view of a patient’s brain, Atlas Pathfinder can adjust its 3D brain model to morph to the patient’s specific anatomy, capturing how the tumor deforms the normal structure of their brain tissue.
Based on the visualization — which radiologists and surgeons can modify to improve the precision — Atlas Pathfinder suggests the safest surgical approaches to access and remove a tumor without harming other parts of the brain. Each approach links out to the Atlas website, which includes a written tutorial of the operative plan.
“AI-powered decision support can make a big difference in navigating a highly complex 3D structure where every millimeter is critical,” Dr. Cohen-Gadol said.
Realistic Rehearsal Environments for Practicing Surgeons
Atlas Meditech is using NVIDIA Omniverse to develop a virtual operating room that can immerse surgeons into a realistic environment to rehearse upcoming procedures. In the simulation, surgeons can modify how the patient and equipment are positioned.
Using a VR headset, surgeons will be able to work within this virtual environment, going step by step through the procedure and receiving feedback on how closely they are adhering to the target pathway to reach the tumor. AI algorithms can be used to predict how brain tissue would shift as a surgeon uses medical instruments during the operation, and apply that estimated shift to the simulated brain.
“The power to enable surgeons to enter a virtual, 3D space, cut a piece of the skull and rehearse the operation with a simulated brain that has very similar physical properties to the patient would be tremendous,” said Dr. Cohen-Gadol.
To better simulate the brain’s physical properties, the team adopted NVIDIA PhysX, an advanced real-time physics simulation engine that’s part of NVIDIA Omniverse. Using haptic devices, they were able to experiment with adding haptic feedback to the virtual environment, mimicking the feeling of working with brain tissue.
Envisioning AI, Robotics in the Future of Surgery Training
Dr. Cohen-Gadol believes that in the coming years AI models will be able to further enhance surgery by providing additional insights during a procedure. Examples include warning surgeons about critical brain structures that are adjacent to the area they’re working in, tracking medical instruments during surgery, and providing a guide to next steps in the surgery.
Atlas Meditech plans to explore the NVIDIA Holoscan platform for streaming AI applications to power these real-time, intraoperative insights. Applying AI analysis to a surgeon’s actions during a procedure can provide the surgeon with useful feedback to improve their technique.
In addition to being used for surgeons to rehearse operations, Dr. Cohen-Gadol says that digital twins of the brain and of the operating room could help train intelligent medical instruments such as microscope robots using Isaac Sim, a robotics simulation application developed on Omniverse.
-
NVIDIA deep learning blog
- The Fastest Path: Healthcare Startup Uses AI to Analyze Cancer Cells in the Operating RoomMedical-device company Invenio Imaging is developing technology that enables surgeons to evaluate tissue biopsies in the operating room, immediately after samples are collected — providing in just three minutes AI-accelerated insights that would otherwise take weeks to obtain from a pathology lab. In a surgical biopsy, a medical professional removes samples of cells or tissue that pathologists analyze for diseases such as cancer. By delivering these capabilities through a compact, AI-powered ima
The Fastest Path: Healthcare Startup Uses AI to Analyze Cancer Cells in the Operating Room
Medical-device company Invenio Imaging is developing technology that enables surgeons to evaluate tissue biopsies in the operating room, immediately after samples are collected — providing in just three minutes AI-accelerated insights that would otherwise take weeks to obtain from a pathology lab.
In a surgical biopsy, a medical professional removes samples of cells or tissue that pathologists analyze for diseases such as cancer. By delivering these capabilities through a compact, AI-powered imaging system within the treatment room, Invenio aims to support rapid clinical decision-making.
“This technology will help surgeons make intraoperative decisions when performing a biopsy or surgery,” said Chris Freudiger, chief technology officer of Silicon Valley-based Invenio. “They’ll be able to rapidly evaluate whether the tissue sample contains cancerous cells, decide whether they need to take another tissue sample and, with the AI models Invenio is developing, potentially make a molecular diagnosis for personalized medical treatment within minutes.”
Quicker diagnosis enables quicker treatment. It’s especially critical for aggressive types of cancer that could grow or spread significantly in the weeks it takes for biopsy results to return from a dedicated pathology lab.
Invenio is a member of NVIDIA Inception, a program that provides cutting-edge startups with technological support and AI platform guidance. The company accelerates AI training and inference using NVIDIA GPUs and software libraries.
Laser Focus on Cancer Care

Invenio’s NIO Laser Imaging System is a digital pathology tool that accelerates the imaging of fresh tissue biopsies. It’s been used in thousands of procedures in the U.S. and Europe. In 2021, it received the CE Mark of regulatory approval in Europe.
The company plans to adopt the NVIDIA Jetson Orin series of edge AI modules for its next-generation imaging system, which will feature near real-time AI inference accelerated by the NVIDIA TensorRT SDK.
“We’re building a layer of AI models on top of our imaging capabilities to provide physicians with not just the diagnostic image but also an analysis of what they’re seeing,” Freudiger said. “With the AI performance provided by NVIDIA Jetson at the edge, they’ll be able to quickly determine what kinds of cancer cells are present in a biopsy image.”
Invenio uses a cluster of NVIDIA RTX A6000 GPUs to train neural networks with tens of millions of parameters on pathologist-annotated images. The models were developed using the TensorFlow deep learning framework and trained on images acquired with NIO imaging systems.
“The most powerful capability for us is the expanded VRAM on the RTX A6000 GPUs, which allows us to load large batches of images and capture the variability of features,” Freudiger said. “It makes a big difference for AI training.”
On the Path to Clinical Deployment
One of Invenio’s AI products, NIO Glioma Reveal, is approved for clinical use in Europe and available for research use in the U.S. to help identify areas of cancerous cells in brain tissue.
A team of Invenio’s collaborators from the University of Michigan, New York University, University of California San Francisco, the Medical University of Vienna and University Hospital of Cologne recently developed a deep learning model that can find biomarkers of cancerous tumors with 93% accuracy in 90 seconds.
With this ability to analyze different molecular subtypes of cancer within a tissue sample, doctors can predict how well a patient will respond to chemotherapy — or determine whether a tumor has been successfully removed during surgery.
Beyond its work on brain tissue analysis, Invenio this year announced a clinical research collaboration with Johnson & Johnson’s Lung Cancer Initiative to develop and validate an AI solution that can help evaluate lung biopsies. The AI model will help doctors rapidly determine whether collected tissue samples contain cancer.
Lung cancer is the world’s deadliest form of cancer, and in the U.S. alone, lung nodules are found in over 1.5 million patients each year. Once approved for clinical use, Invenio’s NIO Lung Cancer Reveal tool aims to shorten the time needed to analyze tissue biopsies for these patients.
As part of this initiative, Invenio will run a clinical study before submitting the NVIDIA Jetson-powered AI solution for FDA approval.
-
Quantum Zeitgeist
- Assembly Theory: Bridging Physics and Biology to Decode Evolution and ComplexityAn international team of researchers, including Professor Sara Walker from Arizona State University and Professor Lee Cronin from the University of Glasgow, have developed a new theoretical framework called 'Assembly Theory'. This theory bridges physics and biology, providing a unified approach to understanding complexity and evolution in nature. The theory, which assigns a complexity score to molecules, could have implications for the search for alien life and efforts to create new life forms i
Assembly Theory: Bridging Physics and Biology to Decode Evolution and Complexity
-
Quantum Zeitgeist
- Haiqu and Perimeter Institute Join Forces to grow a research team at the Perimeter Institute’s Quantum Intelligence Lab (PIQuIL)Quantum computing software startup Haiqu has partnered with the Perimeter Institute, a leading scientific research centre in theoretical physics in Canada. The collaboration aims to enhance the performance of quantum processors and will see Haiqu establish a research team at the Perimeter Institute's Quantum Intelligence Lab (PIQuIL). The partnership will also offer internships at Haiqu, providing opportunities for a wide range of talent. PIQuIL's founder, Roger Melko, and Haiqu's CEO, Richard G
Haiqu and Perimeter Institute Join Forces to grow a research team at the Perimeter Institute’s Quantum Intelligence Lab (PIQuIL)
-
NVIDIA deep learning blog
- How Industries Are Meeting Consumer Expectations With Speech AIThanks to rapid technological advances, consumers have become accustomed to an unprecedented level of convenience and efficiency. Smartphones make it easier than ever to search for a product and have it delivered right to the front door. Video chat technology lets friends and family on different continents connect with ease. With voice command tools, AI assistants can play songs, initiate phone calls or recommend the best Italian food in a 10-mile radius. AI algorithms can even predict which sho
How Industries Are Meeting Consumer Expectations With Speech AI
Thanks to rapid technological advances, consumers have become accustomed to an unprecedented level of convenience and efficiency.
Smartphones make it easier than ever to search for a product and have it delivered right to the front door. Video chat technology lets friends and family on different continents connect with ease. With voice command tools, AI assistants can play songs, initiate phone calls or recommend the best Italian food in a 10-mile radius. AI algorithms can even predict which show users may want to watch next or suggest an article they may want to read before making a purchase.
It’s no surprise, then, that customers expect fast and personalized interactions with companies. According to a Salesforce research report, 83% of consumers expect immediate engagement when they contact a company, while 73% expect companies to understand their unique needs and expectations. Nearly 60% of all customers want to avoid customer service altogether, preferring to resolve issues with self-service features.
Meeting such high consumer expectations places a massive burden on companies in every industry, including on their staff and technological needs — but speech AI can help.
Speech AI can understand and converse in natural language, creating opportunities for seamless, multilingual customer interactions while supplementing employee capabilities. It can power self-serve banking in the financial services industry, enable food kiosk avatars in restaurants, transcribe clinical notes in healthcare facilities or streamline bill payments for utility companies — helping businesses across industries deliver personalized customer experiences.
Speech AI for Banking and Payments
Most people now use both digital and traditional channels to access banking services, creating a demand for omnichannel, personalized customer support. However, higher demand for support coupled with a high agent churn rate has left many financial institutions struggling to keep up with the service and support needs of their customers.
Common consumer frustrations include difficulty with complex digital processes, a lack of helpful and readily available information, insufficient self-service options, long call wait times and communication difficulties with support agents.
According to a recent NVIDIA survey, the top AI use cases for financial service institutions are natural language processing (NLP) and large language models (LLMs). These models automate customer service interactions and process large bodies of unstructured financial data to provide AI-driven insights that support all lines of business across financial institutions — from risk management and fraud detection to algorithmic trading and customer service.
By providing speech-equipped self-service options and supporting customer service agents with AI-powered virtual assistants, banks can improve customer experiences while controlling costs. AI voice assistants can be trained on finance-specific vocabulary and rephrasing techniques to confirm understanding of a user’s request before offering answers.
Kore.ai, a conversational AI software company, trained its BankAssist solution on 400-plus retail banking use cases for interactive voice response, web, mobile, SMS and social media channels. Customers can use a voice assistant to transfer funds, pay bills, report lost cards, dispute charges, reset passwords and more.
Kore.ai’s agent voice assistant has also helps live agents provide personalized suggestions so they can resolve issues faster. The solution has been shown to improve live agent efficiency by cutting customer handling time by 40% with a return on investment of $2.30 per voice session.
With such trends, expect financial institutions to accelerate the deployment of speech AI to streamline customer support and reduce wait times, offer more self-service options, transcribe calls to speed loan processing and automate compliance, extract insights from spoken content and boost the overall productivity and speed of operations.
Speech AI for Telecommunications
Heavy investments in 5G infrastructure and cut-throat competition to monetize and achieve profitable returns on new networks mean that maintaining customer satisfaction and brand loyalty is paramount in the telco industry.
According to an NVIDIA survey of 400-plus industry professionals, the top AI use cases in the telecom industry involve optimizing network operations and improving customer experiences. Seventy-three percent of respondents reported increased revenue from AI.
By using speech AI technologies to power chatbots, call-routing, self-service features and recommender systems, telcos can enhance and personalize customer engagements.
KT, a South Korean mobile operator with over 22 million users, has built GiGa Genie, an intelligent voice assistant that’s been trained to understand and use the Korean language using LLMs. It has already conversed with over 8 million users.
By understanding voice commands, the GiGA Genie AI speaker can support people with tasks like turning on smart TVs or lights, sending text messages or providing real-time traffic updates.
KT has also strengthened its AI-powered Customer Contact Center with transformer-based speech AI models that can independently handle over 100,000 calls per day. A generative AI component of the system autonomously responds to customers with suggested resolutions or transfers them to human agents for more nuanced questions and solutions.
Telecommunications companies are expected to lean into speech AI to build more customer self-service capabilities, optimize network performance and enhance overall customer satisfaction.
Speech AI for Quick-Service Restaurants
The food service industry is expected to reach $997 billion in sales in 2023, and its workforce is projected to grow by 500,000 openings. Meanwhile, elevated demand for drive-thru, curbside pickup and home delivery suggests a permanent shift in consumer dining preferences. This shift creates the challenge of hiring, training and retaining staff in an industry with notoriously high turnover rates — all while meeting consumer expectations for fast and fresh service.
Drive-thru order assistants and in-store food kiosks equipped with speech AI can help ease the burden. For example, speech-equipped avatars can help automate the ordering process by offering menu recommendations, suggesting promotions, customizing options or passing food orders directly to the kitchen for preparation.
HuEx, a Toronto-based startup and member of NVIDIA Inception, has designed a multilingual automated order assistant to enhance drive-thru operations. Known as AIDA, the AI assistant receives and responds to orders at the drive-thru speaker box while simultaneously transcribing voice orders into text for food-prep staff.
AIDA understands 300,000-plus product combinations with 90% accuracy, from common requests such as “coffee with milk” to less common requests such as “coffee with butter.” It can even understand different accents and dialects to ensure a seamless ordering experience for a diverse population of consumers.
Speech AI streamlines the order process by speeding fulfillment, reducing miscommunication and minimizing customer wait times. Early movers will also begin to use speech AI to extract customer insights from voice interactions to inform menu options, make upsell recommendations and improve overall operational efficiency while reducing costs.
Speech AI for Healthcare
In the post-pandemic era, the digitization of healthcare is continuing to accelerate. Telemedicine and computer vision support remote patient monitoring, voice-activated clinical systems help patients check in and receive zero-touch care and speech recognition technology supports clinical documentation responsibilities. Per IDC, 36% of survey respondents indicated that they had deployed digital assistants for patient healthcare.
Automated speech recognition and NLP models can now capture, recognize, understand and summarize key details in medical settings. At the Conference for Machine Intelligence in Medical Imaging, NVIDIA researchers showcased a state-of-the-art pretrained architecture with speech-to-text functionality to extract clinical entities from doctor-patient conversations. The model identifies clinical words — including symptoms, medication names, diagnoses and recommended treatments — and automatically updates medical records.
This technology can ease the burden of manual note-taking and has the potential to accelerate insurance and billing processes while also creating consultation recaps for caregivers. Relieved of administrative tasks, physicians can focus on patient care to deliver superior experiences.
Artisight, an AI platform for healthcare, uses speech recognition to power zero-touch check-ins and speech synthesis to notify patients in the waiting room when the doctor is available. Over 1,200 patients per day use Artisight kiosks, which help streamline registration processes, improve patient experiences, eliminate data entry errors with automation and boost staff productivity.
As healthcare moves toward a smart hospital model, expect to see speech AI play a bigger role in supporting medical professionals and powering low-touch experiences for patients. This may include risk factor prediction and diagnosis through clinical note analysis, translation services for multilingual care centers, medical dictation and transcription and automation of other administrative tasks.
Speech AI for Energy
Faced with increasing demand for clean energy, high operating costs and a workforce retiring in greater numbers, energy and utility companies are looking for ways to do more with less.
To drive new efficiencies, prepare for the future of energy and meet ever-rising customer expectations, utilities can use speech AI. Voice-based customer service can enable customers to report outages, inquire about billing and receive support on other issues without agent intervention. Speech AI can streamline meter reading, support field technicians with voice notes and voice commands to access work orders and enable utilities to analyze customer preferences with NLP.
Minerva CQ, an AI assistant designed specifically for retail energy use cases, supports customer service agents by transcribing conversations into text in real time. Text is fed into Minerva CQ’s AI models, which analyze customer sentiment, intent, propensity and more.
By dynamically listening, the AI assistant populates an agent’s screen with dialogue suggestions, behavioral cues, personalized offers and sentiment analysis. A knowledge-surfacing feature pulls up a customer’s energy usage history and suggests decarbonization options — arming agents with the information needed to help customers make informed decisions about their energy consumption.
With the AI assistant providing consistent, simple explanations on energy sources, tariff plans, billing changes and optimal spending, customer service agents can effortlessly guide customers to the most ideal energy plan. After deploying Minerva CQ, one utility provider reported a 44% reduction in call handling time, a 12.5% increase in first-contact resolution and average savings of $2.67 per call.
Speech AI is expected to continue to help utility providers reduce training costs, remove friction from customer service interactions and equip field technicians with voice-activated tools to boost productivity and improve safety — all while enhancing customer satisfaction.
Speech and Translation AI for the Public Sector
Because public service programs are often underfunded and understaffed, citizens seeking vital services and information are at times left waiting and frustrated. To address this challenge, some federal- and state-level agencies are turning to speech AI to achieve more timely service delivery.
The Federal Emergency Management Agency uses automated speech recognition systems to manage emergency hotlines, analyze distress signals and direct resources efficiently. The U.S. Social Security Administration uses an interactive voice response system and virtual assistants to respond to inquiries about social security benefits and application processes and to provide general information.
The Department of Veterans Affairs has appointed a director of AI to oversee the integration of the technology into its healthcare systems. The VA uses speech recognition technology to power note-taking during telehealth appointments. It has also developed an advanced automated speech transcription engine to help score neuropsychological tests for analysis of cognitive decline in older patients.
Additional opportunities for speech AI in the public sector include real-time language translation services for citizen interactions, public events or visiting diplomats. Public agencies that handle a large volume of calls can benefit from multilingual voice-based interfaces to allow citizens to access information, make inquiries or request services in different languages.
Speech and translation AI can also automate document processing by converting multilingual audio recordings or spoken content into translated text to streamline compliance processes, improve data accuracy and enhance administrative task efficiency. Speech AI additionally has the potential to expand access to services for people with visual or mobility impairments.
Speech AI for Automotive
From vehicle sales to service scheduling, speech AI can bring numerous benefits to automakers, dealerships, drivers and passengers alike.
Before visiting a dealership in person, more than half of vehicle shoppers begin their search online, then make the first contact with a phone call to collect information. Speech AI chatbots trained on vehicle manuals can answer questions on technological capabilities, navigation, safety, warranty, maintenance costs and more. AI chatbots can also schedule test drives, answer pricing questions and inform shoppers of which models are in stock. This enables automotive manufacturers to differentiate their dealership networks through intelligent and automated engagements with customers.
Manufacturers are building advanced speech AI into vehicles and apps to improve driving experiences, safety and service. Onboard AI assistants can execute natural language voice commands for navigation, infotainment, general vehicle diagnostics and querying user manuals. Without the need to operate physical controls or touch screens, drivers can keep their hands on the wheel and eyes on the road.
Speech AI can help maximize vehicle up-time for commercial fleets. AI trained on technical service bulletins and software update cadences lets technicians provide more accurate quotes for repairs, identify key information before putting the car on a lift and swiftly supply vehicle repair updates to commercial and small business customers.
With insights from driver voice commands and bug reports, manufacturers can also improve vehicle design and operating software. As self-driving cars become more advanced, expect speech AI to play a critical role in how drivers operate vehicles, troubleshoot issues, call for assistance and schedule maintenance.
Speech AI — From Smart Spaces to Entertainment
Speech AI has the potential to impact nearly every industry.
In Smart Cities, speech AI can be used to handle distress calls and provide emergency responders with crucial information. In Mexico City, the United Nations Office on Drugs and Crime is developing a speech AI program to analyze 911 calls to prevent gender violence. By analyzing distress calls, AI can identify keywords, signals and patterns to help prevent domestic violence against women. Speech AI can also be used to deliver multilingual services in public spaces and improve access to transit for people who are visually impaired.
In higher education and research, speech AI can automatically transcribe lectures and research interviews, providing students with detailed notes and saving researchers the time spent compiling qualitative data. Speech AI also facilitates the translation of educational content to various languages, increasing its accessibility.
AI translation powered by LLMs is making it easier to consume entertainment and streaming content online in any language. Netflix, for example, is using AI to automatically translate subtitles into multiple languages. Meanwhile, startup Papercup is using AI to automate video content dubbing to reach global audiences in their local languages.
Transforming Product and Service Offerings With Speech AI
In the modern consumer landscape, it’s imperative that companies provide convenient, personalized customer experiences. Businesses can use NLP and the translation capabilities of speech AI to transform the way they operate and interact with customers in real time on a global scale.
Companies across industries are using speech AI to deliver rapid, multilingual customer service responses, self-service features and information and automation tools to empower employees to provide higher-value experiences.
To help enterprises in every industry realize the benefits of speech, translation and conversational AI, NVIDIA offers a suite of technologies.
NVIDIA Riva, a GPU-accelerated multilingual speech and translation AI software development kit, powers fully customizable real-time conversational AI pipelines for automatic speech recognition, text-to-speech and neural machine translation applications.
NVIDIA Tokkio, built on the NVIDIA Omniverse Avatar Cloud Engine, offers cloud-native services to create virtual assistants and digital humans that can serve as AI customer service agents.
These tools enable developers to quickly deploy high-accuracy applications with the real-time response speed needed for superior employee and customer experiences.
Join the free Speech AI Day on Sept. 20 to hear from renowned speech and translation AI leaders about groundbreaking research, real-world applications and open-source contributions.
-
NVIDIA deep learning blog
- The Halo Effect: AI Deep Dives Into Coral Reef ConservationWith coral reefs in rapid decline across the globe, researchers from the University of Hawaii at Mānoa have pioneered an AI-based surveying tool that monitors reef health from the sky. Using deep learning models and high-resolution satellite imagery powered by NVIDIA GPUs, the researchers have developed a new method for spotting and tracking coral reef halos — distinctive rings of barren sand encircling reefs. The study, recently published in the Remote Sensing of Environment journal, could unlo
The Halo Effect: AI Deep Dives Into Coral Reef Conservation
With coral reefs in rapid decline across the globe, researchers from the University of Hawaii at Mānoa have pioneered an AI-based surveying tool that monitors reef health from the sky.
Using deep learning models and high-resolution satellite imagery powered by NVIDIA GPUs, the researchers have developed a new method for spotting and tracking coral reef halos — distinctive rings of barren sand encircling reefs.
The study, recently published in the Remote Sensing of Environment journal, could unlock real-time coral reef monitoring and turn the tide on global conservation.
“Coral reef halos are a potential proxy for ecosystem health,” said Amelia Meier, a postdoctoral fellow at the University of Hawaii and co-author of the study. “Visible from space, these halo patterns give scientists and conservationists a unique opportunity to observe vast and distant areas. With AI, we can regularly assess halo presence and size in near real time to determine ecosystem well-being.”
Sea-ing Clearly: Illuminating Reef Health
Previously attributed solely to fish grazing, reef halos can also indicate a healthy predator-prey ecosystem, according to researchers’ recent discoveries. While some herbivorous fish graze algae or seagrass near the protective reef perimeter, hunters dig around the seafloor for burrowed invertebrates, laying bare the surrounding sand.
These dynamics indicate the area hosts a healthy food buffet for sustaining a diverse population of ocean dwellers. When the halo changes shape, it signals an imbalance in the marine food web and could indicate an unhealthy reef environment.
In Hot Water
While making up less than 1% of the ocean, coral reefs offer habitat, food and nursery grounds for over 1 million aquatic species. There’s also huge commercial value — about $375 billion annually in commercial and subsistence fishing, tourism and coastal storm protection, and providing antiviral compounds for drug discovery research.
However, reef health is threatened by overfishing, nutrient contamination and ocean acidification. Intensifying climate change — along with the resulting thermal stress from a warming ocean — also increases coral bleaching and infectious disease.
Over half of the world’s coral reefs are already lost or badly damaged, and scientists predict that by 2050 all reefs will face threats, with many in critical danger.
Charting New Horizons With AI
Spotting changes in reef halos is key to global conservation efforts. However, tracking these changes is labor- and time-intensive, limiting the number of surveys that researchers can perform every year. Access to reefs in remote locations also poses challenges.
The researchers created an AI tool that identifies and measures reef halos from global satellites, giving conservationists an opportunity to proactively address reef degradation.
Using Planet SkySat images, they developed a dual-model framework employing two types of convolutional neural networks (CNNs). Relying on computer vision methods for image segmentation, they trained a Mask R-CNN model that detects the edges of the reef and halo, pixel by pixel. A U-Net model trained to differentiate between the coral reef and halo then classifies and predicts the areas of both.

The team used TensorFlow, Keras and PyTorch libraries for training and testing thousands of annotations on the coral reef models.
To handle the task’s large compute requirements, the CNNs operate on an NVIDIA RTX A6000 GPU, boosted by a cuDNN-accelerated PyTorch framework. The researchers received the A6000 GPU as participants in the NVIDIA Academic Hardware Grant Program.
The AI tool quickly identifies and measures around 300 halos across 100 square kilometers in about two minutes. The same task takes a human annotator roughly 10 hours. The model also reaches about 90% accuracy depending on location and can navigate various and complicated halo patterns.
“Our study marks the first instance of training AI on reef halo patterns, as opposed to more common AI datasets of images, such as those of cats and dogs,” Meier said. “Processing thousands of images can take a lot of time, but using the NVIDIA GPU sped up the process significantly.”
One challenge is that image resolution can be a limiting factor in the model’s accuracy. Course-scale imagery with low resolutions makes it difficult to spot reef and halo boundaries and creates less accurate predictions.
Shoring Up Environmental Monitoring
“Our long-term goal is to transform our findings into a robust monitoring tool for assessing changes in halo size and to draw correlations to the population dynamics of predators and herbivores in the area,” Meier said.
With this new approach, the researchers are exploring the relationship between species composition, reef health, and halo presence and size. Currently, they’re looking into the association between sharks and halos. If their hypothesized predator-prey-halo interaction proves true, the team anticipates estimating shark abundance from space.
-
NVIDIA deep learning blog
- AI-Fueled Productivity: Generative AI Opens New Era of Efficiency Across IndustriesA watershed moment on Nov. 22, 2022, was mostly virtual, yet it shook the foundations of nearly every industry on the planet. On that day, OpenAI released ChatGPT, the most advanced artificial intelligence chatbot ever developed. This set off demand for generative AI applications that help businesses become more efficient, from providing consumers with answers to their questions to accelerating the work of researchers as they seek scientific breakthroughs, and much, much more. Businesses that pr
AI-Fueled Productivity: Generative AI Opens New Era of Efficiency Across Industries
A watershed moment on Nov. 22, 2022, was mostly virtual, yet it shook the foundations of nearly every industry on the planet.
On that day, OpenAI released ChatGPT, the most advanced artificial intelligence chatbot ever developed. This set off demand for generative AI applications that help businesses become more efficient, from providing consumers with answers to their questions to accelerating the work of researchers as they seek scientific breakthroughs, and much, much more.
Businesses that previously dabbled in AI are now rushing to adopt and deploy the latest applications. Generative AI — the ability of algorithms to create new text, images, sounds, animations, 3D models and even computer code — is moving at warp speed, transforming the way people work and play.
By employing large language models (LLMs) to handle queries, the technology can dramatically reduce the time people devote to manual tasks like searching for and compiling information.
The stakes are high. AI could contribute more than $15 trillion to the global economy by 2030, according to PwC. And the impact of AI adoption could be greater than the inventions of the internet, mobile broadband and the smartphone — combined.
The engine driving generative AI is accelerated computing. It uses GPUs, DPUs and networking along with CPUs to accelerate applications across science, analytics, engineering, as well as consumer and enterprise use cases.
Early adopters across industries — from drug discovery, financial services, retail and telecommunications to energy, higher education and the public sector — are combining accelerated computing with generative AI to transform business operations, service offerings and productivity.

Generative AI for Drug Discovery
Today, radiologists use AI to detect abnormalities in medical images, doctors use it to scan electronic health records to uncover patient insights, and researchers use it to accelerate the discovery of novel drugs.
Traditional drug discovery is a resource-intensive process that can require the synthesis of over 5,000 chemical compounds and yields an average success rate of just 10%. And it takes more than a decade for most new drug candidates to reach the market.
Researchers are now using generative AI models to read a protein’s amino acid sequence and accurately predict the structure of target proteins in seconds, rather than weeks or months.
Using NVIDIA BioNeMo models, Amgen, a global leader in biotechnology, has slashed the time it takes to customize models for molecule screening and optimization from three months to just a few weeks. This type of trainable foundation model enables scientists to create variants for research into specific diseases, allowing them to develop target treatments for rare conditions.
Whether predicting protein structures or securely training algorithms on large real-world and synthetic datasets, generative AI and accelerated computing are opening new areas of research that can help mitigate the spread of disease, enable personalized medical treatments and boost patient survival rates.
Generative AI for Financial Services
According to a recent NVIDIA survey, the top AI use cases in the financial services industry are customer services and deep analytics, where natural language processing and LLMs are used to better respond to customer inquiries and uncover investment insights. Another common application is in recommender systems that power personalized banking experiences, marketing optimization and investment guidance.
Advanced AI applications have the potential to help the industry better prevent fraud and transform every aspect of banking, from portfolio planning and risk management to compliance and automation.
Eighty percent of business-relevant information is in an unstructured format — primarily text — which makes it a prime candidate for generative AI. Bloomberg News produces 5,000 stories a day related to the financial and investment community. These stories represent a vast trove of unstructured market data that can be used to make timely investment decisions.
NVIDIA, Deutsche Bank, Bloomberg and others are creating LLMs trained on domain-specific and proprietary data to power finance applications.
Financial Transformers, or “FinFormers,” can learn context and understand the meaning of unstructured financial data. They can power Q&A chatbots, summarize and translate financial texts, provide early warning signs of counterparty risk, quickly retrieve data and identify data-quality issues.
These generative AI tools rely on frameworks that can integrate proprietary data into model training and fine-tuning, integrate data curation to prevent bias and use guardrails to keep conversations finance-specific.
Expect fintech startups and large international banks to expand their use of LLMs and generative AI to develop sophisticated virtual assistants to serve internal and external stakeholders, create hyper-personalized customer content, automate document summarization to reduce manual work, and analyze terabytes of public and private data to generate investment insights.
Generative AI for Retail
With 60% of all shopping journeys starting online and consumers more connected and knowledgeable than ever, AI has become a vital tool to help retailers match shifting expectations and differentiate from a rising tide of competition.
Retailers are using AI to improve customer experiences, power dynamic pricing, create customer segmentation, design personalized recommendations and perform visual search.
Generative AI can support customers and employees at every step through the buyer journey.
With AI models trained on specific brand and product data, they can generate robust product descriptions that improve search engine optimization rankings and help shoppers find the exact product they’re looking for. For example, generative AI can use metatags containing product attributes to generate more comprehensive product descriptions that include various terms like “low sugar” or “gluten free.”
AI virtual assistants can check enterprise resource planning systems and generate customer service messages to inform shoppers about which items are available and when orders will ship, and even assist customers with order change requests.
Fashable, a member of NVIDIA Inception’s global network of technology startups, is using generative AI to create virtual clothing designs, eliminating the need for physical fabric during product development. With the models trained on both proprietary and market data, this reduces the environmental impact of fashion design and helps retailers design clothes according to current market trends and tastes.
Expect retailers to use AI to capture and retain customer attention, deliver superior shopping experiences, and drive revenue by matching shoppers with the right products at the right time.
Generative AI for Telecommunications
In an NVIDIA survey covering the telecommunications industry, 95% of respondents reported that they were engaged with AI, while two-thirds believed that AI would be important to their company’s future success.
Whether improving customer service, streamlining network operations and design, supporting field technicians or creating new monetization opportunities, generative AI has the potential to reinvent the telecom industry.
Telcos can train diagnostic AI models with proprietary data on network equipment and services, performance, ticket issues, site surveys and more. These models can accelerate troubleshooting of technical performance issues, recommend network designs, check network configurations for compliance, predict equipment failures, and identify and respond to security threats.
Generative AI applications on handheld devices can support field technicians by scanning equipment and generating virtual tutorials to guide them through repairs. Virtual guides can then be enhanced with augmented reality, enabling technicians to analyze equipment in a 3D immersive environment or call on a remote expert for support.
New revenue opportunities will also open for telcos. With large edge infrastructure and access to vast datasets, telcos around the world are now offering generative AI as a service to enterprise and government customers.
As generative AI advances, expect telecommunications providers to use the technology to optimize network performance, improve customer support, detect security intrusions and enhance maintenance operations.
Generative AI for Energy
In the energy industry, AI is powering predictive maintenance and asset optimization, smart grid management, renewable energy forecasting, grid security and more.
To meet growing data needs across aging infrastructure and new government compliance regulations, energy operators are looking to generative AI.
In the U.S., electric utility companies spend billions of dollars every year to inspect, maintain and upgrade power generation and transmission infrastructure.
Until recently, using vision AI to support inspection required algorithms to be trained on thousands of manually collected and tagged photos of grid assets, with training data constantly updated for new components. Now, generative AI can do the heavy lifting.
With a small set of image training data, algorithms can generate thousands of physically accurate images to train computer vision models that help field technicians identify grid equipment corrosion, breakage, obstructions and even detect wildfires. This type of proactive maintenance enhances grid reliability and resiliency by reducing downtime, while diminishing the need to dispatch teams to the field.
Generative AI can also reduce the need for manual research and analysis. According to McKinsey, employees spend up to 1.8 hours per day searching for information — nearly 20% of the work week. To increase productivity, energy companies can train LLMs on proprietary data, including meeting notes, SAP records, emails, field best practices and public data such as standard material data sheets.
With this type of knowledge repository connected to an AI chatbot, engineers and data scientists can get instant answers to highly technical questions. For example, a maintenance engineer troubleshooting pitch control issues on a turbine’s hydraulic system could ask a bot: “How should I adjust the hydraulic pressure or flow to rectify pitch control issues on a model turbine from company X?” A properly trained model would deliver specific instructions to the user, who wouldn’t have to look through a bulky manual to find answers.
With AI applications for new system design, customer service and automation, expect generative AI to enhance safety and energy efficiency, as well as reduce operational expenses in the energy industry.
Generative AI for Higher Education and Research
From intelligent tutoring systems to automated essay grading, AI has been employed in education for decades. As universities use AI to improve teacher and student experiences, they’re increasingly dedicating resources to build AI-focused research initiatives.
For example, researchers at the University of Florida have access to one of the world’s fastest supercomputers in academia. They’ve used it to develop GatorTron — a natural language processing model that enables computers to read and interpret medical language in clinical notes that are stored in electronic health records. With a model that understands medical context, AI developers can create numerous medical applications, such as speech-to-text apps that support doctors with automated medical charting.
In Europe, an industry-university collaboration involving the Technical University of Munich is demonstrating that LLMs trained on genomics data can generalize across a plethora of genomic tasks, unlike previous approaches that required specialized models. The genomics LLM is expected to help scientists understand the dynamics of how DNA is translated into RNA and proteins, unlocking new clinical applications that will benefit drug discovery and health.
To conduct this type of groundbreaking research and attract the most motivated students and qualified academic professionals, higher education institutes should consider a whole-university approach to pool budget, plan AI initiatives, and distribute AI resources and benefits across disciplines.
Generative AI for the Public Sector
Today, the biggest opportunity for AI in the public sector is helping public servants to perform their jobs more efficiently and save resources.
The U.S. federal government employs over 2 million civilian employees — two-thirds of whom work in professional and administrative jobs.
These administrative roles often involve time-consuming manual tasks, including drafting, editing and summarizing documents, updating databases, recording expenditures for auditing and compliance, and responding to citizen inquiries.
To control costs and bring greater efficiency to routine job functions, government agencies can use generative AI.
Generative AI’s ability to summarize documents has great potential to boost the productivity of policymakers and staffers, civil servants, procurement officers and contractors. Consider a 756-page report recently released by the National Security Commission on Artificial Intelligence. With reports and legislation often spanning hundreds of pages of dense academic or legal text, AI-powered summaries generated in seconds can quickly break down complex content into plain language, saving the human resources otherwise needed to complete the task.
AI virtual assistants and chatbots powered by LLMs can instantly deliver relevant information to people online, taking the burden off of overstretched staff who work phone banks at agencies like the Treasury Department, IRS and DMV.
With simple text inputs, AI content generation can help public servants create and distribute publications, email correspondence, reports, press releases and public service announcements.
The analytical capabilities of AI can also help process documents to speed the delivery of vital services provided by organizations like Medicare, Medicaid, Veterans Affairs, USPS and the State Department.
Generative AI could be a pivotal tool to help government bodies work within budget constraints, deliver government services more quickly and achieve positive public sentiment.
Generative AI – A Key Ingredient for Business Success
Across every field, organizations are transforming employee productivity, improving products and delivering higher-quality services with generative AI.
To put generative AI into practice, businesses need expansive amounts of data, deep AI expertise and sufficient compute power to deploy and maintain models quickly. Enterprises can fast-track adoption with the NeMo generative AI framework, part of NVIDIA AI Enterprise software, running on DGX Cloud. NVIDIA’s pretrained foundation models offer a simplified approach to building and running customized generative AI solutions for unique business use cases.
Learn more about powerful generative AI tools to help your business increase productivity, automate tasks, and unlock new opportunities for employees and customers.