Normal view

There are new articles available, click to refresh the page.
Yesterday — 29 November 2023Artificial Intelligence

AWS and NVIDIA expand partnership to advance generative AI

By: Ryan Daws
29 November 2023 at 14:30

Amazon Web Services (AWS) and NVIDIA have announced a significant expansion of their strategic collaboration at AWS re:Invent. The collaboration aims to provide customers with state-of-the-art infrastructure, software, and services to fuel generative AI innovations.

The collaboration brings together the strengths of both companies, integrating NVIDIA’s latest multi-node systems with next-generation GPUs, CPUs, and AI software, along with AWS technologies such as Nitro System advanced virtualisation, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability.

Key highlights of the expanded collaboration include:

  1. Introduction of NVIDIA GH200 Grace Hopper Superchips on AWS:
    • AWS becomes the first cloud provider to offer NVIDIA GH200 Grace Hopper Superchips with new multi-node NVLink technology.
    • The NVIDIA GH200 NVL32 multi-node platform enables joint customers to scale to thousands of GH200 Superchips, providing supercomputer-class performance.
  2. Hosting NVIDIA DGX Cloud on AWS:
    • Collaboration to host NVIDIA DGX Cloud, an AI-training-as-a-service, on AWS, featuring GH200 NVL32 for accelerated training of generative AI and large language models.
  3. Project Ceiba supercomputer:
    • Collaboration on Project Ceiba, aiming to design the world’s fastest GPU-powered AI supercomputer with 16,384 NVIDIA GH200 Superchips and processing capability of 65 exaflops.
  4. Introduction of new Amazon EC2 instances:
    • AWS introduces three new Amazon EC2 instances, including P5e instances powered by NVIDIA H200 Tensor Core GPUs for large-scale generative AI and HPC workloads.
  5. Software innovations:
    • NVIDIA introduces software on AWS, such as NeMo Retriever microservice for chatbots and summarisation tools, and BioNeMo to speed up drug discovery for pharmaceutical companies.

This collaboration signifies a joint commitment to advancing the field of generative AI, offering customers access to cutting-edge technologies and resources.

Internally, Amazon robotics and fulfilment teams already employ NVIDIA’s Omniverse platform to optimise warehouses in virtual environments first before real-world deployment.

The integration of NVIDIA and AWS technologies will accelerate the development, training, and inference of large language models and generative AI applications across various industries.

(Photo by ANIRUDH on Unsplash)

See also: Inflection-2 beats Google’s PaLM 2 across common benchmarks

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post AWS and NVIDIA expand partnership to advance generative AI appeared first on AI News.

Before yesterdayArtificial Intelligence

Global AI security guidelines endorsed by 18 countries

By: Ryan Daws
27 November 2023 at 10:28

The UK has published the world’s first global guidelines for securing AI systems against cyberattacks. The new guidelines aim to ensure AI technology is developed safely and securely.

The guidelines were developed by the UK’s National Cyber Security Centre (NCSC) and the US’ Cybersecurity and Infrastructure Security Agency (CISA). They have already secured endorsements from 17 other countries, including all G7 members.

The guidelines provide recommendations for developers and organisations using AI to incorporate cybersecurity at every stage. This “secure by design” approach advises baking in security from the initial design phase through development, deployment, and ongoing operations.  

Specific guidelines cover four key areas: secure design, secure development, secure deployment, and secure operation and maintenance. They suggest security behaviours and best practices for each phase.

The launch event in London convened over 100 industry, government, and international partners. Speakers included reps from Microsoft, the Alan Turing Institute, and cyber agencies from the US, Canada, Germany, and the UK.  

NCSC CEO Lindy Cameron stressed the need for proactive security amidst AI’s rapid pace of development. She said, “security is not a postscript to development but a core requirement throughout.”

The guidelines build on existing UK leadership in AI safety. Last month, the UK hosted the first international summit on AI safety at Bletchley Park.

US Secretary of Homeland Security Alejandro Mayorkas said: “We are at an inflection point in the development of artificial intelligence, which may well be the most consequential technology of our time. Cybersecurity is key to building AI systems that are safe, secure, and trustworthy.

“The guidelines jointly issued today by CISA, NCSC, and our other international partners, provide a common-sense path to designing, developing, deploying, and operating AI with cybersecurity at its core.”

The 18 endorsing countries span Europe, Asia-Pacific, Africa, and the Americas. Here is the full list of international signatories:

  • Australia – Australian Signals Directorate’s Australian Cyber Security Centre (ACSC)
  • Canada – Canadian Centre for Cyber Security (CCCS) 
  • Chile – Chile’s Government CSIRT
  • Czechia – Czechia’s National Cyber and Information Security Agency (NUKIB)
  • Estonia – Information System Authority of Estonia (RIA) and National Cyber Security Centre of Estonia (NCSC-EE)
  • France – French Cybersecurity Agency (ANSSI)
  • Germany – Germany’s Federal Office for Information Security (BSI)
  • Israel – Israeli National Cyber Directorate (INCD)
  • Italy – Italian National Cybersecurity Agency (ACN)
  • Japan – Japan’s National Center of Incident Readiness and Strategy for Cybersecurity (NISC; Japan’s Secretariat of Science, Technology and Innovation Policy, Cabinet Office
  • New Zealand – New Zealand National Cyber Security Centre
  • Nigeria – Nigeria’s National Information Technology Development Agency (NITDA)
  • Norway – Norwegian National Cyber Security Centre (NCSC-NO)
  • Poland – Poland’s NASK National Research Institute (NASK)
  • Republic of Korea – Republic of Korea National Intelligence Service (NIS)
  • Singapore – Cyber Security Agency of Singapore (CSA)
  • United Kingdom – National Cyber Security Centre (NCSC)
  • United States of America – Cybersecurity and Infrastructure Agency (CISA); National Security Agency (NSA; Federal Bureau of Investigations (FBI)

UK Science and Technology Secretary Michelle Donelan positioned the new guidelines as cementing the UK’s role as “an international standard bearer on the safe use of AI.”

“Just weeks after we brought world leaders together at Bletchley Park to reach the first international agreement on safe and responsible AI, we are once again uniting nations and companies in this truly global effort,” adds Donelan.

The guidelines are now published on the NCSC website alongside explanatory blogs. Developer uptake will be key to translating the secure by design vision into real-world improvements in AI security.

(Photo by Jan Antonin Kolar on Unsplash)

See also: Paul O’Sullivan, Salesforce: Transforming work in the GenAI era

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Global AI security guidelines endorsed by 18 countries appeared first on AI News.

Inflection-2 beats Google’s PaLM 2 across common benchmarks

By: Ryan Daws
23 November 2023 at 09:54

Inflection, an AI startup aiming to create “personal AI for everyone”, has announced a new large language model dubbed Inflection-2 that beats Google’s PaLM 2.

Inflection-2 was trained on over 5,000 NVIDIA GPUs to reach 1.025 quadrillion floating point operations (FLOPs), putting it in the same league as PaLM 2 Large. However, early benchmarks show Inflection-2 outperforming Google’s model on tests of reasoning ability, factual knowledge, and stylistic prowess.

On a range of common academic AI benchmarks, Inflection-2 achieved higher scores than PaLM 2 on most. This included outscoring the search giant’s flagship on the diverse Multi-task Middle-school Language Understanding (MMLU) tests, as well as TriviaQA, HellaSwag, and the Grade School Math (GSM8k) benchmarks:

The startup’s new model will soon power its personal assistant app Pi to enable more natural conversations and useful features.

Thrilled to announce that Inflection-2 is now the 2nd best LLM in the world! 💚✨🎉

It will be powering https://t.co/1RWFB5RHtF very soon. And available to select API partners in time. Tech report linked…

Come run with us!https://t.co/8DZwP1Qnqo

— Mustafa Suleyman (@mustafasuleyman) November 22, 2023

Inflection said its transition from NVIDIA A100 to H100 GPUs for inference – combined with optimisation work – will increase serving speed and reduce costs despite Inflection-2 being much larger than its predecessor.  

An Inflection spokesperson said this latest model brings them “a big milestone closer” towards fulfilling the mission of providing AI assistants for all. They added the team is “already looking forward” to training even larger models on their 22,000 GPU supercluster.

Safety is said to be a top priority for the researchers, with Inflection being one of the first signatories to the White House’s July 2023 voluntary AI commitments. The company said its safety team continues working to ensure models are rigorously evaluated and rely on best practices for alignment.

With impressive benchmarks and plans to scale further, Inflection’s latest effort poses a serious challenge to tech giants like Google and Microsoft who have so far dominated the field of large language models. The race is on to deliver the next generation of AI.

(Photo by Johann Walter Bantz on Unsplash)

See also: Anthropic upsizes Claude 2.1 to 200K tokens, nearly doubling GPT-4

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Inflection-2 beats Google’s PaLM 2 across common benchmarks appeared first on AI News.

Anthropic upsizes Claude 2.1 to 200K tokens, nearly doubling GPT-4

By: Ryan Daws
22 November 2023 at 11:33

San Francisco-based AI startup Anthropic has unveiled Claude 2.1, an upgrade to its language model that boasts a 200,000-token context window—vastly outpacing the recently released 120,000-token GPT-4 model from OpenAI.  

The release comes on the heels of an expanded partnership with Google that provides Anthropic access to advanced processing hardware, enabling the substantial expansion of Claude’s context-handling capabilities.

Our new model Claude 2.1 offers an industry-leading 200K token context window, a 2x decrease in hallucination rates, system prompts, tool use, and updated pricing.

Claude 2.1 is available over API in our Console, and is powering our https://t.co/uLbS2JNczH chat experience. pic.twitter.com/T1XdQreluH

— Anthropic (@AnthropicAI) November 21, 2023

With the ability to process lengthy documents like full codebases or novels, Claude 2.1 is positioned to unlock new potential across applications from contract analysis to literary study. 

The 200K token window represents more than just an incremental improvement—early tests indicate Claude 2.1 can accurately grasp information from prompts over 50 percent longer than GPT-4 before the performance begins to degrade.

Claude 2.1 (200K Tokens) – Pressure Testing Long Context Recall

We all love increasing context lengths – but what's performance like?

Anthropic reached out with early access to Claude 2.1 so I repeated the “needle in a haystack” analysis I did on GPT-4

Here's what I found:… pic.twitter.com/B36KnjtJmE

— Greg Kamradt (@GregKamradt) November 21, 2023

Anthropic also touted a 50 percent reduction in hallucination rates for Claude 2.1 over version 2.0. Increased accuracy could put the model in closer competition with GPT-4 in responding precisely to complex factual queries.

Additional new features include an API tool for advanced workflow integration and “system prompts” that allow users to define Claude’s tone, goals, and rules at the outset for more personalised, contextually relevant interactions. For instance, a financial analyst could direct Claude to adopt industry terminology when summarising reports.

However, the full 200K token capacity remains exclusive to paying Claude Pro subscribers for now. Free users will continue to be limited to Claude 2.0’s 100K tokens.

As the AI landscape shifts, Claude 2.1’s enhanced precision and adaptability promise to be a game changer—presenting new options for businesses exploring how to strategically leverage AI capabilities.

With its substantial context expansion and rigorous accuracy improvements, Anthropic’s latest offering signals its determination to compete head-to-head with leading models like GPT-4.

(Image Credit: Anthropic)

See also: Paul O’Sullivan, Salesforce: Transforming work in the GenAI era

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Anthropic upsizes Claude 2.1 to 200K tokens, nearly doubling GPT-4 appeared first on AI News.

Paul O’Sullivan, Salesforce: Transforming work in the GenAI era

By: Ryan Daws
21 November 2023 at 10:20

In the wake of the generative AI (GenAI) revolution, UK businesses find themselves at a crossroads between unprecedented opportunities and inherent challenges.

Paul O’Sullivan, Senior Vice President of Solution Engineering (UKI) at Salesforce, sheds light on the complexities of this transformative landscape, urging businesses to tread cautiously while embracing the potential of artificial intelligence.

Unprecedented opportunities

Generative AI has stormed the scene with remarkable speed. ChatGPT, for example, amassed 100 million users in a mere two months.

“If you put that into context, it took 10 years to reach 100 million users on Netflix,” says O’Sullivan.

This rapid adoption signals a seismic shift, promising substantial economic growth. O’Sullivan estimates that generative AI has the potential to contribute a staggering £3.5 trillion ($4.4 trillion) to the global economy.

“Again, if you put that into context, that’s about as much tax as the entire US takes in,” adds O’Sullivan.

One of its key advantages lies in driving automation, with the prospect of automating up to 40 percent of the average workday—leading to significant productivity gains for businesses.

The AI trust gap

However, amid the excitement, there looms a significant challenge: the AI trust gap. 

O’Sullivan acknowledges that despite being a top priority for C-suite executives, over half of customers remain sceptical about the safety and security of AI applications.

Addressing this gap will require a multi-faceted approach including grappling with issues related to data quality and ensuring that AI systems are built on reliable, unbiased, and representative datasets. 

“Companies have struggled with data quality and data hygiene. So that’s a key area of focus,” explains O’Sullivan.

Safeguarding data privacy is also paramount, with stringent measures needed to prevent the misuse of sensitive customer information.

“Both customers and businesses are worried about data privacy—we can’t let large language models store and learn from sensitive customer data,” says O’Sullivan. “Over half of customers and their customers don’t believe AI is safe and secure today.”

Ethical considerations

AI also prompts ethical considerations. Concerns about hallucinations – where AI systems generate inaccurate or misleading information – must be addressed meticulously.

Businesses must confront biases and toxicities embedded in AI algorithms, ensuring fairness and inclusivity. Striking a balance between innovation and ethical responsibility is pivotal to gaining customer trust.

“A trustworthy AI should consistently meet expectations, adhere to commitments, and create a sense of dependability within the organisation,” explains O’Sullivan. “It’s crucial to address the limitations and the potential risks. We’ve got to be open here and lead with integrity.”

As businesses embrace AI, upskilling the workforce will also be imperative.

O’Sullivan advocates for a proactive approach, encouraging employees to master the art of prompt writing. Crafting effective prompts is vital, enabling faster and more accurate interactions with AI systems and enhancing productivity across various tasks.

Moreover, understanding AI lingo is essential to foster open conversations and enable informed decision-making within organisations.

A collaborative future

Crucially, O’Sullivan emphasises a collaborative future where AI serves as a co-pilot rather than a replacement for human expertise.

“AI, for now, lacks cognitive capability like empathy, reasoning, emotional intelligence, and ethics—and these are absolutely critical business skills that humans need to bring to the table,” says O’Sullivan.

This collaboration fosters a sense of trust, as humans act as a check and balance to ensure the responsible use of AI technology.

By addressing the AI trust gap, upskilling the workforce, and fostering a harmonious collaboration between humans and AI, businesses can harness the full potential of generative AI while building trust and confidence among customers.

You can watch our full interview with Paul O’Sullivan below:

Paul O’Sullivan and the Salesforce team will be sharing their invaluable insights at this year’s AI & Big Data Expo Global. O’Sullivan will feature on a day one panel titled ‘Converging Technologies – We Work Better Together’.

The post Paul O’Sullivan, Salesforce: Transforming work in the GenAI era appeared first on AI News.

Microsoft recruits former OpenAI CEO Sam Altman and Co-Founder Greg Brockman

By: Ryan Daws
20 November 2023 at 13:44

AI experts don’t stay jobless for long, as evidenced by Microsoft’s quick recruitment of former OpenAI CEO Sam Altman and Co-Founder Greg Brockman.

Altman, who was recently ousted by OpenAI’s board for reasons that have had no shortage of speculation, has found a new home at Microsoft. The announcement came after unsuccessful negotiations with OpenAI’s board to reinstate Altman.

I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.

— Ilya Sutskever (@ilyasut) November 20, 2023

Microsoft CEO Satya Nadella – who has long expressed confidence in Altman’s vision and leadership – revealed that Altman and Brockman will lead Microsoft’s newly established advanced AI research team.

Nadella expressed excitement about the collaboration, stating, “We’re extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team. We look forward to moving quickly to provide them with the resources needed for their success.”

I’m super excited to have you join as CEO of this new group, Sam, setting a new pace for innovation. We’ve learned a lot over the years about how to give founders and innovators space to build independent identities and cultures within Microsoft, including GitHub, Mojang Studios,…

— Satya Nadella (@satyanadella) November 20, 2023

The move follows Altman’s abrupt departure from OpenAI. Former Twitch CEO Emmett Shear has been appointed as interim CEO at OpenAI.

Today I got a call inviting me to consider a once-in-a-lifetime opportunity: to become the interim CEO of @OpenAI. After consulting with my family and reflecting on it for just a few hours, I accepted. I had recently resigned from my role as CEO of Twitch due to the birth of my…

— Emmett Shear (@eshear) November 20, 2023

Altman’s role at Microsoft is anticipated to build on the company’s strategy of allowing founders and innovators space to create independent identities, similar to Microsoft’s approach with GitHub, Mojang Studios, and LinkedIn.

Microsoft’s decision to bring Altman and Brockman on board coincides with the development of its custom AI chip. The Maia AI chip, designed to train large language models, aims to reduce dependence on Nvidia.

While Microsoft reassures its commitment to the OpenAI partnership, valued at approximately $10 billion, it emphasises ongoing innovation and support for customers and partners.

As Altman and Brockman embark on leading Microsoft’s advanced AI research team, the industry will be watching closely to see what the high-profile figures can do with Microsoft’s resources at their disposal. The industry will also be observing whether OpenAI can maintain its success under different leadership.

(Photo by Turag Photography on Unsplash)

See also: Amdocs, NVIDIA and Microsoft Azure build custom LLMs for telcos

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Microsoft recruits former OpenAI CEO Sam Altman and Co-Founder Greg Brockman appeared first on AI News.

Umbar Shakir, Gate One: Unlocking the power of generative AI ethically

By: Ryan Daws
17 November 2023 at 08:54

Ahead of this year’s AI & Big Data Expo Global, Umbar Shakir, Partner and AI Lead at Gate One, shared her insights into the diverse landscape of generative AI (GenAI) and its impact on businesses.

From addressing the spectrum of use cases to navigating digital transformation, Shakir shed light on the challenges, ethical considerations, and the promising future of this groundbreaking technology.

Wide spectrum of use cases

Shakir highlighted the wide array of GenAI applications, ranging from productivity enhancements and research support to high-stakes areas such as strategic data mining and knowledge bots. She emphasised the transformational power of AI in understanding customer data, moving beyond simple sentiment analysis to providing actionable insights, thus elevating customer engagement strategies.

“GenAI now can take your customer insights to another level. It doesn’t just tell you whether something’s a positive or negative sentiment like old AI would do, it now says it’s positive or negative. It’s negative because X, Y, Z, and here’s the root cause for X, Y, Z,” explains Shakir.

Powering digital transformation

Gate One adopts an adaptive strategy approach, abandoning traditional five-year strategies for more agile, adaptable frameworks.

“We have a framework – our 5P model – where it’s: identify your people, identify the problem statement that you’re trying to solve for, appoint some partnerships, think about what’s the right capability mix that you have, think about the pathway through which you’re going to deliver, be use case or risk-led, and then proof of concept,” says Shakir.

By solving specific challenges and aligning strategies with business objectives, Gate One aims to drive meaningful digital transformation for its clients.

Assessing client readiness

Shakir discussed Gate One’s diagnostic tools, which blend technology maturity and operating model innovation questions to assess a client’s readiness to adopt GenAI successfully.

“We have a proprietary tool that we’ve built, a diagnostic tool where we look at blending tech maturity capability type questions with operating model innovation questions,” explains Shakir.

By categorising clients as “vanguard” or “safe” players, Gate One tailors their approach to meet individual readiness levels—ensuring a seamless integration of GenAI into the client’s operations.

Key challenges and ethical considerations

Shakir acknowledged the challenges associated with GenAI, especially concerning the quality of model outputs. She stressed the importance of addressing biases, amplifications, and ethical concerns, calling for a more meaningful and sustainable implementation of AI.

“Poor quality data or poorly trained models can create biases, racism, sexism… those are the things that worry me about the technology,” says Shakir.

Gate One is actively working on refining models and data inputs to mitigate such problems.

The future of GenAI

Looking ahead, Shakir predicted a demand for more ethical AI practices from consumers and increased pressure on developers to create representative and unbiased models.

Shakir also envisioned a shift in work dynamics where AI liberates humans from mundane tasks to allow them to focus on solving significant global challenges, particularly in the realm of sustainability.

Later this month, Gate One will be attending and sponsoring this year’s AI & Big Data Expo Global. During the event, Gate One aims to share its ethos of meaningful AI and emphasise ethical and sustainable approaches.

Gate One will also be sharing with attendees GenAI’s impact on marketing and experience design, offering valuable insights into the changing landscape of customer interactions and brand experiences.

As businesses navigate the evolving landscape of GenAI, Gate One stands at the forefront, advocating for responsible, ethical, and sustainable practices and ensuring a brighter, more impactful future for businesses and society.

Umbar Shakir and the Gate One team will be sharing their invaluable insights at this year’s AI & Big Data Expo Global. Find out more about Umbar Shakir’s day one keynote presentation here.

The post Umbar Shakir, Gate One: Unlocking the power of generative AI ethically appeared first on AI News.

Amdocs, NVIDIA and Microsoft Azure build custom LLMs for telcos

By: Ryan Daws
16 November 2023 at 12:09

Amdocs has partnered with NVIDIA and Microsoft Azure to build custom Large Language Models (LLMs) for the $1.7 trillion global telecoms industry.

Leveraging the power of NVIDIA’s AI foundry service on Microsoft Azure, Amdocs aims to meet the escalating demand for data processing and analysis in the telecoms sector.

The telecoms industry processes hundreds of petabytes of data daily. With the anticipation of global data transactions surpassing 180 zettabytes by 2025, telcos are turning to generative AI to enhance efficiency and productivity.

NVIDIA’s AI foundry service – comprising the NVIDIA AI Foundation Models, NeMo framework, and DGX Cloud AI supercomputing – provides an end-to-end solution for creating and optimising custom generative AI models.

Amdocs will utilise the AI foundry service to develop enterprise-grade LLMs tailored for the telco and media industries, facilitating the deployment of generative AI use cases across various business domains.

This collaboration builds on the existing Amdocs-Microsoft partnership, ensuring the adoption of applications in secure, trusted environments, both on-premises and in the cloud.

Enterprises are increasingly focusing on developing custom models to perform industry-specific tasks. Amdocs serves over 350 of the world’s leading telecom and media companies across 90 countries. This partnership with NVIDIA opens avenues for exploring generative AI use cases, with initial applications focusing on customer care and network operations.

In customer care, the collaboration aims to accelerate the resolution of inquiries by leveraging information from across company data. In network operations, the companies are exploring solutions to address configuration, coverage, or performance issues in real-time.

This move by Amdocs positions the company at the forefront of ushering in a new era for the telecoms industry by harnessing the capabilities of custom generative AI models.

(Photo by Danist Soh on Unsplash)

See also: Wolfram Research: Injecting reliability into generative AI

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Amdocs, NVIDIA and Microsoft Azure build custom LLMs for telcos appeared first on AI News.

Kinetic Consulting launches Macky AI – the first AI business consulting platform available to any business

16 November 2023 at 12:01

Kinetic Consulting, the leading boutique consulting company providing business growth consultancy, has released macky.ai, the first AI business consulting platform that offers any organisation an easy, non-prompt-based AI consulting solution for up to 55 different business categories. The platform is powered by OpenAI’s artificial intelligence technology.

What is Macky AI?

The Macky AI platform overcomes some of the key hurdles preventing the mass adoption of AI in a business environment. No training is required for employees to begin taking advantage of the consulting platform. No knowledge is required on how to prompt the AI to get the right output or determine if the output is suitable. The hard work of determining which prompt to use or if the output is suitable has already been determined by the software’s creator, Kinetic Consulting. Platform users are asked a maximum of three questions to generate the desired output.

The creators of the Macky AI software have curated the types of everyday requirements of key departments in a business and what type of suitable output can be generated from a generative AI solution. An example may be generating something as simple as a job description for a new employee or something more complex, such as creating a new business process or reengineering an existing one. These types of requests are often requested for consultants to perform. Macky AI aims to reduce the cost of everyday consulting needs for companies so they can empower their employees to complete these tasks without the need for costly consultants.

By freeing up the costs paid for these lower-level activities, companies can now divert effort and funds to develop higher-value business initiatives, such as business roadmaps and growth strategy plans. These higher value and more complex business requirements will remain better suited for traditional consulting. The Macky AI platform is unique because it also provides its users with traditional consultants for more complex needs. The ability for an organisation to have the best of both worlds, all on one platform, is made possible on Macky AI. The future of consulting will be the augmentation of AI and human consultants.

Macky AI provides new consulting options for SMEs

A 2023 report by the OECD[1] on the outlook of SMEs in OECD countries highlights that the majority are currently operating in highly challenging environments. The report cites that SMEs have been greatly impacted by the COVID-19 pandemic, rising geopolitical tensions, high inflation, tighter monetary and fiscal policy, and supply-chain disruptions. Retaining and attracting staff has also become a major issue for SMEs in OECD countries. According to a Future of Business Survey, it is reported to be the second most pressing challenge faced by SMEs that are older than two years in the first quarter of 2022[2]. Many SMEs have also depleted their cash reserves during the pandemic and now find it challenging to raise capital for their business to fuel the rising costs of goods and services and the capital required for digital transformation projects.

Outside of the OECD, we find the importance of a thriving SME ecosystem even more critical. In the Gulf region, SMEs contribute even more to the economy than their counterparts in OECD countries. Within the UAE, for example, SMEs represent 94% of the companies and institutions operating in the country, contributing more than 50% to the country’s GDP. SMEs account for 86% of the private sector’s workforce. Operating extensively throughout the rest of the GCC, SMEs employ 80% of Saudia Arabia, 43% of Oman’s workforce, 57% of Bahrain, 23% of Kuwait, and 20% of Qatar.

The importance of having healthy and thriving SMEs is recognised as the primary pillar of strength for any economy. The challenging environment and rising cost of capital make it difficult for SMEs to afford traditional consulting. Ironically, this is the time when consulting is most needed to help SMEs navigate, transform, and thrive again. Macky AI gives SMEs affordable access to consulting services using artificial intelligence. The AI business consulting platform provides an on-demand service for key business challenges, such as analysing a profit and loss statement to identify cost savings and developing a 12-month marketing plan to increase sales.

The future of consulting

Business consulting, like most industries, is undergoing a period of disruption. Technological advances, such as artificial intelligence, are accelerating the delivery of consulting in the future. Critics of the technology may highlight how AI is not 100% accurate in its outputs and is prone to error, so it should not be used. This argument is fundamentally flawed because even human-based consulting is prone to errors. All outputs delivered by human or AI consultants should be checked for accuracy. The advancement of generative AI technology has reached a point where it is now highly useful in a business or education environment.

AI technologies should be embraced rather than resisted if they are fit for purpose. Macky AI is designed to be specifically for business-related needs, and even in the open question section of the platform, the AI has been programmed not to answer questions that are not business-related. The objective of restricting it for business purposes only is to ensure that if employers give it to their employees, it will not be used for personal needs.

“As advancements in AI evolve, we need to accept that it will become a natural part of how we interact with things, get answers to our questions, and help solve complex problems. The future of consulting will be an augmentation between AI and human consultants. This is the inevitable evolutionary path. The percentages of AI usage versus human is unknown at this stage. However, I am 100% confident it will not be all traditional human consulting for much longer. Macky AI is the first step towards bringing AI into the workplace in a controlled environment for a specific business purpose. By empowering SMEs with affordable consulting outputs for business tasks, we are also helping SMEs overcome everyday business challenges and thrive in the future. Macky AI is designed to democratise consulting, making it accessible to all organisations regardless of size.” said Joe Tawfik, founder of Macky AI.

[1] OECD (2023), OECD SME and Entrepreneurship Outlook 2023, OECD Publishing, Paris, https://doi.org/10.1787/342b8564-en.

[2] OECD-World Bank-Meta Future of Business Survey, Data for Good, (March 2022).

(Editor’s note: This article is sponsored by Kinetic Consulting)

The post Kinetic Consulting launches Macky AI – the first AI business consulting platform available to any business appeared first on AI News.

Wolfram Research: Injecting reliability into generative AI

15 November 2023 at 10:30

The hype surrounding generative AI and the potential of large language models (LLMs), spearheaded by OpenAI’s ChatGPT, appeared at one stage to be practically insurmountable. It was certainly inescapable. More than one in four dollars invested in US startups this year went to an AI-related company, while OpenAI revealed at its recent developer conference that ChatGPT continues to be one of the fastest-growing services of all time.

Yet something continues to be amiss. Or rather, something amiss continues to be added in.

One of the biggest issues with LLMs are their ability to hallucinate. In other words, it makes things up. Figures vary, but one frequently-cited rate is at 15%-20%. One Google system notched up 27%. This would not be so bad if it did not come across so assertively while doing so. Jon McLoone, Director of Technical Communication and Strategy at Wolfram Research, likens it to the ‘loudmouth know-it-all you meet in the pub.’ “He’ll say anything that will make him seem clever,” McLoone tells AI News. “It doesn’t have to be right.”

The truth is, however, that such hallucinations are an inevitability when dealing with LLMs. As McLoone explains, it is all a question of purpose. “I think one of the things people forget, in this idea of the ‘thinking machine’, is that all of these tools are designed with a purpose in mind, and the machinery executes on that purpose,” says McLoone. “And the purpose was not to know the facts.

“The purpose that drove its creation was to be fluid; to say the kinds of things that you would expect a human to say; to be plausible,” McLoone adds. “Saying the right answer, saying the truth, is a very plausible thing, but it’s not a requirement of plausibility.

“So you get these fun things where you can say ‘explain why zebras like to eat cacti’ – and it’s doing its plausibility job,” says McLoone. “It says the kinds of things that might sound right, but of course it’s all nonsense, because it’s just being asked to sound plausible.”

What is needed, therefore, is a kind of intermediary which is able to inject a little objectivity into proceedings – and this is where Wolfram comes in. In March, the company released a ChatGPT plugin, which aims to ‘make ChatGPT smarter by giving it access to powerful computation, accurate math[s], curated knowledge, real-time data and visualisation’. Alongside being a general extension to ChatGPT, the Wolfram plugin can also synthesise code.

“It teaches the LLM to recognise the kinds of things that Wolfram|Alpha might know – our knowledge engine,” McLoone explains. “Our approach on that is completely different. We don’t scrape the web. We have human curators who give the data meaning and structure, and we lay computation on that to synthesise new knowledge, so you can ask questions of data. We’ve got a few thousand data sets built into that.”

Wolfram has always been on the side of computational technology, with McLoone, who describes himself as a ‘lifelong computation person’, having been with the company for almost 32 of its 36-year history. When it comes to AI, Wolfram therefore sits on the symbolic side of the fence, which suits logical reasoning use cases, rather than statistical AI, which suits pattern recognition and object classification.

The two systems appear directly opposed, but with more commonality than you may think. “Where I see it, [approaches to AI] all share something in common, which is all about using the machinery of computation to automate knowledge,” says McLoone. “What’s changed over that time is the concept of at what level you’re automating knowledge.

“The good old fashioned AI world of computation is humans coming up with the rules of behaviour, and then the machine is automating the execution of those rules,” adds McLoone. “So in the same way that the stick extends the caveman’s reach, the computer extends the brain’s ability to do these things, but we’re still solving the problem beforehand.

“With generative AI, it’s no longer saying ‘let’s focus on a problem and discover the rules of the problem.’ We’re now starting to say, ‘let’s just discover the rules for the world’, and then you’ve got a model that you can try and apply to different problems rather than specific ones.

“So as the automation has gone higher up the intellectual spectrum, the things have become more general, but in the end, it’s all just executing rules,” says McLoone.

What’s more, as the differing approaches to AI share a common goal, so do the companies on either side. As OpenAI was building out its plugin architecture, Wolfram was asked to be one of the first providers. “As the LLM revolution started, we started doing a bunch of analysis on what they were really capable of,” explains McLoone. “And then, as we came to this understanding of what the strengths or weaknesses were, it was about that point that OpenAI were starting to work on their plugin architecture.

“They approached us early on, because they had a little bit longer to think about this than us, since they’d seen it coming for two years,” McLoone adds. “They understood exactly this issue themselves already.”

McLoone will be demonstrating the plugin with examples at the upcoming AI & Big Data Expo Global event in London on November 30-December 1, where he is speaking. Yet he is keen to stress that there are more varied use cases out there which can benefit from the combination of ChatGPT’s mastery of unstructured language and Wolfram’s mastery of computational mathematics.

One such example is performing data science on unstructured GP medical records. This ranges from correcting peculiar transcriptions on the LLM side – replacing ‘peacemaker’ with ‘pacemaker’ as one example – to using old-fashioned computation and looking for correlations within the data. “We’re focused on chat, because it’s the most amazing thing at the moment that we can talk to a computer. But the LLM is not just about chat,” says McLoone. “They’re really great with unstructured data.”

How does McLoone see LLMs developing in the coming years? There will be various incremental improvements, and training best practices will see better results, not to mention potentially greater speed with hardware acceleration. “Where the big money goes, the architectures follow,” McLoone notes. A sea-change on the scale of the last 12 months, however, can likely be ruled out. Partly because of crippling compute costs, but also because we may have peaked in terms of training sets. If copyright rulings go against LLM providers, then training sets will shrink going forward.

The reliability problem for LLMs, however, will be forefront in McLoone’s presentation. “Things that are computational are where it’s absolutely at its weakest, it can’t really follow rules beyond really basic things,” he explains. “For anything where you’re synthesising new knowledge, or computing with data-oriented things as opposed to story-oriented things, computation really is the way still to do that.”

Yet while responses may vary – one has to account for ChatGPT’s degree of randomness after all – the combination seems to be working, so long as you give the LLM strong instructions. “I don’t know if I’ve ever seen [an LLM] actually override a fact I’ve given it,” says McLoone. “When you’re putting it in charge of the plugin, it often thinks ‘I don’t think I’ll bother calling Wolfram for this, I know the answer’, and it will make something up.

“So if it’s in charge you have to give really strong prompt engineering,” he adds. “Say ‘always use the tool if it’s anything to do with this, don’t try and go it alone’. But when it’s the other way around – when computation generates the knowledge and injects it into the LLM – I’ve never seen it ignore the facts.

“It’s just like the loudmouth guy at the pub – if you whisper the facts in his ear, he’ll happily take credit for them.”

Wolfram will be at AI & Big Data Expo Global. Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Wolfram Research: Injecting reliability into generative AI appeared first on AI News.

DHS AI roadmap prioritises cybersecurity and national safety

By: Ryan Daws
15 November 2023 at 10:10

The Department of Homeland Security’s (DHS) Cybersecurity and Infrastructure Security Agency (CISA) has launched its inaugural Roadmap for AI.

Viewed as a crucial step in the broader governmental effort to ensure the secure development and implementation of AI capabilities, the move aligns with President Biden’s recent Executive Order.

“DHS has a broad leadership role in advancing the responsible use of AI and this cybersecurity roadmap is one important element of our work,” said Secretary of Homeland Security Alejandro N. Mayorkas.

“The Biden-Harris Administration is committed to building a secure and resilient digital ecosystem that promotes innovation and technological progress.” 

Following the Executive Order, DHS is mandated to globally promote AI safety standards, safeguard US networks and critical infrastructure, and address risks associated with AI—including potential use “to create weapons of mass destruction”.

“In last month’s Executive Order, the President called on DHS to promote the adoption of AI safety standards globally and help ensure the safe, secure, and responsible use and development of AI,” added Mayorkas.

“CISA’s roadmap lays out the steps that the agency will take as part of our Department’s broader efforts to both leverage AI and mitigate its risks to our critical infrastructure and cyber defenses.”

CISA’s roadmap outlines five strategic lines of effort, providing a blueprint for concrete initiatives and a responsible approach to integrating AI into cybersecurity.

CISA Director Jen Easterly highlighted the dual nature of AI, acknowledging its promise in enhancing cybersecurity while acknowledging the immense risks it poses.

“Artificial Intelligence holds immense promise in enhancing our nation’s cybersecurity, but as the most powerful technology of our lifetimes, it also presents enormous risks,” commented Easterly.

“Our Roadmap for AI – focused at the nexus of AI, cyber defense, and critical infrastructure – sets forth an agency-wide plan to promote the beneficial uses of AI to enhance cybersecurity capabilities; ensure AI systems are protected from cyber-based threats; and deter the malicious use of AI capabilities to threaten the critical infrastructure Americans rely on every day.”

The outlined lines of effort are as follows:

  • Responsibly use AI to support our mission: CISA commits to using AI-enabled tools ethically and responsibly to strengthen cyber defense and support its critical infrastructure mission. The adoption of AI will align with constitutional principles and all relevant laws and policies.
  • Assess and Assure AI systems: CISA will assess and assist in secure AI-based software adoption across various stakeholders, establishing assurance through best practices and guidance for secure and resilient AI development.
  • Protect critical infrastructure from malicious use of AI: CISA will evaluate and recommend mitigation of AI threats to critical infrastructure, collaborating with government agencies and industry partners. The establishment of JCDC.AI aims to facilitate focused collaboration on AI-related threats.
  • Collaborate and communicate on key AI efforts: CISA commits to contributing to interagency efforts, supporting policy approaches for the US government’s national strategy on cybersecurity and AI, and coordinating with international partners to advance global AI security practices.
  • Expand AI expertise in our workforce: CISA will educate its workforce on AI systems and techniques, actively recruiting individuals with AI expertise and ensuring a comprehensive understanding of the legal, ethical, and policy aspects of AI-based software systems.

“This is a step in the right direction. It shows the government is taking the potential threats and benefits of AI seriously. The roadmap outlines a comprehensive strategy for leveraging AI to enhance cybersecurity, protect critical infrastructure, and foster collaboration. It also emphasises the importance of security in AI system design and development,” explains Joseph Thacker, AI and security researcher at AppOmni.

“The roadmap is pretty comprehensive. Nothing stands out as missing initially, although the devil is in the details when it comes to security, and even more so when it comes to a completely new technology. CISA’s ability to keep up may depend on their ability to get talent or train internal folks. Both of those are difficult to accomplish at scale.”

CISA invites stakeholders, partners, and the public to explore the Roadmap for Artificial Intelligence and gain insights into the strategic vision for AI technology and cybersecurity here.

See also: Google expands partnership with Anthropic to enhance AI safety

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post DHS AI roadmap prioritises cybersecurity and national safety appeared first on AI News.

What Is Retrieval-Augmented Generation?

15 November 2023 at 16:00

To understand the latest advance in generative AI, imagine a courtroom.

Judges hear and decide cases based on their general understanding of the law. Sometimes a case — like a malpractice suit or a labor dispute —  requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.

Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.

The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.

The Story of the Name

Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.

Picture of Patrick Lewis, lead author of RAG paper
Patrick Lewis

“We definitely would have put more thought into the name had we known our work would become so widespread,” Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.

“We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea,” said Lewis, who now leads a RAG team at AI startup Cohere.

So, What Is Retrieval-Augmented Generation?

Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.

In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM’s parameters essentially represent the general patterns of how humans use words to form sentences.

That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.

Combining Internal, External Resources

Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.

The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG “a general-purpose fine-tuning recipe” because it can be used by nearly any LLM to connect with practically any external resource.

Building User Trust

Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.

What’s more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.

Another great advantage of RAG is it’s relatively easy. A blog by Lewis and three of the paper’s coauthors said developers can implement the process with as few as five lines of code.

That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.

How People Are Using Retrieval-Augmented Generation 

With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.

For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.

In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.

The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.

Getting Started With Retrieval-Augmented Generation 

To help users get started, NVIDIA developed a reference architecture for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.

The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.

The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.

Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal — it can deliver a 150x speedup over using a CPU.

Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.

RAG doesn’t require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.

Chart shows running RAG on a PC
An example application for RAG on a PC.

PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source – whether that be emails, notes or articles – to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.

A recent blog provides an example of RAG accelerated by TensorRT-LLM for Windows to get better results fast.

The History of Retrieval-Augmented Generation 

The roots of the technique go back at least to the early 1970s. That’s when researchers in information retrieval prototyped what they called question-answering systems, apps that use natural language processing (NLP) to access text, initially in narrow topics such as baseball.

The concepts behind this kind of text mining have remained fairly constant over the years. But the machine learning engines driving them have grown significantly, increasing their usefulness and popularity.

In the mid-1990s, the Ask Jeeves service, now Ask.com, popularized question answering with its mascot of a well-dressed valet. IBM’s Watson became a TV celebrity in 2011 when it handily beat two human champions on the Jeopardy! game show.

Picture of Ask Jeeves, an early RAG-like web service

Today, LLMs are taking question-answering systems to a whole new level.

Insights From a London Lab

The seminal 2020 paper arrived as Lewis was pursuing a doctorate in NLP at University College London and working for Meta at a new London AI lab. The team was searching for ways to pack more knowledge into an LLM’s parameters and using a benchmark it developed to measure its progress.

Building on earlier methods and inspired by a paper from Google researchers, the group “had this compelling vision of a trained system that had a retrieval index in the middle of it, so it could learn and generate any text output you wanted,” Lewis recalled.

Picture of IBM Watson winning on "Jeopardy" TV show, popularizing a RAG-like AI service
The IBM Watson question-answering system became a celebrity when it won big on the TV game show Jeopardy!

When Lewis plugged into the work in progress a promising retrieval system from another Meta team, the first results were unexpectedly impressive.

“I showed my supervisor and he said, ‘Whoa, take the win. This sort of thing doesn’t happen very often,’ because these workflows can be hard to set up correctly the first time,” he said.

Lewis also credits major contributions from team members Ethan Perez and Douwe Kiela, then of New York University and Facebook AI Research, respectively.

When complete, the work, which ran on a cluster of NVIDIA GPUs, showed how to make generative AI models more authoritative and trustworthy. It’s since been cited by hundreds of papers that amplified and extended the concepts in what continues to be an active area of research.

How Retrieval-Augmented Generation Works

At a high level, here’s how an NVIDIA technical brief describes the RAG process.

When users ask an LLM a question, the AI model sends the query to another model that converts it into a numeric format so machines can read it. The numeric version of the query is sometimes called an embedding or a vector.

NVIDIA diagram of how RAG works with LLMs
Retrieval-augmented generation combines LLMs with embedding models and vector databases.

The embedding model then compares these numeric values to vectors in a machine-readable index of an available knowledge base. When it finds a match or multiple matches, it retrieves the related data, converts it to human-readable words and passes it back to the LLM.

Finally, the LLM combines the retrieved words and its own response to the query into a final answer it presents to the user, potentially citing sources the embedding model found.

Keeping Sources Current

In the background, the embedding model continuously creates and updates machine-readable indices, sometimes called vector databases, for new and updated knowledge bases as they become available.

Chart of a RAG process described by LangChain
Retrieval-augmented generation combines LLMs with embedding models and vector databases.

Many developers find LangChain, an open-source library, can be particularly useful in chaining together LLMs, embedding models and knowledge bases. NVIDIA uses LangChain in its reference architecture for retrieval-augmented generation.

The LangChain community provides its own description of a RAG process.

Looking forward, the future of generative AI lies in creatively chaining all sorts of LLMs and knowledge bases together to create new kinds of assistants that deliver authoritative results users can verify.

Get a hands on using retrieval-augmented generation with an AI chatbot in this NVIDIA LaunchPad lab.

NVIDIA artist's concept of retrieval-augmented generation aka RAG

Igniting the Future: TensorRT-LLM Release Accelerates AI Inference Performance, Adds Support for New Models Running on RTX-Powered Windows 11 PCs

15 November 2023 at 16:00

Artificial intelligence on Windows 11 PCs marks a pivotal moment in tech history, revolutionizing experiences for gamers, creators, streamers, office workers, students and even casual PC users.

It offers unprecedented opportunities to enhance productivity for users of the more than 100 million Windows PCs and workstations that are powered by RTX GPUs. And NVIDIA RTX technology is making it even easier for developers to create AI applications to change the way people use computers.

New optimizations, models and resources announced at Microsoft Ignite will help developers deliver new end-user experiences, quicker.

An upcoming update to TensorRT-LLM — open-source software that increases AI inference performance — will add support for new large language models and make demanding AI workloads more accessible on desktops and laptops with RTX GPUs starting at 8GB of VRAM.

TensorRT-LLM for Windows will soon be compatible with OpenAI’s popular Chat API through a new wrapper. This will enable hundreds of developer projects and applications to run locally on a PC with RTX, instead of in the cloud — so users can keep private and proprietary data on Windows 11 PCs.

Custom generative AI requires time and energy to maintain projects. The process can become incredibly complex and time-consuming, especially when trying to collaborate and deploy across multiple environments and platforms.

AI Workbench is a unified, easy-to-use toolkit that allows developers to quickly create, test and customize pretrained generative AI models and LLMs on a PC or workstation. It provides developers a single platform to organize their AI projects and tune models to specific use cases.

This enables seamless collaboration and deployment for developers to create cost-effective, scalable generative AI models quickly. Join the early access list to be among the first to gain access to this growing initiative and to receive future updates.

To support AI developers, NVIDIA and Microsoft will release DirectML enhancements to accelerate one of the most popular foundational AI models, Llama 2. Developers now have more options for cross-vendor deployment, in addition to setting a new standard for performance.

Portable AI

Last month, NVIDIA announced TensorRT-LLM for Windows, a library for accelerating LLM inference.

The next TensorRT-LLM release, v0.6.0 coming later this month, will bring improved inference performance — up to 5x faster — and enable support for additional popular LLMs, including the new Mistral 7B and Nemotron-3 8B. Versions of these LLMs will run on any GeForce RTX 30 Series and 40 Series GPU with 8GB of RAM or more, making fast, accurate, local LLM capabilities accessible even in some of the most portable Windows devices.

TensorRT-LLM V0.6 Windows Perf Chart
Up to 5X performance with the new TensorRT-LLM v0.6.0.

The new release of TensorRT-LLM will be available for install on the /NVIDIA/TensorRT-LLM GitHub repo. New optimized models will be available on ngc.nvidia.com.

Conversing With Confidence 

Developers and enthusiasts worldwide use OpenAI’s Chat API for a wide range of applications — from summarizing web content and drafting documents and emails to analyzing and visualizing data and creating presentations.

One challenge with such cloud-based AIs is that they require users to upload their input data, making them impractical for private or proprietary data or for working with large datasets.

To address this challenge, NVIDIA is soon enabling TensorRT-LLM for Windows to offer a similar API interface to OpenAI’s widely popular ChatAPI, through a new wrapper, offering a similar workflow to developers whether they are designing models and applications to run locally on a PC with RTX or in the cloud. By changing just one or two lines of code, hundreds of AI-powered developer projects and applications can now benefit from fast, local AI. Users can keep their data on their PCs and not worry about uploading datasets to the cloud.

Perhaps the best part is that many of these projects and applications are open source, making it easy for developers to leverage and extend their capabilities to fuel the adoption of generative AI on Windows, powered by RTX.

The wrapper will work with any LLM that’s been optimized for TensorRT-LLM (for example, Llama 2, Mistral and NV LLM) and is being released as a reference project on GitHub, alongside other developer resources for working with LLMs on RTX.

Model Acceleration

Developers can now leverage cutting-edge AI models and deploy with a cross-vendor API. As part of an ongoing commitment to empower developers, NVIDIA and Microsoft have been working together to accelerate Llama on RTX via the DirectML API.

Building on the announcements for the fastest inference performance for these models announced last month, this new option for cross-vendor deployment makes it easier than ever to bring AI capabilities to PC.

Developers and enthusiasts can experience the latest optimizations by downloading the latest ONNX runtime and following the installation instructions from Microsoft, and installing the latest driver from NVIDIA, which will be available on Nov. 21.

These new optimizations, models and resources will accelerate the development and deployment of AI features and applications to the 100 million RTX PCs worldwide, joining the more than 400 partners shipping AI-powered apps and games already accelerated by RTX GPUs.

As models become even more accessible and developers bring more generative AI-powered functionality to RTX-powered Windows PCs, RTX GPUs will be critical for enabling users to take advantage of this powerful technology.

New Class of Accelerated, Efficient AI Systems Mark the Next Era of Supercomputing

13 November 2023 at 20:00

NVIDIA today unveiled at SC23 the next wave of technologies that will lift scientific and industrial research centers worldwide to new levels of performance and energy efficiency.

“NVIDIA hardware and software innovations are creating a new class of AI supercomputers,” said Ian Buck, vice president of the company’s high performance computing and hyperscale data center business, in a special address at the conference.

Some of the systems will pack memory-enhanced NVIDIA Hopper accelerators, others a new NVIDIA Grace Hopper systems architecture. All will use the expanded parallelism to run a full stack of accelerated software for generative AI, HPC and hybrid quantum computing.

Buck described the new NVIDIA HGX H200 as “the world’s leading AI computing platform.”

Image of H200 GPU system
NVIDIA H200 Tensor Core GPUs pack HBM3e memory to run growing generative AI models.

It packs up to 141GB of HBM3e, the first AI accelerator to use the ultrafast technology. Running models like GPT-3, NVIDIA H200 Tensor Core GPUs provide an 18x performance increase over prior-generation accelerators.

Among other generative AI benchmarks, they zip through 12,000 tokens per second on a Llama2-13B large language model (LLM).

Buck also revealed a server platform that links four NVIDIA GH200 Grace Hopper Superchips on an NVIDIA NVLink interconnect. The quad configuration puts in a single compute node a whopping 288 Arm Neoverse cores and 16 petaflops of AI performance with up to 2.3 terabytes of high-speed memory.

Image of quad GH200 server node
Server nodes based on the four GH200 Superchips will deliver 16 petaflops of AI performance.

Demonstrating its efficiency, one GH200 Superchip using the NVIDIA TensorRT-LLM open-source library is 100x faster than a dual-socket x86 CPU system and nearly 2x more energy efficient than an X86 + H100 GPU server.

“Accelerated computing is sustainable computing,” Buck said. “By harnessing the power of accelerated computing and generative AI, together we can drive innovation across industries while reducing our impact on the environment.”

NVIDIA Powers 38 of 49 New TOP500 Systems

The latest TOP500 list of the world’s fastest supercomputers reflects the shift toward accelerated, energy-efficient supercomputing.

Thanks to new systems powered by NVIDIA H100 Tensor Core GPUs, NVIDIA now delivers more than 2.5 exaflops of HPC performance across these world-leading systems, up from 1.6 exaflops in the May rankings. NVIDIA’s contribution on the top 10 alone reaches nearly an exaflop of HPC and 72 exaflops of AI performance.

The new list contains the highest number of systems ever using NVIDIA technologies, 379 vs. 372 in May, including 38 of 49 new supercomputers on the list.

Microsoft Azure leads the newcomers with its Eagle system using H100 GPUs in NDv5 instances to hit No. 3 with 561 petaflops. Mare Nostrum5 in Barcelona ranked No. 8, and NVIDIA Eos — which recently set new AI training records on the MLPerf benchmarks — came in at No. 9.

Showing their energy efficiency, NVIDIA GPUs power 23 of the top 30 systems on the Green500. And they retained the No. 1 spot with the H100 GPU-based Henri system, which delivers 65.09 gigaflops per watt for the Flatiron Institute in New York.

Gen AI Explores COVID

Showing what’s possible, the Argonne National Laboratory used NVIDIA BioNeMo, a generative AI platform for biomolecular LLMs, to develop GenSLMs, a model that can generate gene sequences that closely resemble real-world variants of the coronavirus. Using NVIDIA GPUs and data from 1.5 million COVID genome sequences, it can also rapidly identify new virus variants.

The work won the Gordon Bell special prize last year and was trained on supercomputers, including Argonne’s Polaris system, the U.S. Department of Energy’s Perlmutter and NVIDIA’s Selene.

It’s “just the tip of the iceberg — the future is brimming with possibilities, as generative AI continues to redefine the landscape of scientific exploration,” said Kimberly Powell, vice president of healthcare at NVIDIA, in the special address.

Saving Time, Money and Energy

Using the latest technologies, accelerated workloads can see an order-of-magnitude reduction in system cost and energy used, Buck said.

For example, Siemens teamed with Mercedes to analyze aerodynamics and related acoustics for its new electric EQE vehicles. The simulations that took weeks on CPU clusters ran significantly faster using the latest NVIDIA H100 GPUs. In addition, Hopper GPUs let them reduce costs by 3x and reduce energy consumption by 4x (below).

Chart showing the performance and energy efficiency of H100 GPUs

Switching on 200 Exaflops Beginning Next Year

Scientific and industrial advances will come from every corner of the globe where the latest systems are being deployed.

“We already see a combined 200 exaflops of AI on Grace Hopper supercomputers going to production 2024,” Buck said.

They include the massive JUPITER supercomputer at Germany’s Jülich center. It can deliver 93 exaflops of performance for AI training and 1 exaflop for HPC applications, while consuming only 18.2 megawatts of power.

Chart of deployed performance of supercomputers using NVIDIA GPUs through 2024
Research centers are poised to switch on a tsunami of GH200 performance.

Based on Eviden’s BullSequana XH3000 liquid-cooled system, JUPITER will use the NVIDIA quad GH200 system architecture and NVIDIA Quantum-2 InfiniBand networking for climate and weather predictions, drug discovery, hybrid quantum computing and digital twins. JUPITER quad GH200 nodes will be configured with 864GB of high-speed memory.

It’s one of several new supercomputers using Grace Hopper that NVIDIA announced at SC23.

The HPE Cray EX2500 system from Hewlett Packard Enterprise will use the quad GH200 to power many AI supercomputers coming online next year.

For example, HPE uses the quad GH200 to power OFP-II, an advanced HPC system in Japan shared by the University of Tsukuba and the University of Tokyo, as well as the DeltaAI system, which will triple computing capacity for the U.S. National Center for Supercomputing Applications.

HPE is also building the Venado system for the Los Alamos National Laboratory, the first GH200 to be deployed in the U.S. In addition, HPE is building GH200 supercomputers in the Middle East, Switzerland and the U.K.

Grace Hopper in Texas and Beyond

At the Texas Advanced Computing Center (TACC), Dell Technologies is building the Vista supercomputer with NVIDIA Grace Hopper and Grace CPU Superchips.

More than 100 global enterprises and organizations, including NASA Ames Research Center and Total Energies, have already purchased Grace Hopper early-access systems, Buck said.

They join previously announced GH200 users such as SoftBank and the University of Bristol, as well as the massive Leonardo system with 14,000 NVIDIA A100 GPUs that delivers 10 exaflops of AI performance for Italy’s Cineca consortium.

The View From Supercomputing Centers

Leaders from supercomputing centers around the world shared their plans and work in progress with the latest systems.

“We’ve been collaborating with MeteoSwiss ECMWP as well as scientists from ETH EXCLAIM and NVIDIA’s Earth-2 project to create an infrastructure that will push the envelope in all dimensions of big data analytics and extreme scale computing,” said Thomas Schultess, director of the Swiss National Supercomputing Centre of work on the Alps supercomputer.

“There’s really impressive energy-efficiency gains across our stacks,” Dan Stanzione, executive director of TACC, said of Vista.

It’s “really the stepping stone to move users from the kinds of systems we’ve done in the past to looking at this new Grace Arm CPU and Hopper GPU tightly coupled combination and … we’re looking to scale out by probably a factor of 10 or 15 from what we are deploying with Vista when we deploy Horizon in a couple years,” he said.

Accelerating the Quantum Journey

Researchers are also using today’s accelerated systems to pioneer a path to tomorrow’s supercomputers.

In Germany, JUPITER “will revolutionize scientific research across climate, materials, drug discovery and quantum computing,” said Kristel Michelson, who leads Julich’s research group on quantum information processing.

“JUPITER’s architecture also allows for the seamless integration of quantum algorithms with parallel HPC algorithms, and this is mandatory for effective quantum HPC hybrid simulations,” she said.

CUDA Quantum Drives Progress

The special address also showed how NVIDIA CUDA Quantum — a platform for programming CPUs, GPUs and quantum computers also known as QPUs — is advancing research in quantum computing.

For example, researchers at BASF, the world’s largest chemical company, pioneered a new hybrid quantum-classical method for simulating chemicals that can shield humans against harmful metals. They join researchers at Brookhaven National Laboratory and HPE who are separately pushing the frontiers of science with CUDA Quantum.

NVIDIA also announced a collaboration with Classiq, a developer of quantum programming tools, to create a life sciences research center at the Tel Aviv Sourasky Medical Center, Israel’s largest teaching hospital.  The center will use Classiq’s software and CUDA Quantum running on an NVIDIA DGX H100 system.

Separately, Quantum Machines will deploy the first NVIDIA DGX Quantum, a system using Grace Hopper Superchips, at the Israel National Quantum Center that aims to drive advances across scientific fields. The DGX system will be connected to a superconducting QPU by Quantware and a photonic QPU from ORCA Computing, both powered by CUDA Quantum.

Logos of NVIDIA CUDA Quantum partners

“In just two years, our NVIDIA quantum computing platform has amassed over 120 partners [above], a testament to its open, innovative platform,” Buck said.

Overall, the work across many fields of discovery reveals a new trend that combines accelerated computing at data center scale with NVIDIA’s full-stack innovation.

“Accelerated computing is paving the path for sustainable computing with advancements that provide not just amazing technology but a more sustainable and impactful future,” he concluded.

Watch NVIDIA’s SC23 special address below.

 

Image of JUPITER supercomputer in Germany

Wolfram Research: Injecting reliability into generative AI

15 November 2023 at 10:40

The hype surrounding generative AI and the potential of large language models (LLMs), spearheaded by OpenAI’s ChatGPT, appeared at one stage to be practically insurmountable. It was certainly inescapable. More than one in four dollars invested in US startups this year went to an AI-related company, while OpenAI revealed at its recent developer conference that ChatGPT continues to be one of the fastest-growing services of all time.

Yet something continues to be amiss. Or rather, something amiss continues to be added in.

One of the biggest issues with LLMs are their ability to hallucinate. In other words, it makes things up. Figures vary, but one frequently-cited rate is at 15%-20%. One Google system notched up 27%. This would not be so bad if it did not come across so assertively while doing so. Jon McLoone (left), Director of Technical Communication and Strategy at Wolfram Research, likens it to the ‘loudmouth know-it-all you meet in the pub.’ “He’ll say anything that will make him seem clever,” McLoone tells AI News. “It doesn’t have to be right.”

The truth is, however, that such hallucinations are an inevitability when dealing with LLMs. As McLoone explains, it is all a question of purpose. “I think one of the things people forget, in this idea of the ‘thinking machine’, is that all of these tools are designed with a purpose in mind, and the machinery executes on that purpose,” says McLoone. “And the purpose was not to know the facts.

“The purpose that drove its creation was to be fluid; to say the kinds of things that you would expect a human to say; to be plausible,” McLoone adds. “Saying the right answer, saying the truth, is a very plausible thing, but it’s not a requirement of plausibility.

“So you get these fun things where you can say ‘explain why zebras like to eat cacti’ – and it’s doing its plausibility job,” says McLoone. “It says the kinds of things that might sound right, but of course it’s all nonsense, because it’s just being asked to sound plausible.”

What is needed, therefore, is a kind of intermediary which is able to inject a little objectivity into proceedings – and this is where Wolfram comes in. In March, the company released a ChatGPT plugin, which aims to ‘make ChatGPT smarter by giving it access to powerful computation, accurate math[s], curated knowledge, real-time data and visualisation’. Alongside being a general extension to ChatGPT, the Wolfram plugin can also synthesise code.

“It teaches the LLM to recognise the kinds of things that Wolfram|Alpha might know – our knowledge engine,” McLoone explains. “Our approach on that is completely different. We don’t scrape the web. We have human curators who give the data meaning and structure, and we lay computation on that to synthesise new knowledge, so you can ask questions of data. We’ve got a few thousand data sets built into that.”

Wolfram has always been on the side of computational technology, with McLoone, who describes himself as a ‘lifelong computation person’, having been with the company for almost 32 of its 36-year history. When it comes to AI, Wolfram therefore sits on the symbolic side of the fence, which suits logical reasoning use cases, rather than statistical AI, which suits pattern recognition and object classification.

The two systems appear directly opposed, but with more commonality than you may think. “Where I see it, [approaches to AI] all share something in common, which is all about using the machinery of computation to automate knowledge,” says McLoone. “What’s changed over that time is the concept of at what level you’re automating knowledge.

“The good old fashioned AI world of computation is humans coming up with the rules of behaviour, and then the machine is automating the execution of those rules,” adds McLoone. “So in the same way that the stick extends the caveman’s reach, the computer extends the brain’s ability to do these things, but we’re still solving the problem beforehand.

“With generative AI, it’s no longer saying ‘let’s focus on a problem and discover the rules of the problem.’ We’re now starting to say, ‘let’s just discover the rules for the world’, and then you’ve got a model that you can try and apply to different problems rather than specific ones.

“So as the automation has gone higher up the intellectual spectrum, the things have become more general, but in the end, it’s all just executing rules,” says McLoone.

What’s more, as the differing approaches to AI share a common goal, so do the companies on either side. As OpenAI was building out its plugin architecture, Wolfram was asked to be one of the first providers. “As the LLM revolution started, we started doing a bunch of analysis on what they were really capable of,” explains McLoone. “And then, as we came to this understanding of what the strengths or weaknesses were, it was about that point that OpenAI were starting to work on their plugin architecture.

“They approached us early on, because they had a little bit longer to think about this than us, since they’d seen it coming for two years,” McLoone adds. “They understood exactly this issue themselves already.”

McLoone will be demonstrating the plugin with examples at the upcoming AI & Big Data Expo Global event in London on November 30-December 1, where he is speaking. Yet he is keen to stress that there are more varied use cases out there which can benefit from the combination of ChatGPT’s mastery of unstructured language and Wolfram’s mastery of computational mathematics.

One such example is performing data science on unstructured GP medical records. This ranges from correcting peculiar transcriptions on the LLM side – replacing ‘peacemaker’ with ‘pacemaker’ as one example – to using old-fashioned computation and looking for correlations within the data. “We’re focused on chat, because it’s the most amazing thing at the moment that we can talk to a computer. But the LLM is not just about chat,” says McLoone. “They’re really great with unstructured data.”

How does McLoone see LLMs developing in the coming years? There will be various incremental improvements, and training best practices will see better results, not to mention potentially greater speed with hardware acceleration. “Where the big money goes, the architectures follow,” McLoone notes. A sea-change on the scale of the last 12 months, however, can likely be ruled out. Partly because of crippling compute costs, but also because we may have peaked in terms of training sets. If copyright rulings go against LLM providers, then training sets will shrink going forward.

The reliability problem for LLMs, however, will be forefront in McLoone’s presentation. “Things that are computational are where it’s absolutely at its weakest, it can’t really follow rules beyond really basic things,” he explains. “For anything where you’re synthesising new knowledge, or computing with data-oriented things as opposed to story-oriented things, computation really is the way still to do that.”

Yet while responses may vary – one has to account for ChatGPT’s degree of randomness after all – the combination seems to be working, so long as you give the LLM strong instructions. “I don’t know if I’ve ever seen [an LLM] actually override a fact I’ve given it,” says McLoone. “When you’re putting it in charge of the plugin, it often thinks ‘I don’t think I’ll bother calling Wolfram for this, I know the answer’, and it will make something up.

“So if it’s in charge you have to give really strong prompt engineering,” he adds. “Say ‘always use the tool if it’s anything to do with this, don’t try and go it alone’. But when it’s the other way around – when computation generates the knowledge and injects it into the LLM – I’ve never seen it ignore the facts.

“It’s just like the loudmouth guy at the pub – if you whisper the facts in his ear, he’ll happily take credit for them.”

Wolfram will be at the AI & Big Data Expo. Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Wolfram Research: Injecting reliability into generative AI appeared first on AI News.

DHS AI roadmap prioritises cybersecurity and national safety

By: Ryan Daws
15 November 2023 at 09:52

The Department of Homeland Security’s (DHS) Cybersecurity and Infrastructure Security Agency (CISA) has launched its inaugural Roadmap for AI.

Viewed as a crucial step in the broader governmental effort to ensure the secure development and implementation of AI capabilities, the move aligns with President Biden’s recent Executive Order.

“DHS has a broad leadership role in advancing the responsible use of AI and this cybersecurity roadmap is one important element of our work,” said Secretary of Homeland Security Alejandro N. Mayorkas.

“The Biden-Harris Administration is committed to building a secure and resilient digital ecosystem that promotes innovation and technological progress.” 

Following the Executive Order, DHS is mandated to globally promote AI safety standards, safeguard US networks and critical infrastructure, and address risks associated with AI—including potential use “to create weapons of mass destruction”.

“In last month’s Executive Order, the President called on DHS to promote the adoption of AI safety standards globally and help ensure the safe, secure, and responsible use and development of AI,” added Mayorkas.

“CISA’s roadmap lays out the steps that the agency will take as part of our Department’s broader efforts to both leverage AI and mitigate its risks to our critical infrastructure and cyber defenses.”

CISA’s roadmap outlines five strategic lines of effort, providing a blueprint for concrete initiatives and a responsible approach to integrating AI into cybersecurity.

CISA Director Jen Easterly highlighted the dual nature of AI, acknowledging its promise in enhancing cybersecurity while acknowledging the immense risks it poses.

“Artificial Intelligence holds immense promise in enhancing our nation’s cybersecurity, but as the most powerful technology of our lifetimes, it also presents enormous risks,” commented Easterly.

“Our Roadmap for AI – focused at the nexus of AI, cyber defense, and critical infrastructure – sets forth an agency-wide plan to promote the beneficial uses of AI to enhance cybersecurity capabilities; ensure AI systems are protected from cyber-based threats; and deter the malicious use of AI capabilities to threaten the critical infrastructure Americans rely on every day.”

The outlined lines of effort are as follows:

  • Responsibly use AI to support our mission: CISA commits to using AI-enabled tools ethically and responsibly to strengthen cyber defense and support its critical infrastructure mission. The adoption of AI will align with constitutional principles and all relevant laws and policies.
  • Assess and Assure AI systems: CISA will assess and assist in secure AI-based software adoption across various stakeholders, establishing assurance through best practices and guidance for secure and resilient AI development.
  • Protect critical infrastructure from malicious use of AI: CISA will evaluate and recommend mitigation of AI threats to critical infrastructure, collaborating with government agencies and industry partners. The establishment of JCDC.AI aims to facilitate focused collaboration on AI-related threats.
  • Collaborate and communicate on key AI efforts: CISA commits to contributing to interagency efforts, supporting policy approaches for the US government’s national strategy on cybersecurity and AI, and coordinating with international partners to advance global AI security practices.
  • Expand AI expertise in our workforce: CISA will educate its workforce on AI systems and techniques, actively recruiting individuals with AI expertise and ensuring a comprehensive understanding of the legal, ethical, and policy aspects of AI-based software systems.

“This is a step in the right direction. It shows the government is taking the potential threats and benefits of AI seriously. The roadmap outlines a comprehensive strategy for leveraging AI to enhance cybersecurity, protect critical infrastructure, and foster collaboration. It also emphasises the importance of security in AI system design and development,” explains Joseph Thacker, AI and security researcher at AppOmni.

“The roadmap is pretty comprehensive. Nothing stands out as missing initially, although the devil is in the details when it comes to security, and even more so when it comes to a completely new technology. CISA’s ability to keep up may depend on their ability to get talent or train internal folks. Both of those are difficult to accomplish at scale.”

CISA invites stakeholders, partners, and the public to explore the Roadmap for Artificial Intelligence and gain insights into the strategic vision for AI technology and cybersecurity here.

(Photo by Oliver Günther on Unsplash)

See also: Google expands partnership with Anthropic to enhance AI safety

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post DHS AI roadmap prioritises cybersecurity and national safety appeared first on AI News.

Quantum AI represents a ‘transformative advancement’

By: Ryan Daws
14 November 2023 at 16:29

Quantum AI is the next frontier in the evolution of artificial intelligence, harnessing the power of quantum mechanics to propel capabilities beyond current limits.

GlobalData highlights a 14 percent compound annual growth rate (CAGR) increase in related patent filings from 2020 to 2022, underscoring the vast influence and potential of quantum AI across industries.

Adarsh Jain, Director of Financial Markets at GlobalData, emphasises the transformative nature of Quantum AI:

“Quantum AI represents a transformative advancement in technology. As we integrate quantum principles into AI algorithms, the potential for speed and efficiency in processing complex data sets grows exponentially. This not only enhances current AI applications but also opens new possibilities across various industries. 

The surge in patent filings is a testament to its growing importance and the pivotal role it will play in the future of AI-driven solutions.”

Kiran Raj, Practice Head of Disruptive Tech at GlobalData, highlights that while AI thrives on data and computational power, the inner workings of the technology often remain unclear. Quantum computing not only promises increased power but also potentially provides greater insights into these workings, paving the way for AI to transcend its current capabilities.

GlobalData’s Disruptor Intelligence Center analysis reveals significant synergy between quantum computing and AI innovations, leading to revolutionary impacts in various industries. Notable collaborations include HSBC and IBM in finance, Menten AI’s healthcare advancements, Volkswagen’s partnership with Xanadu for battery simulation, Intel’s Quantum SDK, and Zapata’s collaboration with BMW.

Raj concludes with a note of caution: “Quantum AI offers the potential for smarter, faster AI systems, but its adoption is complex and demands caution. The technology is still in its early stages, requiring significant investment and expertise.

“Key challenges include the need for advanced cybersecurity measures and ensuring ethical AI practices as we navigate this promising yet intricate landscape.”

(Photo by Anton Maksimov 5642.su on Unsplash)

See also: Google expands partnership with Anthropic to enhance AI safety

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Quantum AI represents a ‘transformative advancement’ appeared first on AI News.

GitLab’s new AI capabilities empower DevSecOps

By: Ryan Daws
13 November 2023 at 17:27

GitLab is empowering DevSecOps with new AI-powered capabilities as part of its latest releases.

The recent GitLab 16.6 November release includes the beta launch of GitLab Duo Chat, a natural-language AI assistant. Additionally, the GitLab 16.7 December release sees the general availability of GitLab Duo Code Suggestions.

David DeSanto, Chief Product Officer at GitLab, said: “To realise AI’s full potential, it needs to be embedded across the software development lifecycle, allowing DevSecOps teams to benefit from boosts to security, efficiency, and collaboration.”

GitLab Duo Chat – arguably the star of the show – provides users with invaluable insights, guidance, and suggestions. Beyond code analysis, it supports planning, security issue comprehension and resolution, troubleshooting CI/CD pipeline failures, aiding in merge requests, and more.

As part of GitLab’s commitment to providing a comprehensive AI-powered experience, Duo Chat joins Code Suggestions as the primary interface into GitLab’s AI suite within its DevSecOps platform.

GitLab Duo comprises a suite of 14 AI capabilities:

  • Suggested Reviewers
  • Code Suggestions
  • Chat
  • Vulnerability Summary
  • Code Explanation
  • Planning Discussions Summary
  • Merge Request Summary
  • Merge Request Template Population
  • Code Review Summary
  • Test Generation
  • Git Suggestions
  • Root Cause Analysis
  • Planning Description Generation
  • Value Stream Forecasting

In response to the evolving needs of development, security, and operations teams, Code Suggestions is now generally available. This feature assists in creating and updating code, reducing cognitive load, enhancing efficiency, and accelerating secure software development.

GitLab’s commitment to privacy and transparency stands out in the AI space. According to the GitLab report, 83 percent of DevSecOps professionals consider implementing AI in their processes essential, with 95 percent prioritising privacy and intellectual property protection in AI tool selection.

The State of AI in Software Development report by GitLab reveals that developers spend just 25 percent of their time writing code. The Duo suite aims to address this by reducing toolchain sprawl—enabling 7x faster cycle times, heightened developer productivity, and reduced software spend.

Kate Holterhoff, Industry Analyst at Redmonk, commented: “The developers we speak with at RedMonk are keenly interested in the productivity and efficiency gains that code assistants promise.

“GitLab’s Duo Code Suggestions is a welcome player in this space, expanding the available options for enabling an AI-enhanced software development lifecycle.”

(Photo by Pankaj Patel on Unsplash)

See also: OpenAI battles DDoS against its API and ChatGPT services

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Cyber Security & Cloud Expo and Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post GitLab’s new AI capabilities empower DevSecOps appeared first on AI News.

Gen AI for the Genome: LLM Predicts Characteristics of COVID Variants

13 November 2023 at 14:00

A widely acclaimed large language model for genomic data has demonstrated its ability to generate gene sequences that closely resemble real-world variants of SARS-CoV-2, the virus behind COVID-19.

Called GenSLMs, the model, which last year won the Gordon Bell special prize for high performance computing-based COVID-19 research, was trained on a dataset of nucleotide sequences — the building blocks of DNA and RNA. It was developed by researchers from Argonne National Laboratory, NVIDIA, the University of Chicago and a score of other academic and commercial collaborators.

When the researchers looked back at the nucleotide sequences generated by GenSLMs, they discovered that specific characteristics of the AI-generated sequences closely matched the real-world Eris and Pirola subvariants that have been prevalent this year — even though the AI was only trained on COVID-19 virus genomes from the first year of the pandemic.

“Our model’s generative process is extremely naive, lacking any specific information or constraints around what a new COVID variant should look like,” said Arvind Ramanathan, lead researcher on the project and a computational biologist at Argonne. “The AI’s ability to predict the kinds of gene mutations present in recent COVID strains — despite having only seen the Alpha and Beta variants during training — is a strong validation of its capabilities.”

In addition to generating its own sequences, GenSLMs can also classify and cluster different COVID genome sequences by distinguishing between variants. In a demo available on NGC, NVIDIA’s hub for accelerated software, users can explore visualizations of GenSLMs’ analysis of the evolutionary patterns of various proteins within the COVID viral genome.

 

Reading Between the Lines, Uncovering Evolutionary Patterns

A key feature of GenSLMs is its ability to interpret long strings of nucleotides — represented with sequences of the letters A, T, G and C in DNA, or A, U, G and C in RNA — in the same way an LLM trained on English text would interpret a sentence. This capability enables the model to understand the relationship between different areas of the genome, which in coronaviruses consists of around 30,000 nucleotides.

In the NGC demo, users can choose from among eight different COVID variants to understand how the AI model tracks mutations across various proteins of the viral genome. The visualization depicts evolutionary couplings across the viral proteins — highlighting which snippets of the genome are likely to be seen in a given variant.

“Understanding how different parts of the genome are co-evolving gives us clues about how the virus may develop new vulnerabilities or new forms of resistance,” Ramanathan said. “Looking at the model’s understanding of which mutations are particularly strong in a variant may help scientists with downstream tasks like determining how a specific strain can evade the human immune system.”

 

GenSLMs was trained on more than 110 million prokaryotic genome sequences and fine-tuned with a global dataset of around 1.5 million COVID viral sequences using open-source data from the Bacterial and Viral Bioinformatics Resource Center. In the future, the model could be fine-tuned on the genomes of other viruses or bacteria, enabling new research applications.

To train the model, the researchers used NVIDIA A100 Tensor Core GPU-powered supercomputers, including Argonne’s Polaris system, the U.S. Department of Energy’s Perlmutter and NVIDIA’s Selene.

The GenSLMs research team’s Gordon Bell special prize was awarded at last year’s SC22 supercomputing conference. At this week’s SC23, in Denver, NVIDIA is sharing a new range of groundbreaking work in the field of accelerated computing. View the full schedule and catch the replay of NVIDIA’s special address below.

NVIDIA Research comprises hundreds of scientists and engineers worldwide, with teams focused on topics including AI, computer graphics, computer vision, self-driving cars and robotics. Learn more about NVIDIA Research and subscribe to NVIDIA healthcare news.

Main image courtesy of Argonne National Laboratory’s Bharat Kale. 

This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. DOE Office of Science and the National Nuclear Security Administration. Research was supported by the DOE through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on response to COVID-19, with funding from the Coronavirus CARES Act.

❌
❌