The Critical Need for Businesses to Strengthen Cyber Security in the Age of AI – Technologist

The UK government’s newly established AI Safety Institute (AISI) has released a report that uncovers significant vulnerabilities in large language models (LLMs). This discovery underscores the urgent need for businesses to tighten their cyber security measures, particularly as AI technology becomes increasingly integrated into operations.

The AISI’s findings demonstrate that these AI systems are alarmingly prone to basic jailbreaks. Specific models generate harmful outputs even without attempts to circumvent their safeguards. This vulnerability poses a severe risk to businesses relying on AI for sensitive and critical functions.

Publicly Available LLMs at Risk

Publicly available LLMs typically incorporate mechanisms to prevent generating harmful or illegal responses. However, “jailbreaking” refers to tricking the model into ignoring these safety protocols. 

According to the AISI, which utilised both standardised and in-house-developed prompts, the tested models responded to harmful queries without needing any jailbreak efforts. When subjected to relatively simple attacks, all models answered between 98 and 100 per cent of harmful questions.

Measuring Harmful Information Compliance

AISI’s evaluation measured the success of these attacks in eliciting harmful information, focusing on two key metrics: compliance and correctness. Compliance indicates whether the model obeys or refuses a dangerous request, while correctness assesses the accuracy of the model’s responses post-attack.

The study included two scenarios: asking explicitly harmful questions directly (“No attack”) and using developed attacks to elicit information the model is trained to withhold (“AISI in-house attack”).

Basic attacks embedded harmful questions into a prompt template or used a simple multi-step procedure. Each model was subjected to a distinct cyber attack, optimised on a training set of queries and validated on a separate set.

The Need for Robust Cyber Security Measures

The report highlights that while compliance rates for harmful questions were relatively low without attacks, they could reach up to 28% for some models on private harmful questions. Under AISI’s in-house attacks, all models complied at least once out of five attempts for nearly every question.

“This vulnerability indicates that current AI models, despite their safeguards, can be easily manipulated to produce harmful outputs,” the report noted. The institute emphasises the need for continued testing and development of more robust evaluation metrics to improve AI safety and reliability.

Implications for Businesses

This report is a wake-up call for businesses to enhance cyber security measures. Despite their sophistication, AI systems can be easily manipulated to generate harmful responses. This vulnerability can lead to significant risks, including data breaches, misinformation, and legal liabilities.

Businesses must invest in cyber security frameworks to protect their AI systems from exploitation. This includes regular security assessments, implementation of advanced threat detection mechanisms, and continuous updates to safeguard protocols. By doing so, businesses can ensure the safe and reliable deployment of AI technologies, thereby protecting their assets and maintaining trust with their stakeholders.

Future Steps

AISI plans to extend its testing to other AI models and is developing comprehensive evaluations and metrics to address various areas of concern. With a growing team and plans to open new offices in San Francisco, the institute aims to collaborate with leading AI companies to enhance the safety and reliability of AI systems worldwide.

As AI continues to evolve, businesses must stay ahead of potential threats by adopting stringent cyber security measures, ensuring that the integration of AI into their operations is both secure and beneficial.

Add a Comment

Your email address will not be published. Required fields are marked *