/dq/media/media_files/2025/01/23/rKAKcjGn3HOOCuruIlo9.jpg)
Philippa-Cogswell1 Photograph: (Philippa-Cogswell1)
Also, spoiler alert. Your body-guard is not your body double. Leave the couch. In an age where Cyber-insurers also look at security maturity before they undertake new enterprises, where security fatigue is a bitter reality, where the grey-matter of LLMs can easily be conned by benign chatter, and where AI is turning as much into a Robin as into a Harley Quinn – and with equal speed and probability – whose side is AI on? Philippa Cogswell, Managing Partner, Unit 42, Asia Pacific & Japan, Palo Alto Networks helps us unravel these complex plot lines.
If unsafe content is embedded alongside benign topics, it is possible to trick the LLM into inadvertently generating harmful content.
Now that AI is being used for both offence and defence – and Quantum is also coming in soon- what do you make of these new threat contours?
We look at trends every year and we have seen a rise in speed, scale and sophistication of threats. I have been in this industry for 20 years and we have talked of this narrative a lot, but today the ‘time to react’ and ‘time to respond’ have changed drastically. In 2021 data, the median time from compromise to exfiltration of data was nine days, by 2023 it was reduced to two days. When we see more AI and automation ahead, this time window can be reduced even more. Historically, phishing has been the number one vector. In the last 12-18 months, there has been a shift in public-facing vulnerabilities- and now what is exposed and what is leveraged are getting interesting. We are seeing a lot of sophistication- criminal and ransomware groups understand how security and IT teams are functioning- and unfortunately, they are in the mix of communication amidst third-party responders.
Please zoom in on AI here.
It’s quite interesting and attackers are not labelling ‘powered by AI’ yet so we do not necessarily know for sure if they are using AI. But in some instances, it’s quite clear. Some examples we are researching and reporting on, corroborate this, and even our red teams and ethical hacking teams – show the use of AI.
Also, we are researching how easy it is to create malware with Gen AI. At this stage, from our own testing and trials- not a lot of high-quality code is being produced with AI. May be parts of code but not a whole piece of code. But with a simple piece of malware – that allows threat hackers to copy, upload, download files etc. – we found that it can replicate similar user interface and some functionalities. Also, at the end of October, we published an article on ‘Deceptive Delight’ – where our team discovered a big technique to bypass safety techniques. It covered a multi-turn technique that engages Large Language Models (LLM) in an interactive conversation, gradually bypassing their safety guardrails and eliciting them to generate unsafe or harmful content.
So seemingly harmless chat works to jailbreak too? Please elaborate on what you found?
The team tested 8,000 cases across eight models. We found that it achieves an average attack success rate of 65 per cent – that too within just three interaction turns with the target model. Basically- Deceptive Delight operates by embedding unsafe or restricted topics among benign ones, all presented in a positive and harmless context, leading LLMs to overlook the unsafe portion and generate responses containing unsafe content. So now there can be a scenario plausible where benign language can be wrapped around nefarious language. Iteration after iteration, detection was becoming difficult.
How?
In the first turn, the attacker requests the model to create a narrative that logically connects both the benign and unsafe topics. In the second turn, they ask the model to elaborate on each topic. During the second turn, the target model often generates unsafe content while discussing benign topics. At the third turn, the LLM allowed the team to bypass safety controls. We tested Deceptive Delight on eight state-of-the-art open-source and proprietary AI models. There is a lot of ongoing research by us and others- about what attackers can and might do. We will continue to publish such research.
Why does this kind of malicious intent work in models created for harmless purposes and in so neatly-obscure a way?
It’s hard to ignore that LLMs have a limited ‘attention span’ and that creates vulnerability to distraction when processing texts with complex logic. If unsafe content is embedded alongside benign topics, it is possible to trick the model into inadvertently generating harmful content while focusing on the benign parts. LLMs mirror humans here - just as humans can only hold a certain amount of information in their working memory at any given time, the ability of LLMs is restricted when it comes to maintaining contextual awareness as they generate responses. This is precisely where they can overlook critical details, especially when it is presented with a mix of safe and unsafe information. More so with prompts that blend harmless content with potentially dangerous or harmful material. In complex or lengthy passages, the benign aspects can be prioritised while glossing over or misinterpreting the unsafe ones. Just the way a person might skim over important but subtle warnings in a detailed report if their attention is divided.
So if AI can be so easily and swiftly used- what about laymen, consumers and SMBs who may not be aware or equipped to deal with such situations? Especially with all the currently-hot cloning and digital frauds?
Yes, it can be quite difficult to deal with such scams. Education and awareness are critical now. Everything we can do to highlight and educate about such scams – we are pursuing. We are getting better at questioning these things. Deep fakes and audio clones are new challenges.
So are AI and Quantum only adversaries? What about the good guys who want to use them?
Yes, we have been using AI in many security products and approaches for some time. We have increased the use of AI – especially precision AI, ML, Gen AI in some network layer, cloud layer, SecOps products. We use them to understand anomalies, and to deal with large volumes of data (I was a security analyst myself so I can tell you that in the security industry we deal with really heavy volumes of data). We will continue to explore more. AI is going to be a great tool from a defence standpoint too – it’s a missed opportunity if organisations are not leaning towards AI from a security standpoint- especially with talent-related challenges of the industry.
Did Operation Cronos help? Does that mean collaboration across global bodies and enforcement bodies works to fight the bad actors?
Yes, we have seen a lot of traction on that front. I think it helps. We have seen a slowdown in some ransomware activities. From 2022 to 2023 there was a YoY increase which was not seen this year. However, we are seeing some groups effectively dissolve, some say they have retired, some are resorting to hactivism, some are emerging as new handshakes- so a lot of shift is happening. There is a rise of dedicated link sites – with more data about victims and scale. I feel like it is an interesting thing to look at next. Smaller groups and dedicated links are some specific trends in India. They also offer penetration testing services- perhaps, they are looking at other avenues for expanding victim coverage and market targeting.
With cyber-insurance premiums slated to go up to $23 billion by 2026, is Cyber-insurance a strong trend- are we spending more on the band-aid than on locks? From being completely-unaware to big spends- has insurance picked up?
I have a slightly-different view here. We have previously seen cases where pay-outs have not happened because malware was attributed to cyber-terrorism and not covered by policies. Insurers also want to understand and do due diligence on an organisation’s maturity.
all enterprises using AI- if we can be more transparent about what data sources and what training we are using, it will help. We cannot continue to keep looking at AI as a Black Box.
In light of some recent headlines on VPN and firewall breaches around Palo Alto Networks and SonicWall- any advice or observations on these vulnerabilities?
The scale and very-deliberate targeting of public-facing infrastructure is a general trend – as I mentioned earlier- and that explains trends of breaches and attempts to exploit those kind of products under this overall pattern.
So who is the next sitting duck- last year we saw sports clubs, Children hospitals- and all kinds of new verticals. What’s next?
Ultimately, there is a range of groups that are opportunistic. Business disruption has been a big trend in the last six to 12 months. Sectors more exposed – show up- as healthcare, manufacturing etc. And even legal & professional services (in India) are more heavily targeted. Manufacturing has not been that regulated as financial services- and that’s why the latter has invested more in controls. Less mature and more exposed from a disruption/business productivity angle- those are the kinds of industries that will be on the radar of threat vectors. Specially if being offline is going to cost you million dollars a day, threat vectors probably are more suspectible to resort to use that aspect in negotiations and extortions.
How much of a problem ‘fatigue’ is? We are always running and mostly always an inch behind the attackers.
We probably do not report so much about success in solving an attack. Headlines do not cover that so much. There are more instances of stopping and prevention of attacks. Security teams are stopping far-far more incidents on a daily basis than those that cause damage. Security is a challenging space- but most people I know in this industry are incredibly passionate about it.
Last year- what caught your eye?
We cannot escape talking about AI. How organisations have started interacting with AI. At Palo Alto Networks, we have been using it for a decade. It’s interesting to see the number and kind of use-cases that have come up. AI is astounding in the way it has picked up so fast, and is so accessible and so flexible.
What’s 2025 going to be like- who would we be fighting?
We will see continuation of critical infrastructure, and with increased adoption of Cloud environments, focus will be heavy on that side too. Deep fakes, unfortunately, will witness more use- not just in APAC but across the world. One more big focus area would be transparency around AI. There is a lot of talk around ethics and regulation in AI. But for all enterprises using AI- if we can be more transparent about what data sources and what training we are using, it will help. We cannot continue to keep looking at AI as a Black Box.
Philippa Cogswell
Managing Partner, Unit 42, Asia Pacific & Japan, Palo Alto Networks
By Pratima H
pratimah@cybermedia.co.in