Data Poisoning, the sly and silent monster, is already around. If we are not worried already by the glimpse that some incidents have provided us; let’s steal a peek at the sheer power of data poisoning to ruin companies, public systems, countries and human lives with the push of button.
Imagine you enter your office and see signs of a failed robbery. Till yesterday, you could let out a sigh of relief. But today, if there are signs of a break-in but the gold in the locker is sitting peacefully, it’s a bigger alarm to pay attention to. A botched burglary is a horror-movie in the making.
Because chances are that the burglar had sneaked in with a different purpose. There could be poison mixed in that water-cooler or the pudding in the fridge.
Alas, while a household can quickly empty the refrigerator, companies and government bodies cannot throw away all their data that easily. It’s tricky. And yet, a ticking bomb.
A threat that’s not sitting far away. A big train accident out of nowhere. A dangerous brake-failure in all cars of a new model’s batch. A fatal drug that is rolled out without the company’s knowledge and is suddenly sitting on many counters. A mess-up with the stock market that crashes the financial pulse of an entire nation. A DDoS attack that causes a huge outage and the dominoes fall everywhere – whoever uses that Cloud.
Cyanide or Cool-Aid. It’s hard to tell anymore when we look at a bottle of data. Data poisoning is the next big threat vector coming upon us. And it’s neither an accident, nor a small bug. To twist the knife further- it’s gonna get very easy and cheap for the bad guys. And very tough and all-high-stakes for the good guys.
Snakes Dressed as Snails
As per media reports, just as recently as early 2021, a hacker broke into a Florida water treatment plant and elevated the sodium hydroxide, or lye, in the water to an unsafe level. The operator quickly spotted the danger. But is it going to be that easy and effective always? Specially as AI, ML and algorithms (that feed on nothing but data) start running our world in massive ways? What if someone brainwashes them? Or poisons them?
Bringing critical infrastructures down with slithering, and silent, attacks is not a movie-script anymore. Microsoft’s investigation into Volt Typhoon’s malicious activities reveals how critical infrastructures can be compromised using Living-Off-Lotland (LotL) cyberattacks, avers Sean Duca, Palo Alto Networks – VP and Regional Chief Security Officer for Asia Pacific & Japan. “This technique involves attackers leveraging existing tools and utilities on a compromised system for malicious activities. The tools may include PowerShell, WMI, command-line interfaces, and batch files.”
Data tampering and poisoning are serious threats to the integrity and security of enterprise data, affirms Dattaraj Rao, Chief Data Scientist, Persistent Systems. “They can occur in various ways, including hacking, malware, and social engineering. When data is tampered with or poisoned, it can lead to several problems, like financial losses, identity theft, and damage to reputation. In addition, tampering with data can result in incorrect or false information being disseminated, which can have detrimental outcomes.”
Data poisoning usually refers to situations where the training data used in ML models is intentionally corrupted by a hacker, if we see it from the gaze of Cybersecurity expert Prof. Lawrence A. Gordon, EY Alumni Professor of Managerial Accounting and Information Assurance, Robert H. Smith School of Business, University of Maryland. “Thus, in terms of the CIA (Confidentiality, Integrity, and Availability) triad considered in cybersecurity, data poisoning is a form of data Integrity cyber breach. Given the growing importance of ML models (which fall under the umbrella of AI), it seems (at least to me) that data poisoning should be a serious concern to organizations.”
How Does That Cuttlefish get inside?
Duca explains how data-tampering and data-poisoning works. “A typical LotL attack consists of three phases. First, in reconnaissance, the attacker gathers information about the compromised system, including architecture, software versions, network configuration, and user privileges. This helps identify strengths, weaknesses, and potential exploitation avenues. Second, during the initial access phase, the breach occurs due to vulnerabilities in network devices or unsafe user actions like visiting malicious websites, opening phishing emails, or using infected USB drives. These contain the attack kit with a fileless script.”
Third, as Duca tells, the malicious activity execution involves escalating privileges, exfiltrating data, and modifying system configurations. Do note that achieving malicious goals while flying under the radar is vital to this operation. Unless the fish camouflages itself well, the attack will not work.
Aaron Bugal, Field Chief Technology Officer – Asia Pacific and Japan, Sophos highlights a big danger that lurks way deep, and behind, many cyber-attacks. “Far too often when conducting security event incident response functions, it’s found that not only was the integrity of ransomed data broken, backup systems storing copies of the data were also destroyed and or deleted – making it very hard, if not impossible, to validate if data remaining is reputable/original.” It’s ironic but true. Data tampering can only be attributable to a security event with a verifiable and immutable copy of the original document.
Zombie Ants- Conning AI – Not tough eh!
Prasenjit Saha, Executive Vice President and Global Cyber Security Business Head at LTIMindtree points out that while AI has become a great enabler in managing cyber risks – be it contextualizing and augmenting detection capabilities, predictive threat intelligence, threat intel-led hunting and mitigation -it has also increasingly become a serious source of mishandling or threats. “Malicious actors are using AI prompts and other invasive codes to bypass security systems and exploit vulnerabilities in AI models leading to unauthorized access to sensitive data, theft of intellectual property, and even misuse of autonomous systems.”
So would jail-broken AI be a practical threat?
Yes, agrees Rao and dissects how the approach is usually different than traditional software jail-breaking. “Adversarial attacks on AI systems are deliberate and malicious attempts to deceive or manipulate AI algorithms by feeding them misleading or malicious data. Adversarial attacks exploit the vulnerabilities in AI systems and can cause them to make incorrect decisions or predictions, leading to potentially harmful consequences. Today, with the advent of large language models (LLM) like ChatGPT, jail-breaking often includes ways to inject malicious prompts to make the LLM respond by violating ethical principles.”
What’s worrisome to know is that today, with a prompt injection, some LLMs may be manipulated to generate biased content, create phishing emails, etc. If enterprises are exposing their models via public endpoints, the vulnerability is even higher.
It’s much more prudent to understand how AI is being used within your own organization, weighs in Bugal. “Although offensively-tasked AI will most likely become a threat in the future, organizations should be looking to understand how generative AI is being used by employees on company-issued devices and what data is being uploaded to them. This is an example of an insider threat, accidental intent, but the loss and or misplacing of information into a cloud service could be an infraction of self-governance and or regulatory requirements.”
We can’t talk about data poisoning without talking about ‘dangerous data’ which Gareth Herschel, VP Analyst, Research Engagement Services, Gartner defines very simply. “Your organization is collecting data as a necessary part of its normal operations that has potentially negative consequences. Even worse, there is often no automatically “correct” way of responding to these consequences as both taking an action, and failing to take that action could be considered inappropriate depending upon the priorities you attach to different outcomes.”
Hedwig and Ethan Hawk, Where Are Thou?
Thankfully, anyone can be a Tom Cruise in this seemingly-impossible mission of spotting and disarming the dangerous guys. It’s not going to be as easy as catching ransomware, but there’s some place to start from – like network defense, pre-emptive boxing, and Ninja-speed pushbacks. And being brave and playing in a team.
Corporations, governments, and critical infrastructure providers must revise their cybersecurity strategies to address increasingly sophisticated threats, integrating host and network-based defences, Duca suggests. “For example, relying solely on endpoint monitoring may allow attackers to evade detection, but network-based defences scrutinise traffic patterns and unexpected communications. As a result, the most effective strategies employ endpoint and network-based defences in tandem, using insights from one system to enhance the other and work together to protect an organisation better.”
Let’s touch the part about being brave and working with others. Often, the problem with data tampering attacks is the lack of transparency in reporting, making it difficult to share knowledge and learn from past mistakes, Rao warns. “Enterprises should implement strong security measures such as firewalls, encryption, and access controls to prevent unauthorized access to their systems and data. Zero-trust strategy is recommended with logged account activities rather than common admin accounts with shared credentials.”
Staple and boring-sounding measures like back-ups and proactivity can be widely under-estimated – but can be just the clincher here.
“Making immutable copies of mission critical data and committing them to an off-line back (remembering the 3-2-1 data backup rule) is integral to a sound incident response plan.” Stresses Bugal. “Without these offline backups and verified data, there’s no way to establish legitimacy and trust of collateral data post a breach.”
At the end-user level, implementing application whitelisting ensures that only approved and trusted applications can run on the network, adds Duca. “This proactive measure restricts the execution of unauthorised programs or scripts, mitigating the risk of LOtL attacks. LotL attackers also exploit known vulnerabilities in outdated software to gain unauthorised access. Therefore, automated scanning and updating systems across the network are essential to decrease risk.”
Watch For Those Mushrooms
It took a clever and sharp-nosed John Snow to sniff out that the cause of a horrific Cholera outbreak was sitting in a water pump. We are scampering at a breakneck speed towards AI- and when we start drinking more and more from these digital-age water pumps, we have to become more and more alert about the venom that can get easily inside.
As we look forward to an AI-led future which will redefine how we live, work and play, we need to bring a security-first approach to AI to make it more trusted, fair, ethical, robust, accountable and transparent- the way Saha puts it.
Indeed! There’s no room for another pandemic in this world. And definitely not for one brought upon by something as ‘cool’ as AI.
Burglars can be dressed in tuxedos. But they are still burglars.
BOX: The Anti-Dote is Here
Source: As told by various experts
- Regularly monitor and audit systems to detect any anomalies or suspicious activities that could indicate data tampering
- Invest in Responsible AI strategy to ensure ML models follow principles of accountability, transparency, reproducibility, security, and privacy
- Inputs to models, intermediate data, and outputs must be logged and monitored to identify any potential attacks and action should be taken to counter these
- Invest in training data filtering, robust learning, and using auxiliary tools
- Back up pure/verified data- Ruthlessly and consistently Train employees on detecting and reporting any suspicious activities and provide them with cybersecurity awareness training
- Work with cybersecurity experts to investigate the incident, determine the extent of the damage, and develop a plan for recovery
- Use AI-enabled advanced access management solutions. This way, security professionals can focus on intelligence and automation while letting the intelligence and automation manage information and events, thus enabling near real-time detection and response
- Redefine the accountability model, adaptive and contextual control mechanism and following industry-leading cybersecurity, data protection, privacy frameworks
- Communicate transparently with stakeholders, including clients, partners, and investors, about the incident and the steps being taken to mitigate the damage and prevent future incidents
- Have a response plan that outlines the steps to be taken in the event of a data-tampering incident. The plan should include steps for containing the incident, notifying authorities and stakeholders, and restoring systems and data
–Byline Pratima H