Opinion

Homomorphic encryption: The key enabler of digital security

Homomorphic Encryption, when data is encrypted using traditional techniques, it becomes impossible to do any meaningful computation

DQINDIA Online

20 Sep 2022 02:53 IST

New Update

Neural networks consist of thousands and millions of artificial "brain cells" or computational units that behave and learn in an incredibly similar way to the human brain. Also known as Artificial Neural Networks (ANNs), neural networks generally consist of software simulations that behave as though they were millions of brain cells interconnected and working in parallel to solve problems, make decisions and recognize patterns just like a human brain can.

Advertisment

An artificial neural network processes information in two ways; when it is being trained it is in learning mode and when it puts what it has learned into practice it is in operating mode. For neural networks to learn, they must be told when they do something right or wrong. This feedback process is often called back-propagation and allows the network to modify its behavior so that the output is exactly as intended. In other words, it is trained with many learning examples and eventually learns how to reach the correct output every time, even when it is presented with a new range or set of inputs. Just like a human, an artificial neural network can use past experiences to reach the right conclusion.

Artificial Neural networks (ANN) or neural networks are computational algorithms. It is intended to simulate the behaviour of biological systems composed of “neurons”. ANNs are computational models inspired by an animal's central nervous systems. It is capable of machine learning as well as pattern recognition.

Artificial neural networks are characterized by containing adaptive weights along paths between neurons that can be tuned by a learning algorithm that learns from observed data in order to improve the model. The cost function is what’s used to learn the optimal solution to the problem being solved. This involves determining the best values for all of the tune-able model parameters, with neuron path adaptive weights being the primary target, along with algorithm tuning parameters such as the learning rate. It’s usually done through optimization techniques such as gradient descent or stochastic gradient descent.

Advertisment

These optimization techniques basically try to make the ANN solution be as close as possible to the optimal solution, which when successful means that the ANN is able to solve the intended problem with high performance.

An artificial neural network is modelled using layers of artificial neurons, or computational units able to receive input and apply an activation function along with a threshold to determine if messages are passed along.

In a simple model, the first layer is the input layer, followed by one hidden layer, and lastly by an output layer. Each layer can contain one or more neurons.

Advertisment

Models can become increasingly complex, and with increased abstraction and problem solving capabilities by increasing the number of hidden layers, the number of neurons in any given layer, and/or the number of paths between neurons. Note that an increased chance of over-fitting can also occur with increased model complexity.

Types of Neural Network

Advertisment

Feedforward Neural Network: The feedforward neural network is the most simple of all varieties. Information moves in one direction only and is sent from input nodes directly to output nodes. There are no loops or cycles in this network.

Recurrent Neural Network: Unlike its feedforward cousin, the recurrent neural network allows data to flow bidirectionally. This type of network is a popular choice for pattern recognition applications, such as speech recognition and handwriting solutions.

Modular Neural Network: A modular neural network is made up of independent neural networks. Each is given a set of inputs and work together to complete sub-tasks. The final output of the modular neural network is managed by an intermediary that collects data from the individual networks.

Advertisment

Convolutional Neural Network: Convolutional neural networks are primarily used to classify images. For example, they are able to cluster similar photos and identify specific objects within a scene, including faces, street signs and individuals.

Neural networks are becoming more and more common and they are often implemented without much consideration of their potential security flaws.

Neural networks are used very commonly by most of us. Asking Siri a question, Self-Driving Cars, Face recognition camera, Face identification within social media applications, Alexa etc. all are neural networks. These are just the applications that are tangible, there are plenty of intangible applications of neural networks that people use every day. Whether it is the software program you use at work or just searching for a place to go to dinner that evening, you are likely using some form of neural network

Advertisment

Neural networks can be rather sensitive to the inputs that you give them. Therefore, it is relatively easy to fool with the network if you know the right buttons to push. By manipulating certain nodes of images it is easy to activate neurons associated with particular features, which can make the network give spurious outputs.

Let’s say that you use Alexa to buy things on a regular basis. One day, a particularly smart hacker sits outside of your house and hacks your Wifi (say using aircrack-ng) if you have not secured your router properly or still have a default password. The hacker now has access to Alexa, which has security privileges to make transactions on your behalf, giving your verbal approval. If the hacker is smart enough, it is conceivably possible that they could fool Alexa into giving away all of your money to the hacker, just by pushing the right buttons on the neural network.

AI is a huge ecosystem of tools, languages, frameworks with functions ranging from data ingestion, data pre-processing, modeling, model evaluation, integration, visualization, packaging & deployment, etc.

Advertisment

Multiple stakeholders’ groups ranging from traditional software engineers to machine learning engineers, data scientists and statisticians are involved in any project.

The complete working solution usually involves multiple systems coupled together at various interfaces.

Huge amounts of business data is involved, millions or even billions of records. Depending upon the problem domain, this data may have tremendous amount of value or sensitivity. Moreover, the preference is to consider all attributes/aspects/dimensions of data.

One can’t use ‘dummy data’ for testing/pre-production stages in AI & ML. At all points in the ecosystem, various stakeholders and various systems handling precious ‘Real’ data. The many iterations and requests for variety of data can disrupt any existing data governance we may have in place.

Adversarial Attacks

The typical purpose of an adversarial attack is to add a natural noise on an image so that the target model misclassifies the sample, but it is still correctly classified by the human eye.

Non-targeted adversarial attacks are developed to fool a machine learning classifier by modifying source image. The neural network does not return a certain class as opposed to targeted attacks. The output can be a random class excluding the original one.

Targeted adversarial attacks are designed to misclassify an image as a specified target class by modifying source image. The output of this neural network is only one certain class. Impersonation can be an example for this type of attacks because an adversarial image can disguise a face as an admin user

Defenses against adversarial attacks are aimed to build such a robust classifier so that it correctly identifies adversarial images.

Essentially, attacks on neural networks involve the introduction of strategically placed noise designed to fool the network by falsely stimulating activation potentials that are important to produce certain outcomes. By ‘strategically place noise’, consider the following image developed by Google Brain to show how the same effect can fool humans. Is the picture on the left and the right are both cats? Or dogs? Or one of each?

An illustration of how the network is corrupted by the introduction of the strategic noise.

White Box & Black Box Attacks

A white box attack occurs when someone has access to the underlying network. As a result, they will know the architecture of the network. This is analogous to a white box penetration test of a company’s IT network. Once a hacker understands how your IT network is structured, it makes it much easier to sabotage. Knowing the structure of the network can help you select the most damaging attacks to perform, and also helps to unveil weaknesses relevant to the network structure.

A black box attack occurs when the attacker knows nothing about the underlying network. In the sense of neural networks, the architecture can be considered as a black box.

Presuming that we are able to test as many samples as we like on the network, we can develop an inferred network by passing a bunch of training samples into the network and obtaining the output. We can then use these labeled training samples as our training data and train a new model to obtain the same output as the original model.

Two classifications:

1) The adversary has access to the training environment and knowledge of the training algorithm and hyperparameters. It knows the neural network architecture of the target policy network, but not its random initialization. They refer to this model as transfer-ability across policies.

2) The adversary additionally has no knowledge of the training algorithm or hyperparameters. They refer to this model as transfer-ability across algorithms.

Once we have our new network, we can develop adversarial examples for our inferred network and then use these to perform adversarial attacks on the original model.

This model does not depend on knowing the architecture of the network, although this would make it easier to perform the attack.

Clearly, this presents a potential problem for the mass adoption of self-driving cars. Nobody would want their car to ignore a stop sign and continue driving into another car, or a building, or a person.

Evasion and Poison Attacks

Attacks that involve in ‘fooling’ a system are Evasion attacks.

An example would be fooling a spam detector that guards email accounts so that you are able to get your spam emails into someone’s inbox. Spam detectors often use some form of machine learning model that can be used for word filtering. If an email contains too many ‘buzzwords’ that are typically associated with spam email (given your email history as the training data), then it will be classified as spam. Now if I know these words I can deliberately change them to make it less likely that the detector will consider my email as spam, and I will be able to fool the system.

Another good example is in computer security, where machine learning algorithms are often implemented in intrusion detection systems (IDSs) or intrusion prevention systems (IPSs). When a network packet reaches my computer that has the characteristic signature of a piece of malware, the algorithm kicks in and stops the packet before it can do anything malicious. However, a hacker can use misleading codes to ‘confuse’ the network so that it does not flag up a problem.

Poisoning attacks involve compromising the learning process of an algorithm but only works on models that participate in online learning, i.e. they learn on the job and retrain themselves as new data become available to them.

Considering the above IDS example, these are constantly updated using online learning since new viruses are always being developed. If one wishes to prevent a zero-day attack, it is necessary to give these systems the capability of online learning through an integrated online big data pool that utilizes data analytics. In a Poisoning attack, the attacker could poison the training data by injection designed samples to eventually compromise the whole learning process. This makes the IDS useless, and you are at much greater risk from potential viruses and likely will not even realize. Poisoning may thus be regarded as adversarial contamination of the training data. Similar could be thought of for the spam detector example.

Fast Gradient Step Method. This manipulates the sharp decision boundaries used by the classifier by the introduction of strategic noise, as we have been discussing up to now.

Some methods to defend

There are a number of methods that have been developed to defend neural networks from the various types of attack vectors.

Adversarial Training

The best way of defending against adversarial attacks is through adversarial training. That is, actively generating adversarial examples, adjust their labels, and add them to the training set and then train the new network on this updated training set and it will help to make your network more robust to adversarial examples.

Smooth decision boundaries

Regularization, acts to smoothen the decision boundaries between classes and makes it less easy to manipulate network classification using strategic noise injection.

Mixup

Mixup involves mixing two training examples by some factor λ, which is between zero and one, and assigning non-integer classification values to these training samples. This acts to augment the training set and reduces the optimistic classification tendencies for networks. It essentially diffuses and smoothens the boundaries between classes and reduces the reliance of classification on a small number of neuron activation potentials.

CleverHans

The essential story is of a horse who supposedly was able to do basic arithmetic by stamping his feet a given number of times. However, it was later discovered that the horse was actually cheating and responding to the verbal and visual clues of the surrounding crowd.

From that name, CleverHans is a Python library that has been developed to benchmark machine learning systems’ vulnerability to adversarial examples. If you are developing a neural network and want to see how robust it is, test it out with CleverHans and you will find get an idea of its level of vulnerability.

Convolutional Neural Network (CNN) in the picture has a more ‘distributed’ opinion about what it is. In fact, some part of the network even thinks it might be a horse!

Adversarial machine learning exploits the above ideas to make the learning and/or prediction mechanisms in an AI/ML system do wrong things. This is done by impacting the learning process, when an attacker may be able to feed data while it is being learned or during prediction/operation, when an attacker may get the model to do something wrong or unexpected for a given input chosen by the attacker. Attacks may be performed in a ‘white-box’ manner, when an attacker knows most things about the internals of the model, the hyper-parameters, etc. or a ‘black-box’ manner, when the attacker can only explore the system ‘from the outside’ by observing its decisions for certain inputs like any other end user.

In adversarial attacks, the attacker tries to disturb the inputs just enough so that the probability distribution of ‘what it is?’ changes in a manner that is favorable to the attacker. In the CNN example above, the adversary may try to tweak certain parts of the image so that the probability of the network thinking that it is a ‘horse’ goes up significantly and ‘horse’ gets voted as the ‘correct’ classification.

Another example if adversary attack is Tay Bot which was an AI chatbot that was created by Microsoft AI research team and was launched on Twitter to have conversations with other users and learn to interact with people ‘along the way’. However, as an outcome of coordinated attacks to ‘mis-train’ it, Tay rapidly became offensive and started posting all sorts of inflammatory tweets. It had to be brought down within just 14 hours of its launch!

How to secure & maintain privacy?

Start with Awareness: Ensure that all members/stakeholders have a good basic understanding of security and privacy. Things like data classification, data protection techniques, authentication/authorization, privacy principles, applicable regulatory requirements, etc. The goal should be to ensure that all stakeholders have role-appropriate understanding of security and privacy and everyone uses the same terminology and knows and understands relevant policy and standards.

Data Governance & Privilege management:Ownership and accountability should be clear for various stakeholders as data changes hands at different stages of each workflow. This is particularly important given the wide circulation of data that will be inevitable in ML & AI projects. Right level of PAM and authentication. AI needs to be accompanied by the same strong access barriers one would encounter through a Web or mobile interface. This virtual barrier could include passwords, biometrics or multi-factor authentication

Diligent threat modeling of solutions: At component level as well as from an end to end perspective. This will ensure that security is ‘built in’ into the design and that applicable security requirements are met at every point in the end to end system. Attention should be paid at boundaries and interfaces between the different sub-systems. Assumptions made by either side at those interfaces should be clearly verified. Also, because production data is involved everywhere, all workflows must be exhaustively covered in the threat models (from the earliest experiments and proofs of concept to the fully operational system as it would be in deployed in production). All threats/risks identified during threat modeling must be fixed by performing a combination of feature security testing and penetration assessments.

Monitoring hygiene & IR plan: All software components are at their latest security patch level, conduct periodic access reviews, rotate keys/certificates, etc. and embed strong incident response plan to deal with a calamity if one does happen.

Inference control is the ability to share extracts from large scale datasets for various studies/research projects without revealing privacy sensitive information about individuals in the dataset.

‘k-anonymity’, ‘l-diversity’ and ‘t-closeness’.

K-Anonymity is used to provide a guarantee that any arbitrary query on a large dataset will not reveal information that can help narrow a group down below a threshold of ‘k’ individuals. The technique provides an assurance that there will remain an ambiguity of ‘at-least-k’ records for anyone mining for privacy sensitive attributes from a dataset. Attacks on k-anonymity, can happen if the results from different subsets of the dataset are unsorted. A mitigation for this is to randomize the order of each released subset.

Another class of attacks comes into play if there is not enough diversity in the records containing a sensitive attribute within each equivalence group. In that case, an attacker can use some background information on individuals to infer sensitive data about them. L-Diversity tries to address this by ensuring that equivalence groups have ‘attribute diversity’. It ensures that subsets of the dataset that have the same value have ‘sufficient diversity’ of the sensitive attribute. ‘l-diversity’ works in hand in hand with ‘k-anonymity’, it adds ‘attribute inference’ protection to a dataset that is protected for ‘membership inference’ by ‘k-anonymity’.

T-Closeness mitigates these weaknesses by consciously keeping the distribution of each sensitive attribute in an equivalence group ‘close’ to its distribution in the complete dataset. In ‘t-closeness’, the distance between the distribution of a sensitive attribute in an equivalence group and the distribution of that attribute in the whole table is no more than a threshold ‘t’.

Differential Privacyprovides a mathematical framework that can be used to understand the extent to which a machine learning algorithm ‘remembers’ information about individuals that it shouldn’t, therefore offering the ability to evaluate ML algorithms for privacy guarantees they can provide. This is invaluable because we require models to learn general concepts from a dataset (e.g., people with salary higher than X are 90% more likely to purchase drones than people with salary less than Y) but not specific attributes that can reveal the identity or sensitive data of individuals that made up the dataset (e.g., Atul’s salary is X). Differential privacy adds a controlled amount of ‘noise’ during processing so as to generate enough ambiguity downstream that privacy-impacting inferences cannot be made based on predictions from the system.

PATE Framework

The Private Aggregation of Teacher Ensembles (PATE) Framework applies differential privacy to provide an overall privacy guarantee on the model being trained from user data. The key intuition in the PATE framework is that “if two models trained on separate data agree on some outcome then it is less likely that sharing that outcome to the consumer will leak any sensitive data about a specific user”.

The framework divides the private data into subsets and independently trains different models (called ‘teachers’) on each of the subsets. The overall prediction is generated by combining the individual predictions of this ‘ensemble’ of teacher models. First, noise is added when combining the outcomes of individual teachers so that the combined result is a ‘noisy aggregation’ of individual teacher predictions. Second, these noisy predictions from the teacher ensemble are used as ‘labeled training data’ to train a downstream ‘student’ model. It is this student model that is exposed to end users for consumption.

Federated Learning

Federated Learning takes a somewhat different approach to preserve privacy in learning scenarios. The key idea is not to bring all data together and instead devise ways in which we can learn from subsets of data and then effectively aggregate learnings.

For instance, a group of hospitals may be interested in applying ML techniques to improve healthcare of patients but (a) individual hospitals may not have sufficient data to do so by themselves and (b) they may not want to risk releasing their data for central aggregation and analysis. This is an ideal scenario for applying federated learning.

Homomorphic Encryption, when data is encrypted using traditional techniques, it becomes impossible to do any meaningful computation on it in the encrypted form. With the widespread adoption of cloud computing, one often encounters scenarios where a party possessing sensitive data wants to outsource some computation on that data to a third party which it does not trust with the plaintext data. Homomorphic encryption basically provides the ability to perform various meaningful operations on encrypted data without having direct access to the encryption keys or the plain text data itself. Using homomorphic encryption, the service can perform the requested computation on the encrypted data and return the encrypted result back to a client. The client can then use the encryption key (which was never shared with the service) to decrypt the returned data and get the actual result.

The article has been written by Archie Jackson, Senior Director, Head of Special Initiatives, IT & Security and CEO Office, Incedo Inc