A group of researchers from US and Swiss universities, in collaboration with Google and its subsidiary DeepMind, have published a research paper explaining how data can leak from imaging platforms based on generative AI models such as: DALL-E, Imagen or stable diffusion.
They all work in the same way, based on the fact that the user enters a certain text prompt, such as “avocado-shaped chair”, and receives a text-generated image within a few seconds.
The generative AI models used on these platforms have been trained on a very large number of images with a predefined description. The idea is that neural networks are able to generate new and unique images after processing a huge amount of training data.
However, new research shows that these images are not always unique. In some cases, the neural network can reproduce an image that exactly matches the previous image used in training. This means that neural networks can inadvertently reveal private information.
This study challenges the notion that AI models used to generate images do not retain their training data and that training data can remain private if not disclosed.
Provide more data
The results of deep learning systems can amaze non-specialists, and they may think that they are magical, but in fact there is no magic in this, since all neural networks build their work on the same principle – learning. using a large data set and precise descriptions of each picture, for example: a series of images of cats and dogs.
After training, the neural network displays a new image and asks you to decide whether it is a cat or a dog. From this modest moment, the developers of these models move on to more complex scenarios, creating an image of a non-existent pet using an algorithm trained on many images of cats. These experiments are carried out not only with images, but also with text, video and even sound.
The starting point for all neural networks is the training dataset. Neural networks cannot create new objects out of thin air. For example, to create an image of a cat, the algorithm must examine thousands of real photos or drawings of cats.
Big Efforts to Keep Datasets Confidential
In their work, researchers pay special attention to machine learning models that work as follows: they distort the training data – images of people, cars, houses, etc. – by adding noise, then the neural network is trained to restore these images to their original state.
This method allows to generate images of acceptable quality, but a potential disadvantage – compared, for example, with algorithms in generating competitive networks – is a greater tendency to leak data. The raw data can be extracted from it in at least three different ways, namely:
Using specific queries to force the neural network to output a specific source image rather than something unique generated from thousands of images.
– The original image can be restored even if only part of it is available.
– You can easily determine whether a particular image is included in the training data or not.
Often neural networks are lazy and instead of creating a new image, they will produce something from the training set if it contains multiple duplicates of the same image. If an image is repeated more than a hundred times in the training set, there is a very high chance that it will leak in its close-to-original form.
However, the researchers showed ways to extract training images that only appeared once in the original set. Of the 500 images the researchers tested, the algorithm randomly recreated three of them.
From whom did you steal?
In January 2023, three artists sued AI-based imaging platforms for using their online images to train their models without any respect for copyright.
The neural network can actually copy the style of the artist, thereby depriving him of his income. The paper notes that in some cases, algorithms can, for various reasons, engage in outright plagiarism, generating drawings, photographs and other images that are almost identical to the work of real people.
Therefore, the researchers made recommendations to increase the specificity of the original training group:
1- Eliminate repetitions in training groups.
2- Re-process training images, for example by adding noise or changing brightness; This reduces the chance of data leakage.
3- Testing the algorithm using special training images and then checking that it does not reproduce it inadvertently exactly.
what’s next?
Generative art platforms have certainly sparked an interesting debate lately, in which a balance needs to be found between artists and technology developers. On the one hand, copyright must be respected, but on the other hand, is art created by AI very different from human art?
But let’s talk about security. The article presents a specific set of facts about only one machine learning model. Extending the concept to all such algorithms, we come to an interesting situation. It’s not hard to imagine a scenario where a mobile operator’s intelligent assistant transmits confidential information about a company in response to a user request, or writes a fraudulent script that instructs a public neural network to create a copy of someone’s passport. However, the researchers emphasize that such problems are still theoretical.
But there are other real problems that we are facing now as scripting models like ChatGPT are now being used to write real malicious code.
And GitHub Copilot helps programmers write code using a huge amount of open source software as input. And the tool doesn’t always respect the copyrights and privacy of authors whose code ended up in a very extended training dataset.
As neural networks develop, there will be attacks on them with consequences that no one understands yet.