A recent study shows that large language models trained on public data can be prompted to provide sensitive information if fed the right words and phrases. And the study is just the tip of the iceberg.
It has already been established that there is bias in artificial intelligence. Specific words are associated with women (smile) versus men (official), negative connotations are often seen in reference to people of color versus their caucasian counterparts, etc. That’s not new and definitely needs to be addressed, but a recent study shows that large language models which are trained on public data can memorize and leak private information. Both of these go hand-in-hand, and as AI becomes more prominent and widely used, both problems need to be addressed.
The collaboration looked specifically at GPT-2, which has 124 million parameters compared with the more popular GPT-3 which has 175 billion parameters. The problem is, the research also found that the larger language models memorize training data more easily that their smaller counterparts. For instance, an experiment showed that GPT-2 XL (1.5 billion parameters) memorized 10 times more information than the smaller GPT-2. The implications for the larger models is huge. GPT-3 is available through an API, publicly.
Microsoft’s Turing Natural Language Generation Model is used in several Azure services and contains 17 billion parameters. Facebook’s translation model has over 12 billion parameters. These are the larger language models the study references, these models not only memorize their training data, but because they do that, they can leak sensitive information.
“Language models continue to demonstrate great utility and flexibility — yet, like all innovations, they can also pose risks. Developing them responsibly means proactively identifying those risks and developing ways to mitigate them,” Google research scientist Nicholas Carlini wrote in a blog post. “Given that the research community has already trained models 10 to 100 times larger, this means that as time goes by, more work will be required to monitor and mitigate this problem in increasingly large language models … The fact that these attacks are possible has important consequences for the future of machine learning research using these types of models.”
Not only do these models require intense monitoring and having mitigation protocols in place, but there needs to be tight security around anything they touch. Since they are trained on public data, it is important for businesses and consumers to limit the information that is shared publicly. These models are trained using billions of web-based examples like ebooks, social media platforms and more. Using this information, they work to complete sentences or paragraphs. This is the information they memorize, the information that can be “leaked” when prompted correctly.
Imagine the implications of a bad actor using this technology to guess credentials of employees at a company, credentials they will first test on personal accounts to find the exact matches before then attempting those credentials at their place of work. If the credentials match, as many people tend to do, now a bad actor is inside a business system and no one knows they aren’t supposed to be there. They can wreak havoc by implanting malware to steal information, cyrptomine compute, hold systems hostage or simply bring the business to a standstill.
With security on the minds of everyone in business, this news puts everyone on high alert. These language models are used in many applications we use every day, and hackers now have another avenue to exploit, if they haven’t started already. If you haven’t done a security review recently, now is the time. Bring in an expert, consult your team and start fixing weak spots now. The more secure you make your business in this ever-expanding digital realm, the more likely you are to survive when an attack inevitably winds up at your door.