Large Language Models Can Leak Private Data

A recent study shows that large language models trained on public data can be prompted to provide sensitive information if fed the right words and phrases. And the study is just the tip of the iceberg.

It has already been established that there is bias in artificial intelligence. Specific words are associated with women (smile) versus men (official), negative connotations are often seen in reference to people of color versus their caucasian counterparts, etc. That’s not new and definitely needs to be addressed, but a recent study shows that large language models which are trained on public data can memorize and leak private information. Both of these go hand-in-hand, and as AI becomes more prominent and widely used, both problems need to be addressed.

The study regarding data leakage was conducted by Google, Apple, Stanford University, OpenAI, the University of California, Berkeley, and Northeastern University. According to Venture Beat, “The researchers report that, of 1,800 snippets from GPT-2, they extracted more than 600 that were memorized from the training data. The examples covered a range of content including news headlines, log messages, JavaScript code, personally identifiable information, and more. Many appeared only infrequently in the training dataset, but the model learned them anyway, perhaps because the originating documents contained multiple instances of the examples.”

The collaboration looked specifically at GPT-2, which has 124 million parameters compared with the more popular GPT-3 which has 175 billion parameters. The problem is, the research also found that the larger language models memorize training data more easily that their smaller counterparts. For instance, an experiment showed that GPT-2 XL (1.5 billion parameters) memorized 10 times more information than the smaller GPT-2. The implications for the larger models is huge. GPT-3 is available through an API, publicly.

Microsoft’s Turing Natural Language Generation Model is used in several Azure services and contains 17 billion parameters. Facebook’s translation model has over 12 billion parameters. These are the larger language models the study references, these models not only memorize their training data, but because they do that, they can leak sensitive information.

“Language models continue to demonstrate great utility and flexibility — yet, like all innovations, they can also pose risks. Developing them responsibly means proactively identifying those risks and developing ways to mitigate them,” Google research scientist Nicholas Carlini wrote in a blog post. “Given that the research community has already trained models 10 to 100 times larger, this means that as time goes by, more work will be required to monitor and mitigate this problem in increasingly large language models … The fact that these attacks are possible has important consequences for the future of machine learning research using these types of models.”

Not only do these models require intense monitoring and having mitigation protocols in place, but there needs to be tight security around anything they touch. Since they are trained on public data, it is important for businesses and consumers to limit the information that is shared publicly. These models are trained using billions of web-based examples like ebooks, social media platforms and more. Using this information, they work to complete sentences or paragraphs. This is the information they memorize, the information that can be “leaked” when prompted correctly.

Imagine the implications of a bad actor using this technology to guess credentials of employees at a company, credentials they will first test on personal accounts to find the exact matches before then attempting those credentials at their place of work. If the credentials match, as many people tend to do, now a bad actor is inside a business system and no one knows they aren’t supposed to be there. They can wreak havoc by implanting malware to steal information, cyrptomine compute, hold systems hostage or simply bring the business to a standstill. 

With security on the minds of everyone in business, this news puts everyone on high alert. These language models are used in many applications we use every day, and hackers now have another avenue to exploit, if they haven’t started already. If you haven’t done a security review recently, now is the time. Bring in an expert, consult your team and start fixing weak spots now. The more secure you make your business in this ever-expanding digital realm, the more likely you are to survive when an attack inevitably winds up at your door.

About the Author

Pieter VanIperen, Managing Partner of PWV Consultants, leads a boutique group of industry leaders and influencers from the digital tech, security and design industries that acts as trusted technical partners for many Fortune 500 companies, high-visibility startups, universities, defense agencies, and NGOs. He is a 20-year software engineering veteran, who founded or co-founder several companies. He acts as a trusted advisor and mentor to numerous early stage startups, and has held the titles of software and software security executive, consultant and professor. His expert consulting and advisory work spans several industries in finance, media, medical tech, and defense contracting. Has also authored the highly influential precursor HAZL (jADE) programming language.

Contact us

Contact Us About Anything

Need Project Savers, Tech Debt Wranglers, Bleeding Edge Pushers?

Please drop us a note let us know how we can help. If you need help in a crunch make sure to mark your note as Urgent. If we can't help you solve your tech problem, we will help you find someone who can.

1350 Avenue of the Americas, New York City, NY