Computers Are Hard: AI Fooled By Written Words

Artificial Intelligence is emerging as the next big thing for businesses. But computers are hard, and AI is far from perfect.

Computers are hard. Consumers use technology every day, but very few of them actually consider what it takes to make that technology work. It’s layers upon layers of hardware and software, and in today’s world, that hardware has become increasingly small. Built inside those layers of hardware are layers and layers of code. Infrastructure, functionality, design, stability, security. It’s all coded into the system to ensure smooth and easy use of the product. But computers are something we can now produce by rote. Innovations in speed, video quality, photo quality, size and clarity will continue, but computers are part of our daily lives. Technology advances daily, building on the previous day’s discoveries. Which is exactly how Artificial Intelligence and Machine Learning work: by building on what it already knows. The problem is that limitations continue to arise.

AI and ML are incredibly important to the future of technology. As humans, we enjoy life’s conveniences. Want a pizza? There’s an app for that. Stuck on the side of the road? There’s an app for that. Want to see the inside of the Louvre? There’s virtual reality for that. All of those innovations have led to the further development of AI and ML. ML is taught using data found on social media and other various websites. It learns from human interaction to determine what humans really want. Autocorrect is a form of AI. Google uses prediction methods in its search bar based on your previous searches. All of these are advances in AI and ML. 

In the past, we discussed how AI has limitations based on the data used to train it. Studies have shown that AI can be biased towards people of color and women, and Apple recently came under fire for blocking any URL containing the word “Asian” when parental controls are in use. Apparently, this is not the end of AI’s limitations, and may only signify the beginning of us discovering the limitations it actually has. Most recently, AI has shown that it becomes confused when shown written words over a recognizable object.

Let’s back up a little. The CLIP neural network, OpenAI’s system for allowing computers to recognize the world around them, is a machine learning system. Neural networks can be trained over time to get better at specific tasks using a network of interconnected nodes. CLIP is designed to identify objects based on an image that aren’t always immediately clear to the system’s developers. OpenAI conducted research on their system, published last week, and is in regard to multimodal neurons which “respond to clusters of abstract concepts centered around a common high-level theme, rather than any specific visual feature.”

The research showed that written words can actually confuse the system. For instance, the team wrote that CLIP has a multimodal “Spider-Man” neuron, which fires upon seeing an image of a spider, the word “spider” or any image or drawing of the Marvel Superhero. The multimodal neurons are what is easy to fool here, as evidenced by OpenAI’s research. The team tricked CLIP into thinking an apple (the actual fruit) was an iPod, simply by taping a piece of paper that says “iPod” to the apple. In fact, CLIP was MORE certain it was an iPod (99.7%) than it was certain it was an apple (85.6%) without the paper taped to it.

The research team is dubbing this a “typographical attack” because it really wouldn’t be a huge problem for anyone to deliberately exploit.

“We believe attacks such as those described above are far from simply an academic concern. By exploiting the model’s ability to read text robustly, we find that even photographs of hand-written text can often fool the model.

[…] We also believe that these attacks may also take a more subtle, less conspicuous form. An image, given to CLIP, is abstracted in many subtle and sophisticated ways, and these abstractions may over-abstract common patterns—oversimplifying and, by virtue of that, overgeneralizing.”


While this may seem like a failing of CLIP, the truth is that it isn’t. It’s really an illustration of how the associations it has composed over time are incredibly complicated. OpenAI research has said that CLIP builds are similar to the functioning of a human brain, which we all know is incredibly complicated. A child learning shapes and colors cannot always tell you where the red square is, instead pointing to the red circle because it’s red, or pointing to the yellow square because it’s a square.

The point is, the human brain is complex. We want AI to function like a human brain, or at least as closely as possible as it can. In order to do that, we not only have to give its “brain” time to grow and learn, but we have to understand that the data we feed it should be healthy and clean. Just like the food we feed our children. Computers are hard. AI is even harder. Most people don’t understand just how far we’ve come in this area over the last decade and think we should be farther than we are. But the truth is, we’re leaping over obstacles every day to maintain our pace of technology innovation. All things take time to perfect, and AI is no different.

About the Author

Pieter VanIperen, Managing Partner of PWV Consultants, leads a boutique group of industry leaders and influencers from the digital tech, security and design industries that acts as trusted technical partners for many Fortune 500 companies, high-visibility startups, universities, defense agencies, and NGOs. He is a 20-year software engineering veteran, who founded or co-founder several companies. He acts as a trusted advisor and mentor to numerous early stage startups, and has held the titles of software and software security executive, consultant and professor. His expert consulting and advisory work spans several industries in finance, media, medical tech, and defense contracting. Has also authored the highly influential precursor HAZL (jADE) programming language.

Contact us

Contact Us About Anything

Need Project Savers, Tech Debt Wranglers, Bleeding Edge Pushers?

Please drop us a note let us know how we can help. If you need help in a crunch make sure to mark your note as Urgent. If we can't help you solve your tech problem, we will help you find someone who can.

1350 Avenue of the Americas, New York City, NY