Can A Potato Chip Bag Spy On You? MIT Researchers Show How Object Vibrations Can Be Used To Reconstruct Conversations [VIDEO]

An object as benign as a potato chip bag can be used to reconstruct and listen to conversations, say MIT researchers. Find out more about this fascinating project in sound vibration analysis. (Photo: Reuters)
An object as benign as a potato chip bag can be used to reconstruct and listen to conversations, say MIT researchers. Find out more about this fascinating project in sound vibration analysis. (Photo: Reuters)

It might be hard to fathom an object as benign as a potato chip bag could compromise your privacy. But, according to MIT researchers' latest "Visual Microphone" presentation, conversations taking place nearby can be extracted or "listened to" by simply studying the vibrations on an object -- in this case, a chip bag.

Utilizing a fascinating new algorithm, researchers at MIT discovered they can that can reconstruct sound, and indeed, human speech, by analyzing tiny vibrations the object causes, which are captured on a video recording.

Anyone who took science in elementary school likely remember that sound is created by vibrating an object which then sends a sound wave through the air and can, in turn, cause vibrations in/on other objects.

Different types of sounds will cause different types of vibrations on an object and figuring out these distinctions is what MIT's algorithm does -- and incredibly well. Some of the vibrations recorded are so miniscule, they don't even make up a thousandth of a pixel!

"There's this very subtle signal that's telling you what the sound passing through is," said Abe Davis, in a white paper about the Visual Microphone research. Davis is a graduate student in electrical engineering and computer science at MIT and first author of the research. According to Davis, the vibrations recorded may be incredibly small, but when averaged together, an observable sound can be extracted.

To prove the concept is actually usable, the researchers compiled a video recording of the Visual Microphone in action. The results are pretty amazing, albeit, slightly creepy. In the clip researchers filmed a bag of potato chips from 15 feet away through soundproof glass. Upon analyzing the recording with the algorithm, researchers were able to reconstruct audio of someone reciting, "Mary Had a Little Lamb."

In describing the kind of technology needed to extract words from video recorded vibrations on objects, researchers shared that a high-speed camera was most suitable.

"Reconstructing audio from video requires that the frequency of the video samples - the number of frames of video captured per second - be higher than the frequency of the audio signal," a MIT blog post on the project stated.

In some experiments, a high-speed camera capturing 2,000 to 6,000 frames per second was used, which is clearly much faster than the 60 frames per second on a high-end smartphones. Though, that figure is still well below the frame rates for commercial high-speed cameras topping out at 100,000 frames per second

However, when the same process was tested with an ordinary digital camera, researchers did find information could still be extracted, though not nearly as clear or consistent. With the use of a standard digital camera, researchers found recordings to be good enough to identify the gender of a speaker in a room, the number of speakers, and in some cases, individual speakers' identities.

While the MIT team's Visual Microphone research may strike some readers as both fascinating and terrifying, for the everyday Joe, it's nothing to be terribly concerned about. Indeed, the researchers themselves have been very open about the limitations of the tool, stating that it's not altogether a better form of sound reconstruction than other methods already in use, but that it is possible the research could be used to discover sound in situations where it couldn't be before.

"It's just adding one more tool for those forensic applications," Davis shared.

When iDigitalTimes spoke with forensic scientist Jonathan Zdziarski, he agreed that the Visual Microphone could have real implications for National Security as it could signal a need for physical design changes to secure rooms where classified conversations might take place -- rooms such as the Situation Room within the White House.

"The MIT research is outstanding, and I think could revolutionize the next generation of surveillance and counter-surveillance tech. In my experience, a number of defense contractors have played it loose with their secure rooms for classified discussions; this research demonstrates potential threats associated with having windows in a room, for example, as you can even likely get vibrations off of thin blinds".

To that, Zdziarski added some speculation on where the research could possibly go.

"With the proper grant money, this research could possibly go outside the visible light spectrum and into thermal, infrared, or other areas. If this algorithm could work with existing technologies like that, then it may someday become possible to eavesdrop on conversations in completely closed rooms, or - to more of an extreme - possibly even via satellite. Of course, this is all speculative, and probably a decade or more away, but the research at the very least demonstrates the importance of good OPSEC in addition to taking all of the existing counter-surveillance best practices more seriously."

The Visual Microphone: Passive Recovery of Sound from Video [VIDEO]

Cammy Harbison

Writer/Reporter For iDigitalTimes

For More OSX, iOS, Jailbreak And Infosec News

Follow Cammy on Facebook, Twitter, or Google Plus


Tor Attack May Have Revealed Anonymous Users' Identities; Who Is Behind The Attacks And Is The Service Safe?

New Google Chromecast Hack Takes Rickrolling To A Whole 'Nother Level: Check Out Dan Petro's Fabulous 'RickMote.' [VIDEO]

George Hotz Aka 'Geohot' Joins Google Project Zero: Former Apple And PS3 Hacker Now Working To Make The Internet More Secure

Join the Discussion
Top Stories