By Armando Orozco
Be wary the next time you enter your passcode into your iPhone on the bus – someone could be shoulder surfing. In fact, a team of researchers from the University of North Carolina has developed a system to watch you pecking out characters on your phone, analyse the video, and produce a pretty accurate guess of what you were typing.
When people talk about key loggers, they’re usually thinking about malware that sits on a computer and surreptitiously monitors what keys people are pressing. But these university researchers are applying an entirely different approach to key logging. Instead of putting software on computers, they are investigating ways to monitor the text that people input into their mobile phones. They do it by taking video of your phone, either directly (over your shoulder or from the side), or simply by reading the reflections of your phone’s screen in your glasses.
The researchers developed a mechanism for looking at mobile phone screens using cheap, mobile videocameras. The cameras record video of people typing on ‘soft’ keyboards, such as those used by Apple’s iPhone. These keyboards commonly use ‘pop out’ animations, in which the key being pressed gets bigger when pressed, to confirm to the user that they have selected the right letter. The pop-out animation makes it easier to see which keys are being pressed in the video.
Mobile cameras have increased dramatically in quality lately, making them far more capable of capturing reflected keyboard images. These cameras are embedded in smartphones, of course, or if you wanted to get even techier, you could buy one of these.
The researchers’ mechanism involves identifying the phone within the video image, so that it can follow the phone as it moves around. Their process then compares images of the phone from the video with a reference image, mapping features on the phone to features on the reference image, which helps to compensate for distortion (especially if the phone is reflected in a pair of concave sunglasses, for example). This creates a stable enough image of the phone in the video to let the automated process detect key presses on the soft keyboard, mapping their position to tell which keys have been pressed.
Their keyboard-sniffing methodology then runs the data through a language parsing system that matches what the video thinks you entered against dictionary models, to produce estimates of what was actually typed. The results were surprisingly accurate, with scores of good (or above). All of the tests scored above the threshold indicating an understandable result.
So, this system could be used to sniff passwords that you entered on your phone. But what about if your password was deliberate gibberish? It turns out that even single-character recognition (not run through a parsing engine to find likely dictionary words) was remarkably accurate. Over half of gibberish passwords would have been recognised, according to the researchers’ tests.
What does all this mean? This was theoretical work – it’s unlikely that anyone is going to be recording reflections of your phone as you type in your password on the bus. Nevertheless, it shows how insecure mobile systems can be, with the right exploit. If nothing else, it’s a fascinating insight into human ingenuity. You can read the study here.