Could a hand-knit sweater contain a secret message? During World War II, the Office of Censorship in Britain thought so, and banned knitting patterns from being sent abroad.
Secret messages hidden in non-secret information is steganography. From invisible inks on paper to Morse code knit into sweaters, steganography continues to evolve.
With today’s mobile technology, a digital picture may be worth a thousand words, or it could be hiding them. Jennifer Newman is laying the groundwork to find them.
A hidden issue
Newman, associate professor of mathematics, leads a project through the Center for Statistics and Applications in Forensic Evidence (CSAFE) at Iowa State related to steganalysis, the detection of the messages and information hidden using steganography. The project is a collaborative effort with two other universities and funded through the National Institute of Standards and Technology (NIST).
“If you look at a picture, how can you tell if it's a stego image? You can't.”
Steganography is of interest to forensic scientists due to its potential use in corporate espionage, terrorism and other criminal activity. For example, Al Qaeda is alleged to have used steganography since at least the early 2000s. The potential economic threat of steganography is clear when you consider the Commission on the Theft of American Intellectual Property’s 2017 report estimate: theft of trade secrets costs the U.S. economy between $180 and $540 billion annually.
Data on the use of steganography is elusive because it is so difficult to detect.
“If you look at a picture, how can you tell if it's a stego image? You can't,” said Newman. “If you look at it, there's no difference. It's not like you can see it's weird-looking, or the text embosses out or anything. It's very subtle changes.”
Current methods for analyzing suspected steganography are expensive, require training using artificial intelligence methods with tens of thousands of images, and still work poorly in realistic scenarios. This is what Newman’s research aims to fix.
A drive for solving problems
Newman’s research has long been focused on studying images using mathematics. Much of her work has been in image processing, which have applications that are integral to many consumer products today.
“I liked image processing from graduate school,” she said. “I'm just a very visual person. I'm not very artsy — I can't even tell you what's a good picture or not, but I’m very interested in the underlying mechanisms of the physics of the image acquisition.”
Continuously driven by the challenge of analyzing technical material, Newman’s doctorate is in mathematics but she also has a diverse background including signal processing, computer science, statistics and physics. Her interest in steganography began in the 1990s, when her research with a graduate student started her interest in embedding algorithms that hide information — the type of techniques she now works to help detect.
“I like to know how things work,” she said. “These are fun problems to solve.”
Since she began her research in steganography, data processing improvements in phones and computers have made hiding information in photos easy for anyone to do. A simple search in Google Play, for example, turns up dozens of applications to download — many of them free — that allow a user to hide information in images with no special training or skills. The number of steganography applications continue to grow, making steganography available for the general public to use for storing passwords, exchanging messages and other personal use.
Creating a statistical foundation
Unfortunately, with the increased availability comes an increased possibility of steganography being used for crime. Newman’s current research puts her problem-solving skills to work in detecting steganography. Her research team is creating the first database of images from cell phones, some of them with steganography, that can be used to improve the detection methods currently available.
“There isn't a vetted database of mobile phone data available,” Newman said. “There has to be a scientific method in place that the standards are well-known and well-used and accepted by the scientific community such that the processing that was applied to this image can be admitted into court by an expert witness.”
To create the database, she is cataloging tens of thousands of photos from numerous phone models using multiple steganography apps. The database can be used to develop and test steganalysis algorithms — those that find steganography in an image — and to develop a scientific standard behind steganalysis methods.
“I'm really excited about having forensic datasets available,” said Barbara Guttman, leader of the Software Quality Group at NIST. “I have found for years and years that when you have challenging datasets available, the research community responds by either improving or building better tools.”
“There has to be a scientific method in place that the standards are well-known and well-used and accepted by the scientific community such that the processing that was applied to this image can be admitted into court by an expert witness.”
Newman’s group catalogs the information related to each photo they take and each stego image they create for the database — for example, what model of phone it was taken on, the application used to create the steganography, the time it was taken, the size of the image and the exposure settings of the camera. They use statistical models to verify their database covers the variety of images needed to create the best algorithms.
“The role of statistics is variability,” Newman said. “If you don't have variability represented in your database, then you may not be able to observe or detect phenomenon that you're trying to use the database for.”
The importance of variability
Newman’s group already found that the variability of the exposure settings makes a difference on detecting steganography. Without more variability, the detection algorithm was less likely to correctly identify whether an image was used for steganography or not. A broader range of exposure settings in the pictures helped the algorithm become more accurate in identifying steganography. As a result, they use manual exposure settings to get a wider range of exposures in the photos that are added to the database.
“Everybody can use this data to improve their tools.”
“It's certainly not going to make the photos look very nice sometimes, but even though that's the case, we've shown that with this factor of statistical variety in the exposure settings, it can provide you a better way to detect stego images,” Newman said.
Newman’s database building and validating does not extend to steganalysis algorithm creation and improvement. She knows her research is the first important step in a long process of creating a scientific standard. Once in place, the method or software developed from using images in the database could be used by police and forensic units, or by companies interested in protecting their intellectual property.
“My goal is to get better tools in the hands of law enforcement. To do that, the research, academic and vendor communities have to have the background, the materials on-hand so they can build these better tools and know they work,” Guttman said. “Everybody can use this data to improve their tools.”