Every minute, users upload tens of thousands of images to the internet. In the case of online platform Facebook, this figure increases to over one hundred thousand. In doing so, users frequently reveal much more than intended. To prevent this, computer scientists at the CISPA Helmholtz Center, and the Max Planck Institute for Computer Science in Saarbrücken, have developed a digital assistant to help users. Using machine learning and neural networks, the researchers have given their “Visual Privacy Advisor” tool the necessary knowledge to warn users before they post images that pose a risk. From 11 June, visitors to the CEBIT computer expo in Hannover will be able to test its latest capabilities in hall 27, stand F68.
Every smartphone allows its user to define whether his actual location or private contact list should be shared. Online platforms, such as Facebook, also have similar mechanisms in order to protect users’ privacy. However, this was not previously the case for uploaded photos. “Photos pose an underestimated risk. Many people are completely unaware of how much information they contain”, explains Mario Fritz, who has been involved in research work at the Saarland Informatics Campus of the new CISPA – Helmholtz Institute in Saarbrücken since June this year.
Fritz, together with Tribhuvanesh Orekondy and Bernt Schiele of the Max Planck Institute for Computer Science, has therefore developed methods for automatically analysing images and assessing the risk to an individual’s privacy if the images are posted online. In doing so the scientists have created the foundations for a visionary type of software, the deployment scenario of which is explained by Fritz as follows: “You want to send a photo using your phone, but our app warns you that this photo is risky and fails to comply with rules that you have established and therefore should not be shared.”
The researchers first needed to develop a classification for images in order to be able to provide privacy controls similar to those that already exist for location and contact data. Sixty- eight characteristics were then identified that should be possible to protect upon request, including a user’s tattoos, fingerprints, sex/gender and e-mail address. The scientists from Saarbrücken then compiled a data pool of 22,000 images. They added keywords corresponding to the classification to each of these images. On average, each image contained some five characteristics. They then used 10,000 of these keyworded images to train their “Convolutional Neural Networks (CNN)”, a type of deep learning architecture.
In parallel with this, they carried out studies with several hundred people in order to discover the extent to which they feel their personal privacy is violated by certain characteristics and how easily these characteristics can actually be identified from the images. The researchers then fed these findings into their neural networks. The networks were not only able to identify the characteristics, but could also – using the computational model thus developed – assess how uneasy a particular person would feel if the information were made public. Using this knowledge, the software is now analysing a further 8,000 images.
The result: based on the computational model the neural networks evaluate the risk to privacy better than a human test group, because they are capable of identifying more risky photos and analysing them more precisely. The researchers are now working on a function that automatically blacks out the compromising areas of images.
The research has been financed in part by the collaborative research center 1223, “Methods and Instruments for Understanding and Monitoring Data Protection”, which is funded by the German Research Foundation.