Website operators take security more seriously than data protection
When Internet users visit a website via a web browser, there usually occurs an exchange of data. For example, website operators often track the IP address from which their website is accessed. Personal data is also often provided by the customers themselves, for example when they purchase products online and have them delivered to their homes. The General Data Protection Regulation (GDPR), which has been in force since 2018, is the first Europe-wide standardized guideline for processing personal data. It aims at protecting users from excessive data storage. Storing an IP address, for example, is already considered storage of personal data. The GDPR applies to all websites hosted or accessible in the EU. Website operators are responsible for implementing the directive, while the national data protection authorities are responsible for supervision.
In 2019, CISPA researcher Christine Utz and her colleague Martin Degeling from Ruhr University Bochum investigated how websites had changed after the introduction of the GDPR. "Our main finding was that while there had been little change in the actual practice of tracking, there had been an increase in transparency efforts by websites, for example, via the provision of privacy statements as well as the introduction of cookie banners," Utz recounts. This was one of the starting points for their current study. CISPA Faculty Dr. Ben Stock, in whose group Utz is now a researcher, had conducted an earlier study on how e-mail campaigns could be used to inform website operators about security vulnerabilities. "This led to the idea of investigating whether website operators could also be made aware of a lack of data protection with the help of such a campaign," Utz continues.
Study design and approach
After extensive preliminary research, the actual implementation of the study took place with a set of approximately 160,000 websites. The criterion for including a website in the sample was the existence of a data protection problem such as the absence of a data protection statement, the absence or delayed display of a cookie banner, or existence of input fields for personal data without HTTPS protection. As a comparison criterion, unsecured access to a so-called Git repository was included in the study. A Git repository is a working copy of a website stored on an external server. Early in November 2021, the website operators were automatically contacted by e-mail and informed about the problems. Over a period of two months, Utz then observed whether the problems on the sites were fixed or not, both among those who had been contacted as well as among a control group. In order to gain deeper insights into why actions were taken or not, the researchers also attached a questionnaire to their e-mails and examined the e-mail communications with the website operators.
Challenges during implementation
A study with such a large sample entails a number of challenges, some of which result from the automation of work steps. One risk concerns false positives. Automated tools used to search HTML source texts, for example, may fail to detect existing privacy statements due to inconsistent naming. Another hurdle is the selection of e-mail addresses. Previous studies have shown that the use of generic addresses such as info@- or webmaster@- has disadvantages. For this reason, wherever possible, e-mails were sent to specific e-mail addresses recognized on the website. "The biggest difficulty, however, was to prevent our e-mails from being classified as spam by the recipients," explains Utz. To do so, Utz and her colleagues took a number of precautions. They used an external server for hosting and signed the e-mails. The external server was also intended to prevent all e-mails coming from CISPA from being classified as spam, which could have caused damage to the center.
Pitfalls of large-scale notification campaigns
The most important finding of the study was that, in principle, it is possible to use large-scale e-mail campaigns to alert website operators to data privacy problems. Nevertheless, given the immense resources required to conduct such a study, success in terms of problem resolution is quite limited. This is particularly evident from the fact that only a very small proportion of those informed responded to the e-mails. The number of websites on which changes were made during the observation period was in a low single-digit percentage range. A comparative analysis also showed that security vulnerabilities are more likely to be remedied than data protection problems. One reason for this, Utz believes, is the fact that security vulnerabilities can often be addressed with less effort.
The qualitative examination of the questionnaires and e-mail communications revealed further reasons for the limited campaign success within the observation period. Utz found that website operators were less open to notifications about privacy issues than about security breaches. Further obstacles to the implementation of changes were also identified. These included language barriers due to a lack of English language skills on the part of the e-mail recipients as well as the classification of the notification e-mails as spam. Interestingly, the GDPR reference itself also proved to be an obstacle. Some operators doubted whether their own website even fell within the scope of the GDPR, or the reference to a lack of data protection was simply rejected as inapplicable. The participants would have liked more, and more detailed, information on data protection issues.
The goal is to cooperate with data protection authorities
Utz wants her research to increase the enforceability of the GDPR. "Data protection authorities often don't have the capacity to detect inadequate GDPR implementations on websites and to point these out to website operators," she says. "But we as researchers could support the authorities in doing this." Importantly, researchers and the institutions behind them have the necessary technical and human resources for such projects. Conversely, researchers could benefit from the standing of public authorities. "Data protection authorities can communicate more effectively why the GDPR is important," Utz says. Cooperation would therefore be a win-win situation for all involved. However, according to Utz, it is essential that broad-based notification campaigns via e-mail are accompanied by other measures in the future, such as information campaigns about the scope of the GDPR. She also suggests the implementation of a new standard concerning the availability of website operators so that email notification campaigns lead to greater success. This could be done, for example, by means of a privacy.txt file stored on all websites, which contains information on how the operators can be contacted in the event of data protection-related questions.