| News | Blog | Access to platform data for researchers

Access to platform data for researchers

Article 40 of the EU Digital Services Act grants access to data from online platforms for research purposes for the first time. This is an important development for researchers who are investigating systemic risks of social media and are dependent on platform data for this purpose. The question now pertains to the specific design of accessing the data.


The Digital Services Act (DSA) covers all digital services allowing EU consumers to access services, content and goods. In addition to internet access services, these include search engines, online trading platforms and online platforms such as social networks. The central goals of the DSA are to prevent the exchange of illegal online content and to protect citizens’ fundamental rights on the internet more comprehensively. Manipulative algorithms and the spread of disinformation are to be effectively curbed.

Data generated by online platforms are not only an essential basis for research projects that deal with online usage behaviour and opinion trends. They also serve to research the so-called systemic risks of online platforms: These concern, in particular, the influence of social media on society and democracy and thus problem areas such as the spread of hate speech and fake news, the abuse of opinion power or election campaign manipulation in social networks.

Since access to platform data has been extremely difficult for researchers in the past, the relevant research has so far concentrated primarily on Twitter. Most recently, the Twitter data interface, which had previously been free of charge for academic purposes, was closed. However, paid data access is generally not affordable in research projects.

The DSA, which comes into force at the end of 2022, promises to improve the conditions for research on online platforms: Article 40 of the DSA grants researchers access to data from very large online platforms and search engines (VLOPs and VLOSEs) with more than 45 million active users. The so-called VLOPs and VLOEs are already subject to strict requirements regarding transparency in online advertising, the selection of recommendation systems and the removal of illegal content.

Legal requirements for the introduction of the DSA

However, Article 40 currently represents no more than a declaration of intent that needs to be fleshed out in a delegated act. In delegated acts, the European Parliament and the Council authorise the Commission to supplement non-essential parts of a legal act – in this case, the DSA.

The concrete design of data access is now in the hands of the Directorate-General for Communication Networks (DG Connect) of the European Commission, which is currently in charge of preparing the Delegated Regulation. This involves defining the purposes for which the platform data may be used and the conditions, procedures and technical requirements under which the data will be passed on.

Interested researchers and researchers involved in relevant research projects have been actively involved in the process of shaping the delegated act in recent weeks in the context of an EU consultation: On the one hand, the researchers are concerned with defining exactly what types of data may be made accessible, for example, user connections, “viewed” content per user or the existing connections between users. On the other hand, it is about practicable procedures around access requests to the so-called coordinator for digital services, a central handling point per member state provided for in the DSA, which will also be responsible for research data access and will act as an intermediary between the requesting research institutions and the platforms. In Germany, the role of coordinator for digital services will most likely be assigned to the Federal Network Agency. Proposals include a tiered approach based on the sensitivity and access method of the requested data and the possibility of opening data access to requests from other researchers with similar research goals after a specific lead time.

Privacy concerns of users must always be weighed against scientific interests. To address these, researchers advocate high data protection standards: Data passed on for research purposes must be anonymised and may not contain any personal information about the users. Therefore, private chat transcripts will not be part of the accessible data. Furthermore, it must be ensured that the data is only used for research purposes and is not misused for commercial purposes. For example, this can be realised within the institutional data handling plans framework.

In this respect, the research community interested in platform data pleads for a practicable and science-friendly design of data access while maintaining high data protection standards. The current target date for adopting the Delegated Regulation by the EU Commission is the first quarter of 2024.

The blog posts published by bidt reflect the views of the authors; they do not reflect the position of the Institute as a whole.