| Phenomena | ESG data value chain

Disziplin

Political science

ESG data value chain

Reading time: 7 min.

In order to offer sustainable financial products, financial institutions need information on the environmental, social and governance (ESG) aspects of the companies and products in which they invest. Early forms of sustainable investing in the 1970s and 1980s relied on detailed company profiles prepared by analysts. Today, however, specialised financial service providers offer extensive data sets with numerous sustainability metrics and ratings for thousands of companies. And while analysts in the past often found it difficult to obtain data, companies today publish digital sustainability reports that sometimes run to several hundred pages.

Accordingly, the generation and processing of large amounts of data and the automation of data processing and analyses are playing an increasingly important role in relation to ESG indicators and assessments. In order to create “ESG end products” such as ratings, which bring together the various aspects of a company’s sustainability, data from various unstructured sources of information must be brought together and processed. This step-by-step “production” and processing of information can be described as an ESG data value chain. There are three types of data in this value chain, which differ according to the degree of processing. The first type is raw data, which comes from sources such as public statistics, newspaper articles or reports from civil society organisations. Thanks to technical advances in the automatic processing of large amounts of data, social media data and remote sensing data produced by satellites are also increasingly being used as raw data.

However, the most important source of raw data is the companies themselves. Companies are obliged by laws such as the EU’s Corporate Sustainability Disclosure Reporting Directive (CSRD) to provide information on certain sustainability issues and key figures. This is done in annual sustainability reports as well as in brochures and factsheets, which can be viewed on the companies’ websites. One common feature of the raw data is that it is usually unstructured. For example, company reports are often hundreds of pages long and often differ in the presentation and depth of detail of information. Public statistics and data from civil society organisations, on the other hand, are often only available for partial aspects of corporate sustainability and within national borders.

In the next step of the value chain, the unstructured raw data is processed by ESG data providers into structured data sets. This involves combining data from various sources (e.g. reports from different companies) into a structured data set. While this task used to be carried out manually by analysts, AI-supported systems are increasingly being used to enter the data. In addition to entering the data, ESG data providers carry out various processes for data cleansing and quality assurance. For example, the sustainability data of subsidiaries and individual production sites are assigned to the companies. In addition, missing and implausible data is replaced by modelled values. In addition, the data providers sort and classify the companies (e.g. by sector or size) in order to increase comparability. In this way, ESG data providers create structured data sets that assign hundreds to thousands of sustainability indicators to thousands or even tens of thousands of companies.

In the picture of the ESG data value chain, the structured data sets take on the role of an intermediate good. Data providers sometimes sell these products to downstream ESG analysis companies or to end users such as banks, insurance companies and asset managers (e.g. BlackRock or DWS) that have statistical and modelling expertise for further data processing. However, many users buy further processed end products that can be used to directly assess and compare the sustainability of investments. The best-known end products include ESG ratings, which summarise all sustainability aspects of a company on a common scale (e.g. from 0 to 100). To obtain an ESG rating, ESG data providers select certain variables from the structured data set and summarise them. Different variables are selected and weighted depending on the type of company and sector. For example, companies in the textile industry generally pay more attention to labour conditions in the supply chain, while greenhouse gases play a greater role for energy companies. The selection of variables is usually based on an assessment of which sustainability issues could pose risks for the company (e.g. due to regulation). In addition to ESG ratings, data providers are also developing other products, such as metrics that measure how many emissions an equity portfolio is responsible for or how high financial losses due to sustainability risks could be. Despite their different orientations, the various end products are largely based on the intermediate product of structured data sets. In addition, the methods used to create new indicators and analysis products are proprietary to the ESG data providers. The latter makes it considerably more difficult to trace the path of data within the value chain.

Comparability with analogue phenomena

The ESG data value chain has similarities with the physical value chains of consumer goods such as coffee. For example, the coffee bean is first grown and harvested by farmers. The dried beans are then transported via various intermediaries and transport companies to roasting plants, which produce the end product, i.e. the coffee that can be bought in shops. Unlike coffee, of course, ESG data is not physically harvested or shipped. Nevertheless, both value chains have similar characteristics. In both cases, the raw materials or raw data are cheap or freely available, while the end product is sold at a higher price. For example, the farmers only receive one eighth of what the roasting plant earns from a cup of coffee. Similarly, ESG data providers sell information, which is often freely available in unstructured form, for several tens of thousands of euros per year. Depending on the scope of the products subscribed to, the prices charged by data providers vary from several thousand euros per year and provider to millions. One prominent provider (Sustainalytics-Morningstar) also makes its aggregated ESG risk assessments freely available.

Another common feature is that the processed end product often no longer allows any conclusions to be drawn about the attributes of the original raw materials, or at least makes this difficult. The roasted and ground coffee usually consists of beans from different growing regions and intermediaries. Tracing back to the exact locations of the plants from which the beans came is almost impossible. The situation is similar with ESG data, even if there is no physical transformation of the product. For example, there are so many data transformations between the chimney from which carbon dioxide is blown into the earth’s atmosphere and the “financed emissions” of an equity fund that reverse engineering is very similar to the endeavour of tracing a coffee bean.

Social relevance

The ESG data value chain makes it possible to extract signals on the sustainability of companies and financial instruments from the mass of unmanageable and unstructured information. Through the generation, selection and transformation of data, actors in the value chain such as reporting companies or ESG data providers influence how sustainability is measured and perceived. These definitions in turn influence the cash flows of sustainable funds, which often invest in companies with high ESG ratings. However, as such ratings are based on a variety of methodological decisions and integrate and weigh up different aspects of sustainability, they can often lead to misunderstandings among users. One example of this is the recent controversy surrounding sustainable funds. If, despite their focus on sustainability, they include companies that make money from coal-fired power generation, this is interpreted as a sign of greenwashing. To increase transparency and trust in ratings and other ESG end products, regulators from the EU, India, Singapore and the UK are currently working on proposals to regulate ESG data providers. Beyond the focus on transparency, however, other social issues also arise in light of the increasing digitalisation and relevance of the value chain. For example, as most unstructured raw data is publicly available, while structured data sets are purchased, the question arises as to whether different types of ESG data should be seen as public or private goods. Tapping into and analysing alternative data sources such as social media or satellite data also opens up opportunities to link and validate existing data from companies or ESG data providers. The increasing use of AI systems also raises questions about the explainability of data products.