Yet ESG ratings vary substantially depending on which provider is doing the ratings—to a point where a company could be highly rated with one rating company and have a very low score with another. The correlation between ESG ratings from the six raters my colleagues and I looked at ranged from 0.38 to 0.71, on a scale from minus 1 (meaning total disagreement) to 1 (meaning full agreement). In other words, the six never all agreed on a company’s ESG rating, and in most cases there was little agreement among them.
That means investors need to dive deep into the details of the different methodologies of ESG raters when the same company has dramatically different ratings.
To get at the reasons for the divergence in ESG ratings, Julian Kölbel, Roberto Rigobon, and I analyzed the differences between ESG ratings from six raters. We identified three sources of the divergence: differences in which indicators are included in the ratings, in the weights given to each of those indicators, and in how they are measured. Together, those three factors define each rating company’s methodology.
Here is what we found—and potential ways to address the divergence of ESG ratings to make them a better tool for investors.
Different data, different weights
To see how much the underlying data can differ from rater to rater, consider this: In our sample of six different raters, the number of indicators that feed into the final ESG rating ranges from 38 for one rating company to 282 for another. This indicates substantial differences in what ESG raters think is important.
Weight divergence happens when rating companies have different views of the relative importance of various issues. For instance, occupational health and safety are commonly measured by looking at injury rates in factories. Some raters might give more weight to how companies perform on this score than, for example, the companies’ lobbying practices. But other raters think that lobbying practices are much more important, as companies might try to reduce accidents in their own factories but at the same time lobby against regulation aimed at making all factories safer—which could add up to more injuries nationwide.
Investors need to know if the weightings align with their own personal concerns. For instance, for many people—and for many raters—diversity and climate change have taken on more importance in recent years. But there remain major differences in the weights raters give them.
Different yardsticks
One of the trickiest reasons for the ratings divergence happens when raters look at the same data (or lack of data) and come up with different measurements. For example, only a few companies disclose carbon-dioxide data for their supply chain. For companies that don’t provide that data, ESG raters often estimate carbon-dioxide emissions from the supply chain. Each rater has its own way of doing that, so each arrives at a different conclusion.
We also found that a rater’s overall view of a company appears to influence the measurement of specific indicators. That is, a company receiving a high score for one indicator is more likely to receive high scores for all other indicators from that same rater. Think of a high-school student who has five different classes with the same teacher. In four classes, she excels; but in the last class, she doesn’t show any interest. The teacher might give her the benefit of the doubt and, maybe subconsciously, raise her grade in that one class because of her performance in the other four classes.
This is what we call the rater effect. It shows that ESG measurement, like any assessment, can be subject to all sorts of human biases. Furthermore, some raters use artificial-intelligence technology in their assessments. These models can be biased, too, and contribute to the measurement divergence.
Seeking out solutions
We looked at each of the causes of ratings divergence, and found that the measurement differences were responsible for 56% of the overall ESG variations. The use of different indicators was responsible for 38% and weightings 6%.
One way to improve measurement would be regulation requiring all companies to disclose certain ESG-related data, as the information reported by companies is the main source of data for ratings. Currently, some companies follow disclosure standards developed by the Global Reporting Initiative or the Sustainability Accounting Standards Board. But such disclosure is optional. If regulators enforced mandatory reporting in compliance with a uniform standard, all companies’ performance on this ESG data could be more easily measured.
If regulators want to go even further, they could impose mandatory auditing of ESG data, so that companies’ disclosures would be approved by auditors in a way similar to what is already happening with financial disclosure. This would give ESG raters reliable data to use in their assessments of each indicator.
However, while there might be less divergence in ESG ratings if companies reported a uniform set of data, there will always be some divergence resulting from different rating methodologies. Enhanced competition among rating companies with different methodologies could encourage innovation that would continually improve ratings so that they provide a fuller picture of companies’ ESG efforts, beyond whatever data those companies might be required to disclose.
For instance, if accidents on the job were one mandatory measure of labor practices, a company might realize that it could manipulate the data by pressuring doctors to play down the severity of such incidents. In a competitive market, ESG rating companies could find other ways to measure labor practices, in addition to the required data.
Regulation could encourage such innovation. Regulators could, for example, force ESG raters to be more transparent about their methods. That would increase competition because investors would be able to make more informed decisions about which ratings to use. In addition, academics, nongovernmental organizations, the media and the companies being rated would be able to criticize the practices of ESG raters in a constructive way.
For now, ESG ratings divergence doesn’t mean that measuring ESG performance is a futile exercise. It’s clearly better than nothing. But it highlights the reality that measuring ESG performance is, to say the least, challenging—for the raters and for the investors who rely on them. Until those ratings contain fewer discrepancies, investors who care about ESG performance are going to have to dig deeper to make sure their money is going where their values are.