What Machine Learning Models for Climate Impacts Can Teach Us About How to Deal Responsibly With AI
Markus Pössel
It comes as no surprise that machine learning and artificial intelligence were a recurring topic during this year’s Heidelberg Laureate Forum – from the laureate talks to the workshop program to the brief glimpse of SAP’s use of AI at the Friday session on the SAP campus. An example that I found particularly interesting came up in the “Hot Topic” discussion on climate change, as part of the work of one of the panelists, Jakob Zscheischler from the Helmholtz Centre for Environmental Research. Zscheischler studies “compound weather and climate events” – in short, negative impacts such as forest mortality, crop failure or particularly large wildfires that arise from a combination of factors, including climate change. So how do those various contributions, those multiple drivers, influence the eventual impact?
The central issue of “so, can we apply AI to that problem?” is the same as for all use of AI, or of machine learning. The technique itself introduces something that, at the start, is a black box. The computer is trained on a certain data set, forms the connections of its inner neural network(s), and after the training is complete, the resulting system is applied to new data. One can test how good the system is, e.g. in extrapolating from the given data, or deducing something specific from the data, by setting some known data that was not in the training set aside for testing.
Looking into Black Boxes
But is that more than a practical heuristic tool? Can it be part of the scientific process, where it is crucial that we understand what is going on, and where “this part of our argument is, um, a black box” is unacceptable – not fundamentally different from the famous Sidney Harris cartoon where the mathematical proof has “And then, a miracle occurs” as its step 2. We don’t want the research equivalent of an AI-generated mushroom guide book.
This is where Interpretable Machine Learning (IML) comes into play. There, the model that had learned to, say, link environmental conditions with negative impacts, is treated as the opposite of a black box. The key is to try and understand how the model works, and how we can understand the connections it has made between the different effects and the outcome in terms of impact during its learning phase. At the simplest level, IML is the generalisation of a research situation that is much older than machine learning: Finding a linear correlation between two research-relevant quantities and trying to elucidate the connectn in terms of physical mechanisms that leads to the correlation.
The advantage of machine-learning models is that once the model has completed its learning phase, it is there to be prodded, analysed, and more generally experimented upon. A number of analysis methods in IML focus on just that kind of virtual experimentation: Varying the input parameters a tiny little bit, and observing how this changes the output. The resulting map of change-relations (gradients, in technical terms) form the basis of interpretations. Other methods try to look “under the hood” of a model like a neural network: What happens there between input and output? What activation patterns can be discerned, and what could they stand for in terms of the underlying physical description the model is meant to encode?
…as Should We All
Taking a step back to look at the big picture, the approach that in this case helps to understand the dependence on negative impacts on environmental factors (including climate change), is a “constructively skeptical” stance towards the black boxes which machine-learning models typically present. We would do well to take a similar stance in other applications of machine learning, or more generally AI, as well. The models in question, whether they have learned to link environmental causes and outcomes or to extrapolate texts on the basis of large language models, are first and foremost tools. But they are, out of the box, not tools that provide their reasoning, or their arguments, or any insight into how they reach their results.
Whenever reliability, insight and checking-up on the results are important, it is up to us to provide that extra work, whether in the context of IML or in simpler contexts: checking up on an automatic translation (such as the one that will produce the German version of this blog post), or an automatically generated text, or god forbid a guide to edible mushrooms. Whether we do so, or whether we rely blindly (and possibly techno-superstitiously) on the output on such models is likely to determine a considerable chunk of the balance of whether these new tools will do, on average, more harm or more good.
The post What Machine Learning Models for Climate Impacts Can Teach Us About How to Deal Responsibly With AI originally appeared on the HLFF SciLogs blog.