How We Model the Climate

Deep Sky Research: Background Methodology

Deep Sky Research analyzes the latest climate data and applies novel modeling techniques to understand and model our changing climate in a way that accurately represents the extreme climate we’re in.


Our models make use of various data sources which are publicly available through organizations like The European Centre for Medium-Range Weather Forecasts (ECMWF), The National Oceanic and Atmospheric Administration (NOAA), and The National Aeronautics and Space Administration (NASA). These organizations collect and provide access to datasets of historical climate data as well as seasonal projections informed by climate models from the  International Panel on Climate Change (IPCC).

One of the primary data sources is ERA5 reanalysis data. Reanalysis combines model data and historical observations to create a globally complete and consistent dataset based on the laws of physics. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. ERA5 reanalysis data provides a historical record for a large number of climate variables from temperature to precipitation to wave patterns. This data is used to identify historical trends as well as to confirm the validity and accuracy of forward looking projections. An example of an ERA5 reanalysis dataset is available here.

A second crucial data source are seasonal forecasts. Seasonal forecasts are projections of the climate system over a few weeks or months. Seasonal forecasts are different from a weather forecast in their precision and their time horizon. They may provide a projection of mean temperature for the next 3 months, whereas a weather forecast will tell you the daily high temperature tomorrow. The longer time horizon means seasonal forecasts also have greater uncertainty. To quantify uncertainty, long-range forecasts use ensembles which include several climate models’ predictions pooled together. These climate models include representations of the atmosphere, ocean, and land surface and are grounded in the laws of physics. An example of a seasonal forecast dataset is available here.


The UNSEEN approach (UNprecedented Simulated Extremes using ENsembles) (Thompson, V.  et al. High risk of unprecedented UK rainfall in the current climate. 2017) offers a solution to a fundamental problem we face when modeling a changing climate: modeling rare events that have no precedent. Statistical modeling relies on large numbers of observations to calculate probabilities and assess risk. Therefore, assessing the risk of an extreme heat wave, for example, for which there are very few precedents, is nearly impossible. Without enough observations our statistical tools fall down. Even if the probability of such an event has grown exponentially, our models will fail to accurately measure the risk because of the lack of data.

UNSEEN modeling offers an innovative solution to this problem. It uses ensembles of seasonal forecasts to create thousands of unrealized but probable climate outcomes from which we can make statistical inferences. It expands the set of observations by supplementing the ERA5 reanalysis data with seasonal forecast data. It includes built-in guard rails to ensure that the forecasts meet a high bar of various statistical measures: validity, reliability, accuracy, independence, and others, in order to confirm that the forecasts are in fact probable under the relevant climate conditions. In other words, UNSEEN models examine not only what did happen, but also what could have happened, in order to predict what is likely to happen in the future. This is a powerful tool as our climate continues experiencing catastrophic outlier events that have no precedent.

A large and growing number of peer-reviewed papers implement this approach and there is an open source UNSEEN framework published by Timo Kelder.

Extreme Value Statistics

Deep Sky Research is focused primarily on evaluating the risk of extreme events caused by climate change. Beyond UNSEEN, this requires specifically modeling the extreme ends of the probability distribution, rather than just predicting the mean. Deep Sky Research makes use of well-established extreme value analysis. Extreme value theory establishes the distribution families to which the maximum of a sample of independent and identically distributed random variables can converge after proper renormalization.

The eXtremes R package is the most advanced statistical software for this kind of analysis and is frequently used by Deep Sky Research for extreme value modeling. The exact model parameters depend on the data. A best practice statistical approach is followed to use the simplest model that fits the data well and compare various models on statistical measures of model performance.

Return Periods

Deep Sky Research frequently expresses the probability of extreme climate events in terms of return periods. A return period is an estimated average time between events. It is common in climate science to hear about for example, the “100-year storm.” This is a storm with a return period of 100 years. This storm will happen, on average, once every 100 years or in other words, every year the probability of the storm occurring is 1/100.

If what used to be a 100-year storm now has a return period of 10 years, this means that the probability of the storm occurring each year is 1/10, meaning the probability has gone up 10X. There are many common misconceptions about return periods. One is that if 99 years have gone by without an occurrence of the 100-year storm, that means we should expect it next year. The fact that the storm’s return period is 100 years only means that its probability is 1/100 in each year. That probability does not go up each year that the storm does not occur. The probability of the storm occurring next year does not change unless the return period has changed.