Medium | Data Science Insights: Spatio temporal trends in carbon dioxide emissions (1970 to 2019)

In this white paper, we introduce new frameworks to study spatio-temporal patterns among 50 countries in carbon dioxide emissions, demographic trends and economic patterns over the past 50 years. Our analysis is broken up into 3 sections.

First, we do a decade-by-decade study of carbon dioxide emissions trajectories. There, we demonstrate notable changes in cluster structures in each time period. This highlights shifts between country emissions profiles over time.

Next, we introduce a new method to classify countries into one of three characteristic emissions classes: one segment, two segments, or three segments. Here, we highlight that most countries are best represented by a piecewise linear model with one change point. This suggests that most countries experienced two periods of characteristic emissions during our analysis window. Finally, we use GDP, population and carbon dioxide data, and apply dimensionality reduction and clustering to group countries based on similarity in their real and carbon economies. This technique is a new way of viewing similarity between countries capturing data over 50 years, capturing economic and emissions data.

Decade-by-decade emissions trajectories:

In this section we investigate country emissions trends on a decade-by-decade basis, exploring the evolutionary structure of country behaviours. Both the number of clusters and cluster constituents are dynamic, with both varying as we proceed forward in time.

First, we take the period from 1970–1979, where we observe a predominant cluster, consisting of 3 sub-clusters, and a small collection of outlier countries (primarily displayed as separate sub-clusters within the predominant cluster). The dendrogram corresponding to this period is shown in Figure 1a. All sub-clusters within the predominant cluster produce increasing trends of carbon dioxide emissions over the decade. The first sub-cluster consists of countries such as Russia and the Ukraine. These countries are characterised by huge growth, and accelerating emissions trends. The second sub-cluster, consisting of Italy, Canada and Argentina exhibit moderate growth in carbon dioxide emissions. The final cluster which includes countries such as Spain and Brazil also displays relatively steady growth behaviours. Countries such as Vietnam and Kuwait exhibit declining carbon dioxide emissions, which is anomalous with respect to the rest of the collection.

Next, we turn to 1980–1989, shown in Figure 1b. This period produces a dendrogram consisting of 3 distinct clusters. The first cluster contains countries such as India, China etc. which display continued growth throughout the period. The second cluster consists of a variety of countries but displays pronounced similarities between Eastern European countries such as Kazakhstan, Russia and Ukraine. These countries all experienced huge growth in emissions, peaking in around 1990. The final cluster consists of France, Belgium and Nigeria. These countries display erratic emissions behaviours, with all trajectories displaying limited trends and substantial volatility.

In the 1990–1999 period shown in Figure 1c, countries form one primary cluster, with a small collection of outlier countries in a separate, significantly smaller cluster. The primary cluster consists of countries that displayed consistent growth over the prior decade (to varying extents). The small (outlier) cluster consisting of countries such as Ukraine, Russia and Kazakhstan, all experienced precipitous drops in their emissions at the beginning of the period — and this behaviour continued throughout the remainder of the time period. The latent phenomenon the cluster formation captures here is the fall of the Iron Curtain countries.

The 2000–2009 period produces one primary trajectory cluster (consisting of two similarly-sized sub-clusters), and a collection of outlier countries. The hierarchical clustering results are displayed in Figure 1d. The clear bifurcation in the large cluster is indicative of contrasting trends in emissions behaviours between various countries. The first sub-cluster consists of countries such as the Netherlands, Italy, Germany, USA, Canada, Japan and Belgium. Most countries within this sub-cluster are more developed and have taken a stronger stance in introducing policies to reduce emissions. Such countries exhibit either flat or declining emissions trajectories throughout the decade. The second sub-cluster consisting of countries such as India, Iran and Turkey, produce sustained growth in emissions during this period. The significantly smaller second cluster consists of countries that produce emissions profiles more like that of the second sub-cluster. This cluster consists of Oman, China, Vietnam and Qatar.

Finally, we turn to the most recent decade of analysis, 2010–2019, which is shown in Figure 1e. This period produces 3 characteristic classes of emissions trajectories. The first cluster consists of countries with lower HDI (human development index) levels: Vietnam, Bangladesh, the Philippines, Iraq and Pakistan. HDI is a summary metric representing average achievement in various areas of human life. These countries produce emissions trajectories that increase significantly throughout the period.

The second cluster consists of Venezuela, Ukraine, the UK, Italy and UAE. These countries mostly produce declining trajectories, which may represent a greater collective focus on reducing carbon dioxide emissions at the national level. The final cluster consists of two primary sub-clusters. The first contains countries such as Japan, the Netherlands and Belgium. These countries are primarily characterised by erratic emissions output, with a declining trend overall. The second sub-cluster consists of countries such as China, Thailand and Chile. Although these countries produce a positive trend over the entire decade, the rate of increase has slowed, and in some cases, overall emissions have begun to trend downward.

Characteristic emissions classes:

In this section, we introduce a new method to classify countries based on their emissions trends over time. Having noticed that many countries exhibit a piecewise linear trend in their CO2 emissions, we introduce a new framework to determine the most appropriate model for each country.
We assume that each country’s emission behaviours over time is best represented by one of the three following models:

  • Model 0 (M_0): No change points (one linear component).
  • Model 1 (M_1): 1 change point (two piecewise linear components).
  • Model 2 (M_2): 2 change points (three piecewise linear components).

In our algorithmic procedure, we optimise with respect to the number and placement of (up to two) change points, such that our function’s average R² is maximised. We compute average R², taking an arithmetic average of all segments’ R². Given our preference for a simpler model in the case where a more complex one accounts for a similar level (or marginally more) explanatory variance, we introduce a slight penalty for model complexity.

The most frequently occurring model is M_1, followed by M_2 and then M_0. That is, in the vast majority of cases, carbon dioxide emissions are best modelled with the existence of either one or two change points.

In Figure 2, we display two representative countries for each model. One can see that in some cases, such as Algeria in figure 2a and Germany 2b, there is one persistent trend over the past 50 years. For Algeria, there is a consistently positive trend in emissions, while in Germany, emissions are best modelled by a linear decline over the past 50 years.

In the case of Egypt (figure 2c and Morocco 2d, both countries produce a more strongly positive linear slope beyond the late 1980s and early 1990s respectively. Thus, these phenomena are best modelled with one change point. In the case of the Netherlands and India, seen in Figures 2e and 2f respectively, both emissions patterns require two change points to model the dynamics.

For the Netherlands, there are obvious discontinuities and abrupt changes in total emissions, seen in the early 1980s and early 2000s respectively. While in India, the early 1980s and mid-2000s display obvious increases in the slope of carbon dioxide emissions.

Identifying similarity in real and carbon economies:

In this section, we introduce a new method to identify countries that share similar economic, demographic and emissions histories. First, we gather GDP, population and CO2 emissions data over the past 50 years, and generate distance matrices between all countries for each respective measurement. We then generate an aggregate distance matrix, where each metric is scaled by a constant, such that we normalise for the scale of each metric in our aggregate distance. We then apply multi-dimensional scaling to the distance matrix, projecting the matrix into lower-dimensional data space, and apply K-means clustering to the resulting projection. The number of clusters, K, is determined by optimising the silhouette score. For the sake of interpretability, we accompany these K-means clustering results with a hierarchical clustering dendrogram.

When we apply this analysis to all 50 countries, the USA exhibits highly anomalous behaviour. In fact, we must sequentially remove the USA, China, India, Russia and Japan to identify the general structures without these outlier countries. In the proceeding analysis, we remove these 5 countries and study the structural patterns among the remaining countries.

First, we explore the similarity between countries with respect to each metric individually. Figure 3 displays the three distance matrices for GDP, population and C02 emissions. In figure 3a, one can see that Germany, France and the United Kingdom are the most anomalous countries with respect to GDP based on their significant economic output over the past 50 years. In figure 3b, we see the distance between country populations, where Indonesia, Brazil and Bangladesh exhibit the greatest dissimilarity to the rest of the collection. In figure 3c, Germany exhibits the greatest dissimilarity to the remainder of the collection of countries.

For the sake of exposition, we present the hierarchical clustering results of our distance matrix capturing similarity in countries’ real and carbon economies. Figure 4 highlights the existence of one prominent cluster, a small sub-cluster and an outlier.

The primary cluster consists of three sub-clusters. The first of which contains countries such as Chile, Iraq, Vietnam, Romania, Bangladesh, etc. Most countries in this cluster are characterised by developing economies, growth in population, and significant growth in carbon dioxide emissions over time.

The second sub-cluster consists of countries such as the Netherlands, Argentina, Nigeria, Taiwan, etc. These countries mostly displayed consistent growth in GDP and population, and lower levels of carbon dioxide emissions over time. The final sub-cluster consists of Spain, South Korea, Indonesia, Mexico, Canada and Turkey. Most of these countries displayed moderate growth in GDP and population, and growth in carbon dioxide emissions, exhibiting reasonable variability over time.

In particular, many of these countries experience a flattening in emissions over the final 5–10 years of the analysis window. The second cluster includes the UK, France, Italy and Brazil. These countries are characterised by increasing emissions trajectories until the later parts of the analysis window, where a flattening or decline in emissions occurs. Germany is identified as an outlier, which is due to its steady decline in emissions over time and the strong GDP growth over the entire analysis window.

This analysis may probe further interesting studies. First, one could explore the key industries and sectors of the economy that drive emissions. This may vary significantly between countries, or perhaps, select industries drive the majority of emissions.

Next, one could explicitly study the correlation between population growth and emissions, and identify countries that have done the best job in controlling emissions levels relative to their population growth. Figure 5 displays the emissions profile of 5 major countries: China, the USA, India, Russia and Japan. Basic visual inspection confirms the pronounced similarity between China and India’s emissions profiles, while the USA and Japan share broadly similar emissions histories. Russia’s sharp decline in emissions in the early 1990s corresponds to the fall of the Iron Curtain. One could speculate that these emissions are highly related to GDP — and studying this relationship explicitly could provide salient insights.


We apply recently introduced and new methods in spatio-temporal data analysis to identify structural similarity and evolutionary patterns in emissions behaviours over time. First, we demonstrate pronounced heterogeneity in cluster number, size and constituency among emissions trajectories on a decade-by-decade basis. This section highlights the dynamic nature of this problem and the need for constant monitoring of country behaviours. In the second section of this white paper, we demonstrate that most countries are well modelled by a piecewise linear data generating process. The consistency of this finding is surprising and would be interesting to monitor moving forward. Should countries more actively police their emissions, one could expect to see further propagation of change points in the near term.

Finally, we introduce a framework to group countries based on their real and carbon economies. Our methodology captures GDP, population and emissions data over the past 50 years, and includes dimensionality reduction and clustering. This framework could be applied to other problems, or one could generalise some of the questions explored in this paper using different economic and demographic metrics.

Read More

November 2021

COP26 | Week 2 in Summary: Greater Focus on Emissions

November 2021

COP 26 | Week 1 in Summary: The 6 Big Initiatives

October 2021

Medium | Data Science Insights: Spatio-temporal trends in the propagation and capacity of hydrogen projects

October 2021

VivoPower’s Aevitas announces contract to complete electrical works for 119MW Hillston Solar Farm

October 2021

VivoPower International PLC wins international TMA Award for hyper-turnaround in 2020

October 2021

Medium | Data Science Insights: Quantifying non fungible token (NFT) energy costs via simulation

1 2 3 12