Checking a database

It has happened ocassionally that the reported data is not internally consistent. Here we show how to make the most of pyam’s tools to check your database so you can double check the data in the IAM database.

In [1]:
import time
from pprint import pprint

import pyam
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

We start with the tutorial data, it contains only a fraction of the AR5 data so is not internally consistent and is hence the perfect dataset to start with.

In [2]:
df = pyam.IamDataFrame(data='tutorial_AR5_data.csv', encoding='utf-8')
INFO:root:Reading `tutorial_AR5_data.csv`
In [3]:
df.head()
Out[3]:
model scenario region variable unit year value
0 AIM-Enduse 12.1 EMF27-450-Conv ASIA Emissions|CO2 Mt CO2/yr 2005 10540.74
3 AIM-Enduse 12.1 EMF27-450-Conv LAM Emissions|CO2 Mt CO2/yr 2005 3285.00
6 AIM-Enduse 12.1 EMF27-450-Conv MAF Emissions|CO2 Mt CO2/yr 2005 4302.21
9 AIM-Enduse 12.1 EMF27-450-Conv OECD90 Emissions|CO2 Mt CO2/yr 2005 12085.85
12 AIM-Enduse 12.1 EMF27-450-Conv REF Emissions|CO2 Mt CO2/yr 2005 3306.95

Summary

With the pyam.IamDataFrame.check_internal_consistency method, we can check the internal consistency of a database. If this method returns None, the database is internally consistent (i.e. the total variables are the sum of the sectoral breakdowns and the regional breakdown

In the rest of this tutorial, we give you a chance to better understand this method. We go through what it is actually doing and show you the kind of output you can expect.

Checking variables are the sum of their components

We are going to use the check_aggregate method of IamDataFrame to check that the components of a variable sum to its total. This method takes np.is_close arguments as keyword arguments, we show our recommended settings here.

In [4]:
np_isclose_args = {
    "equal_nan": True,
    "rtol": 1e-03,
    "atol": 1e-05,
}

Using check_aggregate on the IamDataFrame allows us to quickly check if a single variable is equal to the sum of its sectoral components (e.g. is Emissions|CO2 equal to Emissions|CO2|Transport plus Emissions|CO2|Solvents plus Emissions|CO2|Energy etc.). A returned DataFrame will show us where the aggregate is not equal to the sum of components.

In [5]:
df.check_aggregate(
    "Emissions|CO2",
    **np_isclose_args
)
INFO:root:Emissions|CO2 - 1368 of 1522 data points are not aggregates of components
Out[5]:
2005 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
variable model scenario region
Emissions|CO2 AIM-Enduse 12.1 EMF27-450-Conv ASIA 10540.74 13160.18 11899.38 9545.81 7355.07 6119.50 NaN NaN NaN NaN NaN
LAM 3285.00 3294.54 3367.62 2856.65 2207.36 1537.72 NaN NaN NaN NaN NaN
MAF 4302.21 4487.54 4238.91 3956.19 3490.81 2082.24 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.33 11646.37 8272.30 4457.91 1625.18 NaN NaN NaN NaN NaN
REF 3306.95 3604.42 3325.20 2991.24 1889.38 960.75 NaN NaN NaN NaN NaN
World 34492.05 38321.78 35588.66 28531.68 20287.46 13367.27 NaN NaN NaN NaN NaN
EMF27-450-NoCCS ASIA 10540.74 13160.11 11893.80 9478.33 7367.07 5513.79 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3362.61 2837.11 1889.89 899.63 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4239.03 3619.25 2787.47 1671.29 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 11659.29 8708.81 5488.86 3355.22 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3322.95 3076.67 1977.78 1181.73 NaN NaN NaN NaN NaN
World 34492.05 38313.59 35588.85 28629.65 20458.10 13660.19 NaN NaN NaN NaN NaN
EMF27-550-LimBio ASIA 10540.74 13160.11 14124.17 14218.08 13187.66 10019.56 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3445.63 3496.62 2986.08 1790.49 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4368.48 4519.64 4294.83 2733.76 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 12607.17 11752.01 9749.33 6501.31 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3826.80 3615.47 3258.31 3076.27 NaN NaN NaN NaN NaN
World 34492.05 38313.59 39531.61 38815.54 34676.38 25295.31 NaN NaN NaN NaN NaN
EMF27-Base-FullTech ASIA 10540.74 13160.11 14149.89 16559.14 19658.68 23071.34 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3449.84 3660.68 3850.44 3866.20 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4371.98 4751.63 5389.48 6082.37 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 12642.70 13332.29 13742.93 14150.35 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3838.82 4220.97 4866.31 5615.39 NaN NaN NaN NaN NaN
World 34492.05 38313.59 39612.60 43835.49 49027.80 54552.86 NaN NaN NaN NaN NaN
EMF27-G8-EERE ASIA 10540.74 13152.56 13415.94 10147.89 7637.61 4435.80 NaN NaN NaN NaN NaN
LAM 3285.00 3286.52 3106.39 2825.27 1784.31 899.06 NaN NaN NaN NaN NaN
MAF 4302.21 4487.02 4091.19 3977.50 3659.80 3336.85 NaN NaN NaN NaN NaN
OECD90 12085.85 12750.81 10276.06 8833.95 5845.24 3473.56 NaN NaN NaN NaN NaN
REF 3306.95 3596.74 3453.29 3468.73 3376.25 3058.68 NaN NaN NaN NaN NaN
World 34492.05 38304.41 35425.96 30395.43 23536.71 16487.83 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ...
REMIND 1.5 EMF27-450-NoCCS OECD90 15111.39 15254.16 8082.87 2864.75 369.53 328.11 299.06 266.24 255.25 245.07 226.35
World 33837.41 38224.94 25524.60 7358.64 1691.05 1663.77 1616.52 1555.47 1553.00 1665.20 1883.11
EMF27-550-LimBio ASIA 10193.98 13239.55 14218.37 11920.79 8135.32 5963.84 4486.53 3100.11 2246.06 1843.16 1570.02
LAM 2926.60 3478.79 4413.41 1831.96 1357.42 934.84 712.03 523.57 418.39 359.64 107.38
MAF 4035.32 4381.03 4504.49 3368.89 3582.70 3883.52 3663.91 3349.07 3064.56 2919.43 3153.50
OECD90 15111.39 15241.56 13016.52 10555.13 7238.06 4454.98 2745.80 1531.01 766.12 275.60 -82.62
World 33837.41 37970.11 37657.41 28699.50 20936.83 15389.20 11536.73 8368.71 6360.33 5299.36 4644.20
EMF27-Base-FullTech ASIA 10193.98 13478.78 20256.01 24006.74 NaN NaN NaN NaN 34529.79 28622.83 23400.42
LAM 2926.60 3508.40 5067.35 5464.43 4402.98 5424.51 5869.57 5988.95 6096.94 5152.73 4074.45
MAF 4035.32 4381.09 5364.84 5862.75 NaN NaN NaN NaN NaN NaN NaN
OECD90 15111.39 15234.63 NaN NaN 16610.24 16943.56 16515.90 15922.43 14587.22 11864.62 9683.61
World 33837.41 38293.08 48134.42 53343.82 59836.10 70077.89 77941.21 82914.15 84109.23 75995.09 68004.38
WITCH_EMF27 EMF27-450-Conv ASIA 9895.45 13210.18 13914.12 12004.49 10538.51 8767.49 7410.94 6299.16 3794.59 2865.46 2437.77
LAM 4660.57 4644.17 1851.46 1537.68 1421.58 658.62 -161.02 -1398.20 -1659.55 -1631.41 -1586.79
MAF 2508.31 2673.95 2224.44 1932.65 1907.50 1703.00 1588.40 1493.36 1413.90 1303.02 1118.21
OECD90 12644.40 12597.55 9780.38 NaN 4755.20 3257.79 2240.72 1399.83 795.67 359.17 224.28
REF 3870.58 4035.17 2381.75 2055.81 1733.11 1347.51 1064.51 815.36 611.13 444.37 383.41
World 33579.32 37161.03 30152.15 24091.32 20355.91 15734.40 12143.56 8609.50 4955.74 3340.62 2576.88
EMF27-550-LimBio ASIA 9895.98 13341.76 17280.06 18745.41 16414.52 12419.72 10012.01 9373.38 8937.92 9270.47 9214.63
LAM 4660.84 4612.38 2729.37 2515.92 2286.12 1309.50 1048.52 677.15 470.62 -66.27 -210.35
MAF 2508.81 2621.97 2773.63 2885.03 2775.24 2552.54 2579.22 2758.68 2928.88 3067.60 3139.20
OECD90 12645.78 12542.87 11852.60 NaN NaN 5310.74 4461.18 4405.41 4236.36 4016.45 3860.45
REF 3871.04 4062.72 3764.23 3380.51 2617.25 1957.66 1686.23 1658.31 1628.48 1550.83 1510.95
World 33582.45 37181.70 38399.88 37802.01 32002.02 23550.17 19787.16 18872.93 18202.25 17839.07 17514.87
EMF27-Base-FullTech ASIA 9893.46 13378.34 20016.55 26248.47 30889.38 34562.46 37566.05 40325.64 42647.52 44874.72 46657.52
LAM 4659.58 4623.98 4524.39 4644.99 4937.36 5250.67 5698.25 6117.40 6522.66 6945.51 7358.61
MAF 2506.45 2642.28 3291.19 4063.34 5028.41 6038.17 7017.40 8032.94 8851.50 9680.49 10373.40
OECD90 12639.28 12598.84 13097.95 13835.62 14969.12 15784.59 16540.18 17249.21 17924.86 18566.23 19180.64
REF 3868.87 4077.28 4636.23 5039.14 5412.35 5886.80 6279.44 6439.80 6722.19 7040.23 7284.21
World 33567.64 37320.72 45566.30 53831.56 61236.62 67522.70 73101.32 78164.98 82668.74 87107.17 90854.38

140 rows × 11 columns

As we are missing most of the sectoral data in this subset of AR5, the total variables are mostly not equal to their components. The data table above shows us which model-scenario-region combinations this is the case for. As a user, we would then have to examine which sectors we have for each of these model-scenario-region combinations in order to determine what is missing.

Checking multiple variables

We can then wrap this altogether to check all or a subset of the variables in an IamDataFrame.

In [6]:
for variable in df.filter(level=1).variables():
    diff = df.check_aggregate(
        variable,
        **np_isclose_args
    )
    # you could then make whatever summary you wanted
    # with diff
INFO:root:Emissions|CO2 - 1368 of 1522 data points are not aggregates of components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Price|Carbon - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check aggregate because it has no components

The output tells us where there are issues as well as where it is not possible to actually check sums because no components have been reported.

Checking that regions sum to aggregate regions

Similarly to checking that the sum of a variable’s components give the declared total, we can check that summing regions gives the intended total.

To do this, we use the check_aggregate_regions method of IamDataFrame. By default, this method checks that all the regions in the dataframe sum to World.

Using check_aggregate_regions on the IamDataFrame allows us to quickly check if a regional total for a single variable is equal to the sum of its regional contributors. A returned DataFrame will show us where the aggregate is not equal to the sum of components.

In [7]:
df.check_aggregate_regions(
    "Emissions|CO2",
    **np_isclose_args
)
INFO:root:Emissions|CO2 - 503 of 503 data points are not aggregates of regional components
Out[7]:
2005 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
region model scenario variable
World AIM-Enduse 12.1 EMF27-450-Conv Emissions|CO2 34492.05 38321.78 35588.66 28531.68 20287.46 13367.27 NaN NaN NaN NaN NaN
EMF27-450-NoCCS Emissions|CO2 34492.05 38313.59 35588.85 28629.65 20458.10 13660.19 NaN NaN NaN NaN NaN
EMF27-550-LimBio Emissions|CO2 34492.05 38313.59 39531.61 38815.54 34676.38 25295.31 NaN NaN NaN NaN NaN
EMF27-Base-FullTech Emissions|CO2 34492.05 38313.59 39612.60 43835.49 49027.80 54552.86 NaN NaN NaN NaN NaN
EMF27-G8-EERE Emissions|CO2 34492.05 38304.41 35425.96 30395.43 23536.71 16487.83 NaN NaN NaN NaN NaN
GCAM 3.0 AMPERE3-450 Emissions|CO2 31473.40 31678.13 38660.77 45110.97 44768.14 34990.09 19397.62 1208.73 -17387.30 -37099.22 -57844.17
AMPERE3-450P-CE Emissions|CO2 31473.40 31678.14 38603.90 46071.54 43844.74 34636.41 19108.64 1129.57 -17399.41 -37076.52 -57817.45
AMPERE3-450P-EU Emissions|CO2 31473.40 31678.14 39487.87 47419.61 45118.79 35081.54 19182.91 1166.07 -17384.21 -37079.51 -57832.34
AMPERE3-550 Emissions|CO2 31473.40 31678.13 39660.52 47541.01 50744.18 46992.91 34172.78 17064.62 -2639.86 -21628.98 -42437.11
AMPERE3-Base-EUback Emissions|CO2 31473.40 31678.13 41826.41 52214.95 63459.29 75453.50 81730.83 86384.17 89308.29 92285.81 96090.28
AMPERE3-CF450P-EU Emissions|CO2 31473.40 31678.13 41826.41 52214.95 49660.44 36881.63 19617.99 995.01 -17475.49 -37146.77 -57853.52
AMPERE3-RefPol Emissions|CO2 31473.40 31678.14 39787.42 48131.28 52770.92 54537.50 54976.51 54792.56 53594.22 51287.98 49551.47
EMF27-450-Conv Emissions|CO2 31473.62 34700.97 12161.32 11121.98 6976.38 5488.54 3289.90 1023.41 -1751.60 -3619.36 -6129.59
EMF27-450-NoCCS Emissions|CO2 31473.62 34685.29 11238.06 13109.25 12112.07 10342.93 6317.67 2467.32 -1664.16 -4832.87 -8236.19
EMF27-550-LimBio Emissions|CO2 31473.62 34685.29 19169.88 23277.01 21186.04 15170.25 10413.05 8098.03 6605.72 6446.21 5829.69
EMF27-Base-FullTech Emissions|CO2 31473.62 34685.29 43318.48 51233.36 57308.36 63987.64 68381.59 72124.98 75247.84 77206.61 77589.17
IMAGE 2.4 AMPERE3-450 Emissions|CO2 34111.29 35344.27 36059.37 32234.66 19608.88 15150.28 6668.57 -677.70 -3156.89 -6273.64 -7112.29
AMPERE3-450P-CE Emissions|CO2 34111.14 35343.93 38844.85 40453.35 31255.38 21628.42 13883.06 5169.80 -2754.38 -6409.44 -8174.45
AMPERE3-450P-EU Emissions|CO2 34111.14 35343.93 40612.22 46400.38 37347.97 25392.08 17060.16 7214.90 -2789.63 -6821.32 -7928.68
AMPERE3-550 Emissions|CO2 34111.29 35318.50 37365.86 37226.82 32352.89 30493.37 22154.07 12673.92 7180.83 909.23 -2504.16
AMPERE3-RefPol Emissions|CO2 34124.32 35746.98 40855.54 46771.34 48448.44 51487.30 48906.51 43724.96 40676.94 36602.42 32884.37
EMF27-550-LimBio Emissions|CO2 34259.27 35594.34 37292.93 36467.66 28910.72 24195.19 19530.54 13307.58 10956.33 8196.68 5984.36
EMF27-Base-FullTech Emissions|CO2 34391.39 35246.31 40384.52 44757.28 48671.22 55920.66 61887.17 65748.83 72569.20 78338.48 82774.78
MERGE_EMF27 EMF27-450-Conv Emissions|CO2 28501.22 33303.16 30506.60 22718.89 13174.43 4174.12 -628.46 -2784.94 -4736.63 -5526.81 -6008.83
EMF27-550-LimBio Emissions|CO2 28501.22 33303.16 36019.89 34031.54 26705.70 17872.93 10598.82 7675.84 4542.86 6515.47 5797.86
EMF27-Base-FullTech Emissions|CO2 28501.22 33303.16 43021.90 54681.24 64214.82 73116.83 81405.99 90072.94 98476.95 108882.84 120493.31
EMF27-G8-EERE Emissions|CO2 28501.22 33303.16 31636.36 27381.84 19962.34 14278.68 10533.13 9599.27 9211.55 8061.24 8393.79
MESSAGE V.4 AMPERE3-450 Emissions|CO2 34474.59 36035.69 36941.46 35238.71 26747.96 15173.32 4329.48 -1304.69 -5447.10 -8728.73 -11209.52
AMPERE3-450P-EU Emissions|CO2 34474.49 36036.02 40821.65 46438.23 38929.54 27622.31 12469.82 2786.95 -2585.87 -6268.49 -8535.53
AMPERE3-550 Emissions|CO2 34474.51 36035.96 39222.41 42988.07 40487.80 34363.56 20847.26 10424.52 2479.89 -2730.61 -6474.82
AMPERE3-RefPol Emissions|CO2 34474.46 36035.98 40886.23 47008.04 51379.09 53497.39 50990.57 46103.64 41339.83 34733.89 27562.64
EMF27-550-LimBio Emissions|CO2 34491.02 36087.09 34675.76 30326.54 22000.71 11312.70 9846.46 8570.05 7230.63 5834.56 4500.82
EMF27-Base-FullTech Emissions|CO2 34491.02 36087.09 42809.72 48375.99 55957.77 64431.68 71728.32 75668.28 77297.82 77283.35 75904.56
REMIND 1.5 AMPERE3-450 Emissions|CO2 33841.49 37365.91 35255.18 31679.69 25439.84 16908.36 6524.72 -910.93 -5015.05 -8196.08 -10192.88
AMPERE3-450P-CE Emissions|CO2 33841.49 37357.58 39260.04 43283.25 34864.17 20741.40 7005.88 -622.25 -4786.58 -8275.44 -10291.58
AMPERE3-450P-EU Emissions|CO2 33841.49 37356.37 41565.70 47902.12 37928.16 21615.59 6958.30 -563.45 -4783.46 -8277.63 -10290.40
AMPERE3-550 Emissions|CO2 33841.49 37366.12 38324.51 39015.27 36963.66 31733.51 22831.66 13927.89 5749.59 -327.77 -3755.24
AMPERE3-550P-EU Emissions|CO2 33841.49 37360.58 41568.63 47908.67 46792.96 37445.25 24413.62 14572.43 6804.20 86.14 -3693.90
AMPERE3-Base-EUback Emissions|CO2 33841.49 37371.87 44020.67 50296.87 58575.08 71744.59 82786.79 87993.16 85663.03 75402.53 66716.49
AMPERE3-CF450P-EU Emissions|CO2 33841.49 37365.98 44028.80 50295.90 39726.70 22501.37 6855.42 -602.89 -4769.69 -8268.81 -10300.93
AMPERE3-RefPol Emissions|CO2 33841.49 37372.53 41615.44 48455.22 55203.05 61590.57 64595.04 64737.59 62246.20 56447.99 51261.41
EMF27-450-Conv Emissions|CO2 33837.41 37977.31 29314.49 13503.92 6281.74 3040.79 787.74 -526.27 -1744.61 -1641.29 -1413.97
EMF27-450-NoCCS Emissions|CO2 33837.41 38224.94 25524.60 7358.64 1691.05 1663.77 1616.52 1555.47 1553.00 1665.20 1883.11
EMF27-550-LimBio Emissions|CO2 33837.41 37970.11 37657.41 28699.50 20936.83 15389.20 11536.73 8368.71 6360.33 5299.36 4644.20
EMF27-Base-FullTech Emissions|CO2 33837.41 38293.08 48134.42 53343.82 59836.10 70077.89 77941.21 82914.15 84109.23 75995.09 68004.38
WITCH_EMF27 EMF27-450-Conv Emissions|CO2 33579.32 37161.03 30152.15 24091.32 20355.91 15734.40 12143.56 8609.50 4955.74 3340.62 2576.88
EMF27-550-LimBio Emissions|CO2 33582.45 37181.70 38399.88 37802.01 32002.02 23550.17 19787.16 18872.93 18202.25 17839.07 17514.87
EMF27-Base-FullTech Emissions|CO2 33567.64 37320.72 45566.30 53831.56 61236.62 67522.70 73101.32 78164.98 82668.74 87107.17 90854.38

Again, as the AR5 snapshot is incomplete, all World sums are not equal to the regions provided.

Once again, we can repeat this analysis over all the variables of interest in an IamDataFrame.

In [8]:
for variable in df.variables():
    diff = df.check_aggregate_regions(
        variable,
        **np_isclose_args
    )
    # you could then make whatever summary you wanted
    # with diff
    if diff is not None:
        eg = diff

eg.head(20)
INFO:root:Emissions|CO2 - 503 of 503 data points are not aggregates of regional components
INFO:root:Emissions|CO2|Fossil Fuels and Industry - 239 of 239 data points are not aggregates of regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply|Electricity - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Price|Carbon - cannot check regional aggregate because it has no regional components
INFO:root:Primary Energy - 502 of 503 data points are not aggregates of regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Fossil|w/ CCS - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Temperature|Global Mean|MAGICC6|MED - cannot check regional aggregate because it has no regional components
Out[8]:
2005 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
region model scenario variable
World AIM-Enduse 12.1 EMF27-450-Conv Primary Energy 458.20 518.89 500.15 521.23 569.53 581.44 NaN NaN NaN NaN NaN
EMF27-450-NoCCS Primary Energy 458.20 518.81 500.24 493.64 583.82 614.23 NaN NaN NaN NaN NaN
EMF27-550-LimBio Primary Energy 458.20 518.81 544.28 592.53 639.70 679.98 NaN NaN NaN NaN NaN
EMF27-Base-FullTech Primary Energy 458.20 518.81 545.24 619.43 715.12 816.88 NaN NaN NaN NaN NaN
EMF27-G8-EERE Primary Energy 458.20 518.64 487.22 463.48 499.48 555.22 NaN NaN NaN NaN NaN
GCAM 3.0 AMPERE3-450 Primary Energy 460.41 504.35 618.51 743.09 848.71 935.92 1001.75 1091.21 1177.40 1281.61 1418.91
AMPERE3-450P-CE Primary Energy 460.41 504.35 618.74 753.34 849.79 942.16 1005.56 1092.56 1177.72 1281.71 1418.59
AMPERE3-450P-EU Primary Energy 460.41 504.35 624.50 769.43 857.29 943.23 1002.71 1089.48 1177.26 1282.19 1420.23
AMPERE3-550 Primary Energy 458.03 501.89 622.66 751.83 863.50 956.22 1007.66 1064.33 1146.19 1221.76 1331.16
AMPERE3-Base-EUback Primary Energy 457.76 501.61 632.27 788.66 934.72 1073.33 1174.84 1261.38 1328.94 1395.49 1470.71
AMPERE3-CF450P-EU Primary Energy 460.41 504.35 635.96 793.04 870.95 942.83 1001.18 1087.41 1176.79 1282.69 1420.85
AMPERE3-RefPol Primary Energy 457.76 501.61 622.68 764.56 884.66 995.11 1080.88 1155.01 1214.59 1275.17 1325.32
EMF27-450-Conv Primary Energy 458.82 504.59 556.01 630.79 694.18 749.23 791.67 824.53 834.82 822.03 794.94
EMF27-450-NoCCS Primary Energy 458.82 504.45 546.22 599.28 628.39 654.74 684.87 733.00 787.27 834.91 NaN
EMF27-550-LimBio Primary Energy 458.82 504.45 576.89 663.16 729.41 777.90 817.73 865.14 911.34 939.80 964.83
EMF27-Base-FullTech Primary Energy 458.82 504.45 613.10 730.03 841.52 952.45 1049.69 1139.91 1218.93 1276.80 1319.59
IMAGE 2.4 AMPERE3-450 Primary Energy 441.25 473.91 544.13 577.42 638.17 685.99 751.05 763.70 778.29 821.22 863.04
AMPERE3-450P-CE Primary Energy 441.25 473.91 580.13 654.46 676.13 695.56 750.32 766.26 792.33 837.14 879.08
AMPERE3-450P-EU Primary Energy 441.25 473.91 598.26 696.60 692.53 699.38 747.23 763.69 791.18 836.82 879.48
AMPERE3-550 Primary Energy 441.25 473.79 559.87 603.66 689.49 758.18 815.96 844.50 862.31 905.22 942.64

An internally consistent database

If we have an internally consistent database, the returned DataFrame will always be none.

Repeating the same analysis as above can then confirm that all is well with the database as well as give us some insight into which variables do not have regional or sectoral breakdowns reported.

In [9]:
consistent_df = pyam.IamDataFrame(data="tutorial_check_database.csv", encoding='utf-8')
INFO:root:Reading `tutorial_check_database.csv`
In [10]:
for variable in consistent_df.filter(level=1).variables():
    diff = consistent_df.check_aggregate(
        variable,
        **np_isclose_args
    )
    assert diff is None
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Gas - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CF4 - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CH4 - cannot check aggregate because it has no components
In [11]:
for variable in consistent_df.filter(level=1).variables():
    diff = consistent_df.check_aggregate_regions(
        variable,
        **np_isclose_args
    )
    assert diff is None
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6 - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CF4 - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!

Putting it altogether

Finally, we provide the check_internal_consistency method which does all the above for you and returns a dictionary with all of the dataframes which document the errors.

Note: at the moment, this method’s regional checking is limited to checking that all the regions sum to the World region. We cannot make this more automatic unless we start to store how the regions relate, see this issue.

In [12]:
# if all is good, None is returned
print("Checking consistent data"); time.sleep(0.5)
assert consistent_df.check_internal_consistency() is None

# otherwise we get a dict back
print("Checking AR5 subset"); time.sleep(0.5)
errors = df.check_internal_consistency()
Checking consistent data
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Cars - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Power - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Gas - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6 - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6|Industry - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6|Industry - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6|Solvents - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|C2F6|Solvents - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CF4 - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CF4 - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CH4 - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Aggregate Agg - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Aggregate Agg - cannot check regional aggregate because it has no regional components
Checking AR5 subset
INFO:root:Emissions|CO2 - 1390 of 1522 data points are not aggregates of components
INFO:root:Emissions|CO2 - 503 of 503 data points are not aggregates of regional components
INFO:root:Emissions|CO2|Fossil Fuels and Industry - 1258 of 1258 data points are not aggregates of components
INFO:root:Emissions|CO2|Fossil Fuels and Industry - 239 of 239 data points are not aggregates of regional components
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply - 239 of 239 data points are not aggregates of components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply|Electricity - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Emissions|CO2|Fossil Fuels and Industry|Energy Supply|Electricity - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Price|Carbon - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Price|Carbon - cannot check regional aggregate because it has no regional components
INFO:root:Primary Energy - 1522 of 1522 data points are not aggregates of components
INFO:root:Primary Energy - 503 of 503 data points are not aggregates of regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Coal - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Fossil|w/ CCS - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Primary Energy|Fossil|w/ CCS - cannot check regional aggregate because it has no regional components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Temperature|Global Mean|MAGICC6|MED - cannot check aggregate because it has no components
WARNING:root:Filtered IamDataFrame is empty!
INFO:root:Temperature|Global Mean|MAGICC6|MED - cannot check regional aggregate because it has no regional components
In [13]:
pprint([k for k in errors.keys()])
['Emissions|CO2-aggregate',
 'Emissions|CO2-regional',
 'Emissions|CO2|Fossil Fuels and Industry-aggregate',
 'Emissions|CO2|Fossil Fuels and Industry-regional',
 'Emissions|CO2|Fossil Fuels and Industry|Energy Supply-aggregate',
 'Primary Energy-aggregate',
 'Primary Energy-regional']
In [14]:
errors["Emissions|CO2-aggregate"]
Out[14]:
2005 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
variable model scenario region
Emissions|CO2 AIM-Enduse 12.1 EMF27-450-Conv ASIA 10540.74 13160.18 11899.38 9545.81 7355.07 6119.50 NaN NaN NaN NaN NaN
LAM 3285.00 3294.54 3367.62 2856.65 2207.36 1537.72 NaN NaN NaN NaN NaN
MAF 4302.21 4487.54 4238.91 3956.19 3490.81 2082.24 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.33 11646.37 8272.30 4457.91 1625.18 NaN NaN NaN NaN NaN
REF 3306.95 3604.42 3325.20 2991.24 1889.38 960.75 NaN NaN NaN NaN NaN
World 34492.05 38321.78 35588.66 28531.68 20287.46 13367.27 NaN NaN NaN NaN NaN
EMF27-450-NoCCS ASIA 10540.74 13160.11 11893.80 9478.33 7367.07 5513.79 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3362.61 2837.11 1889.89 899.63 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4239.03 3619.25 2787.47 1671.29 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 11659.29 8708.81 5488.86 3355.22 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3322.95 3076.67 1977.78 1181.73 NaN NaN NaN NaN NaN
World 34492.05 38313.59 35588.85 28629.65 20458.10 13660.19 NaN NaN NaN NaN NaN
EMF27-550-LimBio ASIA 10540.74 13160.11 14124.17 14218.08 13187.66 10019.56 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3445.63 3496.62 2986.08 1790.49 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4368.48 4519.64 4294.83 2733.76 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 12607.17 11752.01 9749.33 6501.31 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3826.80 3615.47 3258.31 3076.27 NaN NaN NaN NaN NaN
World 34492.05 38313.59 39531.61 38815.54 34676.38 25295.31 NaN NaN NaN NaN NaN
EMF27-Base-FullTech ASIA 10540.74 13160.11 14149.89 16559.14 19658.68 23071.34 NaN NaN NaN NaN NaN
LAM 3285.00 3286.68 3449.84 3660.68 3850.44 3866.20 NaN NaN NaN NaN NaN
MAF 4302.21 4487.49 4371.98 4751.63 5389.48 6082.37 NaN NaN NaN NaN NaN
OECD90 12085.85 12744.16 12642.70 13332.29 13742.93 14150.35 NaN NaN NaN NaN NaN
REF 3306.95 3604.39 3838.82 4220.97 4866.31 5615.39 NaN NaN NaN NaN NaN
World 34492.05 38313.59 39612.60 43835.49 49027.80 54552.86 NaN NaN NaN NaN NaN
EMF27-G8-EERE ASIA 10540.74 13152.56 13415.94 10147.89 7637.61 4435.80 NaN NaN NaN NaN NaN
LAM 3285.00 3286.52 3106.39 2825.27 1784.31 899.06 NaN NaN NaN NaN NaN
MAF 4302.21 4487.02 4091.19 3977.50 3659.80 3336.85 NaN NaN NaN NaN NaN
OECD90 12085.85 12750.81 10276.06 8833.95 5845.24 3473.56 NaN NaN NaN NaN NaN
REF 3306.95 3596.74 3453.29 3468.73 3376.25 3058.68 NaN NaN NaN NaN NaN
World 34492.05 38304.41 35425.96 30395.43 23536.71 16487.83 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ...
REMIND 1.5 EMF27-450-NoCCS OECD90 15111.39 15254.16 8082.87 2864.75 369.53 328.11 299.06 266.24 255.25 245.07 226.35
World 33837.41 38224.94 25524.60 7358.64 1691.05 1663.77 1616.52 1555.47 1553.00 1665.20 1883.11
EMF27-550-LimBio ASIA 10193.98 13239.55 14218.37 11920.79 8135.32 5963.84 4486.53 3100.11 2246.06 1843.16 1570.02
LAM 2926.60 3478.79 4413.41 1831.96 1357.42 934.84 712.03 523.57 418.39 359.64 107.38
MAF 4035.32 4381.03 4504.49 3368.89 3582.70 3883.52 3663.91 3349.07 3064.56 2919.43 3153.50
OECD90 15111.39 15241.56 13016.52 10555.13 7238.06 4454.98 2745.80 1531.01 766.12 275.60 -82.62
World 33837.41 37970.11 37657.41 28699.50 20936.83 15389.20 11536.73 8368.71 6360.33 5299.36 4644.20
EMF27-Base-FullTech ASIA 10193.98 13478.78 20256.01 24006.74 28404.78 33016.66 35977.35 36397.92 34529.79 28622.83 23400.42
LAM 2926.60 3508.40 5067.35 5464.43 4402.98 5424.51 5869.57 5988.95 6096.94 5152.73 4074.45
MAF 4035.32 4381.09 5364.84 5862.75 8659.61 12865.82 17680.13 22674.41 27242.60 29150.29 29981.48
OECD90 15111.39 15234.63 15486.57 16326.59 16610.24 16943.56 16515.90 15922.43 14587.22 11864.62 9683.61
World 33837.41 38293.08 48134.42 53343.82 59836.10 70077.89 77941.21 82914.15 84109.23 75995.09 68004.38
WITCH_EMF27 EMF27-450-Conv ASIA 9895.45 13210.18 13914.12 12004.49 10538.51 8767.49 7410.94 6299.16 3794.59 2865.46 2437.77
LAM 4660.57 4644.17 1851.46 1537.68 1421.58 658.62 -161.02 -1398.20 -1659.55 -1631.41 -1586.79
MAF 2508.31 2673.95 2224.44 1932.65 1907.50 1703.00 1588.40 1493.36 1413.90 1303.02 1118.21
OECD90 12644.40 12597.55 9780.38 6560.69 4755.20 3257.79 2240.72 1399.83 795.67 359.17 224.28
REF 3870.58 4035.17 2381.75 2055.81 1733.11 1347.51 1064.51 815.36 611.13 444.37 383.41
World 33579.32 37161.03 30152.15 24091.32 20355.91 15734.40 12143.56 8609.50 4955.74 3340.62 2576.88
EMF27-550-LimBio ASIA 9895.98 13341.76 17280.06 18745.41 16414.52 12419.72 10012.01 9373.38 8937.92 9270.47 9214.63
LAM 4660.84 4612.38 2729.37 2515.92 2286.12 1309.50 1048.52 677.15 470.62 -66.27 -210.35
MAF 2508.81 2621.97 2773.63 2885.03 2775.24 2552.54 2579.22 2758.68 2928.88 3067.60 3139.20
OECD90 12645.78 12542.87 11852.60 10275.13 7908.88 5310.74 4461.18 4405.41 4236.36 4016.45 3860.45
REF 3871.04 4062.72 3764.23 3380.51 2617.25 1957.66 1686.23 1658.31 1628.48 1550.83 1510.95
World 33582.45 37181.70 38399.88 37802.01 32002.02 23550.17 19787.16 18872.93 18202.25 17839.07 17514.87
EMF27-Base-FullTech ASIA 9893.46 13378.34 20016.55 26248.47 30889.38 34562.46 37566.05 40325.64 42647.52 44874.72 46657.52
LAM 4659.58 4623.98 4524.39 4644.99 4937.36 5250.67 5698.25 6117.40 6522.66 6945.51 7358.61
MAF 2506.45 2642.28 3291.19 4063.34 5028.41 6038.17 7017.40 8032.94 8851.50 9680.49 10373.40
OECD90 12639.28 12598.84 13097.95 13835.62 14969.12 15784.59 16540.18 17249.21 17924.86 18566.23 19180.64
REF 3868.87 4077.28 4636.23 5039.14 5412.35 5886.80 6279.44 6439.80 6722.19 7040.23 7284.21
World 33567.64 37320.72 45566.30 53831.56 61236.62 67522.70 73101.32 78164.98 82668.74 87107.17 90854.38

140 rows × 11 columns