pyam: an open-source Python package for IAM scenario analysis and visualization

Overview and scope

The pyam package provides a range of diagnostic tools and functions for analyzing and visualising scenario data in the IAMC timeseries format.

Features:
  • Summary of models, scenarios, variables, and regions included in a snapshot.
  • Display of timeseries data as pandas.DataFrame with IAMC-specific filtering options.
  • Advanced visualization and plotting functions.
  • Diagnostic checks for non-reported variables or timeseries values to analyze and validate scenario data.
  • Categorization of scenarios according to timeseries data or metadata for further analysis.

The package can be used with data that follows the data template convention of the Integrated Assessment Modeling Consortium (IAMC). An illustrative example is shown below; see https://data.ene.iiasa.ac.at/database for more information.

Model Scenario Region Variable Unit 2005 2010 2015
MESSAGE V.4 AMPERE3-Base World Primary Energy EJ/y 454.5 479.6

License, source code, documentation

The pyam package is licensed under an APACHE 2.0 open-source license. See the LICENSE file included in this repository for the full text.

The source code is available on https://github.com/IAMconsortium/pyam. The full documentation of the latest release is available on https://software.ene.iiasa.ac.at/pyam.

The pyam data model

Timeseries data

A pyam.IamDataFrame is a wrapper for two pandas.DataFrame instances:

  • data: The data table is a dataframe containing the timeseries data in “long format”. It has the columns pyam.LONG_IDX = [‘model’, ‘scenario’, ‘region’, ‘unit’, ‘year’, ‘value’].
  • meta: The meta table is a dataframe containing categorisation and descriptive indicators. It has the index pyam.META_IDX = [‘model’, ‘scenario’].

The standard output format is the IAMC-style “wide format”, see the example above. This format can be accessed using pd.IamDataFrame.timeseries(), which returns a pandas.DataFrame with the index pyam.IAMC_IDX = [‘model’, ‘scenario’, ‘region’, ‘variable’, ‘unit’] and the years as columns.

Filtering

The pyam package provides two methods for filtering timeseries data:

An existing IamDataFrame can be filtered using pyam.IamDataFrame.filter(col=…), where col can be any column of the data table (i.e., [‘model’, ‘scenario’, ‘region’, ‘unit’, ‘year’]) or any column of the `meta table. The returned object is a new pyam.IamDataFrame instance.

A pandas.DataFrame with columns or index [‘model’, ‘scenario’] can be filtered by any meta columns from a pyam.IamDataFrame using pyam.filter_by_meta(data, df, col=…, join_meta=False). The returned object is a pandas.DataFrame downselected to those models-and-scenarios where the meta column satisfies the criteria given by col=… . Optionally, the meta columns are joined to the returned dataframe.

pyam documentation

See this guide for guidelines on NumPy/SciPy Documentation conventions.