Class IamDataFrame

class pyam.IamDataFrame(data, **kwargs)[source]

This class is a wrapper for dataframes following the IAMC format. It provides a number of diagnostic features (including validation of data, completeness of variables provided) as well as a number of visualization and plotting tools.

Methods

append(other[, inplace]) Import or read timeseries data and append to IamDataFrame
as_pandas([with_metadata]) Return this as a pd.DataFrame
bar_plot(*args, **kwargs) Plot timeseries bars of existing data
categorize(name, value, criteria[, color, …]) Assign scenarios to a category according to specific criteria
check_aggregate(variable[, components, …]) Check whether the timeseries data match the aggregation
col_apply(col, func, *args, **kwargs) Apply a function to a column
convert_unit(conversion_mapping[, inplace]) Converts units based on provided unit conversion factors
export_metadata(path) Export metadata to Excel
filter([filters, keep, inplace]) Return a filtered IamDataFrame (i.e., a subset of current data)
head(*args, **kwargs) Identical to pd.DataFrame.head() operating on data
interpolate(year) Interpolate missing values in timeseries (linear interpolation)
line_plot([x, y]) Plot timeseries lines of existing data
load_metadata(path, *args, **kwargs) Load metadata exported from pyam.IamDataFrame instance
map_regions(map_col[, agg, copy_col, fname, …]) Plot regional data for a single model, scenario, variable, and year
models() Get a list of models
pie_plot(*args, **kwargs) Plot a pie chart
pivot_table(index, columns[, values, …]) Returns a pivot table
region_plot(**kwargs) Plot regional data for a single model, scenario, variable, and year
regions() Get a list of regions
rename(mapping[, inplace]) Rename and aggregate column entries using groupby.sum() on values.
require_variable(variable[, unit, year, …]) Check whether all scenarios have a required variable
reset_exclude() Reset exclusion assignment for all scenarios to exclude: False
scenarios() Get a list of scenarios
set_meta(meta[, name, index]) Add metadata columns as pd.Series, list or value (int/float/str)
stack_plot(*args, **kwargs) Plot timeseries stacks of existing data
tail(*args, **kwargs) Identical to pd.DataFrame.tail() operating on data
timeseries() Returns a dataframe in the standard IAMC format
to_csv(path[, index]) Write data to a csv file
to_excel([path, writer, sheet_name, index]) Write timeseries data to Excel using the IAMC template convention
validate([criteria, exclude_on_fail]) Validate scenarios using criteria on timeseries values
variables([include_units]) Get a list of variables
append(other, inplace=False, **kwargs)[source]

Import or read timeseries data and append to IamDataFrame

Parameters:
  • other (pyam.IamDataFrame, ixmp.TimeSeries, ixmp.Scenario,) –
  • or data file (pd.DataFrame) – an IamDataFrame, TimeSeries or Scenario (requires ixmp), or pd.DataFrame or data file with IAMC-format data columns
  • inplace (bool, default False) – if True, do operation inplace and return None
as_pandas(with_metadata=False)[source]

Return this as a pd.DataFrame

Parameters:with_metadata (bool, default False) – if True, join data with existing metadata
bar_plot(*args, **kwargs)[source]

Plot timeseries bars of existing data

see pyam.plotting.bar_plot() for all available options

categorize(name, value, criteria, color=None, marker=None, linestyle=None)[source]

Assign scenarios to a category according to specific criteria or display the category assignment

Parameters:
  • name (str) – category column name
  • value (str) – category identifier
  • criteria (dict) – dictionary with variables mapped to applicable checks (‘up’ and ‘lo’ for respective bounds, ‘year’ for years - optional)
  • color (str) – assign a color to this category for plotting
  • marker (str) – assign a marker to this category for plotting
  • linestyle (str) – assign a linestyle to this category for plotting
check_aggregate(variable, components=None, units=None, exclude_on_fail=False, multiplier=1, **kwargs)[source]

Check whether the timeseries data match the aggregation of components or sub-categories

Parameters:
  • variable (str) – variable to be checked for matching aggregation of sub-categories
  • components (list of str, default None) – list of variables, defaults to all sub-categories of variable
  • units (str or list of str, default None) – filter variable and components for given unit(s)
  • exclude_on_fail (boolean, default False) – flag scenarios failing validation as exclude: True
  • multiplier (number, default 1) – factor when comparing variable and sum of components
  • kwargs (passed to np.isclose()) –
col_apply(col, func, *args, **kwargs)[source]

Apply a function to a column

Parameters:
  • col (string) – column in either data or metadata
  • func (functional) – function to apply
convert_unit(conversion_mapping, inplace=False)[source]

Converts units based on provided unit conversion factors

Parameters:
  • conversion_mapping (dict) – for each unit for which a conversion should be carried out, provide current unit and target unit and conversion factor {<current unit>: [<target unit>, <conversion factor>]}
  • inplace (bool, default False) – if True, do operation inplace and return None
export_metadata(path)[source]

Export metadata to Excel

Parameters:path (string) – path/filename for xlsx file of metadata export
filter(filters=None, keep=True, inplace=False, **kwargs)[source]

Return a filtered IamDataFrame (i.e., a subset of current data)

Parameters:
  • keep (bool, default True) – keep all scenarios satisfying the filters (if True) or the inverse
  • inplace (bool, default False) – if True, do operation inplace and return None
  • by kwargs or dict (deprecated) (filters) –
    The following columns are available for filtering:
    • metadata columns: filter by category assignment in metadata
    • ’model’, ‘scenario’, ‘region’, ‘variable’, ‘unit’: string or list of strings, where * can be used as a wildcard
    • ’level’: the maximum “depth” of IAM variables (number of ‘|’) (exluding the strings given in the ‘variable’ argument)
    • ’year’: takes an integer, a list of integers or a range
      note that the last year of a range is not included, so range(2010,2015) is interpreted as [2010, ..., 2014]
    • ’regexp=True’ overrides pseudo-regexp syntax in pattern_match()
head(*args, **kwargs)[source]

Identical to pd.DataFrame.head() operating on data

interpolate(year)[source]

Interpolate missing values in timeseries (linear interpolation)

Parameters:year (int) – year to be interpolated
line_plot(x='year', y='value', **kwargs)[source]

Plot timeseries lines of existing data

see pyam.plotting.line_plot() for all available options

load_metadata(path, *args, **kwargs)[source]

Load metadata exported from pyam.IamDataFrame instance

Parameters:path (string) – xlsx file with metadata exported from pyam.IamDataFrame instance
map_regions(map_col, agg=None, copy_col=None, fname=None, region_col=None, inplace=False)[source]

Plot regional data for a single model, scenario, variable, and year

see pyam.plotting.region_plot() for all available options

Parameters:
  • map_col (string) – The column used to map new regions to. Common examples include iso and 5_region.
  • agg (string, optional) – Perform a data aggregation. Options include: sum.
  • copy_col (string, optional) – Copy the existing region data into a new column for later use.
  • fname (string, optional) – Use a non-default region mapping file
  • region_col (string, optional) – Use a non-default column name for regions to map from.
  • inplace (bool, default False) – if True, do operation inplace and return None
models()[source]

Get a list of models

pie_plot(*args, **kwargs)[source]

Plot a pie chart

see pyam.plotting.pie_plot() for all available options

pivot_table(index, columns, values='value', aggfunc='count', fill_value=None, style=None)[source]

Returns a pivot table

Parameters:
  • index (str or list of strings) – rows for Pivot table
  • columns (str or list of strings) – columns for Pivot table
  • values (str, default 'value') – dataframe column to aggregate or count
  • aggfunc (str or function, default 'count') – function used for aggregation, accepts ‘count’, ‘mean’, and ‘sum’
  • fill_value (scalar, default None) – value to replace missing values with
  • style (str, default None) – output style for pivot table formatting accepts ‘highlight_not_max’, ‘heatmap’
region_plot(**kwargs)[source]

Plot regional data for a single model, scenario, variable, and year

see pyam.plotting.region_plot() for all available options

regions()[source]

Get a list of regions

rename(mapping, inplace=False)[source]

Rename and aggregate column entries using groupby.sum() on values. When renaming models or scenarios, the uniqueness of the index must be maintained, and the function will raise an error otherwise.

Parameters:
  • mapping (dict) –

    for each column where entries should be renamed, provide current name and target name {<column name>: {<current_name_1>: <target_name_1>,

    <current_name_2>: <target_name_2>}}
  • inplace (bool, default False) – if True, do operation inplace and return None
require_variable(variable, unit=None, year=None, exclude_on_fail=False)[source]

Check whether all scenarios have a required variable

Parameters:
  • variable (str) – required variable
  • unit (str, default None) – name of unit (optional)
  • years (int or list, default None) – years (optional)
  • exclude (bool, default False) – flag scenarios missing the required variables as exclude: True
reset_exclude()[source]

Reset exclusion assignment for all scenarios to exclude: False

scenarios()[source]

Get a list of scenarios

set_meta(meta, name=None, index=None)[source]

Add metadata columns as pd.Series, list or value (int/float/str)

Parameters:
  • meta (pd.Series, list, int, float or str) – column to be added to metadata (by [‘model’, ‘scenario’] index if possible)
  • name (str, optional) – meta column name (defaults to meta pd.Series.name)
  • index (pyam.IamDataFrame, pd.DataFrame or pd.MultiIndex, optional) – index to be used for setting meta column ([‘model’, ‘scenario’])
stack_plot(*args, **kwargs)[source]

Plot timeseries stacks of existing data

see pyam.plotting.stack_plot() for all available options

tail(*args, **kwargs)[source]

Identical to pd.DataFrame.tail() operating on data

timeseries()[source]

Returns a dataframe in the standard IAMC format

to_csv(path, index=False, **kwargs)[source]

Write data to a csv file

Parameters:index (boolean, default False) – write row names (index)
to_excel(path=None, writer=None, sheet_name='data', index=False, **kwargs)[source]

Write timeseries data to Excel using the IAMC template convention (wrapper for pd.DataFrame.to_excel())

Parameters:
  • excel_writer (string or ExcelWriter object) – file path or existing ExcelWriter
  • sheet_name (string, default 'data') – name of the sheet that will contain the (filtered) IamDataFrame
  • index (boolean, default False) – write row names (index)
validate(criteria={}, exclude_on_fail=False)[source]

Validate scenarios using criteria on timeseries values

Parameters:
  • criteria (dict) –
    dictionary with variable keys and check values
    (‘up’ and ‘lo’ for respective bounds, ‘year’ for years)
  • exclude_on_fail (bool, default False) – flag scenarios failing validation as exclude: True
variables(include_units=False)[source]

Get a list of variables

Parameters:include_units (boolean, default False) – include the units