API reference

This page provides an auto-generated summary of pandas-indexing’s API. For more details and examples, refer to the relevant chapters in the main part of the documentation.

core

Core module.

add_zeros_like(data: T, reference: MultiIndex | DataFrame | Series, *, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str]) T

Add explicit levels to data as 0 values.

Remaining levels in data not found in levels or derive are taken from reference (or its index).

Parameters:
  • data (Data) – Series or DataFrame to extend with zeros

  • reference (Index) – expected level labels (like model, scenario combinations)

  • derive (dict) – derive labels in a level from a multiindex with allowed combinations

  • **levels ([str]) – which labels should be added to df

Returns:

unsorted data with additional zero data

Return type:

DataFrame

aggregatelevel(data: T, agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]]) T

Aggregate labels on one or multiple levels together.

Parameters:
  • data (Data) – Series or DataFrame to aggregate

  • agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”

  • axis (Axis, optional) – Axis on which to aggregate, default 0

  • dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True

  • mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data

  • **levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex. region={"sdn_ssd": ["sdn", "ssd"]} aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.

Returns:

Aggregated data

Return type:

Data

Notes

If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.

antijoin(index_or_data: S, other: Index | Series | DataFrame, *, level: str | int | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0) S

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • index_or_data (Index or DataFrame or Series) – Data to be filtered

  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assignlevel(df: T, frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) T

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • df (DataFrame, Series or Index) – Index, Series or DataFrame of which to change index levels

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

concat(objs: Iterable[T] | Mapping[str, T], order: Sequence[str] | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, keys: None | str | Index | Sequence = None, copy: bool = False, **concat_kwds) T

Concatenate pandas objects along a particular axis.

In addition to the functionality provided by pd.concat, if an axis has a multiindex then the level order is reordered consistently.

Parameters:
  • objs (a sequence or mapping of Series, DataFrame or Index objects) – If a mapping is passed the keys will be used as a new index level (with the name of the keys argument).

  • order (a sequence of str, default None) – The order of level names in which to concatenate

  • axis (Axis) – Axis along which to concatenate

  • keys (str or list-like of str) – If objs is a mapping, a string-like value will be used as name of the new level, otherwise it is passed on to pd.concat.

  • copy (bool, default False) – Whether to copy the underlying data

  • **concat_kwds – Other arguments accepted by pd.concat

Return type:

Concatenated data or index

Raises:

ValueError – If the level names of objs do not match

See also

pandas.concat

describelevel(index_or_data: DataFrame | Series | Index, n: int = 80, as_str: bool = False) str | None

Describe index levels.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame of which to describe index levels

  • n (int, default 80) – The maximum line length

  • as_str (bool, default False) – Whether to return as string or print, instead

Returns:

description – if as_str is True

Return type:

str, optional

dropnalevel(index_or_data: T, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) T

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame of which to drop rows or columns

  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

ensure_multiindex(s: T) T
extractlevel(index_or_data: T, template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, drop: bool | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) T

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Data to modify

  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixindexna(index_or_data: T, axis: Literal[0, 1, 'index', 'columns'] = 0) T

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:
  • index_or_data (Index, Series or DataFrame) – Data

  • axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

formatlevel(index_or_data: T, drop: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) T

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Data to modify

  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

index_names(s, raise_on_index=False)
isna(index_or_data: Index | Series | DataFrame, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
notna(index_or_data: Index | Series | DataFrame, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
projectlevel(index_or_data: T, levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) T

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame to project

  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

semijoin(frame_or_series: ~pandas_indexing.types.S, other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) S

Semijoin frame_or_series by index other.

Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.

Parameters:
  • frame_or_series (DataFrame or Series) – Data to be filtered

  • other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.

  • how ({'left', 'right', 'inner', 'outer'}) – Join method to use

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • sort (bool, optional) – Whether to sort the index

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

  • fill_value – Value for filling gaps introduced by right or outer joins

  • fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed

Return type:

DataFrame or Series

Raises:
  • ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other

  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if frame_or_series does not derive from DataFrame or Series

summarylevel(index_or_data: DataFrame | Series | Index, n: int = 80, as_str: bool = False) str | None

Describe index levels.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame of which to describe index levels

  • n (int, default 80) – The maximum line length

  • as_str (bool, default False) – Whether to return as string or print, instead

Returns:

description – if as_str is True

Return type:

str, optional

to_tidy(data: Series | DataFrame, meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year') DataFrame

Convert multi-indexed time-series dataframe to tidy dataframe.

Parameters:
  • data (Data) – Data in time-series representation with years on columns

  • meta (DataFrame, optional) – Meta data that is joined before tidying up

  • value_name (str, optional) – Column name for the values; default “value” Use None to not change the name.

  • columns (str, optional) – Name for the level on the columns axis; default “year” Use None to not change the name.

Returns:

Tidy dataframe without index

Return type:

DataFrame

uniquelevel(index_or_data: DataFrame | Series | Index, levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame of which to describe index levels

  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

accessors

Registers convenience accessors into the pix namespace of each pandas object.

Examples

>>> df.pix.project(["model", "scenario"])
>>> df.index.pix.assign(unit="Mt CO2")
>>> df.pix.multiply(other, how="left")
class DataFrameIdxAccessor(*args, **kwargs)

Bases: _DataPixAccessor

Deprecated since version 0.2.9: Use the new name df.pix of the accessor

add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])

Add explicit levels to data as 0 values.

Remaining levels in data not found in levels or derive are taken from reference (or its index).

Parameters:
  • reference (Index) – expected level labels (like model, scenario combinations)

  • derive (dict) – derive labels in a level from a multiindex with allowed combinations

  • **levels ([str]) – which labels should be added to df

Returns:

unsorted data with additional zero data

Return type:

DataFrame

aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])

Aggregate labels on one or multiple levels together.

Parameters:
  • agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”

  • axis (Axis, optional) – Axis on which to aggregate, default 0

  • dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True

  • mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data

  • **levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex. region={"sdn_ssd": ["sdn", "ssd"]} aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.

Returns:

Aggregated data

Return type:

Data

Notes

If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)

Converts units in a dataframe or series.

Parameters:
  • unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)

  • level (str|None, default "unit") – Level name on axis If None, then unit needs to be a mapping like {from_unit: to_unit}

  • axis (Axis, default 0) – Axis of unit level

Returns:

DataFrame or Series with converted units

Return type:

Data

Examples

>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples(
...         [("foo", "mm"), ("bar", "m")], names=["var", "unit"]
...     ),
... )
>>> s.pix.convert_unit("km")
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> s.pix.convert_unit({"m": "km"})
var  unit
bar  km      0.008
foo  mm      7.000
dtype: float64

Notes

Uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, quantify, dequantify

dequantify(level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)

Convert columns in data to pint extension types to handle units.

pint-pandas can only represent a single unit per column and is somewhat brittle.

Parameters:
  • unit (str, optional) – If given, assumes data is currently in this unit.

  • level (str, optional) – Level of which to use the unit, by default “unit”

  • axis (Axis, optional) – Axis from which to pop the level, by default 0

  • copy (bool, optional) – Whether data should be copied, by default False

Returns:

Data with internalized unit which stays with arithmetics

Return type:

Data

Raises:

ValueError – If level contains more than one unit

Examples

>>> s = Series(
...     [7e-3, 8],
...     MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s.pix.quantify()
var
foo    7e-06
bar    0.008
dtype: pint[kilometer]

Notes

pint-pandas uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, dequantify, convert_unit

radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series

Semijoin frame_or_series by index other.

Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.

Parameters:
  • other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.

  • how ({'left', 'right', 'inner', 'outer'}) – Join method to use

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • sort (bool, optional) – Whether to sort the index

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

  • fill_value – Value for filling gaps introduced by right or outer joins

  • fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed

Return type:

DataFrame or Series

Raises:
  • ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other

  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if frame_or_series does not derive from DataFrame or Series

sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')

Convert multi-indexed time-series dataframe to tidy dataframe.

Parameters:
  • meta (DataFrame, optional) – Meta data that is joined before tidying up

  • value_name (str, optional) – Column name for the values; default “value” Use None to not change the name.

  • columns (str, optional) – Name for the level on the columns axis; default “year” Use None to not change the name.

Returns:

Tidy dataframe without index

Return type:

DataFrame

truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsubtract(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unittruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
class DataFramePixAccessor(pandas_obj)

Bases: _DataPixAccessor

add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])

Add explicit levels to data as 0 values.

Remaining levels in data not found in levels or derive are taken from reference (or its index).

Parameters:
  • reference (Index) – expected level labels (like model, scenario combinations)

  • derive (dict) – derive labels in a level from a multiindex with allowed combinations

  • **levels ([str]) – which labels should be added to df

Returns:

unsorted data with additional zero data

Return type:

DataFrame

aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])

Aggregate labels on one or multiple levels together.

Parameters:
  • agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”

  • axis (Axis, optional) – Axis on which to aggregate, default 0

  • dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True

  • mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data

  • **levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex. region={"sdn_ssd": ["sdn", "ssd"]} aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.

Returns:

Aggregated data

Return type:

Data

Notes

If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)

Converts units in a dataframe or series.

Parameters:
  • unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)

  • level (str|None, default "unit") – Level name on axis If None, then unit needs to be a mapping like {from_unit: to_unit}

  • axis (Axis, default 0) – Axis of unit level

Returns:

DataFrame or Series with converted units

Return type:

Data

Examples

>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples(
...         [("foo", "mm"), ("bar", "m")], names=["var", "unit"]
...     ),
... )
>>> s.pix.convert_unit("km")
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> s.pix.convert_unit({"m": "km"})
var  unit
bar  km      0.008
foo  mm      7.000
dtype: float64

Notes

Uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, quantify, dequantify

dequantify(level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)

Convert columns in data to pint extension types to handle units.

pint-pandas can only represent a single unit per column and is somewhat brittle.

Parameters:
  • unit (str, optional) – If given, assumes data is currently in this unit.

  • level (str, optional) – Level of which to use the unit, by default “unit”

  • axis (Axis, optional) – Axis from which to pop the level, by default 0

  • copy (bool, optional) – Whether data should be copied, by default False

Returns:

Data with internalized unit which stays with arithmetics

Return type:

Data

Raises:

ValueError – If level contains more than one unit

Examples

>>> s = Series(
...     [7e-3, 8],
...     MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s.pix.quantify()
var
foo    7e-06
bar    0.008
dtype: pint[kilometer]

Notes

pint-pandas uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, dequantify, convert_unit

radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series

Semijoin frame_or_series by index other.

Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.

Parameters:
  • other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.

  • how ({'left', 'right', 'inner', 'outer'}) – Join method to use

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • sort (bool, optional) – Whether to sort the index

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

  • fill_value – Value for filling gaps introduced by right or outer joins

  • fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed

Return type:

DataFrame or Series

Raises:
  • ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other

  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if frame_or_series does not derive from DataFrame or Series

sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')

Convert multi-indexed time-series dataframe to tidy dataframe.

Parameters:
  • meta (DataFrame, optional) – Meta data that is joined before tidying up

  • value_name (str, optional) – Column name for the values; default “value” Use None to not change the name.

  • columns (str, optional) – Name for the level on the columns axis; default “year” Use None to not change the name.

Returns:

Tidy dataframe without index

Return type:

DataFrame

truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsubtract(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unittruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
class IndexIdxAccessor(*args, **kwargs)

Bases: _PixAccessor

Deprecated since version 0.2.9: Use the new name df.pix of the accessor

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

class IndexPixAccessor(pandas_obj)

Bases: _PixAccessor

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

class SeriesIdxAccessor(*args, **kwargs)

Bases: _DataPixAccessor

Deprecated since version 0.2.9: Use the new name df.pix of the accessor

add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])

Add explicit levels to data as 0 values.

Remaining levels in data not found in levels or derive are taken from reference (or its index).

Parameters:
  • reference (Index) – expected level labels (like model, scenario combinations)

  • derive (dict) – derive labels in a level from a multiindex with allowed combinations

  • **levels ([str]) – which labels should be added to df

Returns:

unsorted data with additional zero data

Return type:

DataFrame

aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])

Aggregate labels on one or multiple levels together.

Parameters:
  • agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”

  • axis (Axis, optional) – Axis on which to aggregate, default 0

  • dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True

  • mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data

  • **levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex. region={"sdn_ssd": ["sdn", "ssd"]} aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.

Returns:

Aggregated data

Return type:

Data

Notes

If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)

Converts units in a dataframe or series.

Parameters:
  • unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)

  • level (str|None, default "unit") – Level name on axis If None, then unit needs to be a mapping like {from_unit: to_unit}

  • axis (Axis, default 0) – Axis of unit level

Returns:

DataFrame or Series with converted units

Return type:

Data

Examples

>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples(
...         [("foo", "mm"), ("bar", "m")], names=["var", "unit"]
...     ),
... )
>>> s.pix.convert_unit("km")
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> s.pix.convert_unit({"m": "km"})
var  unit
bar  km      0.008
foo  mm      7.000
dtype: float64

Notes

Uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, quantify, dequantify

dequantify(level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)

Convert columns in data to pint extension types to handle units.

pint-pandas can only represent a single unit per column and is somewhat brittle.

Parameters:
  • unit (str, optional) – If given, assumes data is currently in this unit.

  • level (str, optional) – Level of which to use the unit, by default “unit”

  • axis (Axis, optional) – Axis from which to pop the level, by default 0

  • copy (bool, optional) – Whether data should be copied, by default False

Returns:

Data with internalized unit which stays with arithmetics

Return type:

Data

Raises:

ValueError – If level contains more than one unit

Examples

>>> s = Series(
...     [7e-3, 8],
...     MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s.pix.quantify()
var
foo    7e-06
bar    0.008
dtype: pint[kilometer]

Notes

pint-pandas uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, dequantify, convert_unit

radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series

Semijoin frame_or_series by index other.

Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.

Parameters:
  • other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.

  • how ({'left', 'right', 'inner', 'outer'}) – Join method to use

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • sort (bool, optional) – Whether to sort the index

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

  • fill_value – Value for filling gaps introduced by right or outer joins

  • fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed

Return type:

DataFrame or Series

Raises:
  • ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other

  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if frame_or_series does not derive from DataFrame or Series

sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')

Convert multi-indexed time-series dataframe to tidy dataframe.

Parameters:
  • meta (DataFrame, optional) – Meta data that is joined before tidying up

  • value_name (str, optional) – Column name for the values; default “value” Use None to not change the name.

  • columns (str, optional) – Name for the level on the columns axis; default “year” Use None to not change the name.

Returns:

Tidy dataframe without index

Return type:

DataFrame

truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsubtract(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unittruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
class SeriesPixAccessor(pandas_obj)

Bases: _DataPixAccessor

add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])

Add explicit levels to data as 0 values.

Remaining levels in data not found in levels or derive are taken from reference (or its index).

Parameters:
  • reference (Index) – expected level labels (like model, scenario combinations)

  • derive (dict) – derive labels in a level from a multiindex with allowed combinations

  • **levels ([str]) – which labels should be added to df

Returns:

unsorted data with additional zero data

Return type:

DataFrame

aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])

Aggregate labels on one or multiple levels together.

Parameters:
  • agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”

  • axis (Axis, optional) – Axis on which to aggregate, default 0

  • dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True

  • mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data

  • **levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex. region={"sdn_ssd": ["sdn", "ssd"]} aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.

Returns:

Aggregated data

Return type:

Data

Notes

If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.

antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)

Antijoin index_or_data with index other.

ie remove all occurrences of other from data

Parameters:
  • other (Index) – Other index to join with

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

Return type:

Index or DataFrame or Series

Raises:
  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if index_or_data does not derive from DataFrame or Series

assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex

Add or overwrite levels on a multiindex.

Parameters:
  • ----------

  • frame (Series or DataFrame, optional) – Additional labels

  • order (list of str, optional) – Level names in desired order or False, by default False

  • axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex

  • ignore_index (bool, optional) – If true, dataframes or series are not index aligned

  • **labels – Labels for each new index level

Returns:

Series or DataFrame with changed index or new MultiIndex

Return type:

df

convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)

Converts units in a dataframe or series.

Parameters:
  • unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)

  • level (str|None, default "unit") – Level name on axis If None, then unit needs to be a mapping like {from_unit: to_unit}

  • axis (Axis, default 0) – Axis of unit level

Returns:

DataFrame or Series with converted units

Return type:

Data

Examples

>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples(
...         [("foo", "mm"), ("bar", "m")], names=["var", "unit"]
...     ),
... )
>>> s.pix.convert_unit("km")
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> s.pix.convert_unit({"m": "km"})
var  unit
bar  km      0.008
foo  mm      7.000
dtype: float64

Notes

Uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, quantify, dequantify

dequantify(level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Remove missing index values.

Drops all index entries for which any or all (how) levels are undefined.

Parameters:
  • subset (Sequence[str], optional) – Names of levels on which to check for NA values

  • how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Extract new index levels with templates matched against any index level.

The **templates argument defines pairs of level names and templates. Given level names are matched against the template, f.ex. "Emi|{gas}|{sector}". Patterns ({gas} or {sector}) appearing in the template are extracted from the successful matches and added as new levels.

Pattern names in the optional argument can be missing (including a leading | character) and are replaced by the string "Total" then.

Changed in version 0.5.3: Added optional patterns.

Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.

Parameters:
  • template (str, optional) – Extraction template for a single level

  • keep (bool, default False) – Whether to keep the split dimension

  • dropna (bool, default True) – Whether to drop the non-matching levels

  • regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from

  • drop (bool, optional) – Deprecated argument, use keep instead

  • optional ([str] or None, optional) – Marks templates as optional

  • **templates (str) – Templates for splitting one or multiple levels

Return type:

Index, Series or DataFrame

Raises:
  • ValueError – If dim is not a dimension of index_or_series

  • ValueError – If template is given, while index has more than one level

Examples

>>> s = Series(
...     range(4),
...     MultiIndex.from_arrays(
...         [
...             ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"],
...             ["GWh", "GWh", "EJ", "GWh"],
...         ],
...         names=["variable", "unit"],
...     ),
... )
>>> s
variable      unit
SE|Elec|Bio   GWh     0
SE|Elec|Coal  GWh     1
PE|Coal       EJ      2
SE|Elec       GWh     3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}")
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"])
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  Total   3
dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False)
variable      unit  type  fuel
SE|Elec|Bio   GWh   Elec  Bio     0
SE|Elec|Coal  GWh   Elec  Coal    1
PE|Coal       EJ    NaN   NaN     2
SE|Elec       GWh   NaN   NaN     3
dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True)
unit  type  fuel
GWh   Elec  Bio     0
GWh   Elec  Coal    1
GWh   Elec  NaN     3
dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"])
>>> extractlevel(s, "SE|{type}|{fuel}")
type  fuel
Elec  Bio     0
      Coal    1
dtype: int64

See also

formatlevel

fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Fix broken MultiIndex NA representation from .groupby(…, dropna=False).

Refer to https://github.com/coroa/pandas-indexing/issues/25 for details

Parameters:

axis (Axis, optional) – Axis to fix, by default 0

Return type:

index_or_data

floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index

Format index levels based on a template refering to other levels.

Changed in version 0.5.3: Added optional patterns.

Parameters:
  • drop (bool, default False) – Whether to drop the used index levels

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify

  • optional ([str], optional) – Marks levels as optional (including a leading | character)

  • **templates (str) – Format templates for one or multiple levels

Return type:

Index, Series or DataFrame

Raises:

ValueError – If templates refer to non-existant levels

isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index

Project multiindex to given levels.

Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.

Parameters:
  • levels (sequence of str) – Names of levels to project on (to keep)

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project

Returns:

index_or_data

Return type:

Index|MultiIndex|Series|DataFrame

quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)

Convert columns in data to pint extension types to handle units.

pint-pandas can only represent a single unit per column and is somewhat brittle.

Parameters:
  • unit (str, optional) – If given, assumes data is currently in this unit.

  • level (str, optional) – Level of which to use the unit, by default “unit”

  • axis (Axis, optional) – Axis from which to pop the level, by default 0

  • copy (bool, optional) – Whether data should be copied, by default False

Returns:

Data with internalized unit which stays with arithmetics

Return type:

Data

Raises:

ValueError – If level contains more than one unit

Examples

>>> s = Series(
...     [7e-3, 8],
...     MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s.pix.quantify()
var
foo    7e-06
bar    0.008
dtype: pint[kilometer]

Notes

pint-pandas uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

See also

set_openscm_registry_as_default, dequantify, convert_unit

radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series

Semijoin frame_or_series by index other.

Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.

Parameters:
  • other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.

  • how ({'left', 'right', 'inner', 'outer'}) – Join method to use

  • level (None or str or int or) – Single level on which to join, if not given join on all

  • sort (bool, optional) – Whether to sort the index

  • axis ({0, 1, "index", "columns"}) – Axis on which to join

  • fill_value – Value for filling gaps introduced by right or outer joins

  • fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed

Return type:

DataFrame or Series

Raises:
  • ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other

  • ValueError – If axis is not 0, “index” or 1, “columns”

  • TypeError – if frame_or_series does not derive from DataFrame or Series

sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')

Convert multi-indexed time-series dataframe to tidy dataframe.

Parameters:
  • meta (DataFrame, optional) – Meta data that is joined before tidying up

  • value_name (str, optional) – Column name for the values; default “value” Use None to not change the name.

  • columns (str, optional) – Name for the level on the columns axis; default “year” Use None to not change the name.

Returns:

Tidy dataframe without index

Return type:

DataFrame

truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index

Return unique index levels.

Parameters:
  • levels (str or Sequence[str], optional) – Names of levels to get unique values of

  • axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on

Returns:

unique_index

Return type:

Index

unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsubtract(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unittruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
forward_binop(self, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
forward_unitbinop(self, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)

arithmetics

Provide aligned basic arithmetic ops.

Simple arithmetic operations add(), divide(), multiply() and subtract() which allow setting the standard how=”outer” alignment that pandas uses by default.

In practice, this means if dataframes do not share the same axes one can choose to get the results for only the items index items existing in both indices (how="inner") or whether to prefer the axis from the first (how="left") or the right (how="right) operand.

add(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
binop(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
div(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divide(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
divmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
floordiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
mul(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
multiply(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
pow(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
radd(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rdivmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rfloordiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rmul(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rpow(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rsub(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
rtruediv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
sub(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
subtract(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
truediv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitadd(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitbinop(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdiv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitdivide(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmod(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmul(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitmultiply(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitradd(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrdiv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrmul(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrsub(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitrtruediv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsub(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unitsubtract(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
unittruediv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)

selectors

Selectors improve .loc[] indexing for multi-index pandas data.

class AllSelector

Bases: Special

class And(a: Selector, b: Selector)

Bases: BinOp

a: Selector
b: Selector
class BinOp(a: Selector, b: Selector)

Bases: Selector

a: Selector
b: Selector
class Const(val: Any)

Bases: Selector

val: Any
class Isin(filters: Mapping[str, Any], ignore_missing_levels: bool = False)

Bases: Selector

filters: Mapping[str, Any]
ignore_missing_levels: bool
class IsinIndex(index: MultiIndex, ignore_missing_levels: bool = False)

Bases: Selector

ignore_missing_levels: bool
index: MultiIndex
class Ismatch(filters: Mapping[str, Any], regex: bool = False, ignore_missing_levels: bool = False)

Bases: Selector

filters: Mapping[str, Any]
ignore_missing_levels: bool
index_match(index, patterns)
multiindex_match(index, patterns, level)
regex: bool
class NoneSelector

Bases: Special

class Not(a: Selector)

Bases: Selector

a: Selector
class Or(a: Selector, b: Selector)

Bases: BinOp

a: Selector
b: Selector
class Selector

Bases: object

class Special

Bases: Selector

isin(df: Series | DataFrame | Index | None = None, index: Index | None = None, /, ignore_missing_levels: bool = False, **filters: Any) Isin | IsinIndex | Series

Constructs a MultiIndex selector.

Parameters:
  • df (Data, optional) – Data on which to match, if missing an Isin object is returned

  • index (Index, optional) – Filter based on common levels given in index. Can also be passed as the df argument. Cannot be combined with filters.

  • ignore_missing_levels (bool, default False) – If set, levels missing in data index will be ignored

  • **filters – Filter to apply on given levels (lists are or ed, levels are and ed) Callables are evaluated on the index level values.

Return type:

Isin, IsinIndex or Series

Example

>>> df.loc[isin(region="World", gas=["CO2", "N2O"])]

or with explicit df to get a boolean mask

>>> isin(df, region="World", gas=["CO2", "N2O"])

For selecting across multiple levels, a multiindex is passed as positional argument:

>>> index = pd.MultiIndex.from_tuples(
...     [("World", "CO2"), ("R10_EUROPE", "N2O")],
...     names=["region", "gas"],
... )
>>> df.loc[isin(index)]
ismatch(df: None | Index | DataFrame | Series | str = None, singlefilter: str | None = None, /, regex: bool = False, ignore_missing_levels: bool = False, **filters) Ismatch | Series

Constructs an Index or MultiIndex selector based on pattern matching.

Parameters:
  • df (Data, optional) – Data on which to match, if missing an Isin object is returned.

  • singlefilter (str, optional) – Filter to apply on a non-multiindex index (can also be handed into the df argument)

  • regex (bool, default False) – If set, filters are interpreted as plain regex strings, otherwise (by default) a glob-like syntax is used

  • ignore_missing_levels (bool, default False) – If set, levels missing in data index will be ignored

  • **filters – Filter to apply on given levels (lists are or ed, levels are and ed)

Return type:

Isin or Series

Example

for a multiindex:

>>> df.loc[ismatch(variable="Emissions|*|Fossil Fuel and Industry")]

for a single index:

>>> df.loc[ismatch("*bla*")]
maybe_const(x)

units

Unit handling in pandas data.

Enables unit conversions based on pint’s application registry (see also Notes).

By default units are expected – as in the IAMC default format – on a unit level on each row, but a column-wise unit level is also supported.

Units can be handled in one of two flavours:

  1. convert_unit() converts manually to a new unit like convert_unit(s, “km”)

  2. quantify() convert data to a pint pandas array which tracks units implicitly through arithmetics until dequantify() then extracts the tracked unit back into the multiindex level.

    While this is in theory the simpler approach, the underlying library pint-pandas [1] is brittle and breaks from time to time.

Notes

The pint application registry is set by pint.set_application_registry() or with set_openscm_registry_as_default(). The latter sets the IAMC based openscm-units one [2].

Examples

>>> import pandas_indexing as pi
>>> pi.set_openscm_registry_as_default()
>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples([("foo", "mm"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s = pi.convert_unit(s, "km")
>>> s
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> pi.quantify(s)
var
bar    0.008
foo    7e-06
dtype: pint[kilometer]

References

See also

pint.set_application_registry

convert_unit(data: Series | DataFrame, unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)

Converts units in a dataframe or series.

Parameters:
  • data (DataFrame or Series) – DataFrame or Series with a “unit” level

  • unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)

  • level (str|None, default "unit") – Level name on axis If None, then unit needs to be a mapping like {from_unit: to_unit}

  • axis (Axis, default 0) – Axis of unit level

Returns:

DataFrame or Series with converted units

Return type:

Data

Examples

>>> s = Series(
...     [7, 8],
...     MultiIndex.from_tuples(
...         [("foo", "mm"), ("bar", "m")], names=["var", "unit"]
...     ),
... )
>>> convert_unit(s, "km")
var  unit
bar  km      0.008000
foo  km      0.000007
dtype: float64
>>> convert_unit(s, {"m": "km"})
var  unit
bar  km      0.008
foo  mm      7.000
dtype: float64

Notes

Uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

dequantify(data: Series | DataFrame, level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
format_dtype(dtype)
get_openscm_registry(add_co2e: bool = True)
is_unit(unit: str) bool
quantify(data: Series | DataFrame, level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False) Series | DataFrame

Convert columns in data to pint extension types to handle units.

pint-pandas can only represent a single unit per column and is somewhat brittle.

Parameters:
  • data (DataFrame or Series) – DataFrame or Series to quantify

  • unit (str, optional) – If given, assumes data is currently in this unit.

  • level (str, optional) – Level of which to use the unit, by default “unit”

  • axis (Axis, optional) – Axis from which to pop the level, by default 0

  • copy (bool, optional) – Whether data should be copied, by default False

Returns:

Data with internalized unit which stays with arithmetics

Return type:

Data

Raises:

ValueError – If level contains more than one unit

Examples

>>> s = Series(
...     [7e-3, 8],
...     MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]),
... )
>>> quantify(s)
var
foo    7e-06
bar    0.008
dtype: pint[kilometer]

Notes

pint-pandas uses the pint application registry, which can be set with pint.set_application_registry() or set_openscm_registry_as_default().

set_openscm_registry_as_default(add_co2e: bool = True)