API reference
This page provides an auto-generated summary of pandas-indexing’s API. For more details and examples, refer to the relevant chapters in the main part of the documentation.
core
Core module.
- add_zeros_like(data: T, reference: MultiIndex | DataFrame | Series, *, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str]) T
Add explicit levels to data as 0 values.
Remaining levels in data not found in levels or derive are taken from reference (or its index).
- Parameters:
- Returns:
unsorted data with additional zero data
- Return type:
DataFrame
- aggregatelevel(data: T, agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]]) T
Aggregate labels on one or multiple levels together.
- Parameters:
data (Data) – Series or DataFrame to aggregate
agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”
axis (Axis, optional) – Axis on which to aggregate, default 0
dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True
mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data
**levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex.
region={"sdn_ssd": ["sdn", "ssd"]}aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.
- Returns:
Aggregated data
- Return type:
Data
Notes
If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.
See also
- antijoin(index_or_data: S, other: Index | Series | DataFrame, *, level: str | int | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0) S
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
index_or_data (Index or DataFrame or Series) – Data to be filtered
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assignlevel(df: T, frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) T
Add or overwrite levels on a multiindex.
- Parameters:
df (DataFrame, Series or Index) – Index, Series or DataFrame of which to change index levels
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- concat(objs: Iterable[T] | Mapping[str, T], order: Sequence[str] | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, keys: None | str | Index | Sequence = None, copy: bool = False, **concat_kwds) T
Concatenate pandas objects along a particular axis.
In addition to the functionality provided by pd.concat, if the concat axis has a multiindex then the level order is reordered consistently.
- Parameters:
objs (a sequence or mapping of Series, DataFrame or Index objects) – If a mapping is passed the keys will be used as a new index level (with the name of the keys argument).
order (a sequence of str, default None) – The order of level names in which to concatenate
axis (Axis) – Axis along which to concatenate
keys (str or list-like of str) – If objs is a mapping, a string-like value will be used as name of the new level, otherwise it is passed on to pd.concat.
copy (bool, default False) – Whether to copy the underlying data
**concat_kwds – Other arguments accepted by pd.concat
- Return type:
Concatenated data or index
- Raises:
ValueError – If the level names of objs do not match
See also
- describelevel(index_or_data: DataFrame | Series | Index, n: int = 80, as_str: bool = False) str | None
Describe index levels.
- Parameters:
- Returns:
description – if as_str is True
- Return type:
str, optional
See also
- dropnalevel(index_or_data: T, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) T
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame of which to drop rows or columns
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- ensure_multiindex(s: T) T
- extractlevel(index_or_data: T, template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, drop: bool | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) T
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
- Parameters:
index_or_data (DataFrame, Series or Index) – Data to modify template : str, optional Extraction template for a single level
keep (bool, default False) – Whether to keep the split dimension
dropna (bool, default True) – Whether to drop the non-matching levels
regex (bool, default False) – Whether templates are given as regular expressions (regexes must use named captures)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to extract from
drop (bool, optional) – Deprecated argument, use keep instead
optional ([str] or None, optional) – Marks templates as optional
**templates (str) – Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
- fixindexna(index_or_data: T, axis: Literal[0, 1, 'index', 'columns'] = 0) T
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
index_or_data (Index, Series or DataFrame) – Data
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- formatlevel(index_or_data: T, drop: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) T
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
index_or_data (DataFrame, Series or Index) – Data to modify
drop (bool, default False) – Whether to drop the used index levels
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to modify
optional ([str], optional) – Marks levels as optional (including a leading | character)
**templates (str) – Format templates for one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- index_names(s, raise_on_index=False)
- isna(index_or_data: Index | Series | DataFrame, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- notna(index_or_data: Index | Series | DataFrame, subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- projectlevel(index_or_data: T, levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) T
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
index_or_data (DataFrame, Series or Index) – Index, Series or DataFrame to project
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- semijoin(frame_or_series: ~pandas_indexing.types.S, other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) S
Semijoin frame_or_series by index other.
Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.
- Parameters:
frame_or_series (DataFrame or Series) – Data to be filtered
other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.
how ({'left', 'right', 'inner', 'outer'}) – Join method to use
level (None or str or int or) – Single level on which to join, if not given join on all
sort (bool, optional) – Whether to sort the index
axis ({0, 1, "index", "columns"}) – Axis on which to join
fill_value – Value for filling gaps introduced by right or outer joins
fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed
- Return type:
DataFrame or Series
- Raises:
ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if frame_or_series does not derive from DataFrame or Series
See also
- summarylevel(index_or_data: DataFrame | Series | Index, n: int = 80, as_str: bool = False) str | None
Describe index levels.
- Parameters:
- Returns:
description – if as_str is True
- Return type:
str, optional
See also
- to_tidy(data: Series | DataFrame, meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year') DataFrame
Convert multi-indexed time-series dataframe to tidy dataframe.
- Parameters:
data (Data) – Data in time-series representation with years on columns
meta (DataFrame, optional) – Meta data that is joined before tidying up
value_name (str, optional) – Column name for the values; default “value” Use
Noneto not change the name.columns (str, optional) – Name for the level on the columns axis; default “year” Use
Noneto not change the name.
- Returns:
Tidy dataframe without index
- Return type:
DataFrame
accessors
Registers convenience accessors into the pix namespace of each pandas
object.
Examples
>>> df.pix.project(["model", "scenario"])
>>> df.index.pix.assign(unit="Mt CO2")
>>> df.pix.multiply(other, how="left")
- class DataFrameIdxAccessor(*args, **kwargs)
Bases:
_DataPixAccessorDeprecated since version 0.2.9: Use the new name
df.pixof the accessor- add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])
Add explicit levels to data as 0 values.
Remaining levels in data not found in levels or derive are taken from reference (or its index).
- Parameters:
- Returns:
unsorted data with additional zero data
- Return type:
DataFrame
- aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])
Aggregate labels on one or multiple levels together.
- Parameters:
agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”
axis (Axis, optional) – Axis on which to aggregate, default 0
dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True
mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data
**levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex.
region={"sdn_ssd": ["sdn", "ssd"]}aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.
- Returns:
Aggregated data
- Return type:
Data
Notes
If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.
See also
- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)
Converts units in a dataframe or series.
- Parameters:
unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)
level (str|None, default "unit") – Level name on
axisIf None, thenunitneeds to be a mapping like{from_unit: to_unit}axis (Axis, default 0) – Axis of unit level
- Returns:
DataFrame or Series with converted units
- Return type:
Data
Examples
>>> s = Series( ... [7, 8], ... MultiIndex.from_tuples( ... [("foo", "mm"), ("bar", "m")], names=["var", "unit"] ... ), ... ) >>> s.pix.convert_unit("km") var unit bar km 0.008000 foo km 0.000007 dtype: float64
>>> s.pix.convert_unit({"m": "km"}) var unit bar km 0.008 foo mm 7.000 dtype: float64
Notes
Uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,quantify,dequantify
- div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
Convert columns in data to pint extension types to handle units.
pint-pandas can only represent a single unit per column and is somewhat brittle.
- Parameters:
- Returns:
Data with internalized unit which stays with arithmetics
- Return type:
Data
- Raises:
ValueError – If level contains more than one unit
Examples
>>> s = Series( ... [7e-3, 8], ... MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]), ... ) >>> s.pix.quantify() var foo 7e-06 bar 0.008 dtype: pint[kilometer]
Notes
pint-pandas uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,dequantify,convert_unit
- radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series
Semijoin frame_or_series by index other.
Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.
- Parameters:
other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.
how ({'left', 'right', 'inner', 'outer'}) – Join method to use
level (None or str or int or) – Single level on which to join, if not given join on all
sort (bool, optional) – Whether to sort the index
axis ({0, 1, "index", "columns"}) – Axis on which to join
fill_value – Value for filling gaps introduced by right or outer joins
fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed
- Return type:
DataFrame or Series
- Raises:
ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if frame_or_series does not derive from DataFrame or Series
See also
- sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')
Convert multi-indexed time-series dataframe to tidy dataframe.
- Parameters:
- Returns:
Tidy dataframe without index
- Return type:
DataFrame
- truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index
Return unique index levels.
- Parameters:
- Returns:
unique_index
- Return type:
Index
See also
- unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- class DataFramePixAccessor(pandas_obj)
Bases:
_DataPixAccessor- add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])
Add explicit levels to data as 0 values.
Remaining levels in data not found in levels or derive are taken from reference (or its index).
- Parameters:
- Returns:
unsorted data with additional zero data
- Return type:
DataFrame
- aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])
Aggregate labels on one or multiple levels together.
- Parameters:
agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”
axis (Axis, optional) – Axis on which to aggregate, default 0
dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True
mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data
**levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex.
region={"sdn_ssd": ["sdn", "ssd"]}aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.
- Returns:
Aggregated data
- Return type:
Data
Notes
If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.
See also
- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)
Converts units in a dataframe or series.
- Parameters:
unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)
level (str|None, default "unit") – Level name on
axisIf None, thenunitneeds to be a mapping like{from_unit: to_unit}axis (Axis, default 0) – Axis of unit level
- Returns:
DataFrame or Series with converted units
- Return type:
Data
Examples
>>> s = Series( ... [7, 8], ... MultiIndex.from_tuples( ... [("foo", "mm"), ("bar", "m")], names=["var", "unit"] ... ), ... ) >>> s.pix.convert_unit("km") var unit bar km 0.008000 foo km 0.000007 dtype: float64
>>> s.pix.convert_unit({"m": "km"}) var unit bar km 0.008 foo mm 7.000 dtype: float64
Notes
Uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,quantify,dequantify
- div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
Convert columns in data to pint extension types to handle units.
pint-pandas can only represent a single unit per column and is somewhat brittle.
- Parameters:
- Returns:
Data with internalized unit which stays with arithmetics
- Return type:
Data
- Raises:
ValueError – If level contains more than one unit
Examples
>>> s = Series( ... [7e-3, 8], ... MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]), ... ) >>> s.pix.quantify() var foo 7e-06 bar 0.008 dtype: pint[kilometer]
Notes
pint-pandas uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,dequantify,convert_unit
- radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series
Semijoin frame_or_series by index other.
Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.
- Parameters:
other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.
how ({'left', 'right', 'inner', 'outer'}) – Join method to use
level (None or str or int or) – Single level on which to join, if not given join on all
sort (bool, optional) – Whether to sort the index
axis ({0, 1, "index", "columns"}) – Axis on which to join
fill_value – Value for filling gaps introduced by right or outer joins
fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed
- Return type:
DataFrame or Series
- Raises:
ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if frame_or_series does not derive from DataFrame or Series
See also
- sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')
Convert multi-indexed time-series dataframe to tidy dataframe.
- Parameters:
- Returns:
Tidy dataframe without index
- Return type:
DataFrame
- truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index
Return unique index levels.
- Parameters:
- Returns:
unique_index
- Return type:
Index
See also
- unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- class IndexIdxAccessor(*args, **kwargs)
Bases:
_PixAccessorDeprecated since version 0.2.9: Use the new name
df.pixof the accessor- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- class IndexPixAccessor(pandas_obj)
Bases:
_PixAccessor- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- class SeriesIdxAccessor(*args, **kwargs)
Bases:
_DataPixAccessorDeprecated since version 0.2.9: Use the new name
df.pixof the accessor- add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])
Add explicit levels to data as 0 values.
Remaining levels in data not found in levels or derive are taken from reference (or its index).
- Parameters:
- Returns:
unsorted data with additional zero data
- Return type:
DataFrame
- aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])
Aggregate labels on one or multiple levels together.
- Parameters:
agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”
axis (Axis, optional) – Axis on which to aggregate, default 0
dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True
mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data
**levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex.
region={"sdn_ssd": ["sdn", "ssd"]}aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.
- Returns:
Aggregated data
- Return type:
Data
Notes
If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.
See also
- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)
Converts units in a dataframe or series.
- Parameters:
unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)
level (str|None, default "unit") – Level name on
axisIf None, thenunitneeds to be a mapping like{from_unit: to_unit}axis (Axis, default 0) – Axis of unit level
- Returns:
DataFrame or Series with converted units
- Return type:
Data
Examples
>>> s = Series( ... [7, 8], ... MultiIndex.from_tuples( ... [("foo", "mm"), ("bar", "m")], names=["var", "unit"] ... ), ... ) >>> s.pix.convert_unit("km") var unit bar km 0.008000 foo km 0.000007 dtype: float64
>>> s.pix.convert_unit({"m": "km"}) var unit bar km 0.008 foo mm 7.000 dtype: float64
Notes
Uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,quantify,dequantify
- div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
Convert columns in data to pint extension types to handle units.
pint-pandas can only represent a single unit per column and is somewhat brittle.
- Parameters:
- Returns:
Data with internalized unit which stays with arithmetics
- Return type:
Data
- Raises:
ValueError – If level contains more than one unit
Examples
>>> s = Series( ... [7e-3, 8], ... MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]), ... ) >>> s.pix.quantify() var foo 7e-06 bar 0.008 dtype: pint[kilometer]
Notes
pint-pandas uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,dequantify,convert_unit
- radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series
Semijoin frame_or_series by index other.
Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.
- Parameters:
other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.
how ({'left', 'right', 'inner', 'outer'}) – Join method to use
level (None or str or int or) – Single level on which to join, if not given join on all
sort (bool, optional) – Whether to sort the index
axis ({0, 1, "index", "columns"}) – Axis on which to join
fill_value – Value for filling gaps introduced by right or outer joins
fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed
- Return type:
DataFrame or Series
- Raises:
ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if frame_or_series does not derive from DataFrame or Series
See also
- sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')
Convert multi-indexed time-series dataframe to tidy dataframe.
- Parameters:
- Returns:
Tidy dataframe without index
- Return type:
DataFrame
- truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index
Return unique index levels.
- Parameters:
- Returns:
unique_index
- Return type:
Index
See also
- unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- class SeriesPixAccessor(pandas_obj)
Bases:
_DataPixAccessor- add(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- add_zeros_like(reference: MultiIndex | DataFrame | Series, /, derive: Dict[str, MultiIndex] | None = None, **levels: Sequence[str])
Add explicit levels to data as 0 values.
Remaining levels in data not found in levels or derive are taken from reference (or its index).
- Parameters:
- Returns:
unsorted data with additional zero data
- Return type:
DataFrame
- aggregate(agg_func: str = 'sum', axis: Literal[0, 1, 'index', 'columns'] = 0, dropna: bool = True, mode: Literal['replace', 'append', 'return'] = 'replace', **levels: Dict[str, Sequence[Any]])
Aggregate labels on one or multiple levels together.
- Parameters:
agg_func (str, optional) – Function for aggregating values, default “sum” Other sensible options are “mean” or “first”
axis (Axis, optional) – Axis on which to aggregate, default 0
dropna (bool, optional) – Whether to drop or preserve NANs in the index, default True
mode ({"replace", "append", "return"}) – Whether to replace or to append to the individual labels or return the aggregated data
**levels – Mapping for one or multiple levels, which labels to aggregate under a common name f.ex.
region={"sdn_ssd": ["sdn", "ssd"]}aggregates the “sdn” and “ssd” regions to a new “sdn_ssd” region.
- Returns:
Aggregated data
- Return type:
Data
Notes
If you already have a complete mapping from country to region, then prefer to use groupby directly instead of relying on this relatively slow method.
See also
- antijoin(other: Index | Series | DataFrame, *, axis: Literal[0, 1, 'index', 'columns'] = 0)
Antijoin index_or_data with index other.
ie remove all occurrences of other from data
- Parameters:
other (Index) – Other index to join with
level (None or str or int or) – Single level on which to join, if not given join on all
axis ({0, 1, "index", "columns"}) – Axis on which to join
- Return type:
Index or DataFrame or Series
- Raises:
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if index_or_data does not derive from DataFrame or Series
See also
- assign(frame: Series | DataFrame | None = None, order: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, ignore_index: bool = False, **labels: Any) DataFrame | Series | MultiIndex
Add or overwrite levels on a multiindex.
- Parameters:
frame (Series or DataFrame, optional) – Additional labels
order (list of str, optional) – Level names in desired order or False, by default False
axis ({0, 1, "index", "columns"}, default 0) – Axis where to update multiindex
ignore_index (bool, optional) – If true, dataframes or series are not index aligned
**labels – Labels for each new index level
- Returns:
Series or DataFrame with changed index or new MultiIndex
- Return type:
df
- convert_unit(unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)
Converts units in a dataframe or series.
- Parameters:
unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)
level (str|None, default "unit") – Level name on
axisIf None, thenunitneeds to be a mapping like{from_unit: to_unit}axis (Axis, default 0) – Axis of unit level
- Returns:
DataFrame or Series with converted units
- Return type:
Data
Examples
>>> s = Series( ... [7, 8], ... MultiIndex.from_tuples( ... [("foo", "mm"), ("bar", "m")], names=["var", "unit"] ... ), ... ) >>> s.pix.convert_unit("km") var unit bar km 0.008000 foo km 0.000007 dtype: float64
>>> s.pix.convert_unit({"m": "km"}) var unit bar km 0.008 foo mm 7.000 dtype: float64
Notes
Uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,quantify,dequantify
- div(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divide(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- dropna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Remove missing index values.
Drops all index entries for which any or all (
how) levels are undefined.- Parameters:
subset (Sequence[str], optional) – Names of levels on which to check for NA values
how ({"any", "all"}) – Whether to remove an entry if all levels are NA or only a single one
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to check on
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- extract(template: str | None = None, *, keep: bool = False, dropna: bool = True, regex: bool = False, axis: Literal[0, 1, 'index', 'columns'] = 0, drop: bool | None = None, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Extract new index levels with templates matched against any index level.
The
**templatesargument defines pairs of level names and templates. Given level names are matched against the template, f.ex."Emi|{gas}|{sector}". Patterns ({gas}or{sector}) appearing in the template are extracted from the successful matches and added as new levels.Pattern names in the
optionalargument can be missing (including a leading|character) and are replaced by the string"Total"then.Changed in version 0.5.3: Added optional patterns.
Changed in version 0.5.0: drop replaced by keep and default changed to not keep. regex added.
Parameters ———- template : str, optional
Extraction template for a single level
- keepbool, default False
Whether to keep the split dimension
- dropnabool, default True
Whether to drop the non-matching levels
- regexbool, default False
Whether templates are given as regular expressions (regexes must use named captures)
- axis{0, 1, “index”, “columns”}, default 0
Axis of DataFrame to extract from
- dropbool, optional
Deprecated argument, use keep instead
- optional[str] or None, optional
Marks templates as optional
- **templatesstr
Templates for splitting one or multiple levels
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If
dimis not a dimension ofindex_or_seriesValueError – If
templateis given, while index has more than one level
Examples
>>> s = Series( ... range(4), ... MultiIndex.from_arrays( ... [ ... ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal", "SE|Elec"], ... ["GWh", "GWh", "EJ", "GWh"], ... ], ... names=["variable", "unit"], ... ), ... ) >>> s variable unit SE|Elec|Bio GWh 0 SE|Elec|Coal GWh 1 PE|Coal EJ 2 SE|Elec GWh 3 dtype: int64 >>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}") unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", optional=["fuel"]) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec Total 3 dtype: int64
>>> extractlevel(s, variable="SE|{type}|{fuel}", keep=True, dropna=False) variable unit type fuel SE|Elec|Bio GWh Elec Bio 0 SE|Elec|Coal GWh Elec Coal 1 PE|Coal EJ NaN NaN 2 SE|Elec GWh NaN NaN 3 dtype: int64
>>> extractlevel(s, variable=r"SE\|(?P<type>.*?)(?:\|(?P<fuel>.*?))?", regex=True) unit type fuel GWh Elec Bio 0 GWh Elec Coal 1 GWh Elec NaN 3 dtype: int64
>>> s = Series(range(3), ["SE|Elec|Bio", "SE|Elec|Coal", "PE|Coal"]) >>> extractlevel(s, "SE|{type}|{fuel}") type fuel Elec Bio 0 Coal 1 dtype: int64
See also
formatlevel
- fixna(axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Fix broken MultiIndex NA representation from .groupby(…, dropna=False)
Refer to https://github.com/coroa/pandas-indexing/issues/25 for details
- Parameters:
axis (Axis, optional) – Axis to fix, by default 0
- Return type:
index_or_data
- floordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- format(axis: Literal[0, 1, 'index', 'columns'] = 0, optional: Sequence[str] | None = None, **templates: str) DataFrame | Series | Index
Format index levels based on a template which can refer to other levels.
Changed in version 0.5.3: Added optional patterns.
- Parameters:
- Return type:
Index, Series or DataFrame
- Raises:
ValueError – If templates refer to non-existant levels
- isna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- mod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- multiply(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- notna(subset: Sequence[str] | None = None, how: Literal['any', 'all'] = 'any', axis: Literal[0, 1, 'index', 'columns'] = 0)
- pow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- project(levels: Sequence[str], axis: Literal[0, 1, 'index', 'columns'] = 0) DataFrame | Series | Index
Project multiindex to given levels.
Drops all levels except the ones explicitly mentioned from a given multiindex or an axis of a series or a dataframe.
- Parameters:
levels (sequence of str) – Names of levels to project on (to keep)
axis ({0, 1, "index", "columns"}, default 0) – Axis of DataFrame to project
- Returns:
index_or_data
- Return type:
Index|MultiIndex|Series|DataFrame
- quantify(level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
Convert columns in data to pint extension types to handle units.
pint-pandas can only represent a single unit per column and is somewhat brittle.
- Parameters:
- Returns:
Data with internalized unit which stays with arithmetics
- Return type:
Data
- Raises:
ValueError – If level contains more than one unit
Examples
>>> s = Series( ... [7e-3, 8], ... MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]), ... ) >>> s.pix.quantify() var foo 7e-06 bar 0.008 dtype: pint[kilometer]
Notes
pint-pandas uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
set_openscm_registry_as_default,dequantify,convert_unit
- radd(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdivmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rfloordiv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmod(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmul(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rpow(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rsub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rtruediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- semijoin(other: ~pandas.core.indexes.base.Index | ~pandas.core.series.Series | ~pandas.core.frame.DataFrame, *, how: ~typing.Literal['left', 'right', 'inner', 'outer'] = 'left', level: str | int | None = None, sort: bool = False, axis: ~typing.Literal[0, 1, 'index', 'columns'] = 0, fill_value: ~typing.Any = <no_default>, fail_on_reorder: bool = False) DataFrame | Series
Semijoin frame_or_series by index other.
Joins indexes of both inputs and then reindexes the primary data input with the resulting joined index allowing for filling values.
- Parameters:
other (Index or Data) – Other index to join with, if a DataFrame or Series is provided its axis is extracted.
how ({'left', 'right', 'inner', 'outer'}) – Join method to use
level (None or str or int or) – Single level on which to join, if not given join on all
sort (bool, optional) – Whether to sort the index
axis ({0, 1, "index", "columns"}) – Axis on which to join
fill_value – Value for filling gaps introduced by right or outer joins
fail_on_reorder (bool, default False) – Raise ValueError if index order cannot be guaranteed
- Return type:
DataFrame or Series
- Raises:
ValueError – If fail_on_reorder is True and the new index order does not correspond to the order of other
ValueError – If axis is not 0, “index” or 1, “columns”
TypeError – if frame_or_series does not derive from DataFrame or Series
See also
- sub(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- subtract(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- to_tidy(meta: DataFrame | None = None, value_name: str | None = 'value', columns: str | None = 'year')
Convert multi-indexed time-series dataframe to tidy dataframe.
- Parameters:
- Returns:
Tidy dataframe without index
- Return type:
DataFrame
- truediv(other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unique(levels: str | Sequence[str] | None, axis: Literal[0, 1, 'index', 'columns'] = 0) Index
Return unique index levels.
- Parameters:
- Returns:
unique_index
- Return type:
Index
See also
- unitadd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdivide(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmod(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmultiply(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitradd(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrdiv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrmul(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrtruediv(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitsub(other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
arithmetics
Provide aligned basic arithmetic ops.
Simple arithmetic operations add(), divide(), multiply() and
subtract() which allow setting the standard how=”outer” alignment that pandas
uses by default.
In practice, this means if dataframes do not share the same axes one can choose to get
the results for only the items index items existing in both indices (how="inner") or
whether to prefer the axis from the first (how="left") or the right (how="right)
operand.
See also
- add(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- binop(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- div(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divide(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- divmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- floordiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- mul(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- multiply(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- pow(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- radd(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rdivmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rfloordiv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmod(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rmul(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rpow(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rsub(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- rtruediv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- sub(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- subtract(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- truediv(df: Series | DataFrame, other: Series | DataFrame, assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitadd(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitbinop(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdiv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitdivide(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmod(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmul(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitmultiply(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitradd(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrdiv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrmul(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrsub(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitrtruediv(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
- unitsub(df: Series | DataFrame, other: Series | DataFrame, level: str = 'unit', assign: Dict[str, Any] | None = None, axis: Literal[0, 1, 'index', 'columns'] | None = None, **align_kwargs: Any)
selectors
Selectors improve .loc[] indexing for multi-index pandas data.
- class Ismatch(filters: Mapping[str, Any], regex: bool = False, ignore_missing_levels: bool = False)
Bases:
Selector- index_match(index, patterns)
- multiindex_match(index, patterns, level)
- isin(df: Series | DataFrame | None = None, ignore_missing_levels: bool = False, **filters: Any) Isin | Series
Constructs a MultiIndex selector.
- Parameters:
df (Data, optional) – Data on which to match, if missing an
Isinobject is returnedignore_missing_levels (bool, default False) – If set, levels missing in data index will be ignored
**filters – Filter to apply on given levels (lists are
ored, levels areanded) Callables are evaluated on the index level values.
- Return type:
Isin or Series
Example
>>> df.loc[isin(region="World", gas=["CO2", "N2O"])]
or with explicit df to get a boolean mask
>>> isin(df, region="World", gas=["CO2", "N2O"])
- ismatch(df: None | Index | DataFrame | Series | str = None, singlefilter: str | None = None, regex: bool = False, ignore_missing_levels: bool = False, **filters) Ismatch | Series
Constructs an Index or MultiIndex selector based on pattern matching.
- Parameters:
df (Data, optional) – Data on which to match, if missing an
Isinobject is returned.singlefilter (str, optional) – Filter to apply on a non-multiindex index (can also be handed into the
dfargument)regex (bool, default False) – If set, filters are interpreted as plain regex strings, otherwise (by default) a glob-like syntax is used
ignore_missing_levels (bool, default False) – If set, levels missing in data index will be ignored
**filters – Filter to apply on given levels (lists are
ored, levels areanded)
- Return type:
Isin or Series
Example
for a multiindex:
>>> df.loc[ismatch(variable="Emissions|*|Fossil Fuel and Industry")]
for a single index:
>>> df.loc[ismatch("*bla*")]
- maybe_const(x)
units
Unit handling in pandas data.
Enables unit conversions based on pint’s application registry (see also Notes).
By default units are expected – as in the IAMC default format – on a unit level on
each row, but a column-wise unit level is also supported.
Units can be handled in one of two flavours:
convert_unit()converts manually to a new unit like convert_unit(s, “km”)quantify()convert data to a pint pandas array which tracks units implicitly through arithmetics untildequantify()then extracts the tracked unit back into the multiindex level.While this is in theory the simpler approach, the underlying library
pint-pandas[1] is brittle and breaks from time to time.
Notes
The pint application registry is set by pint.set_application_registry() or
with set_openscm_registry_as_default(). The latter sets the IAMC based openscm-units one [2].
Examples
>>> import pandas_indexing as pi
>>> pi.set_openscm_registry_as_default()
>>> s = Series(
... [7, 8],
... MultiIndex.from_tuples([("foo", "mm"), ("bar", "m")], names=["var", "unit"]),
... )
>>> s = pi.convert_unit(s, "km")
>>> s
var unit
bar km 0.008000
foo km 0.000007
dtype: float64
>>> pi.quantify(s)
var
bar 0.008
foo 7e-06
dtype: pint[kilometer]
References
See also
pint.set_application_registry
- convert_unit(data: Series | DataFrame, unit: str | Mapping[str, str] | Callable[[str], str], level: str | None = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0)
Converts units in a dataframe or series.
- Parameters:
data (DataFrame or Series) – DataFrame or Series with a “unit” level
unit (str or dict or function from old to new unit) – Either a single target unit or a mapping from old unit to target unit (a unit missing from the mapping or with a return value of None is kept)
level (str|None, default "unit") – Level name on
axisIf None, thenunitneeds to be a mapping like{from_unit: to_unit}axis (Axis, default 0) – Axis of unit level
- Returns:
DataFrame or Series with converted units
- Return type:
Data
Examples
>>> s = Series( ... [7, 8], ... MultiIndex.from_tuples( ... [("foo", "mm"), ("bar", "m")], names=["var", "unit"] ... ), ... ) >>> convert_unit(s, "km") var unit bar km 0.008000 foo km 0.000007 dtype: float64
>>> convert_unit(s, {"m": "km"}) var unit bar km 0.008 foo mm 7.000 dtype: float64
Notes
Uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().See also
- dequantify(data: Series | DataFrame, level: str = 'unit', axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False)
- format_dtype(dtype)
- quantify(data: Series | DataFrame, level: str = 'unit', unit: str | None = None, axis: Literal[0, 1, 'index', 'columns'] = 0, copy: bool = False) Series | DataFrame
Convert columns in data to pint extension types to handle units.
pint-pandas can only represent a single unit per column and is somewhat brittle.
- Parameters:
data (DataFrame or Series) – DataFrame or Series to quantify
unit (str, optional) – If given, assumes data is currently in this unit.
level (str, optional) – Level of which to use the unit, by default “unit”
axis (Axis, optional) – Axis from which to pop the level, by default 0
copy (bool, optional) – Whether data should be copied, by default False
- Returns:
Data with internalized unit which stays with arithmetics
- Return type:
Data
- Raises:
ValueError – If level contains more than one unit
Examples
>>> s = Series( ... [7e-3, 8], ... MultiIndex.from_tuples([("foo", "m"), ("bar", "m")], names=["var", "unit"]), ... ) >>> quantify(s) var foo 7e-06 bar 0.008 dtype: pint[kilometer]
Notes
pint-pandas uses the pint application registry, which can be set with
pint.set_application_registry()orset_openscm_registry_as_default().