Changelog

v0.6.2 (2024-02-26)

  • Make semijoin() and antijoin() accept a Series or DataFrame argument to join against PR64

  • Add py.typed file to signal available type hints to downstream packages PR63

Special thanks to @znichollscr for his first contribution.

v0.6.1 (2024-12-01)

  • Fix iamc_aggregate() to return NaN instead of 0 if no sub-sectors exist.

v0.6.0 (2024-10-24)

  • Add resolver module with the class Resolver that supports consolidating IAMC-style scenario data with non-homogeneous variable coverage. Documentation is unfortunately still missing.

  • Add support for so-called optional patterns to extractlevel() and formatlevel(), for instance: df.pix.extract(variable="Emissions|{gas}|{sector}", optional=["sector"]) decomposes Emissions|CO2 into {"gas": "CO2", "sector": "Total"}.

v0.5.2 (2024-08-24)

  • Bumps minimum python version to 3.9 (which is close to EOL, anyway)

  • Improve selectors to arbitrarily interact with boolean Series, numpy arrays and callables, ie. 1. pd.Series([True, False]) & isin(model="a") produces the same result as

    isin(model="a") & pd.Series([True, False]) did earlier.

    1. isin(model="a") & (lambda s: s > 2) is now supported as well.

  • Fix a testing incompability introduced by a recent attrs update (24.1.0)

  • Load pint and pint-pandas packages only on first use

v0.5.1 (2024-05-20)

  • Fix ismatch() to not interpret brackets as regex symbols

  • Fix isin() to always return a series (instead of a numpy array)

v0.5.0 (2024-04-09)

  • BREAKING: Change extractlevel() to drop split levels by default and accordingly rename the governing argument from drop=False to keep=False PR53.

  • Add regex=True argument to extractlevel() to use templates as manual extraction regex, f.ex. df.pix.extract(variable=r"Emissions\|(?P<gas>.*?)(?:\|(?P<sector>.*?))?", regex=True) will also split Emissions|CO2 to gas = "CO2" and sector = NaN, while df.pix.extract(variable="Emissions|{gas}|{sector}") would have dropped it.

  • Update projectlevel() to raise KeyError for wrong level names PR52.

v0.4.2 (2024-04-03)

v0.4.1 (2024-03-20)

  • Add antijoin() for performing anti-joins PR48

  • Update usage guide for antijoin, but also with more focus on extractlevel

v0.4.0 (2023-12-12)

  • BREAKING accessors is imported implicitly. User code does not need to import it any longer PR45

  • Add continuous testing for python 3.12 PR47

  • Add All and None_ completing selector group PR46

  • Fix type hints of function aggregatelevel() PR44

  • Switch from black to ruff for formatting and update pre-commit versions PR43

v0.3.1 (2023-09-18)

  • The new assignlevel() argument ignore_index=True prevents the dataframe and series alignment which became the default in v0.3 (yesterday), since there are valid use cases of the old behaviour PR41

v0.3 (2023-09-17)

v0.2.10 (2023-08-31)

  • Add mode="append" and mode="return" arguments to aggregatelevel(), which extend the dataframe with the aggregated data or return it PR39

  • Add fail_on_reorder argument to semijoin() to raise a ValueError if the resulting data is not in the order of the provided index (helpful in conjunction with assignlevel()) PR37

  • Enhance concat() to also concatenate Index and MultiIndex objects PR37

v0.2.10-b1 (2023-07-26)

  • Revise arithmetics module:

    • Add all standard binary ops: add, sub, mul, pow, mod, floordiv, truediv, divmod, radd, rsub, rmul, rpow, rmod, rfloordiv, rtruediv, rdivmod

    • Support in-call assignment of individual levels using assign argument, like div(generation, capacity, assign=dict(variable="capacity_factor"))

    • Add a unit-aware variant for each binary op, like unitadd(), or unitmul(), which updates homogeneous units automatically with the calculation

  • Add fill_value argument to semijoin() for filling joining gaps

  • Add aggregatelevel() for aggregating individual level labels; in PR32

  • Fix formatlevel() to create a simple single-level index, if only a single index remains PR29

  • Add to_tidy() for converting a time-series data-frame to tidy format, as expected by plotting libraries like seaborn or plotly express; in PR31.

v0.2.9 (2023-07-11)

  • Rename pandas accessor to .pix (.idx is as of now deprecated, but available for the time being) in PR27.

  • Fix projectlevel() on columns of a DataFrame PR28

v0.2.8 (2023-06-24)

  • Units can be converted with convert_unit(), like f.ex. convert_unit(df, "km / h") or with convert_unit(df, {"m / s": "km / h"}) to convert only data with the m / s unit

  • If the openscm-units registry is registered as pint application registry then emission conversion between gas species are possible under the correct contexts:

from pandas_indexing import set_openscm_registry_as_default, convert_unit

ur = set_openscm_registry_as_default()
with ur.context("AR6GWP100"):
    df = convert_unit(df, "Mt CO2e/yr")  # or df = df.idx.convert_unit("Mt CO2e/yr")
  • To use unit conversion, you should install with pip install "pandas-indexing[units]" to pull in the optional pint and openscm-units dependencies

  • For more information about unit handling, refer to units or check the code added in PR17

  • Documentation fixes: MyST notebook rendering from PR20 and new docs for extractlevel() in PR21.

  • Bug fixes: semijoin(), concat() and ismatch() are working again as advertised PR21 and PR24.

v0.2.7 (2023-05-26)

  • Compatibility release to re-include Python 3.8 support and fix CI testing

  • extract() gains single-level index support

  • Minimal doc improvements

v0.2.6 (2023-05-25)

  • extractlevel() can be used on non-multiindex, like f.ex. extractlevel(df, "{sector}|{gas}") PR18

  • isin() accepts callable filters PR16, f.ex. df.loc[isin(year=lambda s: s>2000)]

  • New function concat() makes concatenation level aware PR14

v0.2.5 (2023-05-04)

v0.2.4 (2023-05-03)

  • Paper-bag release: Fix new accessors unique() and __repr__() and improve tests to catch trivial errors like these earlier PR10

v0.2.3 (2023-05-03)

  • uniquelevel() or .idx.unique returns the unique values of one or multiple levels. PR8

  • summarylevel() creates a string summarizing the index levels and their values. Can also be accessed as df.idx or index.idx PR9

v0.2.2 (2023-05-02)

v0.2.1 (2023-04-08)

  • Restore compatibility with python 3.8

  • Improve typing and add tests for isin() and ismatch()

v0.2 (2023-04-07)

  • isin() and ismatch() are now callable objects, which can be composed with the standard ~, & and | operators to more complex queries

  • add(), subtract(), multiply() and divide() in the new arithmetics module extend the standard pandas operations with join and other arguments known from pandas.DataFrame.align(). They are also available from the idx accessor.

  • Both additions were introduced in PR3

v0.1.2 (2023-02-27)

  • Add usage guide to documentation

  • Fix semijoin() method

v0.1.1 (2023-02-27)

v0.1 (2023-02-23)

  • Initial release