Here we are giving a brief introduction in working with IAMC-styled data with pandas and pandas-indexing.
import pandas as pd
Test data set
For experimenting and easy testing pandas-indexing brings along the power sector generation and capacity of the HighRE illustrative modelling pathway from the IPCC AR6 scenario database in IAMC format.
from pandas_indexing.datasets import remindhighre_power
df = remindhighre_power()
df.head()
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | variable | unit | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Capacity|Electricity|Biomass | GW | 42.9843 | 62.4717 | 85.6698 | 108.8834 | 128.5653 | 137.8452 | 132.8316 | 122.7382 | 106.3772 | 86.0265 | 65.6548 | 45.7591 | 14.5047 | 3.9628 | 0.8198 | 0.2150 |
| Capacity|Electricity|Coal | GW | 1238.8766 | 1525.9497 | 1876.5954 | 2075.2321 | 1225.4509 | 182.0149 | 1.3532 | 1.0729 | 0.7763 | 0.5029 | 0.3111 | 0.1836 | 0.0783 | 0.0918 | 0.0965 | 0.0866 | |||
| Capacity|Electricity|Gas | GW | 1096.6736 | 1343.1077 | 1603.1686 | 1852.6485 | 1761.8057 | 1584.4033 | 1423.5393 | 1289.4777 | 863.8716 | 562.8482 | 393.7209 | 349.9091 | 306.3883 | 213.5970 | 52.6045 | 0.2717 | |||
| Capacity|Electricity|Geothermal | GW | 8.1129 | 12.9985 | 24.1612 | 46.6573 | 70.6469 | 83.4506 | 84.4893 | 82.9688 | 79.1452 | 69.6628 | 58.2021 | 48.4007 | 39.2795 | 33.4793 | 27.6530 | 26.0198 | |||
| Capacity|Electricity|Hydro | GW | 806.5684 | 963.3523 | 1239.6174 | 1500.4508 | 1669.3565 | 1786.0468 | 1858.7821 | 1902.6347 | 1925.6535 | 1933.5199 | 1931.1096 | 1922.5689 | 1893.6410 | 1860.8611 | 1846.1020 | 1805.1398 |
Usage styles
pandas-indexing defines two different usage styles:
functions that can be imported from the toplevel module, like
from pandas_indexing import assignlevel assignlevel(df, unit="Mt CO2e/yr")
convenience accessors that are hooking into pandas as extensions, like
df.pix.assign(unit="Mt CO2e/yr)
Most of the functionality is available with both styles under slightly different names. I’ll present the functional style here first (and add the alternative as comments)
from pandas_indexing.core import describelevel
describelevel(df) # or: df.pix
Index:
* model : REMIND-MAgPIE 2.1-4.3 (1)
* scenario : DeepElec_SSP2_HighRE_Budg900 (1)
* region : World (1)
* variable : Capacity|Electricity|Biomass, ... (20)
* unit : GW, GWh/yr (2)
Columns:
* <unnamed> : 2005, 2010, 2015, 2020, 2025, 2030, 2035, 2040, ... 2100 (16)
df.pix
Index:
* model : REMIND-MAgPIE 2.1-4.3 (1)
* scenario : DeepElec_SSP2_HighRE_Budg900 (1)
* region : World (1)
* variable : Capacity|Electricity|Biomass, ... (20)
* unit : GW, GWh/yr (2)
Columns:
* <unnamed> : 2005, 2010, 2015, 2020, 2025, 2030, 2035, 2040, ... 2100 (16)
As one can see the IAMC format is defined by five index levels: model, scenario, variable, unit and region. In this data subset, we have a single model, scenario combination for one region and with several capacity variables starting with Capacity|Electricity| and generation variables starting with Secondary Energy|Electricity|.
The data comes with two different units: GW and GWh/yr, (hopefully) for capacity and generation, respectively.
Selecting data
For using pandas indexes effectively for computations, it makes sense to split the hierarchically variable index out into separate python variables: generation and capacity. The standard pandas tools for this job are pd.DataFrame.loc in conjunction with pd.IndexSlice or pd.DataFrame.query.
pandas_indexing brings ismatch and isin to make this job as easy as possible.
from pandas_indexing import isin, ismatch # no .idx equivalents
df.loc[ismatch(variable="Capacity|**"), 2030]
model scenario region variable unit
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World Capacity|Electricity|Biomass GW 137.8452
Capacity|Electricity|Coal GW 182.0149
Capacity|Electricity|Gas GW 1584.4033
Capacity|Electricity|Geothermal GW 83.4506
Capacity|Electricity|Hydro GW 1786.0468
Capacity|Electricity|Nuclear GW 275.5920
Capacity|Electricity|Oil GW 90.2329
Capacity|Electricity|Other GW 469.5231
Capacity|Electricity|Solar GW 11094.4847
Capacity|Electricity|Wind GW 3522.8061
Name: 2030, dtype: float64
ismatch allows using a glob-like pattern to subset into one or multiple named levels, together with the standard rename method we can get cleaned up capacity and generation data easily:
generation = df.loc[ismatch(variable="Secondary Energy|**")].rename(
index=lambda s: s.removeprefix("Secondary Energy|Electricity|")
)
generation
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | variable | unit | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Biomass | GWh/yr | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 |
| Coal | GWh/yr | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | |||
| Gas | GWh/yr | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | |||
| Geothermal | GWh/yr | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | |||
| Hydro | GWh/yr | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | |||
| Nuclear | GWh/yr | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | |||
| Oil | GWh/yr | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | |||
| Other | GWh/yr | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | |||
| Solar | GWh/yr | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | |||
| Wind | GWh/yr | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
Since this extraction of data is relatively common, extractlevel simplifies this by matching against a format-like template string:
from pandas_indexing import extractlevel, formatlevel
generation = extractlevel(df, variable="Secondary Energy|{carrier}|{fuel}", drop=True)
capacity = extractlevel(df, variable="Capacity|{carrier}|{fuel}", drop=True)
# or: df.pix.extract(variable="Secondary Energy|{carrier}|{fuel}")
generation
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | unit | carrier | fuel | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | GWh/yr | Electricity | Biomass | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 |
| Coal | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | |||||
| Gas | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | |||||
| Geothermal | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | |||||
| Hydro | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | |||||
| Nuclear | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | |||||
| Oil | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | |||||
| Other | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | |||||
| Solar | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | |||||
| Wind | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
The inverse operation is to combine strings back together with formatlevel:
formatlevel(generation, variable="Secondary Energy|{carrier}|{fuel}", drop=True)
# or: df.pix.format(variable="Secondary Energy|{carrier}|{fuel}")
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | unit | variable | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | GWh/yr | Secondary Energy|Electricity|Biomass | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 |
| Secondary Energy|Electricity|Coal | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | ||||
| Secondary Energy|Electricity|Gas | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | ||||
| Secondary Energy|Electricity|Geothermal | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | ||||
| Secondary Energy|Electricity|Hydro | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | ||||
| Secondary Energy|Electricity|Nuclear | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | ||||
| Secondary Energy|Electricity|Oil | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | ||||
| Secondary Energy|Electricity|Other | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | ||||
| Secondary Energy|Electricity|Solar | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | ||||
| Secondary Energy|Electricity|Wind | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
With generation and capacity conveniently split into separate variables, we can calculate capacity factors (ratios of generation and capacity) directly, as long as we take care of removing the conflicting unit level. Similarly to ismatch, isin can be provided as an argument to .loc[] to select on named index levels with the difference that only exact matches are considered.
capacity_factor = generation.droplevel("unit") / 8760 / capacity.droplevel("unit")
capacity_factor.loc[isin(fuel=["Solar", "Wind", "Hydro", "Geothermal"]), 2030:2051]
| 2030 | 2035 | 2040 | 2045 | 2050 | |||||
|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | carrier | fuel | |||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Electricity | Geothermal | 0.849984 | 0.850005 | 0.849989 | 0.849986 | 0.850021 |
| Hydro | 0.394597 | 0.394474 | 0.396240 | 0.398217 | 0.399716 | ||||
| Solar | 0.161156 | 0.154237 | 0.151994 | 0.150698 | 0.149422 | ||||
| Wind | 0.263820 | 0.275781 | 0.289083 | 0.302018 | 0.298163 |
Instead of dropping the unit level, there is also a set of unit-aware calculation functions, so that this full capacity factor calculation can be performed in very few steps (the unit aware calculation realizes correctly that the capacity factor is unit-less):
generation = extractlevel(df, variable="Secondary Energy|{carrier}|{fuel}", drop=True)
capacity = extractlevel(df, variable="Capacity|{carrier}|{fuel}", drop=True)
generation.pix.unitdiv(capacity)
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | unit | carrier | fuel | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Electricity | Biomass | 0.624410 | 0.622179 | 0.621779 | 0.622625 | 0.640563 | 0.703139 | 0.846429 | 1.091001 | 1.458054 | 1.950832 | 2.628978 | 3.851051 | 12.305194 | 44.837349 | 214.719844 | 835.312733 | |
| Coal | 0.683221 | 0.616954 | 0.562079 | 0.591347 | 0.562819 | 0.547376 | 0.543278 | 0.534583 | 0.542898 | 0.548193 | 0.570406 | 0.586816 | 0.647522 | 0.655854 | 0.656748 | 0.658644 | |||||
| Gas | 0.396947 | 0.428965 | 0.424722 | 0.434726 | 0.418885 | 0.327710 | 0.232976 | 0.230218 | 0.231365 | 0.246935 | 0.189458 | 0.172202 | 0.090228 | 0.090036 | 0.090358 | 0.163281 | |||||
| Geothermal | 0.849531 | 0.849339 | 0.849477 | 0.849434 | 0.849404 | 0.849402 | 0.849423 | 0.849407 | 0.849404 | 0.849440 | 0.849395 | 0.849412 | 0.849410 | 0.849387 | 0.849469 | 0.849447 | |||||
| Hydro | 0.421045 | 0.421212 | 0.403690 | 0.403108 | 0.397248 | 0.394327 | 0.394204 | 0.395969 | 0.397945 | 0.399443 | 0.400637 | 0.402054 | 0.404988 | 0.408025 | 0.409803 | 0.412868 | |||||
| Nuclear | 0.799454 | 0.745660 | 0.736736 | 0.731658 | 0.756045 | 0.776634 | 0.786077 | 0.799452 | 0.799453 | 0.799455 | 0.799442 | 0.799441 | 0.799475 | 0.799417 | 0.800720 | 0.833897 | |||||
| Oil | 0.242962 | 0.250517 | 0.256895 | 0.249993 | 0.254847 | 0.247864 | 0.249823 | 0.249887 | 0.249750 | 0.249769 | 0.250253 | 0.264067 | 0.264067 | 0.264067 | 0.261885 | 0.251493 | |||||
| Other | NaN | NaN | 0.149296 | 0.149237 | 0.125370 | 0.119997 | 0.109662 | 0.100468 | 0.095265 | 0.093251 | 0.093126 | 0.093108 | 0.093314 | 0.093045 | 0.092409 | 0.092235 | |||||
| Solar | 0.173656 | 0.191417 | 0.135744 | 0.162338 | 0.165758 | 0.161046 | 0.154131 | 0.151890 | 0.150595 | 0.149319 | 0.147843 | 0.146626 | 0.143060 | 0.140284 | 0.138511 | 0.136580 | |||||
| Wind | 0.385930 | 0.374443 | 0.230212 | 0.253129 | 0.260839 | 0.263639 | 0.275592 | 0.288885 | 0.301811 | 0.297959 | 0.295101 | 0.291983 | 0.283467 | 0.279135 | 0.273659 | 0.270820 |
Under the hood isin and ismatch generate Selector objects. They can be composed into complex queries intuitively, which are kept as a hierarchical structure of objects.
query = isin(fuel=["Coal", "Gas", "Nuclear"], unit="GW") & ~ismatch(fuel="S*")
query
And(a=Isin(filters={'fuel': ['Coal', 'Gas', 'Nuclear'], 'unit': 'GW'}), b=Not(a=Ismatch(filters={'fuel': 'S*'}, regex=False)))
For evaluating such a query one needs to pass in a data object to produce a boolean mask. Since pandas .loc indexer does exactly that, these queries work as expected.
query(capacity)
model scenario region unit carrier fuel
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Biomass False
Coal True
Gas True
Geothermal False
Hydro False
Nuclear True
Oil False
Other False
Solar False
Wind False
dtype: bool
Note
It is only possible from version 0.5.2 to use a pandas boolean series in front of a selector; ie.
(capacity[2030] > 250) & isin(variable=["Coal", "Gas", "Nuclear"], unit="GW")
works, as you would expect it, in the same way as
isin(variable=["Coal", "Gas", "Nuclear"], unit="GW") & (capacity[2030] > 250)
high_capacity_fossil = capacity.loc[
isin(fuel=["Coal", "Gas", "Nuclear"], unit="GW") & (capacity[2030] > 250),
:2041,
]
high_capacity_fossil
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | unit | carrier | fuel | ||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | GW | Electricity | Gas | 1096.6736 | 1343.1077 | 1603.1686 | 1852.6485 | 1761.8057 | 1584.4033 | 1423.5393 | 1289.4777 |
| Nuclear | 408.3700 | 404.3094 | 379.1375 | 345.2239 | 310.1729 | 275.5920 | 234.7028 | 214.4376 |
The simple fact that this is an operation on [], means that we can also use it to modify values in-place:
high_capacity_fossil.loc[isin(fuel="Gas"), 2030:] = 1000.0
high_capacity_fossil
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | unit | carrier | fuel | ||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | GW | Electricity | Gas | 1096.6736 | 1343.1077 | 1603.1686 | 1852.6485 | 1761.8057 | 1000.000 | 1000.0000 | 1000.0000 |
| Nuclear | 408.3700 | 404.3094 | 379.1375 | 345.2239 | 310.1729 | 275.592 | 234.7028 | 214.4376 |
Most methods in pandas_indexing do not care whether they are run on an index, a series or a dataframe, but will transiently take care of handing them down to the appropriate level:
fossil_series = (
capacity.loc[isin(fuel=["Coal", "Gas", "Nuclear"]), [2030, 2040, 2050, 2060]]
.rename_axis(columns="year")
.stack()
)
fossil_series
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
dtype: float64
fossil_series.loc[isin(year=[2030, 2050])]
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2030 182.0149
2050 0.5029
Gas 2030 1584.4033
2050 562.8482
Nuclear 2030 275.5920
2050 156.7766
dtype: float64
isin(fossil_series.index, fuel="Nuclear")
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2030 False
2040 False
2050 False
2060 False
Gas 2030 False
2040 False
2050 False
2060 False
Nuclear 2030 True
2040 True
2050 True
2060 True
dtype: bool
Selecting based on a multi-index
If we need pairs of data like Coal in 2030 and Gas in 2035 and Nuclear in 2040 and 2050, then we can pass a multiindex to isin:
idx = pd.MultiIndex.from_tuples(
[("Coal", 2030), ("Gas", 2035), ("Nuclear", 2040), ("Nuclear", 2050)],
names=["fuel", "year"],
)
idx
MultiIndex([( 'Coal', 2030),
( 'Gas', 2035),
('Nuclear', 2040),
('Nuclear', 2050)],
names=['fuel', 'year'])
fossil_series.loc[isin(idx)]
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2030 182.0149
Nuclear 2040 214.4376
2050 156.7766
dtype: float64
Since ("Gas", 2035) is not part of the original fossil_series it is silently ignored, just like with other uses of isin.
Alternatively, the same result can be retrieved with the more powerful semijoin using an "inner" join:
from pandas_indexing import semijoin
semijoin(
fossil_series, idx, how="inner"
) # or: fossil_series.pix.semijoin(idx, how="inner")
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2030 182.0149
Nuclear 2040 214.4376
2050 156.7766
dtype: float64
A "right"-join on the other hand will follow the order and keep all elements of the provided idx. Since ("Gas", 2035) is not part of the original fossil_series it shows up as NaNs here:
semijoin(fossil_series, idx, how="right")
fuel year model scenario region unit carrier
Coal 2030 REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity 182.0149
Gas 2035 NaN NaN NaN NaN NaN NaN
Nuclear 2040 REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity 214.4376
2050 REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity 156.7766
dtype: float64
It is also possible to get the inverted result, with only the not matching rows, with an antijoin
from pandas_indexing import antijoin
antijoin(fossil_series, idx)
# or: fossil_series.pix.antijoin(idx)
model scenario region unit carrier fuel year
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 World GW Electricity Coal 2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2060 92.0667
dtype: float64
Projecting levels
Often after selecting the right subsets, ie the interesting model or scenario it makes sense to consolidate the data to a given set of levels. That is what projectlevel is used for:
from pandas_indexing import projectlevel
simple_fossil_series = projectlevel(fossil_series, ["fuel", "year"])
# or: fossil_series.pix.project(["fuel", "year"])
simple_fossil_series
fuel year
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
dtype: float64
projectlevel reduces the levels attached to a multiindex to the ones explicitly named. It is basically the complement to droplevel which removes the listed names
projectlevel(fossil_series, ["model", "scenario"]) == fossil_series.droplevel(
["carrier", "fuel", "unit", "region", "year"]
)
model scenario
REMIND-MAgPIE 2.1-4.3 DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
DeepElec_SSP2_HighRE_Budg900 True
dtype: bool
Assigning to levels
assignlevel allows to modify individual values with helpful keyword arguments,
from pandas_indexing import assignlevel
assignlevel(df, variable="Updated|" + projectlevel(df.index, "variable"), unit="bla")
# or: df.pix.assign(variable="Updated|" + df.index.pix.project("variable"), unit="bla")
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | variable | unit | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Updated|Capacity|Electricity|Biomass | bla | 4.298430e+01 | 6.247170e+01 | 8.566980e+01 | 1.088834e+02 | 1.285653e+02 | 1.378452e+02 | 1.328316e+02 | 1.227382e+02 | 1.063772e+02 | 8.602650e+01 | 6.565480e+01 | 4.575910e+01 | 1.450470e+01 | 3.962800e+00 | 8.198000e-01 | 2.150000e-01 |
| Updated|Capacity|Electricity|Coal | bla | 1.238877e+03 | 1.525950e+03 | 1.876595e+03 | 2.075232e+03 | 1.225451e+03 | 1.820149e+02 | 1.353200e+00 | 1.072900e+00 | 7.763000e-01 | 5.029000e-01 | 3.111000e-01 | 1.836000e-01 | 7.830000e-02 | 9.180000e-02 | 9.650000e-02 | 8.660000e-02 | |||
| Updated|Capacity|Electricity|Gas | bla | 1.096674e+03 | 1.343108e+03 | 1.603169e+03 | 1.852649e+03 | 1.761806e+03 | 1.584403e+03 | 1.423539e+03 | 1.289478e+03 | 8.638716e+02 | 5.628482e+02 | 3.937209e+02 | 3.499091e+02 | 3.063883e+02 | 2.135970e+02 | 5.260450e+01 | 2.717000e-01 | |||
| Updated|Capacity|Electricity|Geothermal | bla | 8.112900e+00 | 1.299850e+01 | 2.416120e+01 | 4.665730e+01 | 7.064690e+01 | 8.345060e+01 | 8.448930e+01 | 8.296880e+01 | 7.914520e+01 | 6.966280e+01 | 5.820210e+01 | 4.840070e+01 | 3.927950e+01 | 3.347930e+01 | 2.765300e+01 | 2.601980e+01 | |||
| Updated|Capacity|Electricity|Hydro | bla | 8.065684e+02 | 9.633523e+02 | 1.239617e+03 | 1.500451e+03 | 1.669357e+03 | 1.786047e+03 | 1.858782e+03 | 1.902635e+03 | 1.925653e+03 | 1.933520e+03 | 1.931110e+03 | 1.922569e+03 | 1.893641e+03 | 1.860861e+03 | 1.846102e+03 | 1.805140e+03 | |||
| Updated|Capacity|Electricity|Nuclear | bla | 4.083700e+02 | 4.043094e+02 | 3.791375e+02 | 3.452239e+02 | 3.101729e+02 | 2.755920e+02 | 2.347028e+02 | 2.144376e+02 | 1.860849e+02 | 1.567766e+02 | 1.239472e+02 | 9.206670e+01 | 3.877600e+01 | 8.455000e+00 | 5.580000e-01 | 1.900000e-02 | |||
| Updated|Capacity|Electricity|Oil | bla | 5.112499e+02 | 4.630961e+02 | 3.709638e+02 | 2.817401e+02 | 1.932768e+02 | 9.023290e+01 | 3.159640e+01 | 2.190000e+01 | 1.450230e+01 | 5.341200e+00 | 3.064300e+00 | 1.200000e-02 | 1.200000e-02 | 1.200000e-02 | 1.210000e-02 | 1.260000e-02 | |||
| Updated|Capacity|Electricity|Other | bla | 0.000000e+00 | 0.000000e+00 | 3.396000e-01 | 1.656200e+00 | 8.110980e+01 | 4.695231e+02 | 1.253078e+03 | 2.254017e+03 | 3.341699e+03 | 4.267072e+03 | 4.925403e+03 | 5.433398e+03 | 6.432168e+03 | 7.478326e+03 | 8.321384e+03 | 9.334424e+03 | |||
| Updated|Capacity|Electricity|Solar | bla | 2.682400e+00 | 4.961380e+01 | 2.198533e+02 | 6.509654e+02 | 3.785689e+03 | 1.109448e+04 | 1.906378e+04 | 2.556775e+04 | 3.097003e+04 | 3.608447e+04 | 4.110399e+04 | 4.527457e+04 | 5.443399e+04 | 6.305545e+04 | 6.884808e+04 | 7.743017e+04 | |||
| Updated|Capacity|Electricity|Wind | bla | 3.226040e+01 | 1.811785e+02 | 4.124751e+02 | 6.682517e+02 | 1.616334e+03 | 3.522806e+03 | 5.459596e+03 | 6.821745e+03 | 7.986305e+03 | 9.064963e+03 | 9.934779e+03 | 1.082886e+04 | 1.320022e+04 | 1.459313e+04 | 1.601071e+04 | 1.736882e+04 | |||
| Updated|Secondary Energy|Electricity|Biomass | bla | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 | |||
| Updated|Secondary Energy|Electricity|Coal | bla | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | |||
| Updated|Secondary Energy|Electricity|Gas | bla | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | |||
| Updated|Secondary Energy|Electricity|Geothermal | bla | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | |||
| Updated|Secondary Energy|Electricity|Hydro | bla | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | |||
| Updated|Secondary Energy|Electricity|Nuclear | bla | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | |||
| Updated|Secondary Energy|Electricity|Oil | bla | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | |||
| Updated|Secondary Energy|Electricity|Other | bla | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | |||
| Updated|Secondary Energy|Electricity|Solar | bla | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | |||
| Updated|Secondary Energy|Electricity|Wind | bla | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
This particular case is even more clearly handled with formatlevel:
from pandas_indexing import formatlevel
formatlevel(df, variable="Updated|{variable}", unit="bla")
# or: df.pix.format(variable=...)
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | variable | unit | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Updated|Capacity|Electricity|Biomass | bla | 4.298430e+01 | 6.247170e+01 | 8.566980e+01 | 1.088834e+02 | 1.285653e+02 | 1.378452e+02 | 1.328316e+02 | 1.227382e+02 | 1.063772e+02 | 8.602650e+01 | 6.565480e+01 | 4.575910e+01 | 1.450470e+01 | 3.962800e+00 | 8.198000e-01 | 2.150000e-01 |
| Updated|Capacity|Electricity|Coal | bla | 1.238877e+03 | 1.525950e+03 | 1.876595e+03 | 2.075232e+03 | 1.225451e+03 | 1.820149e+02 | 1.353200e+00 | 1.072900e+00 | 7.763000e-01 | 5.029000e-01 | 3.111000e-01 | 1.836000e-01 | 7.830000e-02 | 9.180000e-02 | 9.650000e-02 | 8.660000e-02 | |||
| Updated|Capacity|Electricity|Gas | bla | 1.096674e+03 | 1.343108e+03 | 1.603169e+03 | 1.852649e+03 | 1.761806e+03 | 1.584403e+03 | 1.423539e+03 | 1.289478e+03 | 8.638716e+02 | 5.628482e+02 | 3.937209e+02 | 3.499091e+02 | 3.063883e+02 | 2.135970e+02 | 5.260450e+01 | 2.717000e-01 | |||
| Updated|Capacity|Electricity|Geothermal | bla | 8.112900e+00 | 1.299850e+01 | 2.416120e+01 | 4.665730e+01 | 7.064690e+01 | 8.345060e+01 | 8.448930e+01 | 8.296880e+01 | 7.914520e+01 | 6.966280e+01 | 5.820210e+01 | 4.840070e+01 | 3.927950e+01 | 3.347930e+01 | 2.765300e+01 | 2.601980e+01 | |||
| Updated|Capacity|Electricity|Hydro | bla | 8.065684e+02 | 9.633523e+02 | 1.239617e+03 | 1.500451e+03 | 1.669357e+03 | 1.786047e+03 | 1.858782e+03 | 1.902635e+03 | 1.925653e+03 | 1.933520e+03 | 1.931110e+03 | 1.922569e+03 | 1.893641e+03 | 1.860861e+03 | 1.846102e+03 | 1.805140e+03 | |||
| Updated|Capacity|Electricity|Nuclear | bla | 4.083700e+02 | 4.043094e+02 | 3.791375e+02 | 3.452239e+02 | 3.101729e+02 | 2.755920e+02 | 2.347028e+02 | 2.144376e+02 | 1.860849e+02 | 1.567766e+02 | 1.239472e+02 | 9.206670e+01 | 3.877600e+01 | 8.455000e+00 | 5.580000e-01 | 1.900000e-02 | |||
| Updated|Capacity|Electricity|Oil | bla | 5.112499e+02 | 4.630961e+02 | 3.709638e+02 | 2.817401e+02 | 1.932768e+02 | 9.023290e+01 | 3.159640e+01 | 2.190000e+01 | 1.450230e+01 | 5.341200e+00 | 3.064300e+00 | 1.200000e-02 | 1.200000e-02 | 1.200000e-02 | 1.210000e-02 | 1.260000e-02 | |||
| Updated|Capacity|Electricity|Other | bla | 0.000000e+00 | 0.000000e+00 | 3.396000e-01 | 1.656200e+00 | 8.110980e+01 | 4.695231e+02 | 1.253078e+03 | 2.254017e+03 | 3.341699e+03 | 4.267072e+03 | 4.925403e+03 | 5.433398e+03 | 6.432168e+03 | 7.478326e+03 | 8.321384e+03 | 9.334424e+03 | |||
| Updated|Capacity|Electricity|Solar | bla | 2.682400e+00 | 4.961380e+01 | 2.198533e+02 | 6.509654e+02 | 3.785689e+03 | 1.109448e+04 | 1.906378e+04 | 2.556775e+04 | 3.097003e+04 | 3.608447e+04 | 4.110399e+04 | 4.527457e+04 | 5.443399e+04 | 6.305545e+04 | 6.884808e+04 | 7.743017e+04 | |||
| Updated|Capacity|Electricity|Wind | bla | 3.226040e+01 | 1.811785e+02 | 4.124751e+02 | 6.682517e+02 | 1.616334e+03 | 3.522806e+03 | 5.459596e+03 | 6.821745e+03 | 7.986305e+03 | 9.064963e+03 | 9.934779e+03 | 1.082886e+04 | 1.320022e+04 | 1.459313e+04 | 1.601071e+04 | 1.736882e+04 | |||
| Updated|Secondary Energy|Electricity|Biomass | bla | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 | |||
| Updated|Secondary Energy|Electricity|Coal | bla | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | |||
| Updated|Secondary Energy|Electricity|Gas | bla | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | |||
| Updated|Secondary Energy|Electricity|Geothermal | bla | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | |||
| Updated|Secondary Energy|Electricity|Hydro | bla | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | |||
| Updated|Secondary Energy|Electricity|Nuclear | bla | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | |||
| Updated|Secondary Energy|Electricity|Oil | bla | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | |||
| Updated|Secondary Energy|Electricity|Other | bla | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | |||
| Updated|Secondary Energy|Electricity|Solar | bla | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | |||
| Updated|Secondary Energy|Electricity|Wind | bla | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
Both functions avoid having to rely on reset_index, set_index pairs, which are painful for large data, since set_index is expensive!
df.reset_index().assign(variable="Capacity").set_index(df.index.names)
| 2005 | 2010 | 2015 | 2020 | 2025 | 2030 | 2035 | 2040 | 2045 | 2050 | 2055 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| model | scenario | region | variable | unit | ||||||||||||||||
| REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | World | Capacity | GW | 4.298430e+01 | 6.247170e+01 | 8.566980e+01 | 1.088834e+02 | 1.285653e+02 | 1.378452e+02 | 1.328316e+02 | 1.227382e+02 | 1.063772e+02 | 8.602650e+01 | 6.565480e+01 | 4.575910e+01 | 1.450470e+01 | 3.962800e+00 | 8.198000e-01 | 2.150000e-01 |
| GW | 1.238877e+03 | 1.525950e+03 | 1.876595e+03 | 2.075232e+03 | 1.225451e+03 | 1.820149e+02 | 1.353200e+00 | 1.072900e+00 | 7.763000e-01 | 5.029000e-01 | 3.111000e-01 | 1.836000e-01 | 7.830000e-02 | 9.180000e-02 | 9.650000e-02 | 8.660000e-02 | ||||
| GW | 1.096674e+03 | 1.343108e+03 | 1.603169e+03 | 1.852649e+03 | 1.761806e+03 | 1.584403e+03 | 1.423539e+03 | 1.289478e+03 | 8.638716e+02 | 5.628482e+02 | 3.937209e+02 | 3.499091e+02 | 3.063883e+02 | 2.135970e+02 | 5.260450e+01 | 2.717000e-01 | ||||
| GW | 8.112900e+00 | 1.299850e+01 | 2.416120e+01 | 4.665730e+01 | 7.064690e+01 | 8.345060e+01 | 8.448930e+01 | 8.296880e+01 | 7.914520e+01 | 6.966280e+01 | 5.820210e+01 | 4.840070e+01 | 3.927950e+01 | 3.347930e+01 | 2.765300e+01 | 2.601980e+01 | ||||
| GW | 8.065684e+02 | 9.633523e+02 | 1.239617e+03 | 1.500451e+03 | 1.669357e+03 | 1.786047e+03 | 1.858782e+03 | 1.902635e+03 | 1.925653e+03 | 1.933520e+03 | 1.931110e+03 | 1.922569e+03 | 1.893641e+03 | 1.860861e+03 | 1.846102e+03 | 1.805140e+03 | ||||
| GW | 4.083700e+02 | 4.043094e+02 | 3.791375e+02 | 3.452239e+02 | 3.101729e+02 | 2.755920e+02 | 2.347028e+02 | 2.144376e+02 | 1.860849e+02 | 1.567766e+02 | 1.239472e+02 | 9.206670e+01 | 3.877600e+01 | 8.455000e+00 | 5.580000e-01 | 1.900000e-02 | ||||
| GW | 5.112499e+02 | 4.630961e+02 | 3.709638e+02 | 2.817401e+02 | 1.932768e+02 | 9.023290e+01 | 3.159640e+01 | 2.190000e+01 | 1.450230e+01 | 5.341200e+00 | 3.064300e+00 | 1.200000e-02 | 1.200000e-02 | 1.200000e-02 | 1.210000e-02 | 1.260000e-02 | ||||
| GW | 0.000000e+00 | 0.000000e+00 | 3.396000e-01 | 1.656200e+00 | 8.110980e+01 | 4.695231e+02 | 1.253078e+03 | 2.254017e+03 | 3.341699e+03 | 4.267072e+03 | 4.925403e+03 | 5.433398e+03 | 6.432168e+03 | 7.478326e+03 | 8.321384e+03 | 9.334424e+03 | ||||
| GW | 2.682400e+00 | 4.961380e+01 | 2.198533e+02 | 6.509654e+02 | 3.785689e+03 | 1.109448e+04 | 1.906378e+04 | 2.556775e+04 | 3.097003e+04 | 3.608447e+04 | 4.110399e+04 | 4.527457e+04 | 5.443399e+04 | 6.305545e+04 | 6.884808e+04 | 7.743017e+04 | ||||
| GW | 3.226040e+01 | 1.811785e+02 | 4.124751e+02 | 6.682517e+02 | 1.616334e+03 | 3.522806e+03 | 5.459596e+03 | 6.821745e+03 | 7.986305e+03 | 9.064963e+03 | 9.934779e+03 | 1.082886e+04 | 1.320022e+04 | 1.459313e+04 | 1.601071e+04 | 1.736882e+04 | ||||
| GWh/yr | 2.352778e+05 | 3.407222e+05 | 4.669444e+05 | 5.942778e+05 | 7.219167e+05 | 8.496389e+05 | 9.855833e+05 | 1.173833e+06 | 1.359639e+06 | 1.471139e+06 | 1.513056e+06 | 1.544750e+06 | 1.564583e+06 | 1.557556e+06 | 1.543056e+06 | 1.574306e+06 | ||||
| GWh/yr | 7.419778e+06 | 8.252667e+06 | 9.246333e+06 | 1.075747e+07 | 6.045972e+06 | 8.733611e+05 | 6.444444e+03 | 5.027778e+03 | 3.694444e+03 | 2.416667e+03 | 1.555556e+03 | 9.444444e+02 | 4.444444e+02 | 5.277778e+02 | 5.555556e+02 | 5.000000e+02 | ||||
| GWh/yr | 3.816028e+06 | 5.050500e+06 | 5.968778e+06 | 7.060083e+06 | 6.469250e+06 | 4.551528e+06 | 2.907250e+06 | 2.602278e+06 | 1.752056e+06 | 1.218361e+06 | 6.538889e+05 | 5.281944e+05 | 2.423333e+05 | 1.685833e+05 | 4.166667e+04 | 3.888889e+02 | ||||
| GWh/yr | 6.041667e+04 | 9.677778e+04 | 1.799167e+05 | 3.474167e+05 | 5.260278e+05 | 6.213611e+05 | 6.291111e+05 | 6.177778e+05 | 5.893056e+05 | 5.187222e+05 | 4.333611e+05 | 3.603889e+05 | 2.924722e+05 | 2.492778e+05 | 2.059167e+05 | 1.937500e+05 | ||||
| GWh/yr | 2.976944e+06 | 3.557028e+06 | 4.386694e+06 | 5.302056e+06 | 5.813167e+06 | 6.173778e+06 | 6.423194e+06 | 6.604167e+06 | 6.717417e+06 | 6.770250e+06 | 6.782028e+06 | 6.775917e+06 | 6.722667e+06 | 6.655833e+06 | 6.631806e+06 | 6.533167e+06 | ||||
| GWh/yr | 2.861861e+06 | 2.642750e+06 | 2.448556e+06 | 2.214167e+06 | 2.055667e+06 | 1.876222e+06 | 1.617278e+06 | 1.502778e+06 | 1.304083e+06 | 1.098694e+06 | 8.686111e+05 | 6.451944e+05 | 2.717500e+05 | 5.925000e+04 | 3.916667e+03 | 1.388889e+02 | ||||
| GWh/yr | 1.088861e+06 | 1.016972e+06 | 8.353889e+05 | 6.174167e+05 | 4.317778e+05 | 1.960556e+05 | 6.919444e+04 | 4.797222e+04 | 3.175000e+04 | 1.169444e+04 | 6.722222e+03 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | 2.777778e+01 | ||||
| GWh/yr | 0.000000e+00 | 0.000000e+00 | 4.444444e+02 | 2.166667e+03 | 8.913889e+04 | 4.938889e+05 | 1.204583e+06 | 1.985111e+06 | 2.790639e+06 | 3.488083e+06 | 4.020806e+06 | 4.434639e+06 | 5.261472e+06 | 6.099583e+06 | 6.740833e+06 | 7.547194e+06 | ||||
| GWh/yr | 4.083333e+03 | 8.325000e+04 | 2.616111e+05 | 9.263611e+05 | 5.500722e+06 | 1.566242e+07 | 2.575731e+07 | 3.404253e+07 | 4.088394e+07 | 4.723214e+07 | 5.327061e+07 | 5.819236e+07 | 6.826367e+07 | 7.754106e+07 | 8.359456e+07 | 9.270394e+07 | ||||
| GWh/yr | 1.091389e+05 | 5.946944e+05 | 8.323889e+05 | 1.482806e+06 | 3.695778e+06 | 8.141417e+06 | 1.318953e+07 | 1.727517e+07 | 2.112919e+07 | 2.367689e+07 | 2.569986e+07 | 2.771675e+07 | 3.280081e+07 | 3.570786e+07 | 3.840797e+07 | 4.123369e+07 |
Examining level values and level combinations
We already encountered the possibility to get an overview of the available levels and their values with describelevel:
describelevel(df) # or: df.idx
Index:
* model : REMIND-MAgPIE 2.1-4.3 (1)
* scenario : DeepElec_SSP2_HighRE_Budg900 (1)
* region : World (1)
* variable : Capacity|Electricity|Biomass, ... (20)
* unit : GW, GWh/yr (2)
Columns:
* <unnamed> : 2005, 2010, 2015, 2020, 2025, 2030, 2035, 2040, ... 2100 (16)
Often it is necessary to get programmatic access to the unique values of one or more levels:
from pandas_indexing import uniquelevel
uniquelevel(df, "variable")
# or: df.pix.unique("variable")
# or in vanilla pandas: df.index.unique("variable")
Index(['Capacity|Electricity|Biomass', 'Capacity|Electricity|Coal',
'Capacity|Electricity|Gas', 'Capacity|Electricity|Geothermal',
'Capacity|Electricity|Hydro', 'Capacity|Electricity|Nuclear',
'Capacity|Electricity|Oil', 'Capacity|Electricity|Other',
'Capacity|Electricity|Solar', 'Capacity|Electricity|Wind',
'Secondary Energy|Electricity|Biomass',
'Secondary Energy|Electricity|Coal', 'Secondary Energy|Electricity|Gas',
'Secondary Energy|Electricity|Geothermal',
'Secondary Energy|Electricity|Hydro',
'Secondary Energy|Electricity|Nuclear',
'Secondary Energy|Electricity|Oil',
'Secondary Energy|Electricity|Other',
'Secondary Energy|Electricity|Solar',
'Secondary Energy|Electricity|Wind'],
dtype='object', name='variable')
uniquelevel(df, ["variable", "unit"])
MultiIndex([( 'Capacity|Electricity|Biomass', 'GW'),
( 'Capacity|Electricity|Coal', 'GW'),
( 'Capacity|Electricity|Gas', 'GW'),
( 'Capacity|Electricity|Geothermal', 'GW'),
( 'Capacity|Electricity|Hydro', 'GW'),
( 'Capacity|Electricity|Nuclear', 'GW'),
( 'Capacity|Electricity|Oil', 'GW'),
( 'Capacity|Electricity|Other', 'GW'),
( 'Capacity|Electricity|Solar', 'GW'),
( 'Capacity|Electricity|Wind', 'GW'),
( 'Secondary Energy|Electricity|Biomass', 'GWh/yr'),
( 'Secondary Energy|Electricity|Coal', 'GWh/yr'),
( 'Secondary Energy|Electricity|Gas', 'GWh/yr'),
('Secondary Energy|Electricity|Geothermal', 'GWh/yr'),
( 'Secondary Energy|Electricity|Hydro', 'GWh/yr'),
( 'Secondary Energy|Electricity|Nuclear', 'GWh/yr'),
( 'Secondary Energy|Electricity|Oil', 'GWh/yr'),
( 'Secondary Energy|Electricity|Other', 'GWh/yr'),
( 'Secondary Energy|Electricity|Solar', 'GWh/yr'),
( 'Secondary Energy|Electricity|Wind', 'GWh/yr')],
names=['variable', 'unit'])
BEWARE: Pitfalls
concat ignores level order, so make sure to reorder_levels them
pd.concat([simple_fossil_series, simple_fossil_series.swaplevel()])
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
2030 Coal 182.0149
2040 Coal 1.0729
2050 Coal 0.5029
2060 Coal 0.1836
2030 Gas 1584.4033
2040 Gas 1289.4777
2050 Gas 562.8482
2060 Gas 349.9091
2030 Nuclear 275.5920
2040 Nuclear 214.4376
2050 Nuclear 156.7766
2060 Nuclear 92.0667
dtype: float64
pd.concat(
[
simple_fossil_series,
simple_fossil_series.swaplevel().reorder_levels(
simple_fossil_series.index.names
),
]
)
fuel year
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
dtype: float64
Therefore, pandas-indexing brings a variant which does this automatically:
from pandas_indexing import concat
concat([simple_fossil_series, simple_fossil_series.swaplevel()])
fuel year
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
Coal 2030 182.0149
2040 1.0729
2050 0.5029
2060 0.1836
Gas 2030 1584.4033
2040 1289.4777
2050 562.8482
2060 349.9091
Nuclear 2030 275.5920
2040 214.4376
2050 156.7766
2060 92.0667
dtype: float64
Additional helpful multi-index helpers
MultiIndex rendering is often annoying to read, since the important information might get abbreviated away, then converting it into a dataframe is helpful
projectlevel(fossil_series.index, ["model", "scenario", "fuel"])
MultiIndex([('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...),
('REMIND-MAgPIE 2.1-4.3', 'DeepElec_SSP2_HighRE_Budg900', ...)],
names=['model', 'scenario', 'fuel'])
projectlevel(fossil_series.index, ["model", "scenario", "fuel"]).to_frame(index=False)
| model | scenario | fuel | |
|---|---|---|---|
| 0 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Coal |
| 1 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Coal |
| 2 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Coal |
| 3 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Coal |
| 4 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Gas |
| 5 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Gas |
| 6 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Gas |
| 7 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Gas |
| 8 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Nuclear |
| 9 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Nuclear |
| 10 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Nuclear |
| 11 | REMIND-MAgPIE 2.1-4.3 | DeepElec_SSP2_HighRE_Budg900 | Nuclear |