`polars pivot` for dataframe

Pivot a DataFrame from long to wide format.

Flags

--on, -o {any}: Column names for pivoting.
--on-cols, -c {any}: column names used as value columns
--index, -i {any}: Selector or column names for indexes.
--values {any}: Selector or column names used as value columns.
--aggregate, -a {any}: Aggregation to apply when pivoting. The following are supported: first, sum, min, max, mean, median, count, last, or a custom expression.
--separator, -p {string}: Delimiter in generated column names in case of multiple values columns (default '_').
--maintain-order: Maintain Order.
--streamable, -t: Whether or not to use the polars streaming engine. Only valid for lazy dataframes
--stable: Perform a stable pivot.

Input/output types:

input	output
polars_dataframe	polars_dataframe
polars_lazyframe	polars_lazyframe

Examples

Given a set of test scores, reshape so we have one row per student, with different subjects as columns, and their test_1 scores as values

> {
        "name": ["Cady", "Cady", "Karen", "Karen"],
        "subject": ["maths", "physics", "maths", "physics"],
        "test_1": [98, 99, 61, 58],
        "test_2": [100, 100, 60, 60],
    } |
    polars into-df --as-columns |
    polars pivot --on subject --on-cols [maths physics] --index name --values test_1 |
    polars sort-by name maths physics |
    polars collect
╭───┬───────┬───────┬─────────╮
│ # │ name  │ maths │ physics │
├───┼───────┼───────┼─────────┤
│ 0 │ Cady  │    98 │      99 │
│ 1 │ Karen │    61 │      58 │
╰───┴───────┴───────┴─────────╯

Given a set of test scores, reshape so we have one row per student, utilize a selector for the values come to include all test scores

> {
        "name": ["Cady", "Cady", "Karen", "Karen"],
        "subject": ["maths", "physics", "maths", "physics"],
        "test_1": [98, 99, 61, 58],
        "test_2": [100, 100, 60, 60],
    } |
    polars into-df --as-columns |
    polars pivot --on subject --on-cols [maths physics] --index name --values (polars selector starts-with test) |
    polars sort-by name test_1_maths test_1_physics test_2_maths test_2_physics |
    polars collect
╭───┬───────┬──────────────┬────────────────┬──────────────┬────────────────╮
│ # │ name  │ test_1_maths │ test_1_physics │ test_2_maths │ test_2_physics │
├───┼───────┼──────────────┼────────────────┼──────────────┼────────────────┤
│ 0 │ Cady  │           98 │             99 │          100 │            100 │
│ 1 │ Karen │           61 │             58 │           60 │             60 │
╰───┴───────┴──────────────┴────────────────┴──────────────┴────────────────╯

Given a DataFrame with duplicate entries for the pivot columns, use the aggregate flag to specify how to aggregate values for those duplicates. In this example, we sum the foo and bar values for rows with the same ix and col values.

> {
        "ix": [1, 1, 2, 2, 1, 2],
        "col": ["a", "a", "a", "a", "b", "b"],
        "foo": [0, 1, 2, 2, 7, 1],
        "bar": [0, 2, 0, 0, 9, 4],
    } |
    polars into-df --as-columns |
    polars pivot --on col --on-cols [a b] --index ix --aggregate sum |
    polars sort-by ix foo_a foo_b bar_a bar_b |
    polars collect
╭───┬────┬───────┬───────┬───────┬───────╮
│ # │ ix │ foo_a │ foo_b │ bar_a │ bar_b │
├───┼────┼───────┼───────┼───────┼───────┤
│ 0 │  1 │     1 │     7 │     2 │     9 │
│ 1 │  2 │     4 │     1 │     0 │     4 │
╰───┴────┴───────┴───────┴───────┴───────╯

polars pivot for dataframe

Signature

Flags

Input/output types:

Examples

`polars pivot` for dataframe