Nushell
Get Nu!
Getting Started
  • The Nushell Book
  • Command Reference
  • Cookbook
  • Language Reference Guide
  • Contributing Guide
Blog
  • English
  • 中文
  • Deutsch
  • Français
  • Español
  • 日本語
  • Português do Brasil
  • Русский язык
  • 한국어
GitHub
Get Nu!
Getting Started
  • The Nushell Book
  • Command Reference
  • Cookbook
  • Language Reference Guide
  • Contributing Guide
Blog
  • English
  • 中文
  • Deutsch
  • Français
  • Español
  • 日本語
  • Português do Brasil
  • Русский язык
  • 한국어
GitHub
  • Categories

    • Bits
    • Bytes
    • Chart
    • Conversions
    • Core
    • Database
    • Dataframe
    • Dataframe Or Lazyframe
    • Date
    • Debug
    • Default
    • Env
    • Experimental
    • Expression
    • Filesystem
    • Filters
    • Formats
    • Generators
    • Hash
    • History
    • Lazyframe
    • Math
    • Misc
    • Network
    • Path
    • Platform
    • Plugin
    • Prompt
    • Random
    • Removed
    • Shells
    • Strings
    • System
    • Viewers

polars pivot for dataframe

Pivot a DataFrame from long to wide format.

Signature

> polars pivot {flags}

Flags

  • --on, -o {any}: Column names for pivoting.
  • --on-cols, -c {any}: column names used as value columns
  • --index, -i {any}: Selector or column names for indexes.
  • --values {any}: Selector or column names used as value columns.
  • --aggregate, -a {any}: Aggregation to apply when pivoting. The following are supported: first, sum, min, max, mean, median, count, last, or a custom expression.
  • --separator, -p {string}: Delimiter in generated column names in case of multiple values columns (default '_').
  • --maintain-order: Maintain Order.
  • --streamable, -t: Whether or not to use the polars streaming engine. Only valid for lazy dataframes
  • --stable: Perform a stable pivot.

Input/output types:

inputoutput
polars_dataframepolars_dataframe
polars_lazyframepolars_lazyframe

Examples

Given a set of test scores, reshape so we have one row per student, with different subjects as columns, and their test_1 scores as values

> {
        "name": ["Cady", "Cady", "Karen", "Karen"],
        "subject": ["maths", "physics", "maths", "physics"],
        "test_1": [98, 99, 61, 58],
        "test_2": [100, 100, 60, 60],
    } |
    polars into-df --as-columns |
    polars pivot --on subject --on-cols [maths physics] --index name --values test_1 |
    polars sort-by name maths physics |
    polars collect
╭───┬───────┬───────┬─────────╮
│ # │ name  │ maths │ physics │
├───┼───────┼───────┼─────────┤
│ 0 │ Cady  │    98 │      99 │
│ 1 │ Karen │    61 │      58 │
╰───┴───────┴───────┴─────────╯

Given a set of test scores, reshape so we have one row per student, utilize a selector for the values come to include all test scores

> {
        "name": ["Cady", "Cady", "Karen", "Karen"],
        "subject": ["maths", "physics", "maths", "physics"],
        "test_1": [98, 99, 61, 58],
        "test_2": [100, 100, 60, 60],
    } |
    polars into-df --as-columns |
    polars pivot --on subject --on-cols [maths physics] --index name --values (polars selector starts-with test) |
    polars sort-by name test_1_maths test_1_physics test_2_maths test_2_physics |
    polars collect
╭───┬───────┬──────────────┬────────────────┬──────────────┬────────────────╮
│ # │ name  │ test_1_maths │ test_1_physics │ test_2_maths │ test_2_physics │
├───┼───────┼──────────────┼────────────────┼──────────────┼────────────────┤
│ 0 │ Cady  │           98 │             99 │          100 │            100 │
│ 1 │ Karen │           61 │             58 │           60 │             60 │
╰───┴───────┴──────────────┴────────────────┴──────────────┴────────────────╯

Given a DataFrame with duplicate entries for the pivot columns, use the aggregate flag to specify how to aggregate values for those duplicates. In this example, we sum the foo and bar values for rows with the same ix and col values.

> {
        "ix": [1, 1, 2, 2, 1, 2],
        "col": ["a", "a", "a", "a", "b", "b"],
        "foo": [0, 1, 2, 2, 7, 1],
        "bar": [0, 2, 0, 0, 9, 4],
    } |
    polars into-df --as-columns |
    polars pivot --on col --on-cols [a b] --index ix --aggregate sum |
    polars sort-by ix foo_a foo_b bar_a bar_b |
    polars collect
╭───┬────┬───────┬───────┬───────┬───────╮
│ # │ ix │ foo_a │ foo_b │ bar_a │ bar_b │
├───┼────┼───────┼───────┼───────┼───────┤
│ 0 │  1 │     1 │     7 │     2 │     9 │
│ 1 │  2 │     4 │     1 │     0 │     4 │
╰───┴────┴───────┴───────┴───────┴───────╯