bat365 Homepage

unstack

Unstack data from single variable into multiple variables

Syntax

U = unstack(S,vars,ivar)

U = unstack(S,vars,ivar,Name,Value)

[U,is] =
unstack(___)

Description

example

U = unstack(S,vars,ivar) converts the table or timetable, S, to an equivalent table or timetable, U, that is unstacked. vars specifies variables in S, each of which is unstacked into multiple variables in U. In general, U contains more variables, but fewer rows, than S.

The ivar input argument specifies the variable in S that unstack uses as an indicator variable. The values in ivar determine which variables in U contain elements taken from vars after unstacking.

The unstack function treats the remaining variables differently in tables and timetables.

If S is a table, then unstack treats the remaining variables as grouping variables. Each unique combination of values in the grouping variables identifies a group of rows in S that is unstacked into a single row of U.
If S is a timetable, then unstack discards the remaining variables. However, unstack treats the vector of row times as a grouping variable.

You cannot unstack the row names of a table, or the row times of a timetable, or specify either as the indicator variable. You can specify row names or row times as constant variables with the 'ConstantVariables' argument.

U = unstack(S,vars,ivar,Name,Value) converts the table or timetable S with additional options specified by one or more Name,Value pair arguments.

For example, you can specify how unstack converts variables from S to variables in U.

example

[U,is] = unstack(___) also returns an index vector, is, indicating the correspondence between rows in U and rows in S. You can use any of the previous input arguments.

Examples

collapse all

Separate One Variable into Three Variables

Open Live Script

Create a table indicating the amount of snowfall in various towns for various storms. Specify the towns using a categorical array, since there are a fixed set of town names in this table.

Storm = [3;3;1;3;1;1;4;2;4;2;4;2];
Town = categorical({'Natick';'Worcester';'Natick';'Boston';'Boston';'Worcester';...
                    'Boston';'Natick';'Worcester';'Worcester';'Natick';'Boston'});
Snowfall = [0;3;5;5;9;10;12;13;15;16;17;21];

S = table(Storm,Town,Snowfall)

S=12×3 table
    Storm      Town       Snowfall
    _____    _________    ________

      3      Natick           0   
      3      Worcester        3   
      1      Natick           5   
      3      Boston           5   
      1      Boston           9   
      1      Worcester       10   
      4      Boston          12   
      2      Natick          13   
      4      Worcester       15   
      2      Worcester       16   
      4      Natick          17   
      2      Boston          21

S contains three snowfall entries for each storm, one for each town. S is in stacked format, with Town having the categorical data type. Table variables that have the categorical data type are useful indicator variables and grouping variables for unstacking.

Separate the variable Snowfall into three variables, one for each town specified in the variable, Town. The output table, U, is in unstacked format.

U = unstack(S,'Snowfall','Town')

U=4×4 table
    Storm    Boston    Natick    Worcester
    _____    ______    ______    _________

      3         5         0          3    
      1         9         5         10    
      4        12        17         15    
      2        21        13         16

Each row in U contains data from rows in S that have the same value in the grouping variable, Storm. The order of the unique values in Storm determines the order of the data in U.

Apply Aggregation Function to Each Group

Open Live Script

Unstack data and apply an aggregation function to multiple rows in the same group that have the same values in the indicator variable.

Create a timetable containing data on the price of two stocks over two days. To specify the row times, use datetime values. Specify the names of the stocks using a categorical array since this timetable has a fixed set of stock names.

Date = [repmat(datetime('2008-04-12'),6,1);...
        repmat(datetime('2008-04-13'),5,1)];
Stock = categorical({'Stock1';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock1';'Stock2'});
Price = [60.35;27.68;64.19;25.47;28.11;27.98;...
         63.85;27.55;26.43;65.73;25.94];

S = timetable(Date,Stock,Price)

S=11×2 timetable
       Date        Stock     Price
    ___________    ______    _____

    12-Apr-2008    Stock1    60.35
    12-Apr-2008    Stock2    27.68
    12-Apr-2008    Stock1    64.19
    12-Apr-2008    Stock2    25.47
    12-Apr-2008    Stock2    28.11
    12-Apr-2008    Stock2    27.98
    13-Apr-2008    Stock1    63.85
    13-Apr-2008    Stock2    27.55
    13-Apr-2008    Stock2    26.43
    13-Apr-2008    Stock1    65.73
    13-Apr-2008    Stock2    25.94

S contains two prices for Stock1 during the first day and four prices for Stock2 during the first day.

Create a timetable containing separate variables for each stock and one row for each day. Use Date (the vector of row times) as the grouping variable and apply the aggregation function, @mean, to the numeric values from the variable, Price, for each group.

[U,is] = unstack(S,'Price','Stock',...
                 'AggregationFunction',@mean)

U=2×2 timetable
       Date        Stock1    Stock2
    ___________    ______    ______

    12-Apr-2008    62.27     27.31 
    13-Apr-2008    64.79     26.64

U contains the average price for each stock grouped by date.

is identifies the index of the first value for each group of rows in S. The first value for the group with the date April 13, 2008, is in the seventh row of S.

Input Arguments

collapse all

`S` — Input table
table | timetable

Input table, specified as a table or a timetable. S must contain data variables to unstack, vars, and an indicator variable, ivar. The remaining variables in S can be treated as either grouping variables or constant variables.

`vars` — Variables in `S` to unstack
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

Variables in S to unstack, specified as a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

`ivar` — Indicator variable in `S`
positive integer | character vector | string scalar

Indicator variable in S, specified as a positive integer, a character vector, or a string scalar. The values in the variable specified by ivar indicate which variables in U contain elements taken from the variables specified by vars.

The variable specified by ivar can be a numeric vector, logical vector, character array, cell array of character vectors, string array, or categorical vector.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'AggregationFunction',@mean applies the aggregation function @mean to the values in vars.

`GroupingVariables` — Grouping variables in `S` that define groups of rows
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

Grouping variables in S that define groups of rows, specified as the comma-separated pair consisting of 'GroupingVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector. Each group of rows in S becomes one row in U.

S can have row labels along its first dimension. If S is a table, then it can have row names as the labels. If S is a timetable, then it must have row times as the labels. unstack can treat row labels as grouping variables.

If you do not specify 'GroupingVariables', and S is a timetable, then unstack treats the row times as a grouping variable.
If you specify 'GroupingVariables', and S has row names or row times, then unstack does not treat them as grouping variables, unless you include them in the value of 'GroupingVariables'.

`ConstantVariables` — Variables constant within a group
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

Variables constant within a group, specified as the comma-separated pair consisting of 'ConstantVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

The values for these variables in U are taken from the first row in each group in S.

You can include the row names or row times of S when you specify the value of 'ConstantVariables'.

`NewDataVariableNames` — Names for new data variables in `U`
cell array of character vectors | string array

Names for the new data variables in U, specified as the comma-separated pair consisting of 'NewDataVariableNames' and a cell array of character vectors or string array.

If you do not specify 'NewDataVariableNames', then unstack creates names for the new data variables in U based on values in the indicator variable specified by ivar.

`AggregationFunction` — Aggregation function from values in `vars` to single value
`@sum` (numeric data) or `@unique` (nonnumeric data) (default) | function handle

Aggregation function from values in vars to a single value, specified as the comma-separated pair consisting of 'AggregationFunction' and a function handle. unstack applies this function to rows from the same group that have the same value in ivar. The function must aggregate the data values into a single value.

If you do not specify the value of 'AggregationFunction', then unstack uses different default aggregation functions depending on data type.

For numeric data, the default aggregation function is sum.
For nonnumeric data, the default aggregation function is unique.

If there are no data values to aggregate, because there are no data values corresponding to a given indicator value in ivar after unstacking, then unstack must fill an empty element in the unstacked output table. In that case, unstack calls the aggregation function with an empty array as input. The value that unstack fills in depends on what the aggregation function returns when its input is an empty array.

Value Returned by Aggregation Function When No Data to Aggregate	Value Inserted into Empty Elements of Unstacked Table
Aggregation function raises error	`unstack` raises error.
Empty array	Fill value of the appropriate data type (`NaN`, `NaT`, `""`, and so on). Example: If the aggregation function is `unique`, and it returns a 0-by-0 `double` array, then `unstack` inserts a `NaN` into the output table.
Scalar value	Scalar value returned from aggregation function. Example: If the aggregation function is `numel` and it returns `0`, then `unstack` inserts a `0` into the output table.
Vector, matrix, or multidimensional array	`unstack` raises error.

`VariableNamingRule` — Rule for naming variables in `U`
`'modify'` (default) | `'preserve'`

Rule for naming variables in U, specified as the comma-separated pair consisting of 'VariableNamingRule' and either the value 'modify' or 'preserve'.

The values of 'VariableNamingRule' specify the following rules for naming variable in the output table or timetable.

Value of `'VariableNamingRule'`	Rule
`'modify'` (default)	Modify names taken from the input table or timetable so that the corresponding variable names in the output are also valid MATLAB^® identifiers.
`'preserve'`	Preserve original names taken from the input table or timetable. The corresponding variable names in the output can have any Unicode^® characters, including spaces and non-ASCII characters. Note: In some cases, `unstack` must modify original names even when `'preserve'` is the rule. Such cases include: Duplicate names Names that conflict with table dimension names Names that conflict with a reserved name. Names whose lengths exceed the value of `namelengthmax`.

Value of 'VariableNamingRule'

Rule

'modify' (default)

Modify names taken from the input table or timetable so that the corresponding variable names in the output are also valid MATLAB^® identifiers.

'preserve'

Preserve original names taken from the input table or timetable. The corresponding variable names in the output can have any Unicode^® characters, including spaces and non-ASCII characters.

Note: In some cases, unstack must modify original names even when 'preserve' is the rule. Such cases include:

Duplicate names
Names that conflict with table dimension names
Names that conflict with a reserved name.
Names whose lengths exceed the value of namelengthmax.

Output Arguments

collapse all

`U` — Output table
table | timetable

Output table, returned as a table or a timetable. U contains the unstacked data variables, the grouping variables, and the first value of each group from any constant variables.

The order of the data in U is based on the order of the unique values in the grouping variables.

You can store additional metadata such as descriptions, variable units, variable names, and row names in U. For more information, see the Properties sections of table or timetable.

`is` — Index to `S`
column vector

Index to S, returned as a column vector. For each row in U, the index vector, is, identifies the index of the first value in the corresponding group of rows in S.

More About

collapse all

Grouping Variables

Grouping variables are utility variables used to group, or categorize, data. Grouping variables are useful for summarizing or visualizing data by group. You can define groups in your table by specifying one or more grouping variables.

A grouping variable can be any of the following:

Categorical vector
Cell array of character vectors
String array
Character array
Numeric vector, typically containing positive integers
Logical vector

Rows that have the same grouping variable value belong to the same group. If you use multiple grouping variables, rows that have the same combination of grouping variable values belong to the same group.

Tips

You can specify more than one data variable in S, and each variable becomes a set of unstacked data variables in U. Use a vector of positive integers, a cell array or string array containing multiple variable names, or a logical vector to specify vars. The one indicator variable, specified by the input argument, ivar, applies to all data variables specifies by vars.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

The 'NewDataVariableNames' name-value argument must be specified. Its value must be constant.
The vars and ivars input arguments (data variables and indicator variables) must be constant.
If you specify grouping variables and constant variables, then they must be constant.
If you specify an aggregation function, then it must be constant.
If the input is a timetable with regular row times and you specify grouping variables that do not include the row times, then the output timetable might have irregular row times. Even though the intervals between output row times might look the same, the output timetable considers the vector of row times to be irregular.
If a variable of the input table or timetable is a cell array of character vectors, then unstack fills empty cells in the corresponding output variable with 1-by-0 character arrays in the generated code. In MATLAB, unstack fills such gaps with 0-by-0 character arrays.
The unstack function does not support code generation when the input table or timetable has a variable that is a heterogeneous cell array that cannot be converted to a homogeneous cell array.
- If the input has a variable that is a homogeneous cell array, or that can be converted to one, then the 'AggregationFunction' name-value argument must be specified. The default value of 'AggregationFunction' is 'unique'. But the unique function does not support cell arrays.
The vars input argument and the values of the 'GroupingVariables' and 'ConstantVariables' name-value arguments do not support pattern expressions.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

Version History

Introduced in R2013b

expand all

R2020a: Default aggregation function for nonnumeric data

In R2020a, if you do not specify the 'AggregationFunction' name-value pair argument, then the default aggregation function for nonnumeric data is the unique function. In previous releases, there was no default aggregation function for nonnumeric data, so unstack would raise an error.

R2020a: Behavior changes when the aggregation function has no data to aggregate

In R2020a, there are behavior changes when the aggregation function has no data to aggregate. This situation can occur when there are no data values that correspond to values in the indicator variable after unstacking. In such cases, unstack essentially calls the aggregation function on an empty array.

Value Returned by Aggregation Function When No Data to Aggregate	Behavior in R2020a	Behavior in Previous Releases
Data variable is numeric and the aggregation function raises an error.	`unstack` raises an error.	`unstack` filled output table element with fill value that is appropriate for the data type (such as `NaN` for `double` arrays).
Data variable is nonnumeric and the aggregation function returns an empty array.	`unstack` fills output table element with fill value that is appropriate for the data type (such as `""` for string arrays).	`unstack` raised an error.
Data variable is numeric and the aggregation function returns a scalar value (for example, `numel` returns `0`).	`unstack` inserts value returned by the aggregation function. For example, if `numel` returns `0`, then `unstack` inserts `0` into corresponding table element.	`unstack` fills output table element with fill value that is appropriate for the data type (such as `NaN` for `double` arrays).
Data variable is numeric and the aggregation function returns a vector, matrix, or multidimensional array.	`unstack` raises an error.	`unstack` fills output table element with fill value that is appropriate for the data type (such as `NaN` for `double` arrays).

unstack

Syntax

Description

Examples

Separate One Variable into Three Variables

Apply Aggregation Function to Each Group

Input Arguments

`S` — Input table
table | timetable

`vars` — Variables in `S` to unstack
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`ivar` — Indicator variable in `S`
positive integer | character vector | string scalar

Name-Value Arguments

`GroupingVariables` — Grouping variables in `S` that define groups of rows
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`ConstantVariables` — Variables constant within a group
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`NewDataVariableNames` — Names for new data variables in `U`
cell array of character vectors | string array

`AggregationFunction` — Aggregation function from values in `vars` to single value
`@sum` (numeric data) or `@unique` (nonnumeric data) (default) | function handle

`VariableNamingRule` — Rule for naming variables in `U`
`'modify'` (default) | `'preserve'`

Output Arguments

`U` — Output table
table | timetable

`is` — Index to `S`
column vector

More About

Grouping Variables

Tips

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

Version History

R2020a: Default aggregation function for nonnumeric data

R2020a: Behavior changes when the aggregation function has no data to aggregate

See Also

Functions

Live Editor Tasks

unstack

Syntax

Description

Examples

Separate One Variable into Three Variables

Apply Aggregation Function to Each Group

Input Arguments

S — Input table table | timetable

vars — Variables in S to unstack positive integer | vector of positive integers | string array | character vector | cell array of character vectors | pattern scalar | logical vector

ivar — Indicator variable in S positive integer | character vector | string scalar

Name-Value Arguments

GroupingVariables — Grouping variables in S that define groups of rows positive integer | vector of positive integers | string array | character vector | cell array of character vectors | pattern scalar | logical vector

ConstantVariables — Variables constant within a group positive integer | vector of positive integers | string array | character vector | cell array of character vectors | pattern scalar | logical vector

NewDataVariableNames — Names for new data variables in U cell array of character vectors | string array

AggregationFunction — Aggregation function from values in vars to single value @sum (numeric data) or @unique (nonnumeric data) (default) | function handle

VariableNamingRule — Rule for naming variables in U 'modify' (default) | 'preserve'

Output Arguments

U — Output table table | timetable

is — Index to S column vector

More About

Grouping Variables

Tips

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Thread-Based Environment Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

R2020a: Default aggregation function for nonnumeric data

R2020a: Behavior changes when the aggregation function has no data to aggregate

See Also

Functions

Live Editor Tasks

`S` — Input table
table | timetable

`vars` — Variables in `S` to unstack
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`ivar` — Indicator variable in `S`
positive integer | character vector | string scalar

`GroupingVariables` — Grouping variables in `S` that define groups of rows
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`ConstantVariables` — Variables constant within a group
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | `pattern` scalar | logical vector

`NewDataVariableNames` — Names for new data variables in `U`
cell array of character vectors | string array

`AggregationFunction` — Aggregation function from values in `vars` to single value
`@sum` (numeric data) or `@unique` (nonnumeric data) (default) | function handle

`VariableNamingRule` — Rule for naming variables in `U`
`'modify'` (default) | `'preserve'`

`U` — Output table
table | timetable

`is` — Index to `S`
column vector

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.