standardizeMissing
Insert standard missing values
Description
replaces values specified in B
= standardizeMissing(A
,indicator
)indicator
with standard missing values in
A
and returns a standardized array or table.
Missing values are defined according to the data type of
A
:
NaN
—double
,single
,duration
, andcalendarDuration
NaT
—datetime
<missing>
—string
<undefined>
—categorical
{''}
—cell
of character vectors
If A
is a table, then the data type of each variable defines the
missing value for that variable.
specifies additional parameters for standardizing missing values using one or more
name-value arguments. For example,
B
= standardizeMissing(___,Name,Value
)standardizeMissing(A,indicator,'DataVariables',datavars)
standardizes
missing values in the variables specified by datavars
when
A
is a table or timetable.
Examples
Nonstandard Missing Numbers
Create a row vector and replace all instances of -99
with the standard missing value for double
data types, NaN
.
A = [0 1 5 -99 8 3 4 -99 16]; B = standardizeMissing(A,-99)
B = 1×9
0 1 5 NaN 8 3 4 NaN 16
Replace All Instances of Specified Values
Create a table containing Inf
and 'N/A'
to represent missing values.
dblVar = [NaN;3;Inf;7;9]; cellstrVar = {'one';'three';'';'N/A';'nine'}; charVar = ['A';'C';'E';' ';'I']; categoryVar = categorical({'red';'yellow';'blue';'violet';''}); A = table(dblVar,cellstrVar,charVar,categoryVar)
A=5×4 table
dblVar cellstrVar charVar categoryVar
______ __________ _______ ___________
NaN {'one' } A red
3 {'three' } C yellow
Inf {0x0 char} E blue
7 {'N/A' } violet
9 {'nine' } I <undefined>
Replace all instances of Inf
with NaN
and replace all instances of 'N/A'
with the empty character vector, ''
.
B = standardizeMissing(A,{Inf,'N/A'})
B=5×4 table
dblVar cellstrVar charVar categoryVar
______ __________ _______ ___________
NaN {'one' } A red
3 {'three' } C yellow
NaN {0x0 char} E blue
7 {0x0 char} violet
9 {'nine' } I <undefined>
Replace Only Values in Specified Variables
Replace instances of Inf
and 'N/A'
occurring in specified variables of a table with the standard missing value indicators.
Create a table containing Inf
and 'N/A'
to represent missing values.
a = {'alpha';'bravo';'charlie';'';'N/A'}; x = [1;NaN;3;Inf;5]; y = [57;732;93;1398;Inf]; A = table(a,x,y)
A=5×3 table
a x y
___________ ___ ____
{'alpha' } 1 57
{'bravo' } NaN 732
{'charlie'} 3 93
{0x0 char } Inf 1398
{'N/A' } 5 Inf
For the variables a
and x
, replace instances of Inf
with NaN
and 'N/A'
with the empty character vector, ''
.
B = standardizeMissing(A,{Inf,'N/A'},'DataVariables',{'a','x'})
B=5×3 table
a x y
___________ ___ ____
{'alpha' } 1 57
{'bravo' } NaN 732
{'charlie'} 3 93
{0x0 char } NaN 1398
{0x0 char } 5 Inf
Inf
in the variable y
remains unchanged because y
is not included in the DataVariables
name-value argument.
Input Arguments
A
— Input data
vector | matrix | multidimensional array | table | timetable
Input data, specified as a vector, matrix, multidimensional array, table, or timetable. If
A
is a timetable, then standardizeMissing
operates on the table data only and ignores NaT
and
NaN
values in the vector of row times.
Data Types:
double
| single
| char
|
string
| cell
| table
|
timetable
| categorical
|
datetime
| duration
indicator
— Nonstandard missing value indicator
scalar | vector | cell array
Nonstandard missing value indicator, specified as a scalar, vector, or cell array. The
elements of indicator
define the values that
standardizeMissing
treats as missing. If A
is an
array, then indicator
must be a vector. If A
is a
table or timetable, then indicator
can also be a cell array with
entries of multiple data types.
The data types specified in indicator
match
data types in the corresponding entries of A
. The
following are additional data type matches between the elements of indicator
and
elements of A
:
double
indicators matchdouble
,single
, integer, andlogical
entries ofA
.string
andchar
indicators matchcategorical
entries ofA
.
Example: B = standardizeMissing(A,'N/A')
replaces
the character vector 'N/A'
with the empty character
vector, ''
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
| char
| string
| cell
| datetime
| duration
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: standardizeMissing(T,indicator,'ReplaceValues',false)
DataVariables
— Table variables to operate on
table variable name | scalar | vector | cell array | pattern | function handle | table vartype
subscript
Table variables to operate on, specified as one of the options in this table. The
DataVariables
value indicates which variables of the input table
to fill.
Other variables in the table not specified by DataVariables
pass through to the output without being standardized.
Indexing Scheme | Examples |
---|---|
Variable names:
|
|
Variable index:
|
|
Function handle:
|
|
Variable type:
|
|
Example: standardizeMissing(T,indicator,'DataVariables',["Var1" "Var2"
"Var4"])
ReplaceValues
— Replace values indicator
true
or 1
(default) | false
or 0
Replace values indicator, specified as one of these values when
A
is a table or timetable:
true
or1
— Replace input table variables containing missing entries with standardized table variables.false
or0
— Append the input table with all table variables that were checked for missing entries. The missing entries in the appended variables are standardized.
For vector, matrix, or multidimensional array input data,
ReplaceValues
is not supported.
B
is the same size as A
unless the value of
ReplaceValues
is false
. If the value of
ReplaceValues
is false
, then the width of
B
is the sum of the input data width and the number of data
variables specified.
Example: standardizeMissing(T,indicator,'ReplaceValues',false)
Algorithms
standardizeMissing
treats leading and trailing
white space differently for cell arrays of character vectors, character
arrays, and categorical arrays.
For cell arrays of character vectors,
standardizeMissing
does not ignore white space. All character vectors must match exactly a character vector specified inindicator
.For character arrays,
standardizeMissing
ignores trailing white space.For categorical arrays,
standardizeMissing
ignores leading and trailing white space.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
This function fully supports tall arrays. For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Nonstandard missing value indicator must be a scalar or vector.
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.
This function fully supports distributed arrays. For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).
Version History
Introduced in R2013bR2022b: Character arrays have no standard missing value
Character arrays have no default definition of a standard missing value. Therefore,
standardizeMissing
does not replace values in character arrays. For
example, standardizeMissing(['ab'; 'NA'],'NA')
returns logical array
['ab'; 'NA']
. Previously, it returned ['ab'; '
']
.
R2022a: Append standardized values
You can now append the input table with all table variables that were checked for
missing entries. The missing entries in the appended variables are standardized. Append,
rather than replace, table variables by setting the ReplaceValues
name-value argument to false
.
The ReplaceValues
name-value argument is supported only for table and
timetable input data.
See Also
Functions
Apps
Topics
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other bat365 country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)