imageInputLayer
Image input layer
Description
An image input layer inputs 2-D images to a neural network and applies data normalization.
For 3-D image input, use image3dInputLayer
.
Creation
Description
returns an image input layer and specifies the layer
= imageInputLayer(inputSize
)InputSize
property.
sets the optional layer
= imageInputLayer(inputSize
,Name,Value
)Normalization
, NormalizationDimension
, Mean
, StandardDeviation
, Min
, Max
, SplitComplexInputs
, and Name
properties using one or more name-value
arguments. Enclose the property names in quotes.
Properties
Image Input
InputSize
— Size of the input
row vector of integers
This property is read-only.
Size of the input data, specified as a row vector of integers
[h w c]
, where h
,
w
, and c
correspond to the
height, width, and number of channels respectively.
For grayscale images, specify a vector with
c
equal to1
.For RGB images, specify a vector with
c
equal to3
.For multispectral or hyperspectral images, specify a vector with
c
equal to the number of channels.
For 3-D image or volume input, use image3dInputLayer
.
Example:
[224 224 3]
Normalization
— Data normalization
'zerocenter'
(default) | 'zscore'
| 'rescale-symmetric'
| 'rescale-zero-one'
| 'none'
| function handle
This property is read-only.
Data normalization to apply every time data is forward propagated through the input layer, specified as one of the following:
'zerocenter'
— Subtract the mean specified byMean
.'zscore'
— Subtract the mean specified byMean
and divide byStandardDeviation
.'rescale-symmetric'
— Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified byMin
andMax
, respectively.'rescale-zero-one'
— Rescale the input to be in the range [0, 1] using the minimum and maximum values specified byMin
andMax
, respectively.'none'
— Do not normalize the input data.function handle — Normalize the data using the specified function. The function must be of the form
Y = func(X)
, whereX
is the input data and the outputY
is the normalized data.
Tip
The software, by default, automatically calculates the normalization statistics when using the
trainnet
and trainNetwork
functions. To save time when
training, specify the required statistics for normalization and set the ResetInputNormalization
option in trainingOptions
to 0
(false
).
NormalizationDimension
— Normalization dimension
'auto'
(default) | 'channel'
| 'element'
| 'all'
Normalization dimension, specified as one of the following:
'auto'
– If the training option isfalse
and you specify any of the normalization statistics (Mean
,StandardDeviation
,Min
, orMax
), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.'channel'
– Channel-wise normalization.'element'
– Element-wise normalization.'all'
– Normalize all values using scalar statistics.
Data Types: char
| string
Mean
— Mean for zero-center and z-score normalization
[]
(default) | 3-D array | numeric scalar
Mean for zero-center and z-score normalization, specified as a
h-by-w-by-c
array, a 1-by-1-by-c array of means per channel, a
numeric scalar, or []
, where h,
w, and c correspond to the
height, width, and the number of channels of the mean,
respectively.
If you specify the Mean
property, then Normalization
must be 'zerocenter'
or
'zscore'
. If Mean
is
[]
, then the trainnet
and
trainNetwork
functions calculate the mean. To train a
dlnetwork
object using a custom training loop or assemble a network
without training it using the assembleNetwork
function, you must set
the Mean
property to a numeric scalar or a numeric array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
StandardDeviation
— Standard deviation for z-score normalization
[]
(default) | 3-D array | numeric scalar
Standard deviation for z-score normalization, specified as a
h-by-w-by-c
array, a 1-by-1-by-c array of means per channel, a
numeric scalar, or []
, where h,
w, and c correspond to the
height, width, and the number of channels of the standard deviation,
respectively.
If you specify the StandardDeviation
property, then
Normalization
must be 'zscore'
. If
StandardDeviation
is []
, then the
trainnet
and trainNetwork
functions calculate
the standard deviation. To train a dlnetwork
object using a custom training
loop or assemble a network without training it using the
assembleNetwork
function, you must set the StandardDeviation
property to a numeric scalar or a numeric array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Min
— Minimum value for rescaling
[]
(default) | 3-D array | numeric scalar
Minimum value for rescaling, specified as a
h-by-w-by-c
array, a 1-by-1-by-c array of minima per channel, a
numeric scalar, or []
, where h,
w, and c correspond to the
height, width, and the number of channels of the minima,
respectively.
If you specify the Min
property, then Normalization
must be 'rescale-symmetric'
or
'rescale-zero-one'
. If Min
is
[]
, then the trainnet
and
trainNetwork
functions calculate the minima. To train a
dlnetwork
object using a custom training loop or assemble a network
without training it using the assembleNetwork
function, you must set
the Min
property to a numeric scalar or a numeric array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Max
— Maximum value for rescaling
[]
(default) | 3-D array | numeric scalar
Maximum value for rescaling, specified as a
h-by-w-by-c
array, a 1-by-1-by-c array of maxima per channel, a
numeric scalar, or []
, where h,
w, and c correspond to the
height, width, and the number of channels of the maxima,
respectively.
If you specify the Max
property, then Normalization
must be 'rescale-symmetric'
or
'rescale-zero-one'
. If Max
is
[]
, then the trainnet
and
trainNetwork
functions calculate the maxima. To train a
dlnetwork
object using a custom training loop or assemble a network
without training it using the assembleNetwork
function, you must set
the Max
property to a numeric scalar or a numeric array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
SplitComplexInputs
— Flag to split input data into real and imaginary components
0
(false
) (default) | 1
(true
)
This property is read-only.
Flag to split input data into real and imaginary components specified as one of these values:
0
(false
) – Do not split input data.1
(true
) – Split data into real and imaginary components.
When SplitComplexInputs
is 1
, then the layer
outputs twice as many channels as the input data. For example, if the input data is
complex-values with numChannels
channels, then the layer outputs data
with 2*numChannels
channels, where channels 1
through numChannels
contain the real components of the input data and
numChannels+1
through 2*numChannels
contain
the imaginary components of the input data. If the input data is real, then channels
numChannels+1
through 2*numChannels
are all
zero.
To input complex-valued data into a neural network, the
SplitComplexInputs
option of the input layer must be
1
.
For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.
DataAugmentation
— Data augmentation transforms
'none'
(default) | 'randcrop'
| 'randfliplr'
| cell array of 'randcrop'
and
'randfliplr'
This property is read-only.
Note
The DataAugmentation
property is not
recommended. To preprocess images with cropping, reflection, and
other geometric transformations, use augmentedImageDatastore
instead.
Data augmentation transforms to use during training, specified as one of the following.
'none'
— No data augmentation'randcrop'
— Take a random crop from the training image. The random crop has the same size as the input size.'randfliplr'
— Randomly flip the input images horizontally with a 50% chance.Cell array of
'randcrop'
and'randfliplr'
. The software applies the augmentation in the order specified in the cell array.
Augmentation of image data is another way of reducing overfitting [1], [2].
Data Types: string
| char
| cell
Layer
Name
— Layer name
""
(default) | character vector | string scalar
Layer name, specified as a character vector or a string scalar.
For Layer
array input, the trainnet
, trainNetwork
, assembleNetwork
, layerGraph
, and
dlnetwork
functions automatically assign
names to layers with the name ""
.
The ImageInputLayer
object stores this property as a character vector.
Data Types: char
| string
NumInputs
— Number of inputs
0 (default)
This property is read-only.
Number of inputs of the layer. The layer has no inputs.
Data Types: double
InputNames
— Input names
{}
(default)
This property is read-only.
Input names of the layer. The layer has no inputs.
Data Types: cell
NumOutputs
— Number of outputs
1
(default)
This property is read-only.
Number of outputs from the layer, returned as 1
. This layer has a
single output only.
Data Types: double
OutputNames
— Output names
{'out'}
(default)
This property is read-only.
Output names, returned as {'out'}
. This layer has a single output
only.
Data Types: cell
Examples
Create Image Input Layer
Create an image input layer for 28-by-28 color images with name 'input'
. By default, the layer performs data normalization by subtracting the mean image of the training set from every input image.
inputlayer = imageInputLayer([28 28 3],'Name','input')
inputlayer = ImageInputLayer with properties: Name: 'input' InputSize: [28 28 3] SplitComplexInputs: 0 Hyperparameters DataAugmentation: 'none' Normalization: 'zerocenter' NormalizationDimension: 'auto' Mean: []
Include an image input layer in a Layer
array.
layers = [ ... imageInputLayer([28 28 1]) convolution2dLayer(5,20) reluLayer maxPooling2dLayer(2,'Stride',2) fullyConnectedLayer(10) softmaxLayer classificationLayer]
layers = 7x1 Layer array with layers: 1 '' Image Input 28x28x1 images with 'zerocenter' normalization 2 '' 2-D Convolution 20 5x5 convolutions with stride [1 1] and padding [0 0 0 0] 3 '' ReLU ReLU 4 '' 2-D Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0] 5 '' Fully Connected 10 fully connected layer 6 '' Softmax softmax 7 '' Classification Output crossentropyex
Algorithms
Layer Output Formats
Layers in a layer array or layer graph pass data to subsequent layers as formatted dlarray
objects. The format of a dlarray
object is a string of characters, in which each character describes the corresponding dimension of the data. The formats consists of one or more of these characters:
"S"
— Spatial"C"
— Channel"B"
— Batch"T"
— Time"U"
— Unspecified
For example, 2-D image data represented as a 4-D array, where the first two dimensions correspond to the spatial dimensions of the images, the third dimension corresponds to the channels of the images, and the fourth dimension corresponds to the batch dimension, can be described as having the format "SSCB"
(spatial, spatial, channel, batch).
The input layer of a network specifies the layout of the data that the network expects. If you have data in a different layout, then specify the layout using the InputDataFormats
training option.
The layer inputs
h-by-w-by-c-by-N
arrays into the network, where h, w, and
c are the height, width, and number of channels of the
images, respectively, and N is the number of images. Data in this
layout has the data format "SSCB"
(spatial, spatial, channel,
batch).
References
[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." Communications of the ACM 60, no. 6 (May 24, 2017): 84–90. https://doi.org/10.1145/3065386
[2] Cireşan, D., U. Meier, J. Schmidhuber. "Multi-column Deep Neural Networks for Image Classification". IEEE Conference on Computer Vision and Pattern Recognition, 2012.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
Code generation does not support
'Normalization'
specified using a function handle.Code generation does not support complex input and does not support
'SplitComplexInputs'
option.
GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.
Usage notes and limitations:
Code generation does not support
'Normalization'
specified using a function handle.Code generation does not support complex input and does not support
'SplitComplexInputs'
option.
Version History
Introduced in R2016aR2019b: AverageImage
property will be removed
AverageImage
will be removed. Use Mean
instead. To update your code, replace all instances of AverageImage
with Mean
.
There are no differences between the properties that require additional updates to your
code.
R2019b: imageInputLayer
and image3dInputLayer
, by default, use channel-wise normalization
Starting in R2019b, imageInputLayer
and image3dInputLayer
,
by default, use channel-wise normalization. In previous versions, these layers use
element-wise normalization. To reproduce this behavior, set the NormalizationDimension
option of these layers to
'element'
.
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other bat365 country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)