Using Simulink to Generate Fault Data
This example shows how to use a Simulink® model to generate fault and healthy data. The fault and healthy data is used to develop a condition monitoring algorithm. The example uses a transmission system and models a gear tooth fault, a sensor drift fault and shaft wear fault.
Transmission System Model
The transmission casing model uses Simscape™ Driveline™ blocks to model a simple transmission system. The transmission system consists of a torque drive, drive shaft, clutch, and high and low gears connected to an output shaft.
mdl = 'pdmTransmissionCasing';
open_system(mdl)
The transmission system includes a vibration sensor that is monitoring casing vibrations. The casing model translates the shaft angular displacement to a linear displacement on the casing. The casing is modelled as a mass spring damper system and the vibration (casing acceleration) is measured from the casing.
open_system([mdl '/Casing'])
Fault Modelling
The transmission system includes fault models for vibration sensor drift, gear tooth fault, and shaft wear. The sensor drift fault is easily modeled by introducing an offset in the sensor model. The offset is controlled by the model variable SDrift
, note that SDrift=0
implies no sensor fault.
open_system([mdl '/Vibration sensor with drift'])
The shaft wear fault is modeled by a variant subsystem. In this case the subsystem variants change the shaft damping but the variant subsystem could be used to completely change the shaft model implementation. The selected variant is controlled by the model variable ShaftWear
, note that ShaftWear=0
implies no shaft fault.
open_system([mdl '/Shaft'])
open_system([mdl,'/Shaft/Output Shaft'])
The gear tooth fault is modelled by injecting a disturbance torque at a fixed position in the rotation of the drive shaft. The shaft position is measured in radians and when the shaft position is within a small window around 0 a disturbance force is applied to the shaft. The magnitude of the disturbance is controlled by the model variable ToothFaultGain
, note that ToothFaultGain=0
implies no gear tooth fault.
Simulating Fault and Healthy Data
The transmission model is configured with variables that control the presence and severity of the three different fault types, sensor drift, gear tooth, and shaft wear. By varying the model variables, SDrift
, ToothFaultGain
, and ShaftWear
, you can create vibration data for the different fault types. Use an array of Simulink.SimulationInput
objects to define a number of different simulation scenarios. For example, choose an array of values for each of the model variables and then use the ndgrid
function to create a Simulink.SimulationInput
for each combination of model variable values.
toothFaultArray = -2:2/10:0; % Tooth fault gain values sensorDriftArray = -1:0.5:1; % Sensor drift offset values shaftWearArray = [0 -1]; % Variants available for drive shaft conditions % Create an n-dimensional array with combinations of all values [toothFaultValues,sensorDriftValues,shaftWearValues] = ... ndgrid(toothFaultArray,sensorDriftArray,shaftWearArray); for ct = numel(toothFaultValues):-1:1 % Create a Simulink.SimulationInput for each combination of values siminput = Simulink.SimulationInput(mdl); % Modify model parameters siminput = setVariable(siminput,'ToothFaultGain',toothFaultValues(ct)); siminput = setVariable(siminput,'SDrift',sensorDriftValues(ct)); siminput = setVariable(siminput,'ShaftWear',shaftWearValues(ct)); % Collect the simulation input in an array gridSimulationInput(ct) = siminput; end
Similarly create combinations of random values for each model variable. Make sure to include the 0
value so that there are combinations where only subsets of the three fault types are represented.
rng('default'); % Reset random seed for reproducibility toothFaultArray = [0 -rand(1,6)]; % Tooth fault gain values sensorDriftArray = [0 randn(1,6)/8]; % Sensor drift offset values shaftWearArray = [0 -1]; % Variants available for drive shaft conditions %Create an n-dimensional array with combinations of all values [toothFaultValues,sensorDriftValues,shaftWearValues] = ... ndgrid(toothFaultArray,sensorDriftArray,shaftWearArray); for ct=numel(toothFaultValues):-1:1 % Create a Simulink.SimulationInput for each combination of values siminput = Simulink.SimulationInput(mdl); % Modify model parameters siminput = setVariable(siminput,'ToothFaultGain',toothFaultValues(ct)); siminput = setVariable(siminput,'SDrift',sensorDriftValues(ct)); siminput = setVariable(siminput,'ShaftWear',shaftWearValues(ct)); % Collect the simulation input in an array randomSimulationInput(ct) = siminput; end
With the array of Simulink.SimulationInput
objects defined use the generateSimulationEnsemble
function to run the simulations. The generateSimulationEnsemble
function configures the model to save logged data to file, use the timetable format for signal logging and store the Simulink.SimulationInput
objects in the saved file. The generateSimulationEnsemble
function returns a status flag indicating whether the simulation completed successfully.
The code above created 110 simulation inputs from the gridded variable values and 98
simulation inputs from the random variable values giving 208 total simulations. Running these 208 simulations in parallel can take as much as two hours or more on a standard desktop and generates around 10GB of data. An option to only run the first 10 simulations is provided for convenience.
% Run the simulations and create an ensemble to manage the simulation results if ~exist(fullfile(pwd,'Data'),'dir') mkdir(fullfile(pwd,'Data')) % Create directory to store results end runAll = true; if runAll [ok,e] = generateSimulationEnsemble([gridSimulationInput, randomSimulationInput], ... fullfile(pwd,'Data'),'UseParallel', true, 'ShowProgress', false); else [ok,e] = generateSimulationEnsemble(gridSimulationInput(1:10), fullfile(pwd,'Data'), 'ShowProgress', false); %#ok<*UNRCH> end
Starting parallel pool (parpool) using the 'Processes' profile ... Preserving jobs with IDs: 1 because they contain crash dump files. You can use 'delete(myCluster.Jobs)' to remove all jobs created with profile Processes. To create 'myCluster' use 'myCluster = parcluster('Processes')'. Connected to parallel pool with 6 workers.
The generateSimulationEnsemble
ran and logged the simulation results. Create a simulation ensemble to process and analyze the simulation results using the simulationEnsembleDatastore
command.
ens = simulationEnsembleDatastore(fullfile(pwd,'Data'));
Processing the Simulation Results
The simulationEnsembleDatastore
command created an ensemble object that points to the simulation results. Use the ensemble object to prepare and analyze the data in each member of the ensemble. The ensemble object lists the data variables in the ensemble and by default all the variables are selected for reading.
ens
ens = simulationEnsembleDatastore with properties: DataVariables: [6×1 string] IndependentVariables: [0×0 string] ConditionVariables: [0×0 string] SelectedVariables: [6×1 string] ReadSize: 1 NumMembers: 208 LastMemberRead: [0×0 string] Files: [208×1 string]
ens.SelectedVariables
ans = 6×1 string
"SimulationInput"
"SimulationMetadata"
"Tacho"
"Vibration"
"xFinal"
"xout"
For analysis only read the Vibration
and Tacho
signals and the Simulink.SimulationInput
. The Simulink.SimulationInput
has the model variable values used for simulation and is used to create fault labels for the ensemble members. Use the ensemble read
command to get the ensemble member data.
ens.SelectedVariables = ["Vibration" "Tacho" "SimulationInput"]; data = read(ens)
data=1×3 table
Vibration Tacho SimulationInput
___________________ ___________________ ______________________________
{40272×1 timetable} {40272×1 timetable} {1×1 Simulink.SimulationInput}
Extract the vibration signal from the returned data and plot it.
vibration = data.Vibration{1}; plot(vibration.Time,vibration.Data) title('Vibration') ylabel('Acceleration')
The first 10 seconds of the simulation contains data where the transmission system is starting up; for analysis discard this data.
idx = vibration.Time >= seconds(10); vibration = vibration(idx,:); vibration.Time = vibration.Time - vibration.Time(1);
The Tacho
signal contains pulses for the rotations of the drive and load shafts but analysis, and specifically time synchronous averaging, requires the times of shaft rotations. The following code discards the first 10 seconds of the Tacho
data and finds the shaft rotation times in tachoPulses
.
tacho = data.Tacho{1}; idx = tacho.Time >= seconds(10); tacho = tacho(idx,:); plot(tacho.Time,tacho.Data) title('Tacho pulses') legend('Drive shaft','Load shaft') % Load shaft rotates more slowly than drive shaft
idx = diff(tacho.Data(:,2)) > 0.5; tachoPulses = tacho.Time(find(idx)+1)-tacho.Time(1)
tachoPulses = 8×1 duration
2.8543 sec
6.6508 sec
10.447 sec
14.244 sec
18.04 sec
21.837 sec
25.634 sec
29.43 sec
The Simulink.SimulationInput.Variables
property contains the values of the fault parameters used for the simulation, these values allow us to create fault labels for each ensemble member.
vars = data.SimulationInput{1}.Variables; idx = strcmp({vars.Name},'SDrift'); if any(idx) sF = abs(vars(idx).Value) > 0.01; % Small drift values are not faults else sF = false; end idx = strcmp({vars.Name},'ShaftWear'); if any(idx) sV = vars(idx).Value < 0; else sV = false; end if any(idx) idx = strcmp({vars.Name},'ToothFaultGain'); sT = abs(vars(idx).Value) < 0.1; % Small tooth fault values are not faults else sT = false end faultCode = sF + 2*sV + 4*sT; % A fault code to capture different fault conditions
The processed vibration and tacho signals and the fault labels are added to the ensemble to be used later.
sdata = table({vibration},{tachoPulses},sF,sV,sT,faultCode, ... 'VariableNames',{'Vibration','TachoPulses','SensorDrift','ShaftWear','ToothFault','FaultCode'})
sdata=1×6 table
Vibration TachoPulses SensorDrift ShaftWear ToothFault FaultCode
___________________ ______________ ___________ _________ __________ _________
{30106×1 timetable} {8×1 duration} true false false 1
ens.DataVariables = [ens.DataVariables; "TachoPulses"];
The ensemble ConditionVariables
property can be used to identify the variables in the ensemble that contain condition or fault label data. Set the property to contain the newly created fault labels.
ens.ConditionVariables = ["SensorDrift","ShaftWear","ToothFault","FaultCode"];
The code above was used to process one member of the ensemble. To process all the ensemble members the code above was converted to the function prepareData
and using the ensemble hasdata
command a loop is used to apply prepareData
to all the ensemble members. The ensemble members can be processed in parallel by partitioning the ensemble and processing the ensemble partitions in parallel.
reset(ens) runLocal = false; if runLocal % Process each member in the ensemble while hasdata(ens) data = read(ens); addData = prepareData(data); writeToLastMemberRead(ens,addData) end else % Split the ensemble into partitions and process each partition in parallel n = numpartitions(ens,gcp); parfor ct = 1:n subens = partition(ens,n,ct); while hasdata(subens) data = read(subens); addData = prepareData(data); writeToLastMemberRead(subens,addData) end end end
Plot the vibration signal of every 10th member of the ensemble using the hasdata
and read
commands to extract the vibration signal.
reset(ens) ens.SelectedVariables = "Vibration"; figure, ct = 1; while hasdata(ens) data = read(ens); if mod(ct,10) == 0 vibration = data.Vibration{1}; plot(vibration.Time,vibration.Data) hold on end ct = ct + 1; end hold off title('Vibration signals') ylabel('Acceleration')
Analyzing the Simulation Data
Now that the data has been cleaned and pre-processed the data is ready for extracting features to determine the features to use to classify the different faults types. First configure the ensemble so that it only returns the processed data.
ens.SelectedVariables = ["Vibration","TachoPulses"];
For each member in the ensemble compute a number of time and spectrum based features. These include signal statistics such as signal mean, variance, peak to peak, non-linear signal characteristics such as approximate entropy and Lyapunov exponent, and spectral features such as the peak frequency of the time synchronous average of the vibration signal, and the power of the time synchronous average envelope signal. The analyzeData
function contains a full list of the extracted features. By way of example consider computing the spectrum of the time synchronous averaged vibration signal.
reset(ens) data = read(ens)
data=1×2 table
Vibration TachoPulses
___________________ ______________
{30106×1 timetable} {8×1 duration}
vibration = data.Vibration{1}; % Interpolate the Vibration signal onto periodic time base suitable for fft analysis np = 2^floor(log(height(vibration))/log(2)); dt = vibration.Time(end)/(np-1); tv = 0:dt:vibration.Time(end); y = retime(vibration,tv,'linear'); % Compute the time synchronous average of the vibration signal tp = seconds(data.TachoPulses{1}); vibrationTSA = tsa(y,tp); figure plot(vibrationTSA.Time,vibrationTSA.tsa) title('Vibration time synchronous average') ylabel('Acceleration')
% Compute the spectrum of the time synchronous average np = numel(vibrationTSA); f = fft(vibrationTSA.tsa.*hamming(np))/np; frTSA = f(1:floor(np/2)+1); % TSA frequency response wTSA = (0:np/2)/np*(2*pi/seconds(dt)); % TSA spectrum frequencies mTSA = abs(frTSA); % TSA spectrum magnitudes figure semilogx(wTSA,20*log10(mTSA)) title('Vibration spectrum') xlabel('rad/s')
The frequency that corresponds to the peak magnitude could turn out to be a feature that is useful for classifying the different fault types. The code below computes the features mentioned above for all the ensemble members (running this analysis can take up to an hour on a standard desktop. Optional code to run the analysis in parallel using the ensemble partition
command is provided.) The names of the features are added to the ensemble data variables property before calling writeToLastMemberRead to add the computed features to each ensemble member.
reset(ens) ens.DataVariables = [ens.DataVariables; ... "SigMean";"SigMedian";"SigRMS";"SigVar";"SigPeak";"SigPeak2Peak";"SigSkewness"; ... "SigKurtosis";"SigCrestFactor";"SigMAD";"SigRangeCumSum";"SigCorrDimension";"SigApproxEntropy"; ... "SigLyapExponent";"PeakFreq";"HighFreqPower";"EnvPower";"PeakSpecKurtosis"]; if runLocal while hasdata(ens) data = read(ens); addData = analyzeData(data); writeToLastMemberRead(ens,addData); end else % Split the ensemble into partitions and analyze each partition in parallel n = numpartitions(ens,gcp); parfor ct = 1:n subens = partition(ens,n,ct); while hasdata(subens) data = read(subens); addData = analyzeData(data); writeToLastMemberRead(subens,addData) end end end
Selecting Features for Fault Classification
The features computed above are used to build a classifier to classify the different fault conditions. First configure the ensemble to read only the derived features and the fault labels.
featureVariables = analyzeData('GetFeatureNames');
ens.DataVariables = [ens.DataVariables; featureVariables];
ens.SelectedVariables = [featureVariables; ens.ConditionVariables];
reset(ens)
Gather the feature data for all the ensemble members into one table.
featureData = gather(tall(ens))
Evaluating tall expression using the Parallel Pool 'Processes': - Pass 1 of 1: Completed in 14 sec Evaluation completed in 14 sec
featureData=208×22 table
SigMean SigMedian SigRMS SigVar SigPeak SigPeak2Peak SigSkewness SigKurtosis SigCrestFactor SigMAD SigRangeCumSum SigCorrDimension SigApproxEntropy SigLyapExponent PeakFreq HighFreqPower EnvPower PeakSpecKurtosis SensorDrift ShaftWear ToothFault FaultCode
________ _________ _______ _______ _______ ____________ ___________ ___________ ______________ _______ ______________ ________________ ________________ _______________ ________ _____________ __________ ________________ ___________ _________ __________ _________
-0.94876 -0.9722 1.3726 0.98387 0.81571 3.6314 -0.041525 2.2666 2.0514 0.8081 28562 1.1427 0.031581 79.531 0 6.7528e-06 3.2349e-07 162.13 true false false 1
-0.97537 -0.98958 1.3937 0.99105 0.81571 3.6314 -0.023777 2.2598 2.0203 0.81017 29418 1.1362 0.037835 70.339 0 5.0788e-08 9.1568e-08 226.12 true false false 1
1.0502 1.0267 1.4449 0.98491 2.8157 3.6314 -0.04162 2.2658 1.9487 0.80853 31710 1.1479 0.031565 125.18 0 6.7416e-06 3.1343e-07 162.13 true true false 3
1.0227 1.0045 1.4288 0.99553 2.8157 3.6314 -0.016356 2.2483 1.9707 0.81324 30984 1.1472 0.032088 112.52 0 4.9939e-06 2.6719e-07 162.13 true true false 3
1.0123 1.0024 1.4202 0.99233 2.8157 3.6314 -0.014701 2.2542 1.9826 0.81156 30661 1.147 0.03287 109.02 0 3.6182e-06 2.1964e-07 230.39 true true false 3
1.0275 1.0102 1.4338 1.0001 2.8157 3.6314 -0.02659 2.2439 1.9638 0.81589 31102 1.0975 0.033427 64.498 0 2.5493e-06 1.9224e-07 230.39 true true false 3
1.0464 1.0275 1.4477 1.0011 2.8157 3.6314 -0.042849 2.2455 1.9449 0.81595 31665 1.1417 0.034159 98.759 0 1.7313e-06 1.6263e-07 230.39 true true false 3
1.0459 1.0257 1.4402 0.98047 2.8157 3.6314 -0.035405 2.2757 1.955 0.80583 31554 1.1345 0.0353 44.304 0 1.1115e-06 1.2807e-07 230.39 true true false 3
1.0263 1.0068 1.4271 0.98341 2.8157 3.6314 -0.0165 2.2726 1.973 0.80624 30951 1.1515 0.035897 125.28 0 6.5947e-07 1.208e-07 162.13 true true false 3
1.0103 1.0014 1.4183 0.99091 2.8157 3.6314 -0.011667 2.2614 1.9853 0.80987 30465 1.0619 0.036489 17.093 0 5.2297e-07 1.0704e-07 230.39 true true false 3
1.0129 1.0023 1.419 0.98764 2.8157 3.6314 -0.010969 2.266 1.9843 0.80866 30523 1.1371 0.037209 84.57 0 2.1605e-07 8.5562e-08 230.39 true true false 3
1.0251 1.0104 1.4291 0.9917 2.8157 3.6314 -0.023609 2.2588 1.9702 0.81048 30896 1.1261 0.037811 98.462 0 5.0275e-08 9.0495e-08 230.39 true true false 3
-0.97301 -0.99243 1.3928 0.99326 0.81571 3.6314 -0.012569 2.2589 2.0216 0.81095 29351 1.1277 0.038481 42.887 0 1.1383e-11 8.3005e-08 230.39 true false true 5
1.0277 1.0084 1.4315 0.99314 2.8157 3.6314 -0.013542 2.2598 1.9669 0.81084 30963 1.1486 0.038499 99.426 0 1.1346e-11 8.353e-08 226.12 true true true 7
0.026994 0.0075709 0.99697 0.99326 1.8157 3.6314 -0.012569 2.2589 1.8212 0.81095 1083.8 1.1277 0.038481 44.448 9.998 4.9172e-12 8.3005e-08 230.39 false false true 4
0.026943 0.0084639 0.99297 0.98531 1.8157 3.6314 -0.018182 2.2686 1.8286 0.80732 1466.6 1.1368 0.035799 93.95 1.8628 6.8862e-07 1.1702e-07 230.39 false false false 0
⋮
Consider the sensor drift fault. Use the fscnca
command with all the features computed above as predictors and the sensor drift fault label (a true false value) as the response. The fscnca
command returns weights for each feature and features with higher weights have higher importance in predicting the response. For the sensor drift fault the weights indicate that two features are significant predictors (the signal cumulative sum range and the peak frequency of the spectral kurtosis) and the remaining features have little impact.
idxResponse = strcmp(featureData.Properties.VariableNames,'SensorDrift'); idxLastFeature = find(idxResponse)-1; % Index of last feature to use as a potential predictor featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.SensorDrift); featureAnalysis.FeatureWeights
ans = 18×1
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
⋮
idxSelectedFeature = featureAnalysis.FeatureWeights > 0.1; classifySD = [featureData(:,idxSelectedFeature), featureData(:,idxResponse)]
classifySD=208×3 table
SigRangeCumSum PeakSpecKurtosis SensorDrift
______________ ________________ ___________
28562 162.13 true
29418 226.12 true
31710 162.13 true
30984 162.13 true
30661 230.39 true
31102 230.39 true
31665 230.39 true
31554 230.39 true
30951 162.13 true
30465 230.39 true
30523 230.39 true
30896 230.39 true
29351 230.39 true
30963 226.12 true
1083.8 230.39 false
1466.6 230.39 false
⋮
A grouped histogram of the cumulative sum range gives us insight into why this feature is a significant predictor for the sensor drift fault.
figure histogram(classifySD.SigRangeCumSum(classifySD.SensorDrift),'BinWidth',5e3) xlabel('Signal cumulative sum range') ylabel('Count') hold on histogram(classifySD.SigRangeCumSum(~classifySD.SensorDrift),'BinWidth',5e3) hold off legend('Sensor drift fault','No sensor drift fault')
The histogram plot shows that the signal cumulative sum range is a good featured for detecting sensor drift faults though an additional feature is probably needed as there may be false positives when the signal cumulative sum range is below 0.5 if just the signal cumulative sum range is used to classify sensor drift.
Consider the shaft wear fault. In this case the fscnca
function indicates that there are 3 features that are significant predictors for the fault (the signal Lyapunov exponent, peak frequency, and the peak frequency in the spectral kurtosis), choose these to classify the shaft wear fault.
idxResponse = strcmp(featureData.Properties.VariableNames,'ShaftWear');
featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.ShaftWear);
featureAnalysis.FeatureWeights
ans = 18×1
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
⋮
idxSelectedFeature = featureAnalysis.FeatureWeights > 0.1; classifySW = [featureData(:,idxSelectedFeature), featureData(:,idxResponse)]
classifySW=208×4 table
SigLyapExponent PeakFreq PeakSpecKurtosis ShaftWear
_______________ ________ ________________ _________
79.531 0 162.13 false
70.339 0 226.12 false
125.18 0 162.13 true
112.52 0 162.13 true
109.02 0 230.39 true
64.498 0 230.39 true
98.759 0 230.39 true
44.304 0 230.39 true
125.28 0 162.13 true
17.093 0 230.39 true
84.57 0 230.39 true
98.462 0 230.39 true
42.887 0 230.39 false
99.426 0 226.12 true
44.448 9.998 230.39 false
93.95 1.8628 230.39 false
⋮
The grouped histogram for the signal Lyapunov exponent shows why that feature alone is not a good predictor.
figure histogram(classifySW.SigLyapExponent(classifySW.ShaftWear)) xlabel('Signal lyapunov exponent') ylabel('Count') hold on histogram(classifySW.SigLyapExponent(~classifySW.ShaftWear)) hold off legend('Shaft wear fault','No shaft wear fault')
The shaft wear feature selection indicates multiple features are needed to classify the shaft wear fault, the grouped histogram confirms this as the most significant feature (in this case the Lyapunov exponent) has a similar distribution for both faulty and fault free scenarios indicating that more features are needed to correctly classify this fault.
Finally consider the tooth fault, the fscnca
function indicates that there are 3 features primary that are significant predictors for the fault (the signal cumulative sum range, the signal Lyapunov exponent and the peak frequency in the spectral kurtosis). Choosing those 3 features to classify the tooth fault results in a classifier that has poor performance. Instead, use the 6 most important features.
idxResponse = strcmp(featureData.Properties.VariableNames,'ToothFault');
featureAnalysis = fscnca(featureData{:,1:idxLastFeature},featureData.ToothFault);
[~,idxSelectedFeature] = sort(featureAnalysis.FeatureWeights);
classifyTF = [featureData(:,idxSelectedFeature(end-5:end)), featureData(:,idxResponse)]
classifyTF=208×7 table
SigPeak PeakFreq SigCorrDimension SigLyapExponent PeakSpecKurtosis SigRangeCumSum ToothFault
_______ ________ ________________ _______________ ________________ ______________ __________
0.81571 0 1.1427 79.531 162.13 28562 false
0.81571 0 1.1362 70.339 226.12 29418 false
2.8157 0 1.1479 125.18 162.13 31710 false
2.8157 0 1.1472 112.52 162.13 30984 false
2.8157 0 1.147 109.02 230.39 30661 false
2.8157 0 1.0975 64.498 230.39 31102 false
2.8157 0 1.1417 98.759 230.39 31665 false
2.8157 0 1.1345 44.304 230.39 31554 false
2.8157 0 1.1515 125.28 162.13 30951 false
2.8157 0 1.0619 17.093 230.39 30465 false
2.8157 0 1.1371 84.57 230.39 30523 false
2.8157 0 1.1261 98.462 230.39 30896 false
0.81571 0 1.1277 42.887 230.39 29351 true
2.8157 0 1.1486 99.426 226.12 30963 true
1.8157 9.998 1.1277 44.448 230.39 1083.8 true
1.8157 1.8628 1.1368 93.95 230.39 1466.6 false
⋮
figure histogram(classifyTF.SigRangeCumSum(classifyTF.ToothFault)) xlabel('Signal cumulative sum range') ylabel('Count') hold on histogram(classifyTF.SigRangeCumSum(~classifyTF.ToothFault)) hold off legend('Gear tooth fault','No gear tooth fault')
Using the above results a polynomial svm to classify gear tooth faults. Split the feature table into members that are used for training and members for testing and validation. Use the training members to create a svm classifier using the fitcsvm
command.
rng('default') % For reproducibility cvp = cvpartition(size(classifyTF,1),'KFold',5); % Randomly partition the data for training and validation classifierTF = fitcsvm(... classifyTF(cvp.training(1),1:end-1), ... classifyTF(cvp.training(1),end), ... 'KernelFunction','polynomial', ... 'PolynomialOrder',2, ... 'KernelScale','auto', ... 'BoxConstraint',1, ... 'Standardize',true, ... 'ClassNames',[false; true]);
Use the classifier to classify the test points using the predict
command and check the performance of the predictions using a confusion matrix.
% Use the classifier on the test validation data to evaluate performance actualValue = classifyTF{cvp.test(1),end}; predictedValue = predict(classifierTF, classifyTF(cvp.test(1),1:end-1)); % Check performance by computing and plotting the confusion matrix confdata = confusionmat(actualValue,predictedValue); h = heatmap(confdata, ... 'YLabel', 'Actual gear tooth fault', ... 'YDisplayLabels', {'False','True'}, ... 'XLabel', 'Predicted gear tooth fault', ... 'XDisplayLabels', {'False','True'}, ... 'ColorbarVisible','off');
The confusion matrix indicates that the classifier correctly classifies all non-fault conditions but incorrectly classifies one expected fault condition as not being a fault. Increasing the number of features used in the classifier can help improve the performance further.
Summary
This example walked through the workflow for generating fault data from Simulink, using a simulation ensemble to clean up the simulation data and extract features. The extracted features were then used to build classifiers for the different fault types.