Classification of Induction Machine Faults using Time Frequency Representation and Particle Swarm Optimization

  • Medoued, A. (Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda) ;
  • Lebaroud, A. (Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda) ;
  • Laifa, A. (Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda) ;
  • Sayad, D. (Dept. of Electrical Engineering, University of 20 Aout 1955-Skikda)
  • Received : 2012.12.26
  • Accepted : 2013.08.03
  • Published : 2014.01.01


This paper presents a new method of classification of the induction machine faults using Time Frequency Representation, Particle Swarm Optimization and artificial neural network. The essence of the feature extraction is to project from faulty machine to a low size signal time-frequency representation (TFR), which is deliberately designed for maximizing the separability between classes, a distinct TFR is designed for each class. The feature vectors size is optimized using Particle Swarm Optimization method (PSO). The classifier is designed using an artificial neural network. This method allows an accurate classification independently of load level. The introduction of the PSO in the classification procedure has given good results using the reduced size of the feature vectors obtained by the optimization process. These results are validated on a 5.5-kW induction motor test bench.

1. Introduction

Today’s industry strives to improve performance and profitability while maintaining and improving safety. The challenges include reliability and safety operation of electric motors in an industrial process. Thus, very expensive scheduled maintenance is performed in order to detect machine problems before they may result in catastrophic failure [1-2]. Nowadays, maintenance cost reductions are the number one priority for electrical drive to prevent unscheduled downtimes and to increase operational effectiveness. Recent advances of signal processing techniques, such as artificial neural networks [3-8], wavelets [9], etc.., have provided more powerful tools for fault diagnosis.

The problem of diagnosis systems is that they use signals either in the time or frequency domain. In our approach, instead of using a time or a frequency approach, it is potentially more informative to use both time and frequency. Time-frequency analysis of the motor current makes signal properties, related to fault detection, more evident in the transform domain [10].

Traditionally, the objective of time–frequency research is to create a function that will describe the energy density of a signal simultaneously in time and frequency. For explicit classification, it is not necessarily desirable to accurately represent the energy distribution of a signal in time and frequency. In fact, such a representation may conflict with the goal of classification, generating a TFR that maximizes the separability of TFRs from different classes. It may be advantageous to design TFRs that specifically highlight differences between classes [11-14].

Since all TFRs can be derived from the ambiguity plane, no a priori assumption is made about the smoothing required for accurate classification. Thus, the smoothing quadratic TFRs retain only the information that is essential for classification.

This classification allows us to proceed to an optimization routine based on particle swarm technique to find the appropriate size of the feature vectors in order to reduce calculation time and keep signal with relevant information within the vectors.

In this paper, we propose a classification algorithm based on the design of an optimized TFR from a time–frequency ambiguity plane in order to extract the feature vector. The optimal size of feature vectors is realized by the PSO algorithm. The PSO technique can generate high-quality solutions within shorter calculation time and stable convergence characteristic than any other stochastic methods [15-17].

Finally, a neural network-based decision criterion is used for classification. The goal of this work is the realization of an accurate classification system of motor faults such as bearing faults, stator faults, and broken bars rotor faults independently from the load level.


2. Classification Algorithm

The classification algorithm consists of the following three parts: extraction, optimization of features vectors and decision making. In the training stage, three optimal kernels are designed for separating four classes [18]:

1) Class of healthy motor;2) Class of bearing fault;3) Class of stator fault;4) Class of broken bars.

The kernel design process selects, for each class, a number of locations from the time - frequency ambiguity plane. In the decision making stage, we propose an ANN classifier with the Levenberg Marquardt algorithm.The details of each step are described in the following sections.


3. Feature Extraction

3.1 Optimal TFR

For further details, we recommend the reader to review our previous works [19] and [20].

The expression of the TFR is given by:

The characteristic function for each TFR is A(η, τ)φ(η, τ) , η represents the discrete frequency shift and τ represents the discrete time delay. This means that the optimal-classification representation TFRi can be obtained by smoothing the ambiguity plane A(η, τ) with an appropriate kernel φopt , which is an optimal classification kernel. The problem of designing the TFRi becomes equivalent to designing the optimal classification kernel φopt (η, τ). This method, used to design kernels (and thus TFRs), optimizes the discrimination between predefined sets of classes.

Features can be extracted directly from A(η, τ)φopt(η, τ) instead of the optimal classification TFRi. This shortcut simplifies the computation complexity of the feature extraction by reducing the calculations.

3.2 Design of classification kernels

The kernel φopt (η, τ) is designed for each specific classification task. We determine N locations from the ambiguity plane, in such a way that the values in these locations are very similar for signals from the same class, but they vary significantly for signals from different classes. In our design, we use Fisher’s discriminant ratio, FDR [19-20], to get these N locations.

In our classification procedure, C−1 kernels must be designed for a C-class classification system. In order to avoid unnecessary computation to separate classes, we have proposed the principle of the remaining classes [11]. The discrimination between different classes is made by separating the class i from all the remaining classes {i+1,…,N}. In this case, the stator fault kernel is designed to discriminate the stator fault class from the other classes (rotor fault, bearing fault and healthy motor). The rotor fault kernel is designed to discriminate the rotor fault class from the remaining classes (bearing fault and healthy motor). The bearing fault kernel is designed to discriminate the bearing fault class from the healthy motor class. The advantage of the method lies in the optimum separation between the different classes.


4. Feature Vector Optimization

One objective of our approach is to minimize the signal size by the feature vector of a very small size without losing relevant information.. Hence, the search for an optimum size of this vector provides a good compromise between the relevance of information and time consuming cost.

4.1 Particle Swarm Optimization (PSO)

Particle Swarm Optimization (PSO), introduced by Eberhart and Kennedy [21], is based on the analogy of birds swarm and school of fish. In PSO, each individual called particle makes his decision using his own experience together with other individuals’ experience. In PSO, two different definitions are used: the individual best and the global best. As a particle moves through the search space, it compares its fitness value at the current position to the best fitness value it has ever attained previously. The best position that is associated with the best fitness encountered so far is called the individual best or pbest. The global best, or gbest, is the best position among all of the individual’s best positions achieved so far (Fig. 1).

Fig. 1.Particle swarm method principle

Using the gbest and the pbest, the ith particle velocity is updated according to the following equation[22]:

Based on the updated velocities, each particle changes its position according to the equation:

Where w is a weighting function, cj are acceleration factors and rand is a random number between 0 and 1.

The following weighting function is usually utilized:

Where wmax is initial weight, wmin the final weight, itermax is the maximum iteration number, and iter is the current iteration number.

The parameters used in this work are taken as follows [22-26]:

c1=c2=2.05; wmin =0.1; wmax =0.9.

Selection of maximum velocity:

At each iteration step, the algorithm proceeds by adjusting the distance (velocity) that each particle moves in every dimension of the problem hyperspace. The velocity of the particle is a stochastic variable and is, therefore, subject to creating an uncontrolled trajectory, making the particle follow wider cycles in the problem space. In order to damp these oscillations, upper and lower limits can be defined for the velocity vi :

Most of the time, the value of vmax is selected empirically, according to the characteristics of the problem. It is important to note that if the value of this parameter is too large, then the particles may move erratically, going beyond a good solution; on the other hand, if vmax is too small, then the particle’s movement is limited and the optimal solution may not be reached.

Fan and Shi [27] have shown that an appropriate dynamically changing vmax can improve the PSO algorithm performance. To ensure a uniform velocity we fixed vmax according to many run tests.

Integer PSO formulation:

In the case where integer variables are included in the optimization problem such as a size of feature vector, the PSO algorithm can be reformulated by rounding off the particle’s position to the nearest integer. Mathematically, (3) and (4) are still valid, but once the new particle’s position is determined in the real-number space, the conversion to the integer number space must be done.

4.2 Fitness function

For searching an optimized size of the feature vector based on PSO algorithm, a fitness function is needed. In this work, we consider the variance calculated for every size of the feature vector as the fitness for this size and the goal is to optimize this fitness.


5. Classification Using Neural Networks

In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. The learning procedure tries to find a set of connections w that gives a mapping that fits well the training set.

Furthermore, neural networks can be viewed as highly nonlinear functions with the basic form:

Where x is the input vector presented to the network, w are the weights of the network, and y is the corresponding output vector approximated or predicted by the network. The weight vector w is commonly ordered first by layer, then by neurons, and finally by the weights of each neuron plus its bias.

This view of network as a parameterized function will be the basis for applying standard function optimization methods to solve the problem of neural network training.

5.1 Network training as a function optimization problem

As mentioned previously, neural networks can be viewed as highly non-linear functions. From this perspective, the training problem can be considered as a general function optimization problem, with the adjustable parameters being the weights and biases of the network, and the Levenberg-Marquardt can be straightforward applied in this case.

5.2 Levenberg-marquardt algorithm

Basically, it consists in solving the equation:

Where J is the Jacobian matrix (Eq. 8), λ the Levenberg's damping factor, δ the desired updated weight vector and E the error vector containing the output errors for each input vector used on training the network. The δ tells us by how much we should change our network weights to achieve a (possibly) better solution. The JT J matrix may also be known as the approximated Hessian. The λ damping factor is adjusted at each iteration, and guides the optimization process. If the reduction of E is rapid, a smaller value can be used, bringing the algorithm closer to the Gauss-Newton algorithm, whereas if an iteration gives insufficient reduction in the residual, λ can be increased, giving a step closer to the gradient descent direction.

5.3 Computing the Jacobian

AnN-by-M matrix of all first-order partial derivatives of a vector-valued function. N is the number of entries in our training set and M is the total number of parameters (weights + biases) of our network. It can be created by taking the partial derivatives of each output in respect toeach weight, and has the form

Where F(xi,w) is the network function evaluated for the ith input vector of the training set using the weight vector w and wj is the jth element of the weight vector w of the network.

5.4 General Levenberg-Marquardt algorithm

As stated earlier, the Levenberg-Marquardt consists basically in solving (11) with different values of λ until the sum of squared error decreases. So, each learning iteration (epoch) will consist of the following basic steps:

1 Compute the Jacobian2 Compute the error gradient: g = JT E3 Approximate the Hessian: H = JT E4 Solve (H + λI)δ = g to find δ5 Update the network weights ω using δ6 Recalculate the sum of squared errors7 If the sum of squared errors has not been decreased, discard the new weights, increase λ using v and go to step 6.8 Else decrease λ using v and stop.

Variations of the algorithm may include different values of v, one for decreasing λ and another for increasing it. Others may solve (H + λdiag(H))δ = g instead of (H + λI)δ = g , while others may select the initial λ according to the size of the elements on H, by setting λ0 = tmax(diag(H)) , where t is a chosen value.

We can see that we will have a problem if the error does not decrease after some iteration. In this case, the algorithm also stops if λ becomes too large [28-29].


6. Experiment Results

The experimental data are collected in Ampère Laboratory, University of Lyon. The experimental bench consists of a three-phase asynchronous-motor squirrel cage Leroy Somer LS 132S, IP 55, Class F, T ◦C standard = 40 ◦C. The motor is loaded by a powder brake. Its maximum torque (100 Nm) is reached at rated speed.

This brake is sized to dissipate a maximum power of 5kW.Fig. 2 shows the motor bench. The wear obtained on the bearings is a real one (Fig. 3). For the rotor fault, the bar has been broken by drilling the bar of the squirrel cage (Fig. 4). For simulating the fault of imbalance stator, imbalanced power is obtained with a variable auto-transformer placed on a phase of the network (Fig. 2).

Fig. 2.The 5.5 kW motor coupled with load (powder brake).

Fig. 3.Accelerated wear of the bearings by immersion in acid.

Fig. 4.Rotor with broken bars

The acquisition system used to measure these signals consists of eight differential inputs used to measure currents sampled up to 20 MHz 14-bit.

The current signals sampling rate is 20 kHz. The number of samples per signal rises to N=100000 samples on an acquisition period of 5s. The data acquisition set consists of 15 examples of stator current recorded on different levels of load (0%, 25%, 50%, 75% and 100%). Different operating conditions for the machine were considered, namely, healthy, bearing fault, stator fault and rotor fault. The training set is carried out on first ten current examples. The last five current examples are used to test the classification.

Each signal is passed through a lowpass filter and resampled with a downsampling rate of 50. Only the range of the required frequencies is preserved. The lowpass filter is used in order to avoid aliasing during downsampling. The dimension of ambiguity plane is 200×200=40000 points; by considering symmetry compared to the origin, we retain only the quarter of ambiguity plane, which corresponds to N=10000.We designed three kernels: stator fault kernel, rotor fault kernel and bearing fault kernel [18]. Fisher’s point locations in the Doppler-delay plane are ranged in the feature vectors {FV1,…, FVN} as training database of the neural network. In neural network, if there are too few neurons in the hidden layer, the network may not contain sufficient degrees of freedom to form a representation. If too many neurons are defined, the network might become overtrained. Therefore, an optimum design of the neurons number is required. In this work, we used one hidden layer with a number of different neurons to determine the suitable network. As a stop criterion we intended a goal of 10-12 which defines the convergence of the algorithm. The goal is reached in a minimum number of epochs 16 and 24, Fig. 5 and 6 respectively.

Fig. 5.Training diagrams for optimum case of 5 hidden neurons in kernel 1.

Fig. 6.Training diagrams for optimum case of 5 hidden neurons in kernel 2

The training algorithm gives a better performance for a number of 5 neurons in the hidden layers for the three kernels (Table 1).

Table 1.Misclassification results

Fig. 7 shows that for 15 test vectors, in case of Kernel 1, 14 were classified which indicates that the classification error is acceptable. This is also true for the two other kernels. Furthermore, the increase of the size of feature vector reduces significantly this error. However, the classification error is minimized when we increase the number of training vectors to 35 vectors (10 vectors of stator currents at 0% of charge, 5 at 25%, 5 at 50%, 5 at 75% and 10 at 100% of rated charge). Fig. 8 shows clearly a marked improvement in the classification process.

Fig. 7.Classification of test vectors for 20 training vectors

Fig. 8.Classification of test vectors for 35 training vectors

The objective of introducing the PSO is the optimization of the feature vectors size. By considering the variance as the fitness function, the size of the feature vectors was found to be 10. This means that the 10 first elements with larger values of the variance are more relevant (Figs. 9, 10). It is important to note that the training vectors strongly correlate to the number of classified vectors as can be seen on (Fig. 11).

Fig. 9.Feature vectors size optimization by PSO (class 1)

Fig. 10.Feature vectors size optimization by PSO (class 2)

Fig. 11.Classification of test vectors versus training vectors


5. Conclusion

In this paper, we have proposed a new fault classification algorithm of induction machine based on TFR and ANN. We have introduced the PSO algorithm to optimize the size of the feature vectors. Our classification is based on the ambiguity Doppler-delay plane where all the TFRs can be derived by a suitable choice of a kernel. Each type of fault was characterized by a specific kernel. The classification algorithm was tested by comparison with experimental data collected from the stator current measurement at different load levels. The assignment of signal was made by an ANN classifier. The results show that the new algorithm, with the neural network classifier as a decision criterion and the PSO as an optimizing technique, is able to detect and diagnose faults with acceptable accuracy and time consuming calculations compared to the case without PSO optimisation, independently of the load condition and the fault type.


  1. P. J. Tavner, B. G. Gaydon, and D. M. Ward, "Monitoring Generators and Large Motors," Proc. Inst. Elect. Eng. - B, vol. 133, no. 3, pp. 169-180, May 1986.
  2. P. Vas, "Parameter Estimation, Condition Monitoring and Diagnosis of Electrical Machines". Oxford, U.K.: Clarendon, 1993.
  3. Bouzid, M.; Champenois, G.; Bellaaj, N.M.; Signac, L.; Jelassi, K. "An Effective Neural Approach for the Automatic Location of Stator Interturn Faults in Induction Motor" IEEE Transactions on Industrial Electronics, Vol 12) 55 (pp. 4277-4289,. 2008.
  4. Lebaroud Abdesselam, Clerc Guy "Study of Rotor Asymmetry Effects of an Induction Machine by Finite Element Method", JEET, Journal of Electrical Engineering & Technology, Vol. 6, No. 3, pp. 342-349, 2011.
  5. Cupertino, F.; Giordano, V.; Mininno, E.; Salvatore, L "Application of Supervised and Unsupervised Neural Networks for Broken Rotor Bar Detection in Induction Motors" IEEE International Conference onElectric Machines and Drives, pp1895-1901, 2005.
  6. Ammar Medoued, Abdesselem Lebaroud, Ahcene Boukadoum and Guy Clerc, "On-line Faults Signature Monitoring Tool for, Induction Motor Diagnosis", Journal of Electrical Engineering & Technology Vol. 5, No. 1, pp. 140-145, 2010.
  7. Chow, M.-y.; Mangum, P.M.; Yee, S.O. "A neural network approach to real-time condition monitoring of induction motors" IEEE Transactions on Industrial Electronics pp. (6) 38 l vo 448-453, 1991.
  8. H. Su and K. T. Chong, "Induction Machine Condition Monitoring Using Neural Network Modeling," IEEE Trans. Ind. Electron., vol. 54, no. 1, pp. 241-249, Feb. 2007.
  9. A. Ordaz-Moreno, R. de Jesus Romero-Troncoso, J. A. Vite-Frias, J. R. Rivera-Gillen, and A. Garcia-Perez, "Automatic Online Diagnosis Algorithm for Broken-Bar Detection on Induction Motors Based on Discrete Wavelet Transform for FPGA Implementation", IEEE Trans. Ind. Electron., vol. 55, no. 5, pp. 2193-2202, May 2008.
  10. B. Yazıcı and G. B. Kliman, "An Adaptive Statistical Time-Frequency Method For Detection of Broken Bars and Bearing Faults in Motors Using Stator current," IEEE Trans. Ind. Appl., vol. 35, no. 2, pp. 442-452, Mar./Apr. 1999.
  11. M. Wang, G. I. Rowe, and A. V. Mamishev, "Classification of Power Quality Events Using Optimal Time-Frequency Representations - Part 2: Application," IEEE Trans. Power Del., vol. 19, no. 3, pp. 1496-1503, Jul. 2004.
  12. M. Davy and C. Doncarli, "Optimal kernels of time-frequency representations for signal classification," in Proc. IEEE-SP Int. Symp. Time-Freq. Time-Scale Anal., pp. 581-584, 1998.
  13. C. Heitz, "Optimum Time-Frequency Representations for the Classification and Detection of Signals," Appl. Signal Process., vol. 2, no. 3, pp. 124-143, 1995.
  14. B. W. Gillespie and L. Atlas, "Optimizing Time-Frequency Kernels for Classification," IEEE Trans. Signal Process., vol. 49, no. 3, pp. 485-496, Mar. 2001.
  15. K. P. Wong and J. Yuryevich, Evolutionary Programming Based Algorithm for Environmentally Constrained Economic Dispatch, IEEE Trans. Power Syst., Vol.13, No.2, pp. 301, May 1998.
  16. P. J. Angeline, Using Selection to Improve Particle Swarm Optimization, in Proc. IEEE International Conference on Evolutionary. Computations, pp. 84-89, May 1998.
  17. J. Kennedy and R. Eberhart, Particle swarm optimization, Proc. IEEE Int. Conf. Neural Networks, Vol. IV, pp. 1942-1948, 1995.
  18. A.Medoued, A.Lebaroud, A.Boukadoum, T.Boukra, G. Clerc, "Back Propagation Neural Network for Classification of Induction Machine Faults," 8th SDEMPED, IEEE Symposium on Diagnostics for Electrical Machines, Power Electronics & Drives September 5-8, 2011, Bologna, Italy, pp 525-528, 2011.
  19. A. Lebaroud and G. Clerc, "Classification of Induction Machine Faults by Optimal Time frequency Representations," IEEE Trans. on Industrial Electronics, vol. 55, no. 12, december 2008.
  20. A. Lebaroud and G. Clerc, "Accurate Diagnosis of Induction Machine Faults Using Optimal Time-Frequency Representations" Engineering Applications of Artificial Intelligence. Vol. 22, Issues 4-5, June 2009, Pages 815-822.
  21. J. Kennedy and R. Eberhart, "Particle swarm optimization," in Proc. IEEE Int. Conf. Neural Netw., vol. 4, Nov. 1995, pp. 1942-1948.
  22. V. Rashtchi, R. Aghmasheh "A New Method for Identifying Broken Rotor Bars in Squirrel Cage Induction Motor Based on Particle Swarm Optimization Method," World Academy of Science, Engineering and Technology Vol. 67, pp. 694-698, 2010.
  23. R. Eberhart and Y. Shi, "Particle swarm optimization: developments, applications and resources," in Proc. Cong. Evol.Comput, Vol. 1, pp. 81-86, 2001.
  24. J. Kennedy and R. Mendes, "Neighborhood topologies in fully informed and best-of-neighborhood particle swarms," Proc. of the IEEE International Workshop, pp. 45-50, June 2003.
  25. M'hamed, B. "Using Two Pso-Structures Approaches To Estimate Induction Machine Parameters ",13th European Conference on Power Electronics and Applications,pp1-8, 8-10 Sept. 2009.
  26. Hamid, R.H.A.; Amin, A.M.A.; Ahmed, R.S.; El-Gammal, A. "New Technique for Maximum Efficiency and Minimum Operating Cost of Induction Motors Based on Particle Swarm Optmization (PSO)" IEEE International Symposium on Industrial Electronics, Vol. 3(21), pp. 2176 - 2181,: 2006.
  27. H. Fan and Y. Shi, "Study on Vmax of particle swarm optimization," in Proc. Workshop on Particle Swarm Optimization, Purdue School of Engineering and Technology, Indianapolis, IN, Apr. 2001.
  28. Fausett L. ''Fundamentals of neural networks architectures, algorithms, and applications.'' Englewood Cliffs, NJ: Prentice Hall; 1994.
  29. aykin S. ''Neural networks: a comprehensive foundation''. New York: Macmillan; 2nd ed. 1998.

Cited by

  1. A Fault Severity Index for Stator Winding Faults Detection in Vector Controlled PM Synchronous Motor vol.10, pp.6, 2015,
  2. Fault Detection and Classification with Optimization Techniques for a Three-Phase Single-Inverter Circuit vol.16, pp.3, 2016,
  3. Fault diagnosis in yaw drive induction motor for wind turbine pp.2048-402X, 2018,