Understanding the Discrete Fourier Transform and the FFT
Brian Douglas
The discrete Fourier transform (DFT) transforms discrete time-domain signals into the frequency domain. The most efficient way to compute the DFT is using a fast Fourier transform (FFT) algorithm. This tech talk answers a few common questions that are often asked about the DFT and the FFT. It covers an overview of the algorithm where you’ll be walked through an understanding of why you might look at the absolute value of the FFT, how bin width is calculated, and what the difference is between one-sided and two-sided FFTs.
Published: 15 Nov 2023
Imagine you’re running a vibration analysis on some piece of hardware that you’re developing. The hardware is on a shaker table which applies random vibrations to the hardware and you measure how the hardware responds with accelerometers. This measurement is captured with a digital computer and therefore, what you get out is a finite amount of data that is sampled at a regular interval. Now, in the case of vibrations as well as many other applications, it’s often helpful to look at the spectrum of the signal, that is to separate the time domain signal into the frequency components that make it up.
Once we have frequency information, we might choose to look at the amplitude spectrum, or the power spectrum, or the power spectral density. Each of these provide some insight into the signal that we can’t get from the time domain alone.
Now, with finite discrete data like we have here, the first step to getting to any one of these representations is the discrete Fourier transform, or DFT. And the most efficient way to compute the DFT is using a Fast Fourier Transform algorithm, or an FFT.
For this video, what I want to do is answer a few common questions that you might have regarding the DFT and the FFT. I think this will be pretty interesting and useful, so I hope you stick around for it. I’m Brian, and welcome to a MATLAB Tech Talk
All right, so to begin, our first question we want to ask is why are we using the discrete Fourier transform? Well, the DFT transforms a signal from the time domain, or a spatial domain like distance, into the frequency domain. And one of the main reasons for making this transformation is because the features of a signal that we’re interested in are not always obvious in the time or spatial domain.
For example, here I have two time domain signals that are sampled at 200 Hz and the question I have for you is which of these has a significant 60 Hz component? It’s not terribly obvious, right? They both look pretty similar. But, if we transform them into the frequency domain, it's much easier to answer. I mean, it’s clearly signal one, because there is this large 60 Hz peak. So, we can use the frequency domain to understand the frequency make up of a signal.
I want to be clear that there are more reasons to go to the frequency domain than just looking at the spectrum, but I wanted to illustrate at least this one so you understand why we need tools like the DFT and to justify why we’re spending this time trying to understand them.
Ok, to understand how the DFT is doing this transformation from discrete time to discrete frequency, we should look at the equation.
I know there are lot’s of symbols here but it’s actually pretty straight forward. This blue variable, Xn is the discrete time domain signal that we’re starting with and we’re going to transform it into the frequency domain signal, X of k. To do this, we multiply the time signal with this yellow part, which is e raised to a complex number. And if you aren’t familiar with complex exponentials, this is equal to the cosine of the exponent plus i times the sine of the exponent. So, essentially what the DFT is doing is multiplying the time signal by a sine and cosine signal of a particular frequency, given by the variable k, and summing the result. And we do this summation for different values of k.
I think we can get a little intuition into how this works with a simple graphical demonstration.
Alright, I made little MATLAB app here to show how the DFT works. Here, the time signal, Xn is just a pure sine wave with 10 samples. So, per the definition of the DFT, we multiply this signal with E raised to an imaginary exponent. Which I’m calling the correlating signal and that name will hopefully make sense shortly.
If we set k=0, then our correlating signal is just e ^0 which produces a constant real value of 1 and an imaginary value of 0. And when we multiply this with our time signal and sum the result, we get a value that is really low. And mathematically this is because when we multiply the time signal by 1, we get you know the same signal back and so essentially we’re really just summing the values in our time signal. And there are equal positive values, that’s these 4, as there are negative values, these 4, and so they cancel each other out to near zero. And the way we can think about this low value is that our time domain signal is not correlated very well with a signal with zero frequency. That is, there is no DC offset in our time signal, and we can kind of see that it is in fact centered around 0.
Now, let’s move to a new set of frequencies, say k=1. Our correlating signal now consists of a sine and cosine wave with a period equal to the length of the time signal. If we multiply these with Xn and sum the result then the value is larger. And we can visually see that the correlating signal is near the same frequency as our time signal and so, at least for the imaginary component which is perfectly out of phase with our time signal, the product of the two is always negative and the summation then is also negative. So, there is a strong correlation between the two at k = 1.
And if we move to k = 2, the correlation drops again.
And this is all the DFT is doing. It’s going through N different frequencies where k goes from 0 to N -1, and calculates the correlation between it and the time signal.
That’s pretty cool right? That we can think of the DFT as producing the correlation between our time signal and bunch of different frequencies. But there is another cool way we can think about the DFT, and that’s as a rotation with matrix multiplication.
If we go back to the equation, we can rewrite it as a matrix multiplication. Xn is just a 1 by N vector of discrete time data. And the yellow exponential is an N by N matrix of complex numbers. The first column corresponds to the sines and cosines associated with k=0, the second column is k = 1, and so on all the way up to K = N-1.
Now when you perform this multiplication, for the first component of X of K, we get the inner product between the time signal and the frequencies at k = 0, the second component is the inner product with k=1 and so on. So, in this form, it’s a bit easier to see that the DFT is performing this rotation between one set of basis functions, this xn into another, Xk. This N by N matrix is what is achieving that rotation from time into complex exponentials.
And this is actually a good segue into a fast Fourier Transform. This matrix multiplication is easy to do if the length of the signal is short. You know, if there are only 10 samples then this is a 10x10 matrix. But if you’re working with a signal that has thousands or 10’s of thousands of samples then performing this matrix multiplication could because computationally costly.
But it turns out that due to various symmetries in this multiplication, a lot of the operations are duplicated. And so you can perform a calculation once and then populate the result in several locations. And FFT algorithms the advantage of duplicate operations to reduce the number of calculations. They produce the exact same result as the DFT, just in a more efficient way.
For a good explanation of the FFT algorithm and how these computational efficiencies actually made the DFT a viable option for science and engineering, check out Veritasium’s excellent video on the topic. Link is in the description below.
Alright, now that we know that the FFT is just an efficient way to calculate the DFT, I want to use the last bit of this video to answer a few practical questions and dive a little deeper into how we use and interpret the results. And the first question is why do we some times just look at the absolute value of the FFT?
Well, recall that the FFT produces a complex result. And the way we can interpret this is that of the real part of the FFT is how well the time signal correlates to a cosine wave of a given frequency and the imaginary part is how well it correlates to a sine wave of the same frequency. Know both of these is necessary if you want to know the phase of the frequency in your signal or if you want to be able to reconstruct the time signal exactly from frequency data. But if all you’re interested in is the magnitude of the frequency, that is how much of a particular frequency there is in your signal regardless of phase, then you just have to look at the absolute value.
So, for many signal processing applications, like answering our 60 Hz question, you don’t need a real and imaginary component to determine that, you just need the magnitude. So, we take the absolute value to get the magnitude.
Alright, for the next question, what’s the difference between a one-sided and two-sided FFT?
The answer involves understanding that the FFT returns both the positive and the negative frequencies. So, two sides. And if you take the FFT starting at k=0 and go up to k=N-1, then the positive frequencies are on the left and the negative frequencies are on the right, and the Nyquist frequency is the boundary between the two. This is based on the Nyquist Sampling theorem which states that we can only know signal information up to one half of the sampling rate.
Now sometimes it’s helpful to shift the FFT such that the negative signals are on the left, but it’s not necessary as long as you understand which are negative and which are positive.
Alright, back to the question. If you’re looking at the entire range of the FFT, this is a two-sided FFT. Whereas with a one sided FFT you’re just looking at the positive frequencies. Just one side of the spectrum. Why would we do this? Why would we throw away half of the information?
Well, first off if Xn is a real signal, that is it’s not a complex, and if we take the absolute value of the FFT of Xn, then the resulting spectrum is mirrored between the positive and negative frequencies.
And we can see why that is the case here. let me hide the time signal so we can just see the correlating signal. We know that when k = 0, this corresponds to 0 frequency, it’s neither positive nor negative. For k = 1 the frequency has a period of the length of the time signal. Going to k = 2, the frequency has a period one over two of the length of the time signal, and this continues as K increases, 1 over 3, 1 over 4 up to the Nyquist frequency.
Now I know this is a little hard to see, but if we increase k beyond this, the frequency starts to decrease. Not just that, but the frequency is also negative. And we can see that a little bit easier here. Notice that between k=9, the lowest negative frequency, and k=1 the lowest positive frequency, the correlating signal is the exact same but just negative. Now the real component doesn’t change sign since cosine is a even function, but the imaginary component does change sign since sine, a sinusoid, is an odd function.
This means that between k=1 and k=9, the magnitude of the FFT stays the same, and it’s just the sign that changes. And the same with k = 2 and 8, and 3 and 7 and so on until we reach the Nyquist frequency.
Now k=0 and k=Nyquist are both unique values that aren’t duplicated anywhere else, so we want to keep both of those frequencies in a one sided FFT, along with the rest of the positive frequencies. So, when there is an even number of time samples, like we have here with N=10, for a one sided FFT we need to keep N divided by 2 + 1 points in our FFT. That +1 accounts for the Nyquist frequency. However, if we have an odd number of samples, say N = 11, then we don’t actually get the Nyquist frequency in the FFT because it sort gets jumped over, in this case between k = 5 and k = 6. And therefore, with an odd number we take N divided by 2, which leave us with a half left over, and we round that up. For 11 samples, the one-sided FFT would have 6 points in it. So it’s important to know if you have an even or odd number of samples in your time signal to determine if you need that +1 or if you just have to round up.
Now, often you’ll see that with a one-sided FFT you have to scale the results to account for the information you’re ignoring, but if your goal is to just understand what frequencies make up your time signal then the scaling isn’t important. It becomes important if you want to understand like how much power is in that frequency band but we’re going to cover power spectrums in a future video and we’ll talk about scaling there.
Alright, we keep talking about positive and negative frequencies but I haven’t explained how the variable k corresponds to the exact spectrum frequency. Well, we know that k = 0 corresponds to 0 Hz, and I mentioned earlier that k = 1 corresponds to a frequency with a period equal to the length of the time signal, right? Well, k = 2 produces two waves in that same period, and so on.
So, k up to the Nyquist frequency refers to the number of cycles within a time period equal to the length of the time domain signal.
So if we want our spectrum frequency in hertz, or in cycles per second, then it’s just k cycles, divided by the number of seconds in the signal. Or we can write this as k times the sampling frequency divided by the number of samples. So, we can plot our FFT against frequency using this conversion.
But you might also hear the term, bin width. That is the width of each frequency bin in the FFT. So if we go back to the spectrum that we calculated here, bin width is just the width in frequency between each of the samples. So, how far is it between sample 0 and sample 1, and then sample 1 to sample 2, and so on. And we already know this answer. We just set k = 1 in this equation and we get the width between two samples.
This means that if you want to have a different bin width for your FFT, you can either adjust the sampling frequency of your time signal or adjust the number of samples in your time signal.
For example, I have this blue time signal that has a dominant 60 Hz frequency and I have 0.1 seconds of data at 200 Hz. The FFT is on the right and we can see we have a bin width of 10 Hz and there is small peak at 60 Hz. Now, if we want a narrower bin width it’s as easy as using a longer period of data - more samples. Watch as we increase the signal length the bin width gets smaller and we can see the 60 Hz peak really start to become obvious. This is because with more data, we are also increasing the frequency resolution. We’re getting more signal per frequency bin.
However, we can also adjust the bin width without additional data by padding the signal with zeros. For example, if we start with the same 0.1 seconds of time data, and add zeros to the end of the signal, then we can see that once again the bin width is reduced. but, as you can see, this will change the bin width, and improve visualization, but it does not increase frequency resolution. We’re essentially just interpolating the frequency signal and filling it in with more dense sampling. Adding zeros does’t change the overall frequency resolution since we’re not adding any additional signal. Alright, I hope all that bin width stuff made sense.
Ok, so we’ve talk a bunch about different aspects of the FFT, I think to help everything sink in a bit and to make approaching the FFT a little less daunting, let’s walk through writing the code for a one sided FFT in a MATLAB live script.
Alright to start, I have this time domain signal. It’s a pure 3 Hz sine wave and there are 40 samples, sampled at 40 Hz. So we have 1 second of data. Now let’s build up the one-sided FFT.
And first off, let’s just take the FFT of the signal and plot the result. This might be how someone new might start, right? Hey I took the FFT, let me plot it and see what it looks like. But since the result is a complex vector, when you plot it, it shows both the real and imaginary components on a single graph and it doesn’t look like much. So, instead we could plot the real and imaginary components on two separate graphs, however, for this example we just want to look at the magnitude of the response, which we can get by taking the absolute value.
And check this out. We have our two peaks, mirrored about the Nyquist frequency as expected. Now to get the one-sided FFT, we just look at half of this spectrum. But of course since I have an even number of time samples, we need that +1 so we capture the Nyquist frequency as well.
Ok, so far so good. Except now, we want to plot this against frequency in Hz. So, we have k going from 0 to N/2, to account for the 21 samples in our one-sided FFT, and we convert it to frequency by multiplying it by the sample frequency divided by the number of samples.
And finally, we can plot this and as expected, the peak is right at 3 Hz. Now, you might often see the one-sided FFT is scaled to account for the information in the negative frequencies that we excluded, but if your goal is to just understand which frequencies make up your time signal then the scaling isn’t necessary. We can see here that there is a 3 Hz component and it doesn’t matter that the value of 20 at 3 Hz doesn’t relate to anything specific. Scaling the one-sided FFT will be important if you want to understand a quantity like how much power is in that frequency band. But we’re going to talk about the power spectrum in a future video and we’ll cover sailing there.
Ok, hopefully, from what we covered in this video, each of these steps to get the one-sided FFT makes sense. And if you want to try this out yourself, or check out some resources where you can learn more, I’ve left links to everything down below.
Don’t forget to subscribe to this channel so you don’t miss it. Also you can find all of the Tech Talk videos, across many different topics, nicely organized at bat365 And if you liked this video then you might be interested to learn about the Fourier transform in our video on the Z-transform. Alright, thanks for watching, and I’ll see you next time.