A Thinking Person's Guide to Fourier Analysis

Table of Contents

(End of Page)

Fourier Analysis

The extraction of the frequency from the data is straight forward. For our original signal (set the Noise Amplitude to 0 and the Efficiency to 100% in the above plot), you see a clear sinusoid. In general, we immediate think "Fourier analysis", which takes a function of time $h(t)$ and transforms it into a function of frequency, in order to understand what the spectral frequency distribution (i.e. what frequencies are present).


Drag to rotate, coordinates displayed are (x,y) in rotated frame

As you can see above, a vector is a real object that has a length, and points from one location to another. How you represent the vector is up to you, and depends on the coordinate frame you pick. In the figure above, you can drag the frame around and change the angle of the $x$ and $y$ axes relative to horizontal and vertical, and that will change the amount of $x$ and $y$ needed to describe the point. Specifically, we can represent any vector (2 points in space determine a length and a direction) by specifying how much along $x$ and how much along $y$. If we use the rules for vector addition that says that to add 2 vectors, you add the tail of the 2nd to the head of the 1st, that means you can represent any real vector $\vec{v}$ as $$\vec v = a_x\ihat + a_y\jhat\nonumber$$ where $\ihat$ points along the $x$ axis and $\jhat$ points along the $y$ axis, and $a_x$ and $a_y$ are the lenghts along those axes respectively.

So, to represent a vector in some coordinate frame, all we need are the lengths $a_x$ and $a_y$. To get those, we use the fact that direction vectors $\ihat$ and $\jhat$ have a special property: $$\ihat\cdot\jhat = 0\nonumber$$ and that the lengths of $\ihat$ and $\jhat$ are 1 (unity vectors): $$\ihat\cdot\ihat = \jhat\cdot\jhat = 1\nonumber$$ A nice way to represent both properties is to use a general description, and define $\widehat k_i \equiv\ihat$ and $\widehat k_j \equiv\jhat$, and use the Kronecker delta: $$\widehat k_i\cdot\widehat k_j = \delta_{ij}\nonumber$$ where

$\delta_{ij}$$=$$1$($i=j$)
$=$$0$($i\ne j$)
That $\ihat$ and $\jhat$ have these properties is to say that $\ihat$ and $\jhat$ are "orthonormal" - that is, orthogonal, and normalized to 1. Note that we are using the scalar product of two vectors when we say $\vec{a}\cdot\vec{b}$.

Now that we have these orthonormal vectors, we can use this property to calculate $a_x$. Take the scalar product of $\vec{v}$ and $\ihat$: $$\vec{v}\cdot\ihat = (a_x\ihat+a_y\jhat)\cdot\ihat=a_x\nonumber$$

And doing the same for calculating $a_y$ gives you: $$a_x = \vec{v}\cdot\ihat\nonumber$$ $$a_y = \vec{v}\cdot\jhat\nonumber$$

This is pretty interesting - it says that you can define two directions in space that are perpendicular ($\ihat$ and $\jhat$), and represent any vector $\vec{v}$ in that space using the "inner product" $\vec{v}\cdot\ihat$ and $\vec{v}\cdot\jhat$ to find "how much" along those 2 independent directions. Note that the orthogonality principle guarantees that you cannot represent $\ihat$ in terms of $\jhat$ and vice versa. Another way to say this is that we can expand any vector $\vec v$ as a sum over an orthonormal set $(\widehat k_i,\widehat k_j)$ with some coefficients ($a_i,a_j$):

Where the inner product for vectors is defined as $\langle\vec{a},\vec{b}\rangle=\vec{a}\cdot\vec{b}$ and here $\widehat k_1=\ihat$ and $\widehat k_2=\jhat$.

Fourier Analysis of Continuous Data

This property of vectors can be extended to mathematical functions of continuous variables ($\theta$) in the following way: we define a set of "orthogonal functions" $g_i(\theta)$ analogous to $\widehat k_i$, except here $i=-\infty,+\infty$. Then we can define an inner product $\langle g_i(\theta),g_j(\theta)\rangle$ as an integral over all $\theta$: $$\langle a(\theta),b(\theta)\rangle \propto \int_{-\infty}^\infty a(\theta)b(\theta)d\theta\nonumber$$ (we use $\propto$ to be general, but from dimensional considerations, $\propto\to =$) and perform the expansion of some function $f(\theta)$: $$f(\theta) = \sum_{n=0}^\infty a_ng_n(\theta)\nonumber$$ and the coefficients $a_n$ are given by $$a_n = \langle f(\theta),g_n(\theta)\rangle\nonumber$$

Note we use the variable $\theta$, but it is just a variable.

Now, if the function $f(\theta)$ is periodic in $\theta$ with period $T=2\pi$, then it's natural to use the periodic trig functions $\sin$ and $\cos$, and which one would depend on the symmetry ($\cos$ is symmetric with respect to $\theta\to -\theta$ and $\sin$ is antisymmetric). So we can define $$g_n(\theta) = A_n\cos(n\theta)\label{basisg}$$ $$h_n(\theta) = B_n\sin(n\theta)\label{basish}$$ and expand: $$f(\theta)=\sum_{n=0}^\infty a_ng_n(\theta) + b_nh_n(\theta) =A_n\sum_{n=0}^\infty a_n\cos(n\theta) + B_n \sum_{n=0}^\infty b_n\sin(n\theta)\label{efour1}$$ Then we can find the coefficients $a_n$ and $b_n$ as the inner product of $f(\theta)$ with the basis functions $g$ and $h$ as defined in equations $\ref{basisg}$ and $\ref{basish}$: $$a_n = \langle f(\theta),g_n(\theta)\rangle = A_n \langle f(\theta),cos(n\theta)\rangle\nonumber$$ $$b_n = \langle f(\theta),h_n(\theta)\rangle = B_n \langle f(\theta),sin(n\theta)\rangle\nonumber$$ Assuming $f(\theta)$ is periodic in $\theta$, then we only need to find the coefficients $a_n$ and $b_n$ using the inner product over 1 period. This gives us $$a_n = A_n\int_0^{2\pi}f(\theta)\cos(n\theta)d\theta\label{eanfour}$$ $$b_n = B_n\int_0^{2\pi}f(\theta)\sin(n\theta)d\theta\label{ebnfour}$$ We can determine the constants $A_n$ and $B_n$ from the "orthonomal" conditions $$\langle g_n(\theta),g_m(\theta)\rangle = A_nA_m\langle \cos(n\theta),\cos(m\theta)\rangle = \delta_{nm}\nonumber$$ $$\langle h_n(\theta),h_m(\theta)\rangle = B_nB_m\langle \sin(n\theta),\sin(m\theta)\rangle = \delta_{nm}\nonumber$$ Note that when using trig functions, especially functions such as $\sin(n\theta)$, you have to treat $n=0$ separately. That means we need to determine $A_0$, $A_n$, $B_0$ and $B_n$ where here we mean $n\gt 0$:

So we can therefore write: $$f(\theta) = \frac{a_0}{\sqrt{2\pi}} + \frac{1}{\sqrt\pi}\sum_{n=1}^\infty a_n\cos(n\theta) + b_n\sin(n\theta)\label{fourier}$$ and the coefficients are given by $$a_0=\frac{1}{\sqrt{2\pi}}\int_0^{2\pi}f(\theta)d\theta\label{fouriera0}$$ $$a_n=\frac{1}{\sqrt{ \pi}}\int_0^{2\pi}f(\theta)\cos(n\theta)d\theta\label{fourieran}$$ $$b_n=\frac{1}{\sqrt{ \pi}}\int_0^{2\pi}f(\theta)\sin(n\theta)d\theta\label{fourierbn}$$ Equations $\ref{fourier}-\ref{fourierbn}$ constitute the full fourier expansion of a period function of a continuous variable.

Note that often in the literature, you will see factors of $\sqrt{2\pi}$ "absorbed" into the coefficients. To see how this is done, define $$\alpha_n\equiv a_n/\sqrt{\pi}\nonumber$$ $$\beta_n\equiv b_n/\sqrt{\pi}\nonumber$$ for all $n$ including $n=0$. This gives the following for the Fourier expansion: $$f(\theta) = \half \alpha_0 + \sum_{n=1}^\infty \alpha_n\cos(n\theta)+\beta_n\sin(n\theta) \label{fourier_standard}$$ Equation $\ref{fourier_standard}$ is the more common form of the Fourier expansion.

As an example, let's say we have a periodic square wave with period $T=2\pi$, and write:

$f(\theta)$$=$A$0\le\theta\lt \pi$
$=$0$\pi\le\theta\lt 2\pi$
The function looks like this, where the horizontal axis is the coordinate $\theta$, the function is periodic indifinitely, and the values $\theta=\pi$ and $\theta=2\pi$ are marked:

The coefficient $a_0$ is determined via: $$a_0 = \frac{1}{\sqrt{2\pi}}\int_0^{2\pi}f(\theta)d\theta = \frac{1}{\sqrt{2\pi}}\int_0^{\pi}Ad\theta = A\sqrt{\frac{\pi}{2}}\nonumber$$ $$a_n = \frac{1}{\sqrt{\pi}}\int_0^{\pi}A\cos(n\theta)d\theta = \frac{A}{\sqrt{\pi}}\frac{\sin(n\theta)}{n}\vert_0^\pi = 0 \nonumber$$ $$b_n = \frac{1}{\sqrt{\pi}}\int_0^{\pi}A\cos(n\theta)d\theta = \frac{A}{\sqrt{\pi}}\frac{\cos(n\theta)}{n}\vert_\pi^0 = \frac{2A}{\sqrt{\pi}n} \nonumber$$ where in the calculation of $b_n$, we use the fact that the $\cos(n\theta)$ evaluated between $\pi$ and $0$ yields $1-\cos(n\pi)=1-(-1)^n$. When $n$ is even, the evaluation vanishes, and when $n$ is odd, it equals $2$, so $b_n = 2A/n\sqrt{\pi}$ and $n$ is odd (1,3,5....).

Putting this altogether gives us the following expansion: $$f(\theta) = \frac{A}{2}\Big( 1+\frac{4}{\pi}\sum_{n=1,3,5...}^\infty \frac{\sin(n\theta)}{n}\Big) \label{foursquare}$$

Highest term: 0

Fourier expansion of square wave $f(\theta)=A$ between $0$ and $\pi$

As you can see in the figure above, it starts out with a single term, $n=0$. This is the baseline of the periodic wave, here $\half A$. You can click the arrows to add terms in the expansion, and remember that even numbers have no effect ($n$ is always odd for square waves).

As you increase the highest term you will see an interesting pattern - the expansion becomes a better and better approximation, except at the edges where the wave transitions (where the derivative is infinite). There are a few things to note about this. Probably the most important point is that in the real world, derivatives are not infinite, and square waves don't have discontinuities. But the sharper the transition, the more terms you need to add to the Fourier expansion, which means you have to keep adding higher frequencies. This is a general rule - sharp transitions mean higher frequencies. So to turn it around, if you want to use a Fourier expansion to synthesize a square wave, you have to decide what kind of transition is acceptable, and generate enough higher frequency components to make it work. Note that we usually associate high frequency with high current in electronic devices, so generating "perfect" square waves from a Fourier composition is going to cost you!

In the plot below, we generate 1000 points between $0$ to $4\pi$, or 2 periods. You can vary the total number of terms, and the graph will show you the "residuals", which is the difference between the square wave $f(\theta)$ and the Fourier composition for that number of terms. You can see that the difference between the "true" $f(\theta)$ and the Fourier composition converges to $0$ except at the transition (discontinuity) regions (or equivalently, the edges). These large residuals are called "Gibbs ears", and there's a thereom that as you have more and more terms, the Gibbs ears converge to a fixed number that has to do with the difference in the function $f(\theta)$ on either side of the discontinuity. What you can do to make the convergence almost perfect for all points is to also bandwidth limit your function $f(\theta)$, so that it's not a perfect square wave but something close. Then you can generate as many terms in the Fourier composition to get whatever conversion you desire.

Number of terms: 11

A common use of Fourier decomposition would be when the input is a function of time, $f(t)$, and you want to know the frequency spectrum. For $f(t)$ periodic over a period $T$, the phase angle $\theta$ argument for the $\cos$ and $\sin$ function is given by $\theta=2\pi t/T\equiv\omega t$, the equations $\ref{fourier}-\ref{fourierbn}$ can be calculated in the same way, however it is easier to just recognize that in those equations, whenevery you see $2\pi$, it means "period". So we can modify these equations by making the change $2\pi\to T$, change $\theta$ to $t$, and integrate over $t$ from $0$ to the period $T$. The new equations are: $$f(t) = \frac{a_0}{\sqrt{T}} + \sqrt{\frac{2}{T}}\sum_{n=1}^\infty a_n\cos(\omega_nt) + b_n\sin(\omega_nt)\label{fouriert}$$ $$a_0=\frac{1}{\sqrt{T}}\int_0^Tf(t)dt\label{fourierta0}$$ $$a_n=\sqrt{\frac{2}{T}}\int_0^{T}f(t)\cos(\omega_nt)dt\label{fouriertan}$$ $$b_n=\sqrt{\frac{2}{T}}\int_0^{T}f(t)\sin(\omega_nt)dt\label{fouriertbn}$$ where $\omega_n=n\omega =2\pi n/T$ are the possible Fourier angular frequencies. Note that equations $\ref{fouriert}-\ref{fouriertbn}$ are the same as equations $\ref{fourier}-\ref{fourierbn}$ with $\theta\to\omega_nt=2\pi t/T$, and substituting $2\pi$ with $T$ in the coefficients. This should not be a surprise!

Following through for $f(t) = V_0$ ($t\lt\half T$, otherwise $f(t)=0$) in the same way we did above for $f(\theta) = A$ ($\theta\lt\pi$), we have the following expansion for a square wave that is always positive: $$f(t) = \frac{V_0}{2}\Big( 1+\frac{4}{\pi}\sum_{n=1,3,5...}^\infty \frac{\sin(\omega_nt)}{n}\Big) \label{foursquaret}$$ where $\omega_n=n\omega=2\pi n/T$.

It is worth noting that if we define things this way, then the Fourier coefficients $a_n$ and $b_n$ become quantities with dimensions. This is inevitable given the orthonormality requirement coupled with the fact that now we are integrating over a coordinate that has dimensions (here $t$ for time, but it could be any dimension). For example, let's imagine that $f(t)=V(t)$, the voltage across a resistor $R$. If we used the generalized expansion as per equation $\ref{fouriert}$ and equations $\ref{fourierta0}-\ref{fouriertbn}$, then it is clear that the coefficients have to have dimensions of voltage times square root of time, $V\sqrt{T}$. If we substitute $f=1\T$, we then have coefficients with dimension voltage per square root of frequency ($f$). What this is telling you is that if you Fourier decompose $V(t)$, you would get a series of coefficients that tell you "how much voltage at a given square root of frequency". Hmm, not so obvious what that means.

To see what this means, let's imagine you put this voltage $V(t)$ across some load with resistance $R$. It will generate a current, and it will dissipate power $P=I^2R=V^2/R$. The power has a time dependency, so you might then want to calculate the total average power $\overline P$ over one period by integrating $V^2(t)/R$ over that period, and dividing by the period $T$: $$\overline P = \frac{1}{T}\int_0^T \frac{V^2(t)}{R} dt\nonumber$$ Now let's substitue the Fourier decomposition of $V(t)$ using equation $\ref{fouriert}$ to get

$R\overline P$$=$$\frac{1}{T}\int_0^T \Big( \frac{a_0}{\sqrt{T}} + \sqrt{\frac{2}{T}}\sum_{n=1}^\infty a_n\cos(\omega_nt) + b_n\sin(\omega_nt) \Big)^2dt$
$=$$\frac{1}{T^2}\int_0^T \Big( a_0 + \sqrt{2}\sum_{n=1}^\infty a_n\cos(\omega_nt) + b_n\sin(\omega_nt) \Big)^2dt$
$=$$\frac{1}{T^2}\int_0^T\Big( a_0^2 + 2\sqrt{2}a_0\sum_{n=1}^\infty a_n\cos(\omega_nt) + b_n\sin(\omega_nt) + 2(\sum_{n=1}^\infty a_n\cos(\omega_nt) + b_n\sin(\omega_nt))^2 \Big)dt$
The term involving the constant $a_0^2$ integrates to $a_0^2/T$. The term that is linear in $a_0$ multiplies $\sin$ and $\cos$ functions, and each of those terms in that sum will integrate to $0$ over 1 period. For the second term, we have: $$\frac{2}{T^2}\int_0^T \Big( [\sum a_n\cos(\omega_nt)]^2 + 2\sum a_n\cos(\omega_nt)\sum b_n\sin(\omega_nt) + [\sum b_n\sin(\omega_nt)]^2\Big)dt\nonumber$$

For these 3 terms, each term in the double sum involves a $\sin$ multiplying a $\cos$, so each of those terms integrates to $0$ over 1 period. For the 1st and 3rd term, you would get various terms like $a_n\cos(\omega_nt)\times a_m\cos(\omega_mt)$, which will vanish unless $n=m$, and then the term will integrate to $T/2$ using the trig identity $\cos^2\theta=\half(1+\cos2\theta)$ and $\sin^2\theta=\half(1-\cos2\theta)$. This gives us the formula: $$\overline P = \frac{1}{RT}\Big(a_0^2 + \sum_{n=1}^\infty (a_n^2+ b_n^2)\Big)\nonumber$$

If the average power $\overline P$ in the period $T$ is some quantity divided by $T$, then that leads to the interpretation that each coefficient squared is proportional to the total energy at that particular value of $n$. If the waveform $V(t)$ is complex enough to need both $a_n$ and $b_n$ terms, remember that for a given frequency $\omega_n=n\omega$, the power will be given by $P_n = (a_n^2 + b_n^2)/T$ for that frequency.

This is one of the most powerful things about Fourier decomposition - it tells you how much of each frequency is present, and how much power there is at that frequency!

For the case of the square wave above, with $V(t)=V_0$ ($t\lt \half T$), we would get $$\overline P = \frac{V_0^2}{R}\Big( \half + \frac{2}{\pi^2}\sum_{n=odd}^\infty \frac{1}{n^2}\Big) \label{fourpowert}$$

Fourier Analysis of Discrete Data (DFT)

In our case, the incoming signal $h(t)$ is sampled at discrete times before analyzed, which turns it into a series of numbers that we can call $h_k$, $k=1,N$ and $N$ is the total number of data points. The sampling will be done at constant time intervales defined by $\delta$, the time between successive digitizations (units of time). We still want to Fourier analyze this, to determine the frequency content, so we have to modify for discrete data. The fourier analysis of this is well understood, and comes under the title "Discrete Fourier Transforms", or DFT.

There are 2 important parameters to keep in mind here: the total number of points, $N$, and the sampling time, $\delta$. The total time over which the signal spans would then be $T=N\delta$. Now keep in mind that what we want to do is to figure out the set of signals with various frequencies that make up $h(t)$ using the digitized values $h_k$. So imagine that we had an incoming $h(t)\propto \cos\omega_x t$, where $\omega_x = 2\pi f_x=2\pi/T_x$ is unknown, and something we want to determine. With our sampling time $\delta$, we need to consider what the period $T_c$ would be for smallest possible waveform that we could measure. The answer is that since we want to see both a peak and a trough for an oscillating signal, then any waveform with a half period smaller than $\delta$ would not be measureable, which means $T_c > 2\delta$. In frequency space, $f_c=1/T_c=1/2\delta$ is the largest frequency $f_x$ we could measure: $$f_x \lt f_c = \frac{1}{2\delta}\label{Nyquist}$$ Equation $\ref{Nyquist}$ is often referred to as the "Nyquist condition", and $f_c$ as the "critical frequency", and this is a very important parameter for digital transmission of time dependent signals. For instance, it is usually the case that modern day transmitters will only transmit in a limited range of frequencies, called the "bandwidth". If transmitters are bandwidth limited (usually by intention or due to electronics limitations), then by setting $\delta$ appropriately, you can recover the incoming signal frequencies exactly using DFT algorithms. What happens when you transmit signals with higher frequencies than $f_c$ is that the DFT algorithms will not be able to see those higher frequencies, and they will show up at lower (beat) frequencies. This is called "aliasing". In video signals, it will manifest as distortions in the signal - you will miss the higher frequency components, and some lower frequency components will have more amplitude than they should. The cure of "aliasing" is to limit the bandwidth of the incoming signal, or better yet, match $\delta$ to what the transmitter is doing.

Back to DFT. We have discrete data which comes in pairs: $t_k,h_k$. Each particular time $t_k$ is just some integer times the sampling time $\delta$, which means $t_k=k\delta$ and $k=0,N-1$ (we start at $k=0$ so that the first frequency will be 0, or the DC level). This means that $$g_n(t)\to g_n(t_k) = \sqrt{2}\cos(\omega_n t_k)\nonumber$$ where we use the $\cos$, assuming real data with even symmetry. This makes $g_n(t_k)$ an $n\times k$ matrix, and suggests that you can think of a DFT as a matrix multiplication of the vector $h_k$ to get the amplitudies $a_n$ corresdponding to frequencies $\omega_n$: $$a_n \propto \sum_k h_k\cdot g_n(t_k)\nonumber$$ This reduces our problem to finding the proportionality, frequencies, etc. Obviously, our inner products have to be changed from $\int$ to $\sum$. For continuous data, the intergral is over continuous time $t$, and so for DFT, the time differential $dt\to\delta$ (this is natural, since to go from discrete to continuous, $\delta\to dt$). For continuous periodic data, we normalized the inner product over one period, whereas for DFT, the normalization will be over the entire time interval $N\delta$. For the angular frequencies $\omega_n$, remember that the sampling time $\delta$ defines the set of possible frequencies (and the Nyquist frequency $f_c$ defines the largest). So the lowest frequency measureable will of course be $f_0=0$. The next will be $f_1 = 1/T_1$ where $T_1$ will be the full range $T=N\delta$: $T_1=N\delta$ and $f_1=1/N\delta$. The next highest would be $T_2 = \half N\delta$ giving $f_2 = 2/N\delta$. We can see the pattern which gives $$f_n = \frac{n}{N\delta}\nonumber$$ The highest frequency here is $f_{N-1}=1/\delta$, which is greater than $f_c$ by a factor of 2. This means that there will be $\half(N-1)$ frequencies, but we still have $n=1,N$. However that's ok, since we actually need to measure not only the frequencies but the phases. Hmmm.....
The "extra" frequencies are not, so this means that in order to recover all of the discrete frequencies, we need to limit the value of $n$ to $n=0, \half(N-1)$.

We can now write the full DFT matrix multiplication: $$a_n = \frac{\sqrt{2}}{N\delta}\sum_{k=0}^{N-1}h_k\cos(\omega_n k\delta)\delta\nonumber$$ Substituting $\omega_n = 2\pi n/(N\delta)$ gives $$a_n = \frac{\sqrt{2}}{N}\sum_{k=0}^{N-1}h_k\cos(2\pi nk/N)\nonumber$$

Back to top


Drew Baden  Last update January 16, 2018