Waves, Superposition, Dispersion

Definitions
Speed of Light
Wave Functions
Principle of Superposition
Beats
Sound Beats
Traveling Wave
Traveling Waves
Group Velocity, 2 Waves
Faster Than Light?
Group Velocity, Many Waves
Dispersion

Definitions

We can define a "wave" as the propagation of a disturbance through a medium. If the direction of propagation is along the z axis, then we can characterize waves where the disturbance is parallel to the direction as "longitudinal" and those where the direction is perpendicular as "transverse".

The classic example of longitudinal waves is sound - the propagation of a pressure disturbance that is along the direction of motion. We can model such a wave with springs attached to masses as in the following figure:

As you can see in the diagram, the springs between the 2nd and 3rd masses are compressed, which will cause the spring to expand and compress the next set, and so on. This is how the wave propagates.

For transverse waves, the classic example is waves on the water. If you drop a stone into a pool of water, and inertia gravity push the stone below the water level, and since water is basically incompressible on these scales, water near the stone will be forced upward by buoyant forces. The water forced up will then come down, forcing water near it to go up, and so on.

What these two different kinds of waves have in common is that both propagate in a medium, and the propgation has to do with forces of inertia interplaying against a restoring force. Here we will look at what happens when you have more than one wave in a medium.

To start, we define the property of the wave called "period" $T$ as being the time it takes for the wave to repeat itself: $T$ = time per cycle. The frequency $f$ is defined as the number of cycles in a second, so if the number of cycles is 1, then it takes $T$ seconds to repeat, so the frequency will be related to the period as $$f=1/T\nonumber$$ The motion of a wave takes a time $T$ to repeat (go 1 cycle), but you can also define a cycle as being $360\deg$ or $2\pi$ radians. So we can form what is called the "angular frequency", $\omega$, as the number of cycles per second in units of radians per second. So if it takes $T$ seconds to repeat (go through $2\pi$), then the angular frequency $\omega$ is given by $$\omega = \frac{2\pi}{T}\nonumber$$ of in terms of the frequency, $$\omega = 2\pi f\nonumber$$

If you define the velocity of the wave as the propagation distance per time, then in 1 period, the wave will go a distance $\lambda$ before repeating, so: $$v = \frac{\lambda}{T} = \lambda f\nonumber$$ The wavelength, $\lambda$, defined as the distance the wave propagates before it repeats itself, is always just a function of the medium: if the medium responds "quicker", then the wave propagates a shorter distance before repeating. This is important to understand: the frequency (and period) are determined by the source of the wave, and the wavelength comes from properties of the medium.

Using the definitions above, we can write $$v = \lambda f = \frac{\lambda}{2\pi}\cdot 2\pi f = \frac{\lambda}{2\pi}\cdot\omega\nonumber$$ It turns out to be more convenient (maybe mostly due to how it's used in quantum mechanics) to define the "wave number" $k$ as $$k = \frac{2\pi}{\lambda}\nonumber$$ which gives us $$v = \frac{\omega}{k}\nonumber$$

Speed of Light

For electromagnetic waves, $v=c=3\times 10^{8}m/s$, an astoundingly large velocity, but actually that is only because we define the meter as a scale that is quite small compared to the distance between planets and stars, compared to how far light can travel in a second. For instance, it takes light 4.2465 years to get from our sun to Proxima Centauri, the nearest star. Since light travels at a constant speed, that distance can be calculated using $\Delta x = v\Delta t$ which gives us $\Delta x = 4.02\times 10^{16}m$, or $25.0$ trillion miles. If we were to instead use as our unit of distance the "light-year", which is the distance that light travels in a year, we can convert from meters to light-years using the same formula and get that 1 light-year = $9.46\times 10^{15}m = $5.88$ trillion miles. In these units, the speed of light will be equal to 1 light-year per year!

So we can think of the speed of light as simply a conversion factor between distance and time, in a given set of units. This can be handy in electronics. For instance, we can write $3\times 10^{8}m/s$ as $0.3m \times 10^{9}/s$ which we can write as $0.3m\times 1GHz$. We can use that to tell us the wavelength of an EM wave that is oscillating at $1GHz$: it has to be $0.3m$! If the frequency is in the FM range, say 100MHz, then the EM wavelength has to be 10 times bigger than $0.3m$, or $3.0m$. And so on. We can also write the speed of light as $0.3m \times 10^{9}/s = 0.3m/10^{-9}s = 0.3m/ns$ where $1ns = 10^{-9}s$ is 1 nanosecond. This says that the speed of light is $0.3m$ per $ns$, which is about 1 foot per ns. That's a convenient conversion when you are dealing with light over short distances.

Wave Functions

Let's next consider a transverse wave that is non localized in space. So it might be a rope that is oscillating with some frequency $f$ and some wavelength $\lambda$. At any given point in time, the amplitude of the disturbance defines the maximum height above the equilibrium line. In the figure below, the horizontal line might be the level of the rope when it is still, and the blue represents the oscillations.

You can represent this wave mathematically using a "wave function", which tells you that at some given position $x$, what is the height $y$ above the equilibrium line $y=0$ as $$y(x) = A\sin(2\pi \frac{x}{\lambda})\nonumber$$ $A$ is the amplitude, and tells you the max ($y=A$ and minimum $y=-A$) amount from the horizontal. You need the $2\pi$ in the argument to the $\sin$ function because trig functions are always functions of angles. The ratio $x/\lambda$ tells you what fraction of the wavelength you have at some point $x$. So the argument is $2\pi$ times the fraction of a wavelength. And note that the $\sin$ (or any other trig function) is periodic, such that $\sin(x)=\sin(x+2\pi n)$ where $n$ is any integer: it repeats!

Using the definition of $k$ as above, you can write the wave function as $$y(x) = A\sin(kx)\nonumber$$ The argument to the $\sin$ function is sometimes called the "phase", or "phase angle", since the argument is always an angle. The "phase constant" is the "phase" at the initial condition. In the above plot, if we say that the initial condition is at the left most point on the horizontal, then the phase constant will be zero. But it can be anything you like, and in general, we can call this variable $\theta_0$ and write the wave equation as $$y(x) = A\sin(kx+\theta_0)\nonumber$$ where for the above figure, $\theta_0=0$.

Principle of Superposition

What if you have two waves that have the same wavelength, amplitude, and phase constant. That looks like this:

If you superimpose these waves on the same medium, the principle of superposition says that the wave functions add linearly, so that the total displacement is the sum of the displacement from each wave. For example, if you have a water surface and push down on one edge and create a wave, and push down on an opposite edge and create another wave, the waves will move towards each other and the wave when they overlap will be the linear sum of both waves. So if each wave can be described with a wave function $y_1 = A\sin(kx)$ and $y_2 = A\sin(kx)$, and here the waves have the same amplitude and wave number, and same phase constant, then the resulting wave will be given by $$y_{tot} = y_1 + y_2 = A\sin(kx) + A\sin(kx) = 2A\sin(kx)\nonumber$$ Now let's make 2 waves with different phase constants (let's use the phrase "different phases", to mean the same thing): $y_1 = A\sin(kx)$ and $y_2 = AA\sin(kx+\theta_0)$. It will look like this:

In the figure above, the phase difference is $100\deg$. The vertical dashed yellow lines show you the two waves at a constant phase, and the solid yellow line shows you the different positions where the waves have that phase. You can draw the yellow line anywhere along the horizontal, as it is just telling you how much one wave "leads" or "lags" the other.

Note that we can write $kx+\theta_0$ as $k(x+\theta_0/k)=k(x+2\pi\theta_0/\lambda) =k(x+x_0)$ where $x_0\equiv 2\pi\theta_0/\lambda$ is the distance between where the waves have the same phase.

The pripciple of superposition tells us that the total wave function will be $$y_{tot} = A\sin(kx) + A\sin(kx+\theta_0) = A[\sin(kx)+\sin(kx+\theta_0)]\nonumber$$ We can expand the $\sin$ term that has the 2 arguments, and rearrange with a lot of algebra. Before we do this, there's a shortcut that is good to know about because we will use this a lot below. The shortcut has you subtracting some constant phase $\phi_0$ from both waves at the same time. If we do this, we get the 2 wave functions $y_1=A\sin(kx-\phi_0)$ and $y_2=A\sin(kx+\theta_0-\phi_0)$. This might seem like we are getting into some hot water, but we are not, because if we subtract the same phase $\phi_0$ from both waves before adding, then all it does is translate the waves over by the same amount horizontally, so that the relative phases, $\theta_0$, will be the same. The real trick here is to make the substitution $\phi_0 = \half\theta_0$. This gives us the two waves $y_1=A\sin(kx-\half\theta_0)$ and $y_2=A\sin(kx+\half\theta_0)$. Then we add to get: $$y_{tot} = A[\sin(kx-\half\theta_0) + \sin(kx+\half\theta_0)]\nonumber$$ Then we use the formula for the sine or cosine of the sum or difference between two angles: $$\sin(a\pm b)=\sin(a)cos(b)\pm\cos(a)\sin(b)\nonumber$$ $$\cos(a\pm b)=\cos(a)cos(b)\mp\sin(a)\sin(b)\nonumber$$ For our sum of the 2 wave functions, we get $$\begin{align} y_{tot} & = A[\sin(kx-\half\theta_0) + \sin(kx+\half\theta_0)]\nonumber\\ & = A[\sin(kx)\cos(\half\theta_0)-\cos(kx)\sin(\half\theta_0)+ \sin(kx)\cos(\half\theta_0)+\cos(kx)\sin(\half\theta_0)]\nonumber\\ & = 2A\cos(\half\theta_0)\sin(kx)\label{add2}\end{align}$$ There are many interesting things about this formula. For starters, you can see that when the phase difference $\theta_0=0$, then $y_{tot}=2A\sin(kx)$. This is called "constructive interference". When the phase difference is half a wavelength, or $\theta_0 = \pi$, then $\cos(\half\pi)=0$ and we get what's called "destructive interference", and the sum of the 2 waves is zero.

In the figure below, click and drag the red wave along the horizontal to move it around, which gives it an effective phase difference with respect to the blue wave. You can see the sum (purple wave) change accordingly.

So this means that if you take 2 lasers with the same amplitude and frequency ligth coming out the end, and superimpose such that the phase difference is half a wavelength ($\delta_0=\pi$), then you should get destructive interference and the beam should vanish. Does this actually happen? Can this actually happen? Can you plug 2 lasers into the wall, draw power out, see the beam from each one, and superimpose such that the beam vanishes? If so, where did the power go? The answer is not so obvious.

Beats

Equation $\ref{add2}$ above shows what happens when you add 2 waves with a difference in phase. We can use that to show what happens when you add 2 waves that don't have a phase difference, but do have a difference in wavelength (or difference in wave number $k=2\pi/\lambda$).

Let's take 2 waves: $$y_1=A\sin(k_1x)\nonumber$$ $$y_2=A\sin(k_2x)\nonumber$$ We can add them together, but first let's make some definitions: $$\Delta k \equiv k_2 - k_1\label{dk}$$ $$\bar{k} \equiv \half(k_2 + k_1)\label{kbar}$$ It is solve for $k_1$ and $k_2$ in terms of $\Delta k$ and $\bar{k}$: $$k_2 = \bar{k} + \half\Delta k\label{k2}$$ $$k_1 = \bar{k} - \half\Delta k\label{k1}$$

Now we can write $$y_1 = A\sin(k_1x) = A\sin(\bar{k}x + \half\Delta k\cdot x)\nonumber$$ $$y_2 = A\sin(k_2x) = A\sin(\bar{k}x - \half\Delta k\cdot x)\nonumber$$ This looks like equation $\ref{add2}$, where $\half\theta_0$ is now $\half\Delta k$. So when you add them together you should get $$y_{tot} = 2A\cos(\half\Delta k\cdot x)\sin(\bar{k}x)\label{beats}$$

In the simulation below, we again plot a blue and red wave with the same wavelength (here it's 40 pixels), and the purple sum. Above it you can increase or decrease the wavelength of the red wave by $\pm 1$ pixel, and it will show you the ratio of the wavelengths. As you change it, you will see some interesting behavior when the wavelengths are different, but close to each other so that $\Delta k\lt\lt \bar{k}$ (maybe $\Delta k \sim 5-10\%$ of $\bar{k}$). When the two wavelengths are close, then the $\Delta k$ term gets close to zero.

Change $\lambda_{red}$

1.00

When you have the red wavelength at around 90% of the blue, you will see an interesting structure in amplitude of the purple wave (which is the sum of red and blue). Since the blue and red wavelength are similar, the quantity $\Delta k$ will be much smaller than the quantity $\bar{k}$. So the first part of equation $\ref{beats}$ - $2A\cos(\half\Delta k\cdot x)$ - is changing much slower than the second part, $\sin(\bar{k}x)$. You can think of this wave as having it's oscillatory part ($\sin(\bar{k}x)$) "modulated" by a slowly changing amplitude due to the $\cos(\half\Delta k\cdot x)$ term. This phenomena is called "beats".

Sound Beats

If you were listening to 2 sounds at slightly different frequencies, each sound wave would be described by a wave function $y_1 = A\sin(\omega_1 t)$ and $y_2 = A\sin(\omega_2 t)$ where here $\omega_1 \sim \omega_2$. Following what we did to get equation $\ref{beats}$ to add the two together gives $$y_{tot} = 2A\cos(\half\Delta \omega\cdot t)\sin(\bar{\omega}t)\label{fbeats}$$ What you would hear would be a sound with angular frequency $\bar{\omega}$ modulated in amplitude, oscillating with a "beat frequency" $\Delta f=\Delta\omega/2\pi$. You will often see people tune their instruments by playing a note on their instrument and the note you want to tune to at the same time and listening for the beat frequency, tuning it away. When the amplitude oscillation is gone, the 2 notes are the same ($\Delta\omega=0$). In an orchestra, before the piece starts, you will see the 1st violinist (called the "concert master") stand up and play the note A440 (the note that has a frequency of 440Hz). The rest of the orchestra will then tune to that. But in fact, sometimes after the 1st violinist plays their A440, the oboist will tune to that and then play the note, and the rest of the orchestra tunes to the oboist. This is because the oboe has a nice sharp tone, making it easy to hear.

In the above simulation, click on the button on the right ("Play Tone"). As you change the red frequency, you will hear the beats: the amplitude of the sum oscillates (is modulated by) at the beat frequency $\Delta f = \Delta\omega/2\pi$.

Traveling Wave

If you drop a rock in some still water, the rock pushes the water down, and since water is incompressible, that causes some water to rise up above the water line somewhere else close by. What goes up comes down, and so you have created a disturbance that propagates outward in a circular pattern: a traveling wave. Another example would be if you bang a hammer on a piece of metal and you hear the longitudinal sound wave traveling outward. What we want to do next is to describe the wave function.

In the figure below, you will see a pulse on the left, and a yellow dot in the middle. If you hit "Start", the pulse travels to the right. This is an example of a traveling wave where the disturbance is transverse to the direction of motion. Like a water wave. The yellow dot could be a boat sitting in the water. As the wave comes by, you can see the motion of the boat - it is vertical, along the transverse direction. This shows clearly that what is actually traveling is the disturbance - there is no water moving with any velocity along the direction of the wave, the water motion is always in the vertical direction.

To derive the wave function for this pulse, let's first imagine that you were moving with the pulse (say, a pulse in the water and you are floating above it) at the same velocity as the pulse. Call this the $O'$ frame, and call the frame of the water the $O$ frame. The pulse is moving with some velocity $v$ in the $O$ frame, so the relative velocity between the two frames is also $v$.

In the simulation below, you can see two reference frames, $O$ and $O'$. Hit the "Start" button, and $O'$ will move with velocity $v$ relative to $O$ along the horitzontal $x$ direction ($x$ measures the distance along the horizontal in $O$, and $x'$ measures the distance along the horizontal in the $O'$ frame). Any point will have both an $x$ coordinate, and an $x'$ coordinate, and these are related, as you can see, by the relation $$x = x' + vt\label{gal1}$$ This is just saying that any distance from the origin in $O$ is related to the distance from the origin in $O'$ by how far $O'$ has moved in $O$ in some time $t$. This is called the "Galilean transformation". Importantly, note that $y = y'$ since heights are the same in both frames. This is a general rule of such kinds of transformations: the components of any vector parallel to the motion transform as in equation $\ref{gal1}$, and components perpendicular are the same ($y=y'$ and $z=z'$ if motion is along $x$ and $x'$).

In the simulation below, you can see the two frames $O$ and $O'$ outlined in yellow. $O'$ moves with a velocity $v$ in $O$. This illustrates the Galilean transformation.

In $O'$, the pulse is stationary along the horizontal $x'$ direction. We can define a wave function that tells you the height $y'$ of the pulse as $y'=f(x')$. The function $f'$ could be anything - here it is a gaussian. We can write the same function in frame $O$ as $y = y' = f(x') = f(x-vt)$: $$y = f(x-vt)\nonumber$$ So whatever wave function describes the pulse in the reference frame where the pulse isn't moving, the reference frame in any other frame is given by the same function of $x-vt$. This describes a pulse moving to the right. If the pulse is moving to the left, then in that situation $v<0$ and the wave function will be given by $$y = f(x+vt)\nonumber$$

So if in the $O'$ frame the wave function is given by $$y'(t) = A\sin(kx')\nonumber$$ then in the $O$ frame the wave function will be given by $$\begin{align} y(t) &= A\sin(k[x-vt])\nonumber\\ &= A\sin(kx-kvt)\nonumber\\ &= A\sin(kx-\omega t)\nonumber \end{align}\nonumber\\$$ where we have used the fact that the velocity is given by $$\begin{align} v &=\lambda f\nonumber\\ &= \frac{\lambda}{2\pi}\cdot 2\pi f\nonumber\\ &= \frac{\omega}{k}\label{vp} \end{align}\nonumber\\$$ and $\omega = kv$. Also note that $y'(t)=y(t)$ since both measure the distance perpendicular to the direction of motion along the $x$ axis.

The simulation above (showing a wave pulse) has the shape of a pulse that has a definite beginning and ending. But you can have a traveling wave that is continuous. For instance, you have an infinitely long rope that is attached at one end. You hold your end and shake it up and down. This sends a traveling wave down the rope. Hit the "Start" button below to show a continuous wave moving to the right. This is called a "traveling wave". The yellow dot is a point at constant $x$, oscillating due to the transverse wave, to illustrate that the wave moves to the right, but the displacement is at a constant horizontal coordinate $x$.

Traveling Waves (more than 1 traveling wave)

Let's see what happens when we have 2 traveling waves, superimposed. As an example, dropping 2 stones at 2 different places on a water surface. The resulting wave from each will spread towards each other and superimpose. Below, you will see 2 waves: a blue wave moving to the right, and a red wave with the same amplitude, wavelength, and frequency moving to the left. The 3rd purple wave is the sum of the 2 traveling waves. The yellow dot is a tagged position, responding to the sum of the two traveling waves at that point. So it might correspond to a piece of wood in the water, when 2 traveling waves going in opposite direction come by and interfere. You can use the arrows to change the tagged position and you should be able to find a position where the dot doesn't move at all. This position is called a "node", which is where the 2 waves interfere destructively.

Change tagged position:

Change red wavelength: $\Delta\lambda$ = 0 = 0%

The purple wave is called a "standing" wave, and as you can see, it's easy to make with two equal and opposite waves. The nodes are those positions where the waves are always interfering destructively, and have all kinds of interesting applications. For example, atomic physicists take two lasers and shine the beams towards each other, creating a standing wave between the lasers. If you put an atom at the node, the atom will stay there (in 2 dimensions). This is how people make neutral atom traps (and there are other ways as well).

You can also change the wavelength of the red wave relative to the blue wave by clicking on the bottom row of buttons labeled "Change red wavelength:". Each click changes the red wavelength by 1 pixel relative to the blue. What you can see happen is that the amplitude of the interference wave is no longer constant, but seems to be moving along the horizontal. This is an example of what is called the "group velocity", described next.

Group Velocity, 2 Waves

Now let's see what happens when you add 2 waves together that have different wavelengths and different frequencies, and are moving in the same direction. Note: we are referring here to 2 waves each of which is a function of position, and is traveling in some direction so it is also a function of time. Such a wave is always represented as a $\sin$ and/or $\cos$ function (or equivalently, using exponential notation) and these waves exist for all values of $x$: they are non-localized. A wave that is localized is a more complicated thing that needs Fourier analysis to analyze, and we will save that for a later time.

So, we will need 2 wave functions: $$\begin{align} y_1 &= A\sin({k_1x-\omega_1t})\nonumber\\ y_2 &= A\sin({k_2x-\omega_2t})\nonumber\\ \end{align}\nonumber\\$$ To add these we first define $\Delta k \equiv k_2 - k_1$ and $\bar k \equiv \half(k_2+k_1)$ and solve for $k_1$ and $k_2$ to get $$\begin{align} k_1 &= \bar k - \half\Delta k\nonumber\\ k_2 &= \bar k + \half\Delta k\nonumber\\ \end{align}\nonumber\\$$ Similarly, we define $\Delta \omega \equiv \omega_2 - \omega_1$ and $\bar \omega \equiv \half(\omega_2+\omega_1)$ and solve for $\omega_1$ and $\omega_2$ to get $$\begin{align} \omega_1 &= \bar \omega - \half\Delta \omega\nonumber\\ \omega_2 &= \bar \omega + \half\Delta \omega\nonumber\\ \end{align}\nonumber\\$$

Then we can rewrite the 2 wave functions as $$\begin{align} y_1 &= A\sin({k_1x-\omega_1t})\nonumber\\ &= A\sin([\bar k-\half\Delta k]x-[\bar\omega-\half\Delta\omega]t)\nonumber\\ &= A\sin([\bar kx-\bar\omega t] - \half[\Delta k\cdot x-\Delta\omega\cdot t])\nonumber\\ \end{align}\nonumber\\$$ Similarly, we have $$\begin{align} y_2 &= A\sin({k_2x-\omega_2t})\nonumber\\ &= A\sin([\bar k+\half\Delta k]x-[\bar\omega+\half\Delta\omega]t)\nonumber\\ &= A\sin([\bar kx-\bar\omega t] + \half[\Delta k\cdot x-\Delta\omega\cdot t])\nonumber\\ \end{align}\nonumber\\$$ To make it easy to see how to add, let's define $\alpha \equiv \bar kx-\bar\omega t$ and $\beta \equiv \half(\Delta kx-\Delta\omega t)$ and write the 2 wave functions as $$\begin{align} y_1 &= A\sin(\alpha-\beta)\nonumber\\ y_2 &= A\sin(\alpha+\beta)\nonumber\\ \end{align}\nonumber\\$$ Adding them together gives $$\begin{align} y &= y_1+y_2=2A\cos(\beta)\sin(\alpha)\nonumber\\ &= 2A\cos(\half[\Delta k\cdot x-\Delta\omega\cdot t])\sin(\bar kx-\bar\omega t)\nonumber\\ &= 2A\cos(\half\Delta k[x-\frac{\Delta\omega}{\Delta k}t])\sin(\bar kx-\bar\omega )\label{beat2}\\ \end{align}\nonumber\\$$ This is exactly the result we got when we added 2 waves with 2 different wavelengths: the resulting wave goes like the average of the wavelengths, with an amplitude modulated at $\half$ the difference. But here, since these are 2 traveling waves going in the same direction, we have the resulting wave propagating with the average wavelength and frequency, with an amplitude that is also moving at the difference in wave number and angular frequency. This amplitude is usually called the "group", and it has a velocity (called the "group velocity") whcih can be seen clearly by rewriting equation $\ref{beat2}$ as $$y= 2A\cos(\half\Delta k[x-v_gt])\sin(\bar kx-\bar\omega )\nonumber$$ where we have replaced $$x-\frac{\Delta\omega}{\Delta k}t\nonumber$$ with $$v_g = \frac{\Delta\omega}{\Delta k}\label{vg}$$ So in summary, each wave has a phase velocity given by equation $\ref{vp}$, $v=\omega/k$, and we call this the "phase velocity" because it shows the velocity of each point on the wave, which is the wave phase. The sum of the 2 waves is called the group, with a group velocity given by equation $\ref{vg}$.

Below you can see 2 waves drawn in blue with (slightly) different angular frequency and wavelengths, so each wave will have a different phase velocity given by equation $\ref{vp}$. The yellow dot on each wave represents some constant point in order to make it easy to see the different phase velocities.

Below the 2 blue waves is the sum of both waves, in red, which shows the "beat" pattern (the group), with a wavelength given by the average of the 2 blue waves, and the amplitude has the beat modulation.

Hit the "Start" button to start the simulation. You will see the 2 blue waves start traveling at constant phase velocities, and you will see the amplitude modulation (the group) also traveling as per equation $\ref{beat2}$. This traveling group with a group velocity given by equation $\ref{vg}$. The up and down arrow buttons allow you to change the relative wavelength and period of the waves to see how that effects the group. For instance, if you make $\Delta\lambda$ small or $\Delta\T$ large, then you can get an arbitrarily large group velocity $v_g$ and this will be clear when you run the simulation.

$\Delta\lambda$: 3 $\Delta T$: 3

Wave $k$ $\omega$ $v_p$
1
2
$v_g=$

Faster than light?

Non-localized waves

Note that so far here we have been considering waves that are described by a wave function $y(t)$ such as $$y(t) = A\sin(kx-\omega t)\nonumber$$ where $v_p=\omega/k$.

This wave function is defined for all space, and has an infinite extent (all $x$). This is called a "de-localized" wave.

Can the phase velocity $v_p=\omega/k$ exceed that for light? Special relativity requires that nothing can exceed the speed of light, $c$, but put another way, it says that no signals can travel faster than $c$. If we define the signal velocity $v_s$, then special relativity says $v_s\lt c$. So the question we are really asking here is whether the phase velocity can also be the signal velocity, and we are asking it in the context of a non-localized wave.

So how would you send a signal (aka information) in a traveling wave moving with phase velocity $v_p$? Because if you can send information in a traveling wave at the velocity $v_p$, then $v_p=v_s\le c$ by special relativity.

To address this, imagine you were looking at this wave coming in, using some kind of detector. What you would be measuring could be the amplitude $A$, the wavelength $\lambda$, the frequency $f$ (or period $T$), or the phase contant. And what you would see is that all of these values are constant. And that means that no information is being sent other than the first moment when someone turned it on and you measured the first values. After that, there's no change in the wave, so no information, consistent with the rules for information theory: a constant value has no information, because you can predict the next value, and a random input would have maximum information because you could not predict the next value from looking at all the previous values. So as a rule, an incoming de-localized wave with some phase velocity $v_p$ carries no information, $v_p\ne v_s$. Since there's no constraint on $v_p$ due to special relativity, it could in fact be greater than the speed of light, $c$.

However, for non-localized light waves in a vacuum, $v_p=\omega/k=c$. If we add 2 light waves with different frequencies together, then using equation $\ref{vg}$ we can write $$\Delta\omega = \omega_2-\omega_1 = v_pk_2-v_pk_1 = v_p\Delta k\nonumber$$ so the group velocity will be $$v_g = \frac{\Delta\omega}{\Delta k}=\frac{v_p\Delta k}{\Delta k}=v_p\nonumber$$ and so the phase and group velocity are equal for non-localized waves at the same frequency in a vacuum. However, in dispersive media, where the index of fraction (and hence the velocity) are functions of the incoming frequency, then we will have a different phase, group, and even signal velocity.

Localized waves

This is an entirely different ball of wax. For localized waves, we need to use the mathematics of Fourier analysis

As you will notice from the above, it's possible to make the group velocity $v_g$ greater than the phase velocity $v_p$ of either wave, and additionally, as $\Delta k\to 0$, we can make $v_g$ arbitrarily larger. Even larger than the speed of light.

Group Velocity, Many Waves

With more than 2 waves, things get more complicated, and so we can make use of Fourier analysis. So imagine that we have a bunch of waves, with wave functions $\psi_i$ where $i$ is the index that goes from $1$ to the number of waves, $N$. Then when you add up all of the waves, you get a function $f(x,t)$ given by $$f(x,t) = \sum_{i=1}^N \psi_i\nonumber$$ As $N$ gets large, we can replace the sum by an integral, and write the function $f(x,t)$ evaluated at $f(x,0)$ (t=0) as a Fourier integral $$f(x,0) = \int_{-\infty}^{\infty} A(k)e^{ikx}dk\nonumber$$ By the principle of superposition, the wave function $f(x,t)$ will be given by $$f(x,t) = \int_{-\infty}^{\infty} A(k)e^{i(kx-\omega t)}dk\label{fourier1}$$ where $\omega=\omega(k)$ is some function of the wave number. For light in a vacuum, $c=\omega/k$ so $\omega(k)=ck$.

Equation $\ref{fourier1}$ describes a "wave packet", $f(x,t)$, that is moving to the right (along increasing $x$). We don't know what $\omega(k)$ is, but if the wave packet is "peaked" around some central value $k_0$, then we can define $\omega_0=\omega(k_0)$ and expand $\omega(k)$ using a Taylor expansion: $$\begin{align} \omega(k) &\to \omega(k_0) + (k-k_0)\frac{\partial\omega}{\partial k}\rvert_{k_0}\nonumber \\ &= \omega_0 + (k-k_0)\omega'(k_0)\label{taylor1}\end{align}$$ where $\omega'(k_0)= \frac{\partial \omega}{\partial k}\rvert_{k_0}$

Substituting this into equation $\ref{fourier1}$ gives $$\begin{align} f(x,t) &= \int_{-\infty}^{\infty} A(k)e^{i(kx-[\omega_0+(k-k_0)\omega') t)}dk\nonumber\\ &=\int_{-\infty}^{\infty} A(k)e^{ikx}e^{-i\omega_0t}e^{-ik\omega't}e^{ik_0\omega't}dk\nonumber\\ &=\int_{-\infty}^{\infty} A(k)e^{ikx}e^{ik_0x}e^{-ik_0x}e^{-i\omega_0t}e^{-ik\omega't}e^{ik_0\omega't}dk\nonumber\\ &=e^{i(k_0x-\omega_0t)}\int_{-\infty}^{\infty} A(k)e^{i(k-k_0)x}e^{-i(k-k_0)\omega't}dk\nonumber\\ &=e^{i(k_0x-\omega_0t)}\int_{-\infty}^{\infty} A(k)e^{i(k-k_0)(x-\omega't)}dk\label{fourier2}\\ \end{align}$$ The first part of $f(x,t)$ describes a wave propagating with phase velocity given by $v_p=\omega_0/k_0$, and the 2nd part describes a wave propagating along the $x$ direction with a group velocity given by $$v_g = \frac{\partial x}{\partial t} =\omega'=\frac{\partial\omega}{\partial k}\rvert_{k_0}\label{groupexp}$$ Note: this works if we have $N$ waves that are grouped around a central wavelength $k_0$, and if the function $\omega(k)$ is mostly linear (call this the "Linear" region) so that we can ignore the 2nd derivative $\partial^2\omega/\partial k^2$.

In the next simulation we will add more than 2 waves. Hit the "START" button to start the simulation. The yellow dot shows the motion of each wave. You can click on either "Linear" (the default), or "NonLinear" to change $\omega(k)$. In "Linear" mode, the red wave shows a clear grouping, with each group moving along at the group velocity given by equation $\ref{groupexp}$. Here the group velocity moves faster than the phase velocity of the individual waves, but that's only a function of the slope of $\omega(k)$, seen in the chart below.

If you click on "NonLinear", it changes $\omega(k)$ to be nonlinear, so that the next term in the expansion (equation $\ref{taylor1}$) is non-zero, and in this case large. You can see the effect of the grouping in the red wave: the groups are changing as the wave propagates, and there's no real group velocity because there's no real group. This means that in order to get a consistent and continuous group propagation, so that you can send information, you need the angular frequency $\omega$ to be linear in the wave number. This is something that is guaranteed when you form a wave packet from a Fourier sum.

$\omega(k)$: Linear NonLinear

Number of waves: 5

In the plot below you can see $\omega$ vs $k$. The group velocity is defined as the slope in that plane, given by $\delta\omega/\delta k = \partial\omega/\partial k$ in the Linear region.

Dispersion

As seen above, for light in a vacuum, $v_p=\omega/k$, and $v_p=c$ which is constant. In a medium such as glass or water, the index of refraction $n$ characterizes how much light slows down, and this is the source of refraction. The velocity of light in the medium, $v_n$, is given by $v_n=c/n$ and is independent of wavelength and frequency. However for most media, this is only an approximation as the change in index of refraction is small. For instance, in water, the index of refraction goes from 1.342 for violet (410nm) light to 1.331 for red (660nm), an increase of 0.83% in 250nm which covers most of the visible light spectrum. For flint glass, the index changes from 1.662 for violet to 1.698 for red, a change of 2.2% over that range. The plot below shows the change in the index of refraction as a function of wavelength, relative to what it is at 660nm (red).

Using Snell's law $n_1\sin\theta_1=n_2\sin\theta_2$, we can see that the refraction angle depends on the index of refraction, and if the index of refraction is a function of wavelength, then the angle will also change with wavelength. So if you have a material like flint glass or diamond that is a strong function of wavelength, then white light will "spread out", or be dispersed, as it goes through the medium.

This is how a prism takes advantage of dispersion to separate the frequencies of white light into components.

So in general, dispersion is when the index of refraction is a function of wavelength: $n=n(\lambda)$. Or equivalently, since the speed of the wave is a function of the index of refraction (by definition), then $v_p=v_p(\lambda)$ or you could also say $v_p=v_p(\omega)$ since $\lambda$ and $\omega$ are related through the velocity.

Drew Baden Last update May 7, 2024 All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law.