That is not what this article is about; see below.
In signal sampling and in design of experiments, aliasing occurs when several continuous signals which are different become indistinguishable (or aliases of one another) when sampled. Consequently the signal cannot be uniquely reconstructed from the sampled signal.
For example, if one would observe the position of the sun in the sky every 23 hours, it would look as if the sun moves in opposite direction, and much more slowly than it actually moves. If the observation is made every 25 hours the proper direction is observed, but again it would seem to move much slower.
A similar temporal aliasing effect may occur filming a spoked wheel.
Aliasing can also occur in spatial sampling; the "jaggies" seen in poorlysampled raster images are a common spatial aliasing phenomenon.
The term "aliasing" derives from the usage in radio engineering, where a radio signal could be picked up at two different positions on the radio dial in a superheterodyne radio: one where the local oscillator was above the radio frequency, and one where it was below. This is analogous to the frequencyspace "wrapround" that is one way of understanding aliasing. However, there is a deeper way of understanding aliasing, based on function approximation[?], which is outlined below as an introduction.

First we will introduce a formal notion of "continuous signal". Since there are more than one possible choices (depending on the subject at hand), we will give some general outline, but fix our attention on a specific example for the purpose of this article. Second, we will give a notion of similarity of signals. Again, this precise notion depends on the underlying physical problem, but we will provide a common example for the sake of discussion. Third, we will give the most common sampling method as an example, and fourth we will show its failings. Fifth, we will give an improved sampling method that is more intune with the similarity notion introduced in the second section.
In engineering, the method introduced in the third section is called sampling, while a method such as that introduced in the fifth section is called filtering. This discussion may be viewed as a theoretical introduction to the ideas of antialiasing.
A signal is usually a real or complex valued function whose domain could be the real line (which we might understand as a time variable), the plane (for computer graphics) or some other set. More specific details depend on the nature of the signals we're studying, but an example of a mathematically precise space of interest might be
the set of functions whose square is integrable.
These signals are in fact not necessarily continuous functions; the adjective "continuous" refers to the domain <math>[0,1]</math> as opposed to a discrete or even finite set.
The signal could arise from a variety of physical processes. For instance, one could measure the seismic movement of the ground with a seismograph. The output of a seismograph is a strip of paper known as a seismogram. This strip of paper can be interpreted as the graph of a function. This function will be in L^{2} as defined above, and thus we obtain a mathematical signal from a physical process.
We need to find a way of judging whether two such signals are similar. We will quantify this, usually with a norm. Such a simple system may not be appropriate; for an example of a more sophisticated approach, see the psychoacoustic model. For the sake of discussion, we will use the root mean square norm (see L_{p} spaces for some details).
An obvious way to go from a continuous function of <math>[0,1]</math> to an ndimensional vector (a sampled signal) is the following "point sampling" method:
That is, the function is sampled at the points <math>1/n, 2/n, ..., 1</math>. The obvious advantage of this scheme is its simplicity. However, the map <math>S_0</math> has some unfortunate properties.
Note that <math>S_0</math> is a linear map: if f and g are any two functions for which <math>S_0f</math> and <math>S_0g</math> are defined, and if a is any scalar, then <math>S_0(af+g)=aS_0(f)+S_0(g)</math>. The domain of <math>S_0</math> includes at least all continuous functions of <math>[0,1]</math>. On the other hand, for technical reasons, it is not clear how to extend <math>S_0</math> to all of <math>L^2</math>. In particular (and perhaps more telling) is that <math>S_0</math> is not continuous as a function on <math>L^2</math>.
Indeed, define <math>f_k</math> by
Then, the norm <math>f_k</math> in <math>L^2</math> is <math>1/\sqrt k</math> and so <math>f_k</math> converges to zero. However, for any <math>k>n</math>, the vector <math>S_0f_k</math> is <math>(0,...,0,1)</math> and so <math>S_0f_k</math> does not converge to zero. Hence <math>S_0</math> is not continuous.
Therefore, the sampling function <math>S_0</math> very poorly represents our notion of closeness in our signal space <math>L^2</math>.
First, note that <math>L^2</math> is contained in <math>L^1([0,1])</math> (see Lp spaces.) Hence, we can define a new "interval averaging" sampling method by
This sampling method, unlike <math>S_0</math>, is defined over all of <math>L^2</math>. Also, by the CauchySchwarz inequality (for instance,) <math>S_1</math> is also continuous in the root mean square norm. Hence, signals which alias to the same sampled vector will be related as far as the root mean square norm is concerned.
We continue with <math>L^2</math> but now we will use its Hilbert space structure. Let F be any sampling method (a linear map from <math>L^2</math> to <math>\Bbb C^n</math>.)
In our example, the vector space of sampled signals <math>\Bbb C^n</math> is ndimensional complex space. Any proposed inverse R of F (reconstruction formula, in the lingo) would have to map <math>\Bbb C^n</math> to some subset of <math>L^2</math>. We could choose this subset arbitrarily, but if we're going to want a reconstruction formula R that is also a linear map, then we have to choose an ndimensional linear subspace of <math>L^2</math>.
This fact that the dimensions have to agree is related to the NyquistShannon sampling theorem.
The elementary linear algebra approach works here. Let <math>d_k:=(0,...,0,1,0,...,0)</math> (all entries zero, except for the kth entry, which is a one) or some other basis of <math>\Bbb C^n</math>. To define an inverse for F, simply choose, for each k, an <math>e_k \in L^2</math> so that <math>F(e_k)=d_k</math>. This uniquely defines the (pseudo)inverse of F.
Of course, one can choose some reconstruction formula first, then either compute some sampling algorithm from the reconstruction formula, or analyze the behavior of a given sampling algorithm with respect to the given formula.
Perhaps the most widely used reconstruction formula is as follows. Let <math>\{ e_k \}</math> is a basis of <math>L^2</math> in the Hilbert space sense; for instance, one could use the canonical
although other choices are certainly possible. Note that here the index k can be any integer, even negative.
Then we can define a linear map R by
for each <math>k=\lfloor n/2 \rfloor,...,\lfloor (n1)/2 \rfloor</math>, where <math>(d_k)</math> is the basis of <math>\Bbb C^n</math> given by
(This is the usual discrete Fourier basis.)
The choice of range <math>k=\lfloor n/2 \rfloor,...,\lfloor (n1)/2 \rfloor</math> is somewhat arbitrary, although it satisfies the dimensionality requirement and reflects the usual notion that the most important information is contained in the low frequencies. In some cases, this is incorrect, so a different reconstruction formula needs to be chosen.
A similar approach can be obtained by using wavelets instead of Hilbert bases. For many applications, the best approach is still not clear today.
We note here that there is an efficient algorithm, known as the Fast Fourier transform to convert vectors between the canonical basis of <math>\Bbb C^n</math> and the Fourier basis <math>(d_k)</math>. This algorithm is significantly faster than the matrix multiplication required in the general case of change of basis. On the other hand, wavelets are often defined so that the change of basis matrix is sparse, and so again the change of basis algorithm is efficient.
For many applications, the reconstruction algorithm R suggested above is useful. We can return to <math>S_0</math> and <math>S_1</math> and analyze them in relation to R.
In the case of <math>S_0</math>, for any <math>\lfloor n/2 \rfloor \le j \le \lfloor (n1)/2 \rfloor</math>, at least, <math>R(S_0(e_j))=e_j</math>, and so the reconstruction formula is a reasonable match of the sampling method. However, it is fairly easy to see that, for any <math>j</math>, <math>S_0(e_j)</math> is nonzero, in fact, these quantities behave periodically in <math>j</math>. Hence, any high frequencies which can not be represented by R will get "folded into" the low frequencies.
In the case of <math>S_1</math>, one can analyze, via convolutions, to which extend the high frequencies are "folded into" the low frequencies. They still are, but to a somewhat lesser extent.
This "folding" of frequencies is often considered to be undesirable when R is chosen properly. For instance, it is possible to choose a reconstruction formula based on the Haar basis (see wavelets) in which case <math>S_1</math> does not fold any high frequencies into the lower frequencies. However, this reconstruction formula (or the Haar basis) are inappropriate to most problems.
If one is giving a reconstruction formula in terms of Hilbert bases, as is our case, then one can give a "perfect" filter, which does not fold any frequencies at all, in terms of convolutions.
See also:
Search Encyclopedia
