GRADUATE COURSE — QUANTIFYING MIXING

Transcripción

GRADUATE COURSE — QUANTIFYING MIXING
GRADUATE COURSE — QUANTIFYING MIXING
Contents
1. Introduction
1.1. Aim of a mixing process
1.2. Mixing is intuitive
1.3. Mixing by stretching and folding
1.4. Mixing via chaotic advection
1.5. An mixer based on a chaotic dynamical system
1.6. Mixing via MHD
1.7. Oceanographic Mixing
1.8. Duct flows
1.9. Mixing through turbulence
1.10. Mixing via dynamos
1.11. A good definition
1.12. How to quantify all this...
2. Mathematical background
2.1. The space M .
2.2. Describing sets of points in M .
2.3. A and µ (Measuring the ‘size’ of sets).
2.4. The transformation f
3. Fundamental results for measure-preserving dynamical systems
4. Ergodicity
5. Mixing
6. Transfer Operators
6.1. Connection with mixing
7. Example systems
7.1. Baker’s Map
7.2. Cat Map
7.3. Standard Map
8. Segregation
8.1. Scale of segregation
8.2. Intensity of segregation
9. Transport matrices
9.1. Perron-Frobenius Theory
9.2. Eigenvalues of the transport matrix
References
1
1
1
1
1
1
1
2
2
2
3
3
4
4
4
5
7
8
10
11
13
15
19
21
21
21
22
22
22
24
26
27
30
31
31
QUANTIFYING MIXING
1
Graduate course — Quantifying mixing
1. Introduction
1.1. Aim of a mixing process. The aim of a mixing process is, of course, to mix,
effectively, efficiently and quickly:
1. trans. a. To put together or combine (two or more substances or
things) so that the constituents or particles of each are interspersed or
diffused more or less evenly among those of the rest; to unite (one or more
substances or things) in this manner with another or others; to make a
mixture of, to mingle, blend. (OED)
To produce a mixture:
Substances that are mixed, but not chemically combined. Mixtures are
nonhomogeneous, and may be separated mechanically. (Hackh’s Chemical
Dictionary)
1.2. Mixing is intuitive. Mixing is an intuitive procedure, and many simple devices
can be found in the home which achieve this aim.
(a) a
(b) b
(c) c
Figure 1: Everyday mixing devices and procedures.
1.3. Mixing by stretching and folding. At the core of many mechanical mixing processes is the idea of stretching fluid elements, and then returning stretched elements by
folding or cutting. This has been understood for many years, and indeed, “None other
than Osborne Reynolds advocated in a 1894 lecture demonstration that, when stripped of
details, mixing was essentially stretching and folding and went on to propose experiments
to visualize internal motions of flows.” [Ottino et al.(1994)].
1.4. Mixing via chaotic advection. These ideas were given mathemtical rigour by
establishing a connection between stretching and folding of fluid elements with chaotic
dynamics, and in particular with the ideas of horseshoes [Ottino(1989a), Smale(1967)].
Figure 1.4 illustrates the state of a chaotic mixing device. This type of pattern, consisting
of a highly striated structure, is typical, and a natural question is: how do we quantify
the quality of mixing?
R
1.5. An mixer based on a chaotic dynamical system. The KenicsMixer
is an
industrial mixing device based on the well-known Baker’s transformation.
2
QUANTIFYING MIXING
Figure 2: Mixing of two highly viscous fluids between eccentric cylinders [Ottino(1989b)]
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(a)
Figure 2: How the Kenics mixer works: the frames show the evolution of concentration patterns within the first four
blades of the RL-180 mixer.
perpendicular placed, rectangular plates 1 . This device, however, gives the possibility to control its efficiency by changing the rotation speed of the pipe (which may be considered to be analogous to the twist
of the blades in a Kenics mixer) and allowed relatively simple mathematical modeling using an approximate analytical expression for the velocity field. The expression for the velocity field (and, consequently,
the numerical simulations) was improved by Meleshko et al. [11], achieving even better agreement with
experimental results of [10]. However, these studies were dealing with a simplified model, which fails to
catch the details of the real flow in a Kenics static mixer.
The increasing computational power allowed different researchers to perform direct simulations of the
(c)
three-dimensional flow in Kenics mixers [3, 8, 12–16]. The last paper considers even flows with higher
Reynolds numbers up to Re = 100. These studies analyzed only certain particular flows and, unlike [10],
did not allow for the optimization of the mixer geometry, due to high cost of 3D simulations.
◦
were made by [9], who sugFigure 1: Examples of different Kenics designs: (a): a “standard” right-left layout with 180 twist of the blades (RL-More systematic efforts on exploring the efficiency of the Kenics mixer
gested a more energy efficient design with a total blade twist of 120 ◦. They explored different mixer
180); (b): right-right layout with blades of the same direction of twist (RR-180); (c): (RL-120) right-left
configurations, but, since the velocity field had to be re-computed every time, the scope was limited: only
layout with 120◦ blade twist.
seven values of the blade twist angle were analyzed. The aim of the current work is to study numerically the
dependence of the mixer performance on the geometrical parameter (blade twist angle) and to determine
the optimal configuration within the imposed limitations. Since it was shown in [9] that the blade pitch has
rather minor effect on mixer performance, it is fixed in the current work.
degrees) to specify a particular mixer geometry. Thus, for example RL-180 stands for the mixer, combining
the blades twisted 180 ◦ in both directions, figure 1a, as analyzed in [3]. Figure 1b shows the RR-180 The Kenics static mixer was also considered as a tool to enhance the heat exchange through the pipe
walls [17]. They found that the Kenics mixer may offer a moderate improvement in heat transfer, but its
configuration as was considered by Hobbs and Muzzio [8], while figure 1c illustrates the RL-120 geometry,
applicability in this function is limited by difficulty of i.e. wall cleaning. However, only mixers with the
which was suggested as more energy efficient in [9].
“standard” 180 ◦ blade twist were considered. In the current work we also analyse the influence of the
blade twist angle on refreshing of material on the tube surface. Recently, Fourcade et al. [16] addressed
the efficiency of striation thinning by the Kenics mixer both numerically and experimentally, using the
so-called “striation thinning parameter” that describes the exponential thinning rate of material striations.
1.2 Principle of Kenics operation
This was done by inserting a large number of “feed circles” and numerically tracking markers along the
mixer. Their method allows to characterize the efficiency of the static mixer. However, adjusting the
The Kenics mixer in general is intended to mimic to a possible extent the “bakers transformation” (see [1]):
would necessitate repetition of all particle tracking computations. Optimization of the mixer
repetitive stretching, cutting and stacking. To illustrate the principles of the Kenics static mixer a series geometry
of
geometry calls for a special tool that allows to re-use the results of tedious, extensive computations in order
the concentration profiles inside the first elements of the “standard” ( RL-180) are presented in figure to2.compare different mixer layouts. A good candidate for such a tool is the mapping technique.
R
Figure 3: The Kenicsmixer
[Galaktionov et al.(2003)]
(b)
All these concentration distributions are obtained using the mapping approach. The first image shows the1
Note that the partitioned pipe mixer is actually a simplified model of a RR type of Kenics mixer.
initial pattern at the beginning of the first element: each channel is filled partly by black (c = 1) and partly
by white (c = 0) fluid, with the interface perpendicular to the blade. The flux of both components is equal.
3
The images in figure 2b-e show the evolution of the concentration distribution along the first blade, the
thin dashed line in figure 2e denotes the leading edge of the next blade. From the point of view of mimicking
the bakers transformation it seems that the RL-180 mixer has a too large blade twist: the created layers do
not have (even roughly) equal thickness. The configuration achieved 1/4 blade twist earlier (figure 2d)
seems to be much more preferable. The next frame, figure 2f, shows the mixture patterns just 10 ◦ into the
second, oppositely twisted, blade. The striations, created by the preceding blade are cut and dislocated at
the blade. As a result, at the end of the second blade (figure 2g) the number of striations is doubled. After
four mixing elements, figure 2h, sixteen striations are found in each channel. The Kenics mixer roughly
doubles the number of striations with each blade, although some striations may not stretch across the whole
channel width. Note, that the images in figure 2 show the actual spatial orientation of the striations and
mixer blades. In all further figures the patterns are transformed to the same orientation: the (trailing edge
of the) blade is positioned horizontally. This simplifies the comparison of self-similar distributions.
Figure 4: A magnetohydrodynamic chaotic stirrer, [Yi et al.(2002)]
1.6. Mixing via MHD.
1.3 Existing approaches to Kenics mixer characterization
The widespread use of the Kenics mixer prompted the attention to the kinematics of its operation and attempts to find ways to improve its performance. Khakhar et al. [10] considered the so-called partitioned
pipe mixer, designed to mimic the operation of Kenics. The analogy is incomplete, since the partitioned
pipe mixer is actually a dynamic device, consisting of rotating pipe around a number of straight, fixed,
2
Figure 5: Plankton bloom at the Shetland islands. [NASA]
1.7. Oceanographic Mixing.
1.8. Duct flows. Schematic view of a duct flow with concatenated mixing elements. Red
and blue blobs of fluid mix well under a small number of applications. Changing only the
position of the centres of rotation can have a marked effect on the quality of mixing.
QUANTIFYING MIXING
3
Figure 6: Duct flow
Figure 7: Scalar concentration distribution from a high resolution numerical simulation of
a turbulent flow in a two-dimensional plane for a Schmidt number of 144 and a Reynolds
number of 22. (Courtesy of G. Brethouwer and F. Nieuwstadt)
1.9. Mixing through turbulence.
Figure 8: magnetic field generated by inductive processes by the motions of a highly
conducting fluid. The prescribed velocity is of a type known to be a fast dynamo, i.e.,
capable of field amplification in the limit of infinite conductivity (Cattaneo et al. 1995).
1.10. Mixing via dynamos.
4
QUANTIFYING MIXING
How big?
How does
this
compare with
this?
How wide?
Figure 9: stuff
1.11. A good definition. THOMASINA:
When you stir you rice pudding, Septimus, the spoonful of jam spreads
itself round making red trails like the picture of a meteor in my astronomical atlas. But if you stir backward, the jam will not come together again.
Indeed, the pudding does not notice and continues to turn pink just as
before. Do you think this odd?
SEPTIMUS:
No.
THOMASINA:
Well, I do. You cannot stir things apart.
SEPTIMUS:
No more you can, time must needs run backward, and since it will not, we
must stir our way onward mixing as we go, disorder out of disorder into
disorder until pink is complete, unchanging and unchangeable, and we are
done with it for ever.
Arcadia, Tom Stoppard
1.12. How to quantify all this... Danckwerts (1952): ”...two distinct parameters are
required to characterize the ’goodness of mixing’...the scale of segregation...and the intensity of segregation” [Denbigh, 1986]
2. Mathematical background
Mathematically, a mixing process can be described as a transformation of a space into itself1. The space might represent a container of fluid, the ocean, the atmosphere, something
numerical, while the transformation respresents whatever procedure induces the mixing,
1Most
of this section is taken from [Sturman et al.(2006)] and references therein
QUANTIFYING MIXING
5
for example stirring, shaking, turbulence, MHD etc. A transformation of a space into itself is a dynamical system, and to express it rigorously requires a quadruple (M, A, f, µ).
We begin with some details about each of these four concepts.
2.1. The space M . M is the space (the flow domain) and will have some structure,
including:
• metric space. A metric space, M , is a set of points for which a rule is given to
describe the “distance” between points. The rule is defined by a function defined
on pairs of points such that the value of the function evaluated on two points
gives the distance between the points. If x, y denote two points in M , then d(x, y)
denotes the distance between x and y and this distance function, or metric satisfies
the following three properties:
(1) d(x, y) = d(y, x) (i.e., the distance between x and y is the same as the distance
between y and x),
(2) d(x, y) = 0 ⇐⇒ x = y ( i.e., the distance between a point and itself is zero,
and if the distance between two points is zero, the two points are identical),
(3) d(x, y) + d(y, z) ≥ d(x, z) (the “triangle inequality”).
The most familiar metric space is the Euclidean space, Rn . Here the distance between two points xp= (x1 , . . . , xn ), y = (y1 , . . . , yn ) is given by the
Euclidean metric d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 .
• vector space. This is a space whose set of elements is closed under vector addition
and scalar multiplication. Again Euclidean space Rn is the standard example of
a vector space, in which vectors are a list of n real numbers (coordinates), scalars
are real numbers, vector addition is the familiar component-wise vector addition,
and scalar multiplication is multiplication on each component in turn.
• normed vector space. To give a vector space some useful extra structure and be
able to discuss the length of vectors, we endow it with a norm, which is closely
related to the idea of a metric. A norm gives the length of each vector in V .
Definition 1 (Norm). A norm is a function k · k : V → R which satisfies:
(1) kvk ≥ 0 for all v ∈ V and kvk = 0 if and only if v = 0 (positive definiteness)
(2) kλvk = |λ|kvk for all v ∈ V and all scalars λ
(3) kv + wk ≤ kvk + kwk for all v, w ∈ V (the triangle inequality)
It is easy to see the link between a norm and a metric. For example, the norm
of a vector v can be regarded as the distance from the origin to the endpoint of v.
More formally, a norm k · k gives a metric induced by the norm d(u, v) = ku − vk.
Whilst the Euclidean metric is the most well-known, other norms and metrics
are sometimes more appropriate to a particular situation. A family of norms called
the Lp -norms are frequently used, and are defined as follows:
L1 -norm: kxk1 = |x1 | + |x2 | + . . . + |xn |
L2 -norm: kxk2 =
|x1 |2 + |x2 |2 + . . . + |xn |2
1/2
Lp -norm: kxkp = (|x1 |p + |x2 |p + . . . + |xn |p )1/p
L∞ -norm: kxk∞ = max (|xi |)
1≤i≤n
Here the L2 -norm induces the standard Euclidean metric discussed above. The
L1 -norm induces a metric known as the Manhattan or Taxicab metric, as it gives
6
QUANTIFYING MIXING
the distance travelled between two points in a city consisting of a grid of horizontal
and vertical streets. The limit of the Lp -norms, the L∞ -norm, is simply equal to
the modulus of the largest component.
• inner product space. An inner product space is simply a vector space V endowed
with an inner product. An inner product is a function h·, ·i : V × V → R. As
usual, the inner product on Euclidean space Rn is familiar, and is called the dot
product, or scalar product. For two vectors v = (v1 , . . . , vn ) and w = (w1 , . . . , wn )
this is given by hv, wi = v · w = v1 w1 + . . . vn wn . On other vector spaces the
inner product is a generalization of the Euclidean dot product. An inner product
adds the concept of angle to the concept of length provided by the norm discussed
above.
• topological space. Endowing a space with some topology formalizes the notions of
continuity and connectedness.
Definition 2. A topological space is a set X together with a set T containing
subsets of X, satisfying:
(1) The empty set ∅ and X itself belongs to T
(2) the intersection of two sets in T is in T
(3) the union of any collection of sets in T is in T
The sets in T are open sets, which are the fundamental elements in a topological
space. We give a definition for open sets in a metric space in definition 4. The
family of all open sets in a metric space forms a topology on that space, and
so every metric space is automatically a topological space. In particular, since
Euclidean space Rn is a metric space, it is also a topological space. (However, the
reverse is not true, and there are topological spaces which are not metric spaces.)
• manifold. The importance of Euclidean space can be seen in the definition of a
manifold. This is a technical object to define formally, but we give the standard
heuristic definition, that a manifold is a topological space that looks locally like
Euclidean space Rn . Of course, Euclidean space itself gives a straightforward
example of a manifold. Another example is a surface like a sphere (such as the
Earth) looks like a flat plane to a small enough observer (producing the impression
that the Earth is flat). The same could be said of other sufficiently well-behaved
surfaces, such as the torus.
• smooth, infinitely differentiable manifold. The formal definition of a manifold
involves local coordinate systems, or charts, to make precise the notion of “looks
locally like”. If these charts possess some regularity with respect to each other,
we may have the notion of differentiability on the manifold. In particular, with
sufficient regularity, a manifold is said to be a smooth, or infinitely differentiable
manifold.
• Riemannian. On a smooth manifold M one can give the description of tangent
space. Thus at each point x ∈ M we associate a vector space (called tangent space,
and written Tx M ) which contains all the directions in which it is possible to pass
through x. Elements in Tx M are called tangent vectors, and these formalise the
idea of directional derivatives. We will frequently have to work with the tangent
space to describe the rate at which points on a manifold are separated. However,
throughout the book our manifolds will be well-behaved two-dimensional surfaces,
QUANTIFYING MIXING
7
and so tangent space will simply be expressed in the usual Cartesian or polar
coordinates.
Finally, if a differentiable manifold is such that all tangent spaces are equipped
with an inner product then the manifold is said to be Riemannian. This allows
a variety of notions, such as length, angle, volume, curvature, gradient and divergence.
2.2. Describing sets of points in M . Once a metric is defined on a space (i.e., set of
points), then it can be used to characterize other types of sets in the space, for example
the sets of points ‘close enough’ to a given point:
Definition 3 (Open -Ball). The set
B(x, ) = {y ∈ M d(x, y) < },
is called the open -ball around x.
Intuitively, such a set is regarded as open, as although it does not contain points y a
distance of exactly away from x, we can always find another point in the set (slightly)
further away than any point already in the set.
With this definition we can now define the notion of an open set.
Definition 4 (Open Set). A set U ⊂ M is said to be open if for every x ∈ U there exists
an > 0 such that B(x, ) ⊂ U .
Thus, open sets have the property that all points in the set are surrounded by points
that are in the set. The family of open sets give the required topology for M to be a
topological space. The notion of a neighborhood of a point is similar to that of open set.
Definition 5 (Neighborhood of a Point). If x ∈ M and U is an open set containing x,
then U is said to be a neighborhood of x.
Definition 6 (Limit Point). Let V ⊂ M , and consider a point p ∈ V . We say that p is
a limit point of V if every neighborhood of p contains a point q 6= p such that q ∈ V .
The notion of a boundary point of a set will also be useful.
Definition 7 (Boundary Point of a Set). Let V ⊂ M . A point x ⊂ V is said to be a
boundary point of V if for every neighborhood U of x we have U ∩ V 6= ∅ and U \V 6= ∅
(where U \V means “the set of points in U that are not in V ”).
So a boundary point of a set is not surrounded by points in the set, in the sense that
you cannot find a neighborhood of a boundary point a set having the property that the
neighborhood is in the set.
Definition 8 (Boundary of a Set). The set of boundary points of a set V is called the
boundary of V , and is denoted ∂V .
It is natural to define the interior of a set as the set that you obtain after removing the
boundary. This is made precise in the following definition.
Definition 9 (Interior of a Set). For a set V ⊂ M , the interior of V , denoted Int V, is
the union of all open sets that are contained in V . Equivalently, it is the set of all x ⊂ V
having the property that B(x, ) ⊂ V , for some > 0. Equivalently, Int V = V \∂V .
8
QUANTIFYING MIXING
Definition 10 (Closure of a Set). For a set V ⊂ M , the closure of V , denoted V̄ is the
set of x ⊂ M such that B(x, ) ∩ V 6= ∅ for all > 0.
So the closure of a set V may contain points that are not part of V . This leads us to the
next definition.
Definition 11 (Closed Set). A set V ⊂ M is said to be closed if V = V̄ .
In the above definitions the notion of the complement of a set arose naturally. We give
a formal definition of this notion.
Definition 12 (Complement of a Set). Consider a set V ⊂ M . The complement of V ,
denoted M \V (or V c , or M − V ) is the set of all points p ∈ M such that p ∈
/ M.
Given a “blob” (i.e., set) in our flow domain (M ), we will want to develop ways of
quantifying how it “fills out” the domain. We begin with some very primitive notions.
Definition 13 (Dense Set). A set V ⊂ M is said to be dense in M if V̄ = M .
Intuitively, while a dense set V may not contain all points of M , it does contain ‘enough’
points to be close to all points in M .
Definition 14 (Nowhere Dense Set). A set V ⊂ M is said to be nowhere dense if V̄ has
empty interior, i.e., it contains no (nonempty) open sets.
2.3. A and µ (Measuring the ‘size’ of sets). A measure is a function that assigns
a number to a given set. The assigned number can be thought of as a size, probability
or volume of the set. Indeed a measure is often regarded as a generalization of the idea
of the volume (or area) of a set. Every definition of integration is based on a particular
measure, and a measure could also be thought of as a type of “weighting” for integration.
Definition 15 (Measure). A measure µ is a real-valued function defined on a σ-algebra
satisfying the following properties:
(1) µ(∅) = 0,
(2) µ(A) ≥ 0,
S
P
(3) for a countable collection of disjoint sets {An }, µ( An ) = µ(An ).
These properties are easily understood in the context of the most familiar of measures.
In two dimensions, area (or volume in three dimensions) intuitively has the following
properties: the area of an empty set is zero; the area of any set is non-negative; the
area of the union of disjoint sets is equal to the sum of the area of the constituent sets.
The measure which formalises the concept of area or volume in Euclidean space is called
Lebesgue measure.
The collection of subsets of M on which the measure is defined is called a σ-algebra
over M . Briefly, a σ-algebra over M is a collection of subsets that is closed under the
formation of countable unions of sets and the formation of complements of sets. More
precisely, we have the following definition.
Definition 16 (σ-algebra over M ). A σ-algebra, A, is a collection of subsets of M such
that:
(1) M ∈ A,
(2) M \A ∈ A for A ∈ A,
QUANTIFYING MIXING
(3)
9
S
n≥0 An ∈ A for all An ∈ A forming a finite or infinite sequence {An } of subsets
of M .
In other words, a σ-algebra contains the space M itself, and sets created under countable
set operations. These are, roughly, the sets which can be measured.
If µ is always finite, we can normalise to µ(M ) = 1. In this case there is an analogy
with probability theory that is often useful to exploit, and µ is referred to as a probability
measure. A set equipped with a σ-algebra is called a measurable space. If it is also
equipped with a measure, then it is called a measure space.
There are a number of measure-theoretic concepts we will encounter in later chapters.
The most common is perhaps the idea of a set of zero measure. We will repeatedly be
required to prove that points in a set possess a certain property. In fact what we actually
prove is not that every point in a set possesses that property, but that almost every point
possesses the property. The exceptional points which fail to satisfy the property form a
set of measure zero, and such a set is, in a measure-theoretic sense, negligible. Naturally,
a subset U ⊂ M has measure zero if µ(U ) = 0. Strictly speaking, we should state that U
has measure zero with respect to the measure µ. Moreover, if U ∈
/ A, the σ-algebra, then it
is not measurable and we must replace the definition by: a subset U ⊂ M has µ-measure
zero if there exists a set A ∈ A such that U ⊂ A and µ(A) = 0. However, in this book,
and in the applications concerned, we will allow ourselves to talk about sets of measure
zero, assuming that all such sets are measurable, and that the measure is understood.
Note that a set of zero measure is frequently referred to as a null set.
From the point of view of measure theory two sets are considered to be “the same”
if they “differ by a set of measure zero.” This sounds straightforward, but to make this
idea mathematically precise requires some effort. The mathematical notion that we need
is the symmetric difference of two sets. This is the set of points that belong to exactly
one of the two sets. Suppose U1 , U2 ⊂ M ; then the symmetric difference of U1 and U2 ,
denoted U1 4U2 , is defined as U1 4U2 ≡ (U1 \U2 ) ∪ (U2 \U1 ). We say that U1 and U2 are
equivalent (mod 0) if their symmetric difference has measure zero.
This allows us to define precisely the notions of sets of full measure and positive measure.
Suppose U ⊂ M , then U is said to have full measure if U and M are equivalent (mod 0).
Intuitively, a set U has full measure in M if µ(U ) = 1 (assuming µ(M ) = 1). A set of
positive measure is intuitively understood as a set V ⊂ M such that µ(V ) > 0, that is,
strictly greater than zero. The support of a measure µ on a metric space M is the set of
all points x ∈ M such that every open neighbourhood of x has positive measure.
Finally in this section we mention the notion of absolute continuity of a measure. If µ
and ν are two measures on the same measurable space M then ν is absolutely continuous
with respect to µ, written ν µ, if ν(A) = 0 for every measurable A ⊂ M for which
µ(A) = 0. Although absolute continuity is not something which we will have to work
with directly, it does form the basis of many arguments in ergodic theory. Its importance
stems from the fact that for physical relevance we would like properties to hold on sets
of positive Lebesgue measure, since Lebesgue measure corresponds to volume. Suppose
however that we can only prove the existence of desirable properties for a different measure
ν. We would then like to show that ν is absolutely continuous with respect to Lebesgue
measure, as the definition of absolute continuity would guarantee that if ν(A) > 0 for a
measurable set A, then the Lebesgue measure of A would also be strictly positive. In other
10
QUANTIFYING MIXING
words, any property exhibited on a significant set with respect to ν would also manifest
itself on a significant set with respect to Lebesgue measure.
2.4. The transformation f . In reading the dynamical systems or ergodic theory literature one encounters a plethora of terms describing transformations, e.g. isomorphisms,
automorphisms, endomorphisms, homeomorphisms, diffeomorphisms, etc. In some cases,
depending on the structure of the space on which the map is defined, some of these terms
may be synonyms. Here we will provide a guide for this terminology, as well as describe
what is essential for our specific needs.
First, we start very basically. Let A and B be arbitrary sets, and consider a map,
mapping, function, or transformation (these terms are often used synonymously), f :
A → B. The key defining property of a function is that for each point a ∈ A, it has only
one image under f , i.e., f (a) is a unique point in B. Now f is said to be one-to-one if any
two different points are not mapped to the same point, i.e., a 6= a0 ⇒ f (a) 6= f (a0 ), and it
is said to be onto if every point b ∈ B has a preimage in A, i.e., for any b ∈ B there is at
least one a ∈ A such that f (a) = b. These two properties of maps are important because
necessary and sufficient conditions for a map to have an inverse2 f −1 is that it be one-toone and onto. There is synonomous terminology for these properties. A mapping that is
one-to-one is said to be injective (and may be referred to as an injection), a mapping that
is onto is said to be surjective (and may be referred to as an surjection), and a mapping
that is both one-to-one and onto is said to be bijective (and may be referred to as an
bijection).
So far we have talked about properties of the mapping alone, with no mention of the
properties of the sets A and B. In applications, additional properties are essential for
discussing basic properties such as continuity and differentiability. In turn, when we
endow A and B with the types of structure discussed in the previous section, it then
becomes natural to require the map to respect this structure, in a precise mathematical
sense. In particular, if A and B are equipped with algebraic structures, then a bijective
mapping from A to B that preserves the algebraic structures in A and B is referred to
as an isomorphism (if A = B then it is referred to as an automorphism). If A and B
are equipped with a topological structure, then a bijective mapping that preserves the
topological structure is referred to as a homeomorphism. Equivalently, a homeomorphism
is a map f that is continuous and invertible with a continuous inverse. If A and B
are equipped with a differentiable structure, then a bijective mapping that preserves the
differentiable structure is referred to as a diffeomorphism. Equivalently, a diffeomorphism
is a map that is differentiable and invertible with a differentiable inverse.
The notion of measurability of a map follows a similar line of reasoning. We equip A with
a σ-algebra A and B with a σ-algebra A0 . Then a map f : A → B is said to be measurable
(with respect to A and A0 ) if f −1 (A0 ) ∈ A for every A0 ∈ A0 . In the framework of using
ergodic theory to describe fluid mixing, it is natural to consider a measure space M with
the Borel σ-algebra. It is shown in most analysis courses following the approach of measure
theory that continuous functions are measurable. Hence, a diffeomorphism f : M → M is
certainly also measurable. However, in considering properties of functions in the context
of a measure space, it is usual to disregard, to an extent, sets of zero measure. To
be more precise, many properties of interest (e.g. nonzero Lyapunov exponents, or f
being at least two times continuously differentiable) may fail on certain exceptional sets.
2The
inverse of a function f (x) is written f −1 (x) and is defined by f (f −1 (x)) = f −1 (f (x)) = x.
QUANTIFYING MIXING
11
These exceptional sets will have zero measure, and so throughout transformations will be
sufficiently well-behaved, including being measurable and sufficiently differentiable.
Measure Preserving Transformations. Next we can define the notion of a measure preserving transformation.
Definition 17 (Measure Preserving Transformation). A transformation f is measurepreserving if for any measurable set A ⊂ M :
µ(f −1 (A)) = µ(A) for all A ∈ A
This is equivalent to calling the measure µ f -invariant (or simply invariant). If the
transformation f is invertible (that is, f −1 exists), as in all the examples that we will
consider, this definition can be replaced by the more intuitive definition.
Definition 18. An invertible transformation f is measure-preserving if for any measurable set A ⊂ M :
µ(f (A)) = µ(A) for all A ∈ A
For those working in applications the notation f −1 (A) may seem a bit strange when
at the same time we state that it applies in the case when f is not invertible. However, it is important to understand f −1 (A) from the point of view of its set-theoretic
meaning: literally, it is the set of points that map to A under f . This does not require
f to be invertible (and it could consist of disconnected pieces). We have said nothing
so far about whether such an invariant measure µ might exist for a given transformation f , but a standard theorem, called the Kryloff-Bogoliouboff theorem (see for example
[Katok & Hasselblatt(1995)], or [Kryloff & Bogoliouboff(1937)] for the original) guarantees that if f is continuous and M is a compact metric space then an invariant Borel
probability measure does indeed exist.
In many of the examples that we will consider the measure of interest will be the area,
i.e., the function that assigns the area to a chosen set. The fluid flow will preserve this
measure as a consequence of the flow being incompressible. Finally, we end this section
by pulling together the crucial concepts above into one definition.
Definition 19 (Measure-Preserving Dynamical System). A measure-preserving dynamical system is a quadruple (M, A, f, µ) consisting of a metric space M , a σ-algebra A over
M , a transformation f of M into M , and a f -invariant measure µ.
3. Fundamental results for measure-preserving dynamical systems
In this section we give two classical, and extremely fundamental results for dyamical
systems which preserve an invariant measure. The ideas are a foundation of much of the
theory which follows in later chapters. We begin with a theorem about recurrence.
Theorem 1. (Poincaré Recurrence Theorem) Let (M A, f, µ) be a measure-preserving
dynamical system, and let A ∈ A be an arbitrary measurable set with µ(A) > 0. Then for
almost every x ∈ A, there exists n ∈ N such that f n (x) ∈ A, and moreover, there exists
infinitely many k ∈ N such that f k (x) ∈ A.
Proof: Let B be the set of points in A which never return to A,
B = {x ∈ A|f n (x) ∈
/ A for all n > 0}.
12
QUANTIFYING MIXING
We could also write
−n
B = A\ ∪∞
(A).
i=0 f
n
First note that since B ⊆ A, if x ∈ B, then f (x) ∈
/ B, by the definition of B. Hence
B ∩ f −n (B) = ∅ for all n > 0 (if not then applying f n contradicts the previous sentence).
We also have f −n (B)∩f n+k (B) = ∅ for all n > 0, k ≥ 0 (else a point in f −k (B)∩f −(n+k) (B)
would have to map under f −k into both B and f −n (B) and we have just seen that these
are disjoint). Therefore the sets B, f −1 (B), f −2 (B), . . . are pairwise disjoint. Moreover
because f is measure-preserving µ(B) = µ(f −1 (B)) = µ(f −2 (B)) = . . .. Now we have a
collection of an infinite number of pairwise disjoint sets of equal measure in M , and since
µ(M ) = 1 we must have µ(B) = 0, and so for almost every x ∈ A we have f n (x) ∈ A for
some n > 0. To prove that the orbit of x returns to A infinitely many times, we note that
we can simply repeat the above argument starting at the point f n (x) ∈ A to find n0 > n
0
such that f n (x) ∈ A for almost every x ∈ A, and continue in this fashion.
One of the most important results concerning measure-preserving dynamical systems
is the Birkhoff Ergodic Theorem, which tells us that for typical initial conditions, we can
compute time averages of functions along an orbit. Such functions ϕ on an orbit are known
as observables, and in practice might typically be a physical quantity to be measured,
such as concentration of a fluid. On the theoretical side, it is crucial to specify the class
of functions to which ϕ belongs. For example we might insist that ϕ be measurable,
integrable, continuous or differentiable.
We give the theorem without proof, but the reader could consult, for example, [Birkhoff(1931)],
[Katznelson & Weiss(1982)], [Katok & Hasselblatt(1995)] or [Pollicott & Yuri(1998)] for
further discussion and proofs.
Theorem 2. (Birkhoff Ergodic Theorem) Let (M A, f, µ) be a measure-preserving
R dynamical system, and let ϕ ∈ L1 (i.e., the set of functions on M such that M ϕdµ is
bounded) be an observable function. Then the forward time average ϕ+ (x) given by
n−1
(3.0.1)
1X
ϕ(f i (x))
ϕ (x) = lim
n→∞ n
i=0
+
exists for µ-almost every x ∈ M . Moreover, the time average ϕ+ satisfies
Z
Z
+
ϕ (x)dµ =
ϕ(x)dµ
M
M
This theorem can be restated for negative time to show that the backward time average
n−1
1X
ϕ (x) = lim
ϕ(f −i (x))
n→∞ n
i=0
−
also exists for µ-almost every x ∈ M . A simple argument reveals that forward time
averages equal backward time averages almost everywhere.
Lemma 1. Let (M A, f, µ) be a measure-preserving dynamical system, and let ϕ ∈ L1 be
an observable function. Then
ϕ+ (x) = ϕ− (x)
for almost every x ∈ M ; that is, the functions ϕ+ and ϕ− coincide almost everywhere.
QUANTIFYING MIXING
13
Proof: Let A+ = {x ∈ M |ϕ+ (x) > ϕ− (x)}. By definition A+ is an invariant set, since
ϕ+ (x) = ϕ+ (f (x)). Thus applying the Birkhoff Ergodic Theorem to the transformation
f restricted to the set A+ we have
Z
Z
Z
+
−
+
(ϕ (x) − ϕ (x)dµ =
ϕ (x)dµ −
ϕ− (x)dµ
+
+
+
A
ZA
Z A
=
ϕ(x)dµ −
ϕ(x))dµ
A+
A+
= 0.
Then since the integrand in the first integral is strictly positive by definition of A+ we
must have µ(A+ ) =0, and so ϕ+ (x) ≤ ϕ− (x) for almost every x ∈ M . Similarly, the same
argument applied to the set A− = {x ∈ M |ϕ− (x) > ϕ+ (x)} implies that ϕ− (x) ≤ ϕ+ (x)
for almost every x ∈ M , and so we conclude that ϕ+ (x) = ϕ− (x) for almost every x ∈ M .
The Birkhoff Ergodic Theorem tells us that forward time averages and backward time
averages exist, providing we have an invariant measure. It also says that the spatial
average of a time average of an integrable function ϕ is equal to the spatial average of ϕ.
Note that it does not say that the time average of ϕ is equal to the spatial average of ϕ.
For this to be the case, we require ergodicity.
4. Ergodicity
In this section we describe the notion of ergodicity, but first we emphasize an important
point. Since we are assuming that M is a compact metric space the measure of M is finite.
Therefore all quantities of interest can be rescaled by µ(M ). In this way, without loss
of generality, we can take µ(M ) = 1. In the ergodic theory literature, this is usually
stated from the start, and all definitions are given with this assumption. In order to
make contact with this literature, we will follow this convention. However, in order to
get meaningful estimates in applications one usually needs to take into account the size
of the domain (i.e. µ(M )). We will address this point when it is necessary.
There are many equivalent definitions of ergodicity. The basic idea is one of indecomposability. Suppose a transformation f on a space M was such that two sets of positive
measure, A and B, were invariant under f . Then we would be justified in studying f
restricted to A and B separately, as the invariance of A and B would guarantee that no
interaction between the two sets occurred. For an ergodic transformation this cannot happen — that is, M cannot be broken down into two (or more) sets of positive measure on
which the transformation may be studied separately. This need for the lack of non-trivial
invariant sets motivates the definition of ergodicity.
Definition 20 (Ergodicity). A measure preserving dynamical system
(M, A, f, µ) is ergodic if µ(A) = 0 or µ(A) = 1 for all A ∈ A such that f (A) = A.
We sometimes say that f is an ergodic transformation, or that µ is an ergodic invariant
measure.
Ergodicity is a measure-theoretic concept, and is sometimes referred to as metrical transitivity. This evokes the related idea from topological dynamics of topological transitivity
(which is sometimes referred to as topological ergodicity).
14
QUANTIFYING MIXING
Definition 21 (Topological transitivity). A (topological) dynamical system f : X → X
is topologically transitive if for every pair of open sets U, V ⊂ X there exists an integer
n such that f n (U ) ∩ V 6= ∅.
A topologically transitive dynamical system is often defined as a system such that the
forward orbit of some point is dense in M . These two definitions are in fact equivalent (for
homeomorphisms on a compact metric space), a result given by the Birkhoff Transitivity
Theorem (see for example, [Robinson(1998)]).
Another common, heuristic way to think of ergodicity is that orbits of typical initial
conditions come arbitrarily close to every point in M , i.e. typical orbits are dense in M .
“Typical” means that the only trajectories not behaving in this way form a set of measure
zero. More mathematically, this means the only invariant sets are trivial ones, consisting
of sets of either full or zero measure. However ergodicity is a stronger property than the
existence of dense orbits.
The importance of the concept of ergodicity to fluid mixing is clear. An invariant set
by definition will not ‘mix’ with any other points in the domain, so it is vital that the only
invariant sets either consist of negligably small amounts of points, or comprise the whole
domain itself (except for negligably small amounts of points). The standard example of
ergodicity in dynamical systems is the map consisting of rigid rotations of the circle.
Example 1. Let M = S 1 , and f (x) = x + ω (mod 1). Then if
ω is a rational number, f is not ergodic,
ω is an irrational number, f is ergodic.
A rigorous proof can be found in any book on ergodic theory, for example [Petersen(1983)].
We note here that the irrational rotation on a circle is an example of an even more special
type of system. The infinite non-repeating decimal part of ω means that x can never return
to its initial value, and so no periodic orbits are possible.
There are a number of common ways to reformulate the definition of ergodicity. Indeed,
these are often quoted as definitions. We give three of the most common here, as they
express notions of ergodicity which will be useful in later chapters. The first two are based
on the behaviour of observable functions for an ergodic system.
Definition 22 (Ergodicity — equivalent). A measure preserving dynamical system (M, A, f, µ)
is ergodic if and only if every invariant measurable (observable) function ϕ on M is constant almost everywhere.
This is simply a reformulation of the definition in functional language, and it is not hard
to see that this is equivalent to the earlier definition (see for example [Katok & Hasselblatt(1995)]
or [Brin & Stuck(2002)] for a proof). We will make use of this equivalent definition later.
Perhaps a more physically oriented notion of ergodicity comes from Boltzman’s development of statistical mechanics and is succinctly stated as “time averages of observables
equal space averages.” In other words, the long term time average of a function (“observable”) along a single “typical” trajectory should equal the average of that function over
all possible initial conditions. We state this more precisely below.
Definition 23 (Ergodicity — equivalent). A measure preserving dynamical system
A, f, µ)
R (M,
1
is ergodic if and only if for all ϕ ∈ L (i.e., the set of functions on M such that M ϕdµ
is bounded), we have
QUANTIFYING MIXING
n−1
1X
lim
ϕ(f k (x)) =
n→∞ n
k=0
15
Z
ϕ dµ
M
This definition deserves a few moments thought. The right hand side is clearly just a
constant, the spatial average of the function ϕ. It might appear that the left hand side,
the time average of ϕ along a trajectory depends on the given trajectory. But that would
be inconsistent with the right hand side of the equation being a constant. Therefore the
time averages of typical trajectories are all equal, and are equal to the spatial average,
for an ergodic system.
We have yet another definition of ergodicity that will “look” very much like the definitions of mixing that we introduce in the next section.
Definition 24 (Ergodicity — equivalent). The measure preserving dynamical system f
is ergodic if and only if for all A, B ∈ A,
n−1
1X
µ(f k (A) ∩ B) = µ(A)µ(B).
lim
n→∞ n
k=0
It can be shown (see, e.g., [Petersen(1983)]) that each of these definitions imply each
other (indeed, there are even more equivalent definitions that can be given). A variety of
equivalent definitions is useful because verifying ergodicity for a specific dynamical system
is notoriously difficult, and the form of certain definitions may make them easier to apply
for certain dynamical systems.
5. Mixing
We now discuss ergodic theory notions of mixing, and contrast them with ergodicity.
In the ergodic theory literature the term mixing is encompassed in a wide variety of
definitions that describe different strengths or degrees of mixing. Frequently the difference
between these definitions is only tangible in a theoretical framework. We give the most
important definitions for applications (thus far) below.
Definition 25 ((Strong) Mixing). A measure preserving (invertible) transformation f :
M → M is (strong) mixing if for any two measurable sets A, B ⊂ M we have:
lim µ(f n (A) ∩ B) = µ(A)µ(B)
n→∞
Again, for a non-invertible transformation we replace f n in the definition with f −n . The
word “strong” is frequently omitted from this definition, and we will follow this convention
and refer to “strong mixing” simply as “mixing”. This is the most common, and the most
intuitive definition of mixing, and we will describe the intuition behind the definition. To
do this, we will not assume that µ(M ) = 1.
Within the domain M let A denote a region of, say, black fluid and let B denote any
other region within M . Mathematically, we denote the amount of black fluid that is
contained in B after n applications of f by
µ (f n (A) ∩ B) ,
16
QUANTIFYING MIXING
that is, the volume of f n (A) that ends up in B. Then the fraction of black fluid contained
in B is given by
µ (f n (A) ∩ B)
.
µ(B)
Intuitively, the definition of mixing should be that, as the number of applications of f is
increased, for any region B we would have the same proportion of black fluid in B as the
proportion of black fluid in M . That is,
µ (f n (A) ∩ B)
µ(A)
−
→ 0, as n → ∞,
µ(B)
µ(M )
Now if we take µ(M ) = 1, we have
µ (f n (A) ∩ B) − µ(A)µ(B) → 0 as n → ∞,
which is our definition of (strong) mixing. Thinking of this in a probabilistic manner, this
means that given any subdomain, upon iteration it becomes (asymptotically) independent
of any other subdomain.
Like ergodicity, measure-theoretic mixing has a counterpart in topological dynamics,
called topological mixing.
Definition 26 (Topological mixing). A (topological) dynamical system f : X → X is
topologically mixing if for every pair of open sets U, V ⊂ X there exists an integer N > 0
such that f n (U ) ∩ V 6= ∅ for all n ≥ N .
Note the relationship between this definition and definition 21. For topological transitivity we simply require that for any two open sets, an integer n (which will depend on
the open sets in question) can be found such that the nth iterate of one set intersects
the other. For topological mixing we require an integer N which is valid for all open
sets, such that whenever n ≥ N , the nth iterate of one set intersects the other. Again
the measure-theoretic concept of mixing is stronger than topological mixing, so that the
following theorem holds, but not the converse.
Theorem 3. Suppose a measure-preserving dynamical system (M, A, f, µ) is mixing.
Then f is topologically mixing.
Proof: See for example [Petersen(1983)].
Definition 27 (Weak Mixing). The measure preserving transformation f : M → M is
said to be weak mixing if for any two measurable sets A, B ⊂ M we have:
n−1
1X
lim
|µ(f k (A) ∩ B) − µ(A)µ(B)| = 0
n→∞ n
k=0
It is easy to see that weak mixing implies ergodic (take A such that f (A) = A, and
take B = A. Then µ(A) − µ(A)2 = 0 implies µ(A) = 0 or 1). The converse is not true,
for example an irrational rotation is not weak mixing.
Mixing is a stronger notion than weak mixing, and indeed mixing implies weak mixing.
Although the converse is not true it is difficult to find an example of a weak mixing
QUANTIFYING MIXING
17
Figure 10: A sketch illustrating the principle of mixing. Under iteration of f , the set
A spreads out over M , until the proportion of A found in any set B is the same as the
proportion of A in M .
system which is not mixing (such an example is constructed in [Parry(1981)]). This
makes it unlikely to see weak mixing which is not mixing in applications.
Definition 28 (Light Mixing). The measure preserving transformation f : M → M is
said to be light mixing if for any two measurable sets A, B ⊂ M we have:
lim inf µ(f k (A) ∩ B) > 0
n→∞
Definition 29 (Partially Mixing). The measure preserving transformation f : M → M
is said to be partially mixing if for any two measurable sets A, B ⊂ M and for some
0 < α < 1 we have:
lim inf µ(f k (A) ∩ B) ≥ αµ(A)µ(B)
n→∞
Definition 30 (Order n-mixing). The measure preserving transformation f : M → M is
said to be order n-mixing if for any measurable sets A1 , A2 , . . . An ⊂ M we have:
lim
mi →∞
|mi −mj |→∞
µ(f m1 (A1 ) ∩ f m2 (A2 ) ∩ · · · ∩ f mn (An )) = µ(A1 )µ(A2 ) . . . µ(An )
Example 2. Example of mixing - the Baker’s map
Let f : [0, 1] × [0, 1] → [0, 1] × [0, 1] be a map of the unit square given by
(2x, y/2)
0 ≤ x < 1/2
(5.0.2)
f (x, y) =
(2x − 1, (y + 1)/2) 1/2 ≤ x ≤ 1
18
1
0
QUANTIFYING MIXING
11111
00000
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
00000
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
00000
11111
00000
0000011111
11111
00000
11111
0
1
1
1/2
0
1
11111111111111111111
0000000000
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
00000000001111111111
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
11111111111111111111
1111111111
0000000000
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
11111111111111111111
1111111111
0000000000
00000000000000000000
1111111111
0000000000
1111111111
0
1
0
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
1111111111
0000000000
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
1111111111
0000000000
0000000000
1111111111
1111111111
0000000000
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
1111111111
0000000000
0000000000
1111111111
0
1
Figure 11: The action of the Baker’s map of example 2. The unit square is stretched by a
factor of two in the x direction while contracted by a factor of one half in the y direction.
To return the image to the unit square the right hand half is cut off and placed on the
left hand half.
Note that this definition guarantees that both x and y remain in the interval [0, 1]. This
map can be found defined in slightly different ways, using (mod 1) to guarantee this. The
action of this map is illustrated in figure 11. The name of the transformation comes from
the fact that its action can be likened to the process of kneading bread when baking. At each
iterate, the x variable is stratched by a factor of two, while the y variable is contracted by
a factor of one half. This ensures that the map is area-preserving. Since the stretching in
the x direction leaves the unit square, the right hand half of the image cut off and placed
on top of the left half. (This is similar to the Smale horseshoe discussed in the following
chapter, except that there the map is kept in the unit square by folding rather than by
cutting.)
Theorem 4. The Baker’s map of example 2 is mixing.
Proof:
This is simple to see diagrammatically. Figure 12 shows two sets A and B in the unit
square. For simplicity we have chosen A to be the bottom half of the unit square, so that
µ(A) = 1/2, while B is some arbitrary rectangle. The following five diagrams in figure
12 show the image of A under 5 iterations of the Baker’s map. It is clear that after n
iterates A consists of 2n strips of width 1/2n+1 . It is then easy to see that
µ(B)
= µ(A)µ(B).
lim µ(f n (A) ∩ B) =
n→∞
2
A similar argument suffices for any two sets A and B, and so the map is mixing. Example 3. The Kenics Mixer
The Kenics mixer is a micromixer based on the action of the Baker’s map.
Typically the same argument cannot be made as hand in hand with chaotic dynamics
goes a huge increase in geometrical complexity. A naive numerical scheme might be
attempted, based on the definitions given above, with rely on computing µ(f k (A) ∩ B).
Several problems are immediately evident. For example, which pair of sets A and B to
take? Also, numerical problems are encountered due to the massive expansion in one
direction and massive contraction in another. (For example, in the Baker’s map, two
points initially a distance apart in the y-direction after n iterations are a distance /2n
apart. Since 10−16 ≈ 2−50 , using double precision one would expect to have lost all
accuracy after about 50 iterations.)
QUANTIFYING MIXING
19
1
0
0
1
Figure 12: Successive iterates of the Baker’s map on the set A.
Figure 13: The Kenics mixer (picture taken from REFERENCE
6. Transfer Operators
Instead of looking at the behaviour of individual sets, look at the evolution of densities
or some other observable. To do so, we define transfer operators. For more details, see
for example [Lasota & Mackey(1994)] or [Choe(2005)].
20
QUANTIFYING MIXING
Figure 14: Flow patterns for the Kenics mixer (picture taken from REFERENCE)
Definition 31. Let (M, A, µ) be a measure space. Any linear operator P : L1 → L1 (i.e.,
an operator on integral functions ϕ) satisfying
• P ϕ ≥ 0 for ϕ ≥ 0, ϕ ∈ L1
• kP ϕk = kϕk for ϕ ≥ 0, ϕ ∈ L1
is called a Markov operator.
A particular type of Markov operator is the Frobenius-Perron operator, which is used
for studying the evolution of densities.
Definition 32. Let (M, A, µ) be a measure space. A Markov operator P : L1 → L1
satisfying
Z
Z
P ϕ(x)dµ =
ϕ(x)dµ
A
f −1(A)
for A ∈ A is called a Frobenius-Perron operator.
The Frobenius-Perron operator has the following additional properties:
(1) RP is linear (P (λR1 ϕ1 + λ2 ϕ2 ) = λ1 P ϕ1 + λ2 P ϕ2 )
(2) M P ϕ(x)dµ = M ϕ(x)dµ
R
(3) RIf Pn is the Frobenius-Perron operator for f n then Pn − P n (i.e., A P n ϕ(x)dµ =
ϕ(x)dµ)
f −1 (A)
Another particular type of operator is the Koopman operator U : L∞ → L∞ (i.e., an
operator on essentially bounded functions), defined by
U ϕ(x) = ϕ(f (x))
So the Koopman operator is concerned with the evolution of observable functions. It has
the properties:
(1) U is linear
QUANTIFYING MIXING
21
(2) kU ϕkL∞ ≤ kϕkL∞ (i.e., U is a contraction)
(3) hP f, gi = hf, U gi (i.e., U is adjoint to the Frobenius-Perron operator)
R
(Here ha, bi signifies the inner product given by M abdµ.
6.1. Connection with mixing.
Proposition 1. [Lasota & Mackey(1994)] Let (M, A, f, µ) be a measure-preserving dynamical system, with P the Frobenius-Perron operator corresponding to f, ϕ. Then
(1) If f is measure-preserving then the constant density ϕ(x) = 1 is a fixed point of P
(i.e., ϕ(x) = 1 is a stationary density, P 1 = 1).
(2) If f is ergodic the ϕ(x) = 1 is the unique stationary density.
(3) If f is mixing the ϕ(x) = 1 is unique and ‘stable’
Proposition 2. [Lasota & Mackey(1994)] Let (M, A, f, µ) be a measure-preserving dynamical system, with U the Koopman operator corresponding to f . Then (for ϕ ∈ L1 and
ψ ∈ L∞ )
• f is ergodic if and only if
n−1
1X
lim
hϕ, U k ψi = hϕ, 1ih1, ψi
n→∞ n
k=0
• f is weak mixing if and only if
n−1
1 X hϕ, U k ψi − hϕ, 1ih1, ψi = 0
n→∞ n
k=0
lim
• f is mixing if and only if
lim hϕ, U k ψi = hϕ, 1ih1, ψi
n→∞
7. Example systems
Throughout the following sections we will define a number of mixing indices which
quantify in some way how well a system is mixed. We will refer frequently to each of the
following systems.
7.1. Baker’s Map. The definition of the Baker’s Map has been given in example 2. The
principle behind it, that of stretching, cutting and stacking, has been recognised as a
fundamental mechanism of fluid mixing for many years. Figure 7.1 shows sketches taken
from [Danckwerts(1953a)] and [Spencer & Wiley(1951)] illustrating this point.
Figure 15: Sketches of Baker’s Map dynamics from [Danckwerts(1953b)] and
[Spencer & Wiley(1951)] illustrating the fundamental action of the map.
22
QUANTIFYING MIXING
7.2. Cat Map. The Cat Map is an archetypal example (perhaps the archetypal example) of a two-dimensional area-preserving linear diffeomorphism with the demonstrable
properties of ergodicity, mixing, the Bernoulli property, exponential decay of correlations,
existence of Markox partitions, uniform hyperbolicity, and just about every other property
connected with chaotic dynamics. It is given by the toral automorphism:
(7.2.1)
f (x, y) = (x + y, x + 2y)
(mod 1)
The dynamics of this map is shown in figure 7.2.
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0
0.2
0.4
0.6
0.8
1
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0
0.2
0.4
0.6
0.8
1
Figure 16: Dynamics of the Arnold Cat Map
7.3. Standard Map. The standard map is a nonlinear perturbation of a shear map,
given by
(7.3.1)
f (x, y) = (x + y + K sin(2πx), y + K sin(2πx))
(mod 1)
where K is a parameter. When K = 0 the map is a simple shear, and increasing K
increases the complexity in the dynamics. The dynamics of the standard map is illustrated
in figure 7.3.
8. Segregation
In 1953 P.V. Danckwerts published the seminal paper The definition and measurement
of some characteristics of mixtures” [Danckwerts(1953a)]. From his point of view, any
quantitative description of mixing must (to be of practical value)
QUANTIFYING MIXING
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
23
0
0
0.2
0.4
0.6
0.8
1
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0
0.2
0.4
0.6
0.8
1
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
Figure 17: Dynamics of the standard map
(1) be related as closely as possible to the properties of the qualitative mixture
(2) be convenient to measure
(3) be widely applicable without modification
(4) not be dependent on arbitrary tests
An underlying assumption of such a description is that ultimate particles (that is,
the smallest particles capable of independent movement) are sufficiently small, compared
to the size of any portion of mixture analyzed. If this is the case, then the idea of
concentration at a point is meaningful. Moreover, in a “perfectly mixed” mixture, any
samples will have the same composition (cf the definition of ergodic).
Danckwerts’ main idea is that there are two largely independent mechanisms in a mixing
process, and so two complementary measurements will be required to quantify a mixture.
The two processes are:
(1) liquids broken into “clumps” and intermingled [cf baker’s map pic]. The average
size of unmixed clumps will continue to decrease. Often such clumps form long
streaks (striations)
(2) molecular diffusion, reducing mixture of mutually soluble liquids to uniformity.
These is frequently slow unless some clumping has occurred.
For liquids these independent process have distinguishable results. The first reduces the
size of clumps, while the second obliterates differences in concentration at the boundaries
24
QUANTIFYING MIXING
between clumps. The indicates two measures of the degree of mixing: scale of segregation
and intensity of segregation.
In the following we assume two liquids, A and B, occupying a region M of normalized
volume (µ(M ) = 1), have concentrations a = a(p) and b = b(p) at any point p. The
mean concentrations are labelled ā and b̄. Thus ā + b̄ = 1, and a(p) + b(p) = 1 for any
p. Typically we will consider ā = b̄ = 1/2, but for a (partially) mixed state we would
typically expect a(p) 6= b(p) 6= 1/2.
8.1. Scale of segregation. Choose two points p1 and p2 a distance r apart. Measure the
concentrations a1 = a(p1 ) and a2 = a(p2 ) at both, and calculate the deviations of these
from the mean (a1 − ā) and (a2 − ā). For a good mixture we would like these deviations
to be small. Compute the average of the product of these over a large number of pairs
of points, (a1 − ā)(a2 − ā), and finally divide by the mean square deviation (a − ā)2 to
normalize. This defines the coefficient of correlation,
(8.1.1)
R(r) =
(a1 − ā)(a2 − ā)
(a − ā)2
If we choose r = 0, so that a1 = a2 , we have
R(0) ≡ 1,
that is, at length scale zero, the mixture is completely segregated. In general, values of
R(r) near 1 mean that a large quantity of A at a point is associated with a large amount of
A a distance r away. If R(r) = 0, we have a random relationship between concentrations
at length scale r. It is possible that R(r) is negative, especially if the mixture has some
regular pattern.
Suppose we have the simple pattern shown in figure 11...
In a situation with no underlying pattern to the flow structure, we would expect to see
a monotonically decreasing correlogram for R(r). To extract a measure of mixing from
the correlogram we have the Linear scale of segregation:
Z ∞
Z r̂
S=
R(r)dr =
R(r)dr
0
0
which gives the average size of clumps, and the Volume scale of segregation:
Z ∞
Z r̂
2
r R(r)dr = 2π
r2 R(r)dr
V = 2π
0
0
which gives the volume of clumps.
Figure 8.1 shows an example from [Danckwerts(1953a)]. This illustrates the idea that
if clumps of black fluid are distributed amongst white fluid, S indicates the size of clumps
of black fluid. Figure 8.1 shows the effect of elongation of the clumps on the shape of the
correlogram and the value of S.
In figure 8.1 we show correlograms for the standard map (equation (7.3.1)) on the left.
This illustrates decreasing S with increasing iterations. On the right is the correlogram
for the third iterate of the Baker’s map (see figure 11), showing the effect of a pattern in
the mixture.
8.1.1. Measurement of scale of segregation. A technique for computing the correlation
function R(r) is detailed in [Tucker(1991)]...
QUANTIFYING MIXING
25
Figure 18: Figures from [Danckwerts(1953a)] showing a sample mixture and its correlogram. Here S/d = 0.42, where d is the diameter of the circles.
Figure 19: Figures from [Danckwerts(1953a)] showing the effect of elongating clumps.
Here S/d = 1.1 where the strips are of width d and length 10d.
1
1
0.9
0.8
0.8
0.6
0.7
0.6
0.4
0.5
0.2
0.4
0.3
0
0.2
-0.2
0.1
0
-0.4
0
0.05
0.1
0.15
0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 20: Correlogram (R(r) versus r) for the iterates of the standard map shown in
figure 7.3.
26
QUANTIFYING MIXING
intensity of
segregation
8.2. Intensity of segregation. The second mixing index of [Danckwerts(1953a)] is intensity of segregation. This is defined as
σ2
(8.2.1)
I= a
āb̄
2
Recall that σa is simply the variance of the concentration of a, so the intensity of segregation can be thought of as rescaled variance. It measures how much the composition
at each point differs from the average composition of mixture. It is the ratio of standard
deviation of concentration at a given state to the concentration at the initial state. It is
a second moment.
Similarly to the scale of segregation, we have I = 1 when segregation is complete (the
concentration at every point is either black or white), and I = 0 when the concentration
is uniform. I reflects the extent to which the concentrations in the clumps depart from
the mean, and not the relative amount of A and B, or the size of the clumps.
Figure 8.2, (adapted from [Tucker(1991)]) illustrates that scale of segregation and intensity of segregation are independent measurements.
scale of
segregation
Figure 21: [Tucker(1991)]
QUANTIFYING MIXING
27
8.2.1. Measurement of intensity of segregation. To measure I in practice, we are likely to
use a grid-based method. Dividing the domain into N boxes of size δ we have
PN
(ai − ā)2
I = i=1
N ā(1 − ā)
where ai represents the volume ratio in box i. Now a value of I = 0 implies the mixture
is completely mixed at the length scale δ. Now I depends on the choice of δ, and in
particular as δ → 0, I → 1.
Danckwerts derives in [Danckwerts(1953a)] a relationship between intensity and scale
of segregation:
d2 R 1 dI
∝− 2
−
I dt
dr r=0
9. Transport matrices
Transport matrices can be used to investigate mixing efficiency in a short (i.e., not
infinite) time. The procedure is as follows. Divide the domain into a (large )number of
regions/cells. A transport matrix (or distribution matrix) characterizes how much fluid in
each cell is transported to each other cell.
In the following, we assume an N × N grid on a unit square domain, giving N 2 cells
of equal area a = 1/N 2 . Denote by aij the amount of fluid in cell i which is in cell j
after the mixing procedure, for 1 ≤ i, j ≤ N 2 . Define an N 2 × N 2 matrix P with (i, j)th
component pij = aij /a. This normalization gives pij ∈ [0, 1]∀i, j.
By area-preservation/incompressibility we have
2
N
X
(9.0.2)
pij = 1
j=1
2
N
X
(9.0.3)
pij = 1
i=1
(9.0.4)
(n)
(n)
Let rAi be the ratio of fluid A in the ith cell at iteration n. Then setting qi
normalized so that
N2
X
(n)
qi = 1
=
(n)
i
rA
N r¯A
is
i=1
Thus
(n)
(n)
q (n) = (q1 , . . . , qN 2 )
gives the normalized distribution of A at iterate n. Let q (0) be the initial distribution, so
that
q (n) = q (0) P n
(This is effectively the statement that we have a Markov process). If P is invertible then
we can write
q (0) = q (n) P −n
28
QUANTIFYING MIXING
so that we can find the initial distribution required to achieve any given final (after n
iterates) distribution. Suppose we wish to achieve ‘perfect mixing’, which in this sense is
a uniform distribution of A at length scale N :
1
q (n) = 2 (1, 1, . . . , 1).
N
(0)
Then qj = N12 , since property 9.0.3 is preserved under matrix inversion. This implies
that to achieve perfect mixing, we must have started in a perfectly mixed state. Transport
matrices which produce perfect mixing in this sense from a less than perfectly mixed initial
state are non-invertible.
Example 4. Baker’s map...
The idea of using such a matrix to study and quantify mixing comes originally (I
think) from [Spencer & Wiley(1951)]. In this paper they illustrate the idea using the
map shown on the right of figure 7.1. This is a transformation very like the Baker’s map.
The dynamics of the six cells yields the transport matrix


1 1 0 1 1 0
 1 1 0 1 1 0 

1
 1 0 1 1 0 1 
P = 

4 1 0 1 1 0 1 
 0 1 1 0 1 1 
0 1 1 0 1 1
Iterating the transformation equates to taking successive powers of P , which are:


3 3 2 3 3 2
 3 3 2 3 3 2 

1 
 3 2 3 3 2 3 
2
P =

,
16  3 2 3 3 2 3 
 2 3 3 2 3 3 
2 3 3 2 3 3


11 11 10 11 11 10
 11 11 10 11 11 10 


1
 11 10 11 11 10 11 
P3 =

,
64  11 10 11 11 10 11 
 10 11 11 10 11 11 
10 11 11 10 11 11


43 43 42 43 43 42
 43 43 42 43 43 42 

1 
 43 42 43 43 42 43 
4
P =

.
256  43 42 43 43 42 43 
 42 43 43 42 43 43 
42 43 43 42 43 43
It is clear that this mixing operation quickly distributes the material uniformly. This
is quantified in [Spencer & Wiley(1951)] by considering the standard deviation of the
entries in each row of the matrices. For comparison, [Spencer & Wiley(1951)] also consider
QUANTIFYING MIXING
Matrix
P
P2
P3
P4
P̂
P̂
P̂
P̂
P̂ 2
P̂ 2
P̂ 2
P̂ 2
P̂ 3
P̂ 3
P̂ 3
P̂ 3
P̂ 4
P̂ 4
P̂ 4
P̂ 4
29
Row Standard deviation
Any
0.1179
Any
0.0295
Any
0.0074
Any
0.0018
1
0.2357
2
0.2357
3
0.3727
4
0.2887
1
0.0962
2
0.1227
3
0.1179
4
0.1559
1
0.0556
2
0.0471
3
0.0651
4
0.0916
1
0.0346
2
0.0296
3
0.0394
4
0.0437
an irregular transport matrix P̂ , which mixes material less uniformly than the previous
operation:


0 1 2 3
1 1 2 0 3 

P̂ = 
6 5 0 1 0 
0 3 3 0
with successive powers:


11 11 11 3

1 
 2 14 11 9  ,
P̂ 2 =
36  5 5 11 5 
18 6 3 9


66 42 42 66

1 
 6957 42 48
,
P̂ =

60 60 66 30 
216
21 57 66 72


252 348 372 324

1 
 267 327 324 378  .
P̂ =
1296  390 270 276 360 
387 351 324 234
The standard deviations in table 9 indicate the rate at which the uniform mixing
operation P reaches its uniformly distributed state, and for the irregular mixing operation
30
QUANTIFYING MIXING
P̂ , the different standard deviations for different rows of the powers of the transport
matrix indicates where best to place an initial sample to be distributed. For example,
if four iterates of the mixing operation are used, cell 2, corresponding to the smallest
deviation from uniformity, should be used. These examples are of course simple, and even
in 1951 [Spencer & Wiley(1951)] recognize that “in practice, rather large and unwieldy
matrices would be encountered. Fortunately, the necessary operations of inversion, matrix
multiplication etc. can be carried out quite readily using punched card methods”. Fifty
years later, manipulation of very large matrices is a much more straightforward matter.
9.1. Perron-Frobenius Theory.
Theorem 5. Let matrix Q be m × m, non-negative (qij ≥ 0) and irreducible. Then
(1) Q has a real eigenvalue r such that r ≥ |λi | for any other eigenvalue λi . (dominant
eigenvalue/spectral radius)
(2) the eigenvector
for r exists
P and has entries ≥ 0.
P
(3) mini j qij ≤ r ≤ maxi j qij
We can apply this to the transport matrix P (assuming it is irreducible), and since 9.0.2
and 9.0.3 hold (rows and columns sum to 1), we have λ1 = 1 and |λi | ≤ 1 for i = 2, . . . N 2 .
9.1.1. Irreducibility.
Definition 33. The m timesm matrix Q is irreducible if you cannot permute rows and
columns to achieve block upper triangular form. That is, 6 ∃J such that
J T AJ
where J is a permutation matrix (all zeros except one 1 entry in each row and each
column) and A11 and A22 are square.
This is a negative definition, and in a sense rather unhelpful (testing all possible permutations of a large matrix is likely to be laborious). A natural question then is how to
tell whether a given matrix is irreducible.
Theorem 6. Q is a non-negative irreducible matrix if (Im + Q)m−1 > 0.
(Note here that the m in the power is the same as the dimension of Q.) Im is the m × m
identity matrix. A further relevant definition is:
Definition 34. The m × m matrix Q is primitive if and only if Qk > 0 for some k. (That
is, (qij )k > 0 for some k and each i, j ∈ [1, m].
The properties of irreducibility and primitivity are related by:
Q is primitive =⇒ Q is irreducible
Q is irreducible with positive trace =⇒ Q is primitive
Theorem 7. If Q is primitive then r > |λi |.
QUANTIFYING MIXING
31
9.2. Eigenvalues of the transport matrix. Suppose P is diagonalizable. Then we can
write
Λ = B −1 P B
where B is a non-singular matrix and Λ is a diagonal matrix with diagonal entries Λii = λi ,
i = 1, . . . , N 2 . We can arrange these in descending order so that |λi | ≥ |λj | for i < j.
Recall that theorem 5 gives λ1 = 1. Writing B = (bij ) and B −1 = (b̂ij ) we have
P n = BΛn B −1
= λn1 H1 + λn2 H 2 + · · · + λnN 2 HN 2
where Hk = bik b̂kj , 1 ≤ i, j, k ≤ N 2 (i.e., the matrices Hk can be computed from the
eigenvectors of P ). It is easy to see (and in fact follows as a consequence of theorem 5)
that
1
H1 = 2 (1)N 2
N
(i.e., H1 is a matrix consisting entirely of entries 1/N 2 ). If |λ2 | < 1 (which is certainly
the case if P is a primitive matrix, by theorem 7), then
lim P n = H1
n→∞
If so, then the infinite time distribution is given by
q (∞) =
=
lim q (n)
n→∞
lim q (0) P n
n→∞
(0)
= q H1
1
(1, 1, . . . , 1)
=
N2
by 9.0.3. Hence we achieve, eventually, the ideal mixed state at the length scale of the
grid independently of the initial condition. Note that this situation corresponds to the
intensity of segregation σ = 0).
We consider as a means of quantification the speed of approach of P n to H1 . That is,
the values of |λi | for i ≥ 2 give an index of mixing efficiency.
—————————————————————————————————References
[Birkhoff(1931)] Birkhoff, G. D. (1931). Proof of the ergodic theorem. Proceedings of the Academy of
Sciences USA, 17, 656–600.
[Brin & Stuck(2002)] Brin, M. & Stuck, G. (2002). Introduction to Dynamical Systems. Cambridge University Press.
[Choe(2005)] Choe, G. H. (2005). Computational Ergodic Theory, volume 13 of Algorithms and Computation in Mathematics. Springer.
[Danckwerts(1953a)] Danckwerts, P. V. (1953a). The definition and measurement of some characteristics
of mixtures. Applied Scientific Research, A3, 279–296.
[Danckwerts(1953b)] Danckwerts, P. V. (1953b). Theory of mixtures and mixing. Research, 6, 355–361.
[Galaktionov et al.(2003)] Galaktionov, O. S., Anderson, R., Peters, G., & Meijer, H. (2003). Analysis
and optimization of kenics static mixers. International Polymer Processing, 18, 138–150.
[Katok & Hasselblatt(1995)] Katok, A. & Hasselblatt, B. (1995). Introduction to the Modern Theory of
Dynamical Systems. Cambridge University Press, Cambridge.
32
QUANTIFYING MIXING
[Katznelson & Weiss(1982)] Katznelson, Y. & Weiss, B. (1982). A simple proof of some ergodic theorems.
Israel J. Math., 42, 291–296.
[Kryloff & Bogoliouboff(1937)] Kryloff, N. & Bogoliouboff, N. (1937). La théorie générale de la mesure
dans son application à l’étude des systèmes dynamiques de la méchanique non linéaire. Ann. Math.,
38(1), 65–113.
[Lasota & Mackey(1994)] Lasota, A. & Mackey, M. C. (1994). Chaos, Fractals, Noise: Stochastic Aspects
of Dynamics, volume 97 of Applied Mathematical Sciences. Springer-Verlag, second edition edition.
[Ottino(1989a)] Ottino, J. M. (1989a). The Kinematics of Mixing: Stretching, Chaos, and Transport.
Cambridge University Press, Cambridge, England. Reprinted 2004.
[Ottino(1989b)] Ottino, J. M. (1989b). The mixing of fluids. Scientific American, 260, 56–67.
[Ottino et al.(1994)] Ottino, J. M., Jana, S. C., & Chakravarthy, V. S. (1994). From Reynolds’s stretching
and folding to mixing studies using horseshoe maps. Physics of Fluids, 6(2), 685–699.
[Parry(1981)] Parry, W. (1981). Topics in ergodic theory. Cambridge University Press, Cambridge.
[Petersen(1983)] Petersen, K. (1983). Ergodic Theory. Cambridge University Press, Cambridge.
[Pollicott & Yuri(1998)] Pollicott, M. & Yuri, M. (1998). Dynamical Systems and Ergodic Theory. Volume
40 of London Mathematical Society student texts. Cambridge University Press, Cambridge.
[Robinson(1998)] Robinson, C. (1998). Dynamical Systems: Stability, Symbolic Dynamics, and Chaos.
CRC Press.
[Smale(1967)] Smale, S. (1967). Differentiable dynamical systems. Bull. Amer. Math. Soc., 73(747–817).
[Spencer & Wiley(1951)] Spencer, R. S. & Wiley, R. M. (1951). The mixing of very viscous liquids. J.
Colloid. Sci., 6(133–145).
[Sturman et al.(2006)] Sturman, R., Ottino, J. M., & Wiggins, S. (2006). The Mathematical Foundations
of Mixing, volume 22 of Cambridge Monographs on Applied and Computational Mathematics. Cambridge
University Press.
[Tucker(1991)] Tucker, C. L. (1991). Principles of mixing measurement. In Mixing in Polymer Processing,
volume 23 of Plastics Engineering, pages 101–127. Marcel Dekker Inc.
[Yi et al.(2002)] Yi, M., Qian, S., & Bau, H. H. (2002). A magnetohydrodynamic chaotic stirrer. Journal
of Fluid Mechanics, 468, 153–177.