- Table of Contents
- Chapter 1: Introduction To and Survey of Parameter Estimation
- Chapter 2: Probability
- Chapter 3: Introduction To Statistics
- Chapter 4: Parameter Estimation Methods
- Chapter 5: Introduction To Linear Estimation
- Chapter 6: Matrix Analysis For Linear Parameter Estimation
- Chapter 7: Minimization of Sum of Squares Functions For Models Nonlinear In Parameters
- Chapter 8: Design of Optimal Experiments
- Appendices
- Index

C HAPTER
1_______________________________________
I NTRODUCTION TO AND SURVEY
OF PARAMETER ESTIMATION
'~
1.1
INTRODUCfION
One of the fundamental tasks of engineering and science, and indeed of
mankind in general, is the extraction of information from data. Parameter
estimation is a discipline that provides tools for the efficient use of data in
the estimation of constants appearing in mathematical models and for
aiding in modeling of phenomena.
The models may be in the form of algebraic, differential, or integral
equations a nd their associated initial and boundary conditions. An estimated parameter m ayor may not have' a direct physical significance.
Parameter estimation can also be visualized as a study of inverse
problems. In the solution of partial differential equations one classically
seeks a solution in a domain knowing the boundary and initial conditions
and any constants. In the inverse problem not all these constants would be
known. Instead discrete measurements of the dependent variable inside the
domain must be used to estimate values for these constants, also called
parameters.
Parameter estimation is needed in the modern world for the solution of
the many diverse problems related to the space program, investigation of
the atom, and modeling of the economy. Examples and applications in this
book, however, are directed to estimation problems occurring in engineering a nd science in which partial differential equations as well as ordinary
differential and algebraic equations are used to model the phenomena.
1
re:'
CHAPTER I
INTRODUCTION O F PARAMETER ESTIMATION
F ortunately, simultaneous with the development o f i ncreased need o f
p arameter estimation, computers have been built that make p arameter
e stimation practicable for a great array o f a pplications. I t should be noted
that both digital computational a nd d ata a cquisition facil ities are practical
necessities in p arameter estimation. Both these facilities have been readily
available only since the late 1950s o r e arly 1960s, w hereas estimation was
first extensively discussed by Legendre in 1806 [ I) a nd G auss in 1809 [2J.
In Gauss's classic paper he claimed usage o f the m ethod o f least squares
(still used in parameter estimation) as early as 1795 in connection with the
orbit determination of minor planets. F or this reason G auss is recognized
as being the first to use this important tool o f p arameter e stimation.
The name " parameter e stimation" is n ot universally used. O ther terms
are nonlinear least squares, nonlinear estimation, nonlinear regression,·
a nd identification, although the latter sometimes is given a quite different
meaning. Estimation is a statistical term a nd i dentification is a n electrical
engineering term .
1. 1.1
P arameters, Properties, and S tates
A m athematical model of a dynamic process usually involves ordinary o r
p artial differential equations. Sometimes the solution o f these equations is
a relatively simple set of algebraic equations. In any case, there are
dependent a nd i ndependent variables a nd also certain constants. T he
d ependent variables are sometimes called slate v ariables (or signals). T he
c onstants may be parameters.
I n experiments the states are frequently measured directly, b ut the
parameters are not. Approximate values of the parameters are inferred
from measurements o f the states. Since only approximate p arameter values
are found, the parameters are said to be eSlimated. T his book is p rimarily
concerned with estimating parameters. O ther b ooks concentrate o n p roviding " best" predictions o f states based on knowledge o f i ndependent variables. The two problems are quite similar. When parameters are estimated,
state estimates are usually found simultaneously.
A p arameter having a physical significance for a solid o r fluid might also
b e t ermed a properly. Examples o f p arameters t hat might also be properties
are characteristics of materials such as density, specific heat, viscosity,
thermal conductivity, electrical conductivity, Young's modulus, emittance,
a nd electrical capacitance. A property frequently involves the concept o f
p er unit length, area, o r volume. Examples o f q uantities that are parameters b ut n ot properties are weight o r mass o f a given specimen, electrical
· Parameter estimation is not necessarily nonlinear as implied by some of these terms.
1.1 INTRODUCTION
3
r esistance o f a s ection o f wire, a nd d rag e xperienced b y a t ruck moving a t
a c onstant velocity.
T he c oncepts o f p arameters, properties, a nd s tates a re i llustrated b y t he
following examples.
E xample 1.1.1
Newton's second law states for a system that F AI) - ma,,(I), where F AI) is the
force in the x direction, a,,(I) is the acceleration in the x direction, and m is mass.
Force and acceleration, which can be functions of time I , can be considered to be
states, whereas mass is a parameter. Force and acceleration are both often easily
measured; usually mass is easily measured separately also, but there can be cases
when the mass must be inferred, such as when determining the mass of comets and
planets. I f a body is homogeneous, the mass is the product of its density and
volume. The density is a property. I f the volume were known and measurements of
F and ax were available, the parameter to be estimated would be the density, which
also happens to be a property of the material.
Example 1.1.2
Ohm's Law states that £ (1)= R /(t), where £ (1) is voltage, / (1) is current, and R is
electric resistance. Voltage and current are states, whereas resistance is a parameter. I r the resistance of a wire of known length and diameter is being determined,
one could instead estimate the electric resistivity of that type of wire, a parameter
which is also a property.
Example 1.1.3
An object is thrown vertically above the earth with an initial velocity of Vo· From
the solution of the appropriate differential equation, the distance S of the object
above the earth is described by s ... V01- gl2 / 2, where g is the acceleration of
gravity. Here s would be the state and g the parameter. The independent variable is
time I . VO could be a parameter or a state. This illustrates that the parameter and
state estimation problems sometimes overlap.
Above, the term p arameter was applied t o w hat m ight be termed
" physical" p arameters in constrast with statistical parameters. Examples o f
s tatistical parameters are variances a nd c orrelation coefficients o f the
measurement errors. Both types o f p arameters m ay have to be estimated in
some problems; in others only the physical parameters need b e f ound.
1.1.2 Purpose o f l bls C hapter
T he p urpose o f this c hapter is t o survey some o f t he basic problems a nd
c oncepts o f p arameter e stimation covered i n this book. By u nderstanding
4
CHAPTER 1 INTRODUCTION OF PARAMETER ESTIMATION
1.2 FUNDAMENTAL PROBLEMS
5
some of the ideas in a simple form, the reader can better comprehend the
detailed treatment in later chapters.
Much of parameter estimation can be related to five optimization
problems. The first problem is the choice of the best function to extremize.
The most common function chosen for minimizing is the sum of squares of
deviations. This yields the method of least squares discussed in Section 1.3.
The second optimization problem is the minimization of the chosen
function, wh ich is also discussed in Section 1.3. These first two optimization problems are usually what is meant by the parameter estimation
problem, which is discussed further in Section 1.2.3. T he third optimization
problem involves optimal design of experiments to obtain the " best"
parameter estimates. This is discussed in Section 1.204. I f several competing
mathematical models are known, but the true model is unknown, the
(fourth) problem is the optimal design of experiments to discriminate
between the models; see Section 1.2.5. (By a " known" model we mean that
the mathematical structure of the 'equation is known even though the
values of certain parameters may be unknown.) The fifth and most
difficult problem is the determination of mathematical models when there
is so little information that a complete, finite list of competing models is
not known (Section 1.2.6).
In addition to the basic optimization problems in parameter estimation
there are a number of basic concepts. One of these relates to what we call
sensitivity coefficients. A sensitivity coefficient is formed by taking the first
derivative of a dependent variable (i .e., a state variable) with respect to a
parameter. This is discussed further in Section 104. Sensitivity coefficients
are important because they can give information regarding linearity a nd
identifiability . Linearity is concerned with the dependent variable(s) being
linear or nonlinear in the parameters (Section 104). There are cases where
unique solutions for some parameters do not exist; this relates to identifiability, wh ich is discussed in Sections 1.3 a nd 104.
Another concept, parsimony , long a principle in which choices among
alternative explanations of physical phenomena have been made, has
recently been emphasized by Box [4] in its application to parameter
estimation. Box asks that a model have a minimum number of parameters
consistent with the physical basis, if there is one.
a nd L. Lapidus. Some books o n parameter estimation with a statistical
emphasis are given in references 3 through 6.
A nother group that has made a large contribution to estimation is the
control and systems group of electrical engineers. Most of their work
relates to state estimation rather than parameter estimation, which they
call identification. Some books on state estimation are by Sage a nd Melsa
[7], Sage [8], Deutsch [9], a nd Bryson and Ho [10]. Books by Sage a nd
Melsa [11], G raupe [12], a nd Mendel [13] discuss identification. This group
usually is concerned with estimating states and parameters in sets of
ordinary differential equations.
Another group is composed of econometricians, that is, economists with
a strong interest in statistics. One reference is K menta [14] . Econometricians have generally concerned themselves with models which can be
approximated adequately by systems of linear algebraic equations.
Other valuable works, not identified with any of the above groups, are
references 15-17.
1.1.3 Related Research
1.2.1
A number of individuals and groups have contributed to the research in
the past decade. Some of the statisticians and engineers who have made
significant contributions to parameter estimation are G. E. P. Box, N.
Draper, 1. S. Hunter, M. J. Box, W. G. Hunter, Y. Bard, J. H. Seinfeld,
In the classical problem one mathematically models a system in a certain
domain and seeks to calculate the dependent variable(s) in the domain for
a known model and initial and boundary conditions. There is no uncertainity in any of these. N ot only is the structure of the model known (e.g.,
I
1.1.4 Relation to Analytical Design Theory
Parameter estimation is related to analytical design theory. The concepts
of choosing the best cost function and minimizing it are common to both.
Because of the similarity, there may be some technology transfer from
parameter estimation theory to design theory.
In addition to the design aspects related to cost functions mentioned
above, parameter estimation is also concerned with the design of " best"
experiments. This involves various ideas including criteria, constraints, a nd
sensitivity. Furthermore, modern design is utilizing statistics to a greater
extent than formerly to describe tolerances, life of structures, and so on.
1.2 FUNDAMENTAL P ROBLEMS
A n umber of distinctions between various' estimation problems should be
understood. Some of these distinctions can be confusing because similar,
but not identical problems, are encountered in control engineering.
Deterministic or Classical Problem
CHAPTER I I NTRODUcnON O F PARAMETER ESTIMATION
6
1.1 n JNDAMENTAL PROBLEMS
F lpre 1.1 Oassical problem of known input and system. The problem is to calculate
the output state.
q (t)
Known f nput
1.1.3 Parameter Estlmadon Problem
In the parameter estimation problem the structure of the differential
equation is known; measurements of the input q (/) a s well as the initial
condition(s) [B in (1.2.2)] o r boundary conditions are available. Some or
all of the parameters may be unknown. The problem is to obtain the
" best" o r optimal estimate of these parameters using the measured values
of input and output.
Because measurements invariably contain errors, solution of parameter
estimation problems utilize concepts of probability and statistics. The
requisite probability background is reviewed in Chapter 2 and the bases of
the statistical methods are given in Chapter 3. The reader who has an
adequate background in probability and statistics can omit these two
chapters.
This estirhation problem is illustrated by Fig. 1.3. The "unknown"
system is modeled by a differential equation containing unknown parameters. This problem also involves state estimation because 1J(/) is unknown
and is usually estimated a t the same time as the parameters.
The investigation of the parameter estimation problem is a primary
objective of this text. The emphasis is on parameter estimation techniques
that are appropriate for analysis of dynamic experiments. Methods useful
for estimation involving linear and nonlinear partial differential equations
are given particular attention.
an ordinary differential equation of known order) but all the relevant
parameters or properties are known. This is illustrated by Fig. 1.1. which
might be visualized as the heating of a billet with q (/) being the heat input
and 1J(I) the temperature of the billet. The model might be
/3.
tJr,
dl
+ /301J = q( I)
(1.2.1)
1J(0) = B
(1.2.2)
where (1.2.1) is the differential equation containing the input q ( I ) a nd
(1.2.2) is the initial condition. All the parameters. /30>/3 •• B. a nd q (/). are
known. The objective is to calculate the s lale 1J(/) for I > O. in other
words. to solve the differential equation.
Of all the problems listed here the classical problem is the one engineers
are most often trained to solve. I t is not the subject of this book.
1.2.2 State EstImatIon Problem
In the state estimation problem. the state is estimated using measurements
of the input and the state. See Fig. 1.2. This problem is similar to the
classical one in that 1J(/) is needed and the model is known. There are extra
complications. however. in that the observed input contains the noise W(/)
a nd that measurements are only available for the output state 1J corrupted
with the noise E(/). ("Noise" means non systematic measurement errors.)
Using the preceding example one could still write (1.2.1) but q ( I ) would
not be precisely known and neither would B in (1.2.2). In the solution of
this problem the statistics of w (/) and E (t) are usually assumed to be
known. One seeks a "best" or optimal estimate 1J(/) of the lrue system state
1J(/). I t is in connection with this problem that the term "filter" is used
[7-12].
7
1.1.4 Opdmum Experiment Problem
The optimum experiment problem can be illustrated using Fig. 1.3. An
objective is to adjust any inputs such as q (/) o r boundary and initial
conditions so as to minimize the effect of errors on estimated values of the
parameters. In other words the output 1J(/) would be made as "sensitive"
as possible to the parameters. Adjusting q (/) means the selection of the
time variation to accomplish the objective. Another objective would be to
find the best location for sensors and the best duration for taking measure-
.
q(t)
I Iet)
Known f npu~t;l-_ _- 1 Output S tate
c ontafns unknown
paratneters
Measurement noise
o r d isturbances
Measurement
noise
( t) + £ et)
S tate
L ---r-_ _ _r bserved
Measurement
nofse
fIIwe 1.3 Parameter estimation problem. The problem is to estimate certain unknown
parameten in the model of the system.
F lpre 1.1 State estimation problem. The problem is to estimate , ,(I).
t
8
CHAPTER 1 INTRODUCTION O F PARAMETER ESTIMATION
9
1.3 S IMPLE EXAMPLES
ments. There are certain realistic constraints that must be included when
seeking these optimums, such as the maximum allowable experiment
duration and maximum temperature rise. Optimum experiments are discussed mainly in C hapter 8.
1.2.5 Discrimination Problem
In the discrimination problem there are two or more possible candidates
for the model, one of which is the true model. The objective is to design
experiments that will enable one to decide upon the correct model. There
are some similarities with the optimum experiment problem. This is also
discussed in Chapter 8.
1.2.6 Identification Problem
Figure 1.4 Simulated data for thermal conductivity vs. temperature.
The identification problem is similar to the parameter estimation problem
in t hat there may be unknown parameters in the model. The problem is
much more complex because the structure of the model (e.g., differential
equation) is unknown. Developing models is sometimes called model
building which is discussed in various connections in Chapters 5 to 8.
where /30 a nd /31 are unknown parameters. The measurements Y; are
related to k(T;), abbreviated k;, b y
1.3
where E; is a n unknown error. F or n measurements there are n equations
with two parameters and n unknown errors,
S IMPLE E XAMPLES
i = 1,2, . . . , n
Typical problems are outlined in this section for estimation problems
involving algebraic, ordinary differential, and partial differential equations.
These examples are given to introduce the student to a number of parameter estimation concepts that are amplified in subsequent chapters.
1.3.1
(1.3.2a)
Y I=/3 0+/3 IT I +EI
Y 2 = /30+ /3I T 2+ E
2
(1.3.2b)
Linear Algebraic Model
Suppose that a number of distinct experiments have been performed for a
given material at different temperatures, 1', a nd that at each l ' a value of
the thermal conductiv;~~1 k has been determined. Hence there are a number
of data sets ( Y I , 1'1) ' ( Y2, T 2), . .. , where Y; is a measured value of k a nd 1';
is the temperature in the ith experiment. The data are shown in Fig. 104.
A model mus't now be proposed for k versus T. I f there is a ny applicable
physical law relating k a nd T it should be used. F or this example none is
known but the data of Fig. 104 suggest a linear relation in 1',
(1.3.1 )
;
I f n = I, both /30 a nd f31 c annot be estimated. If n = 2, estimates of f30 a nd
/31 can be obtained from (1.3.2b) by neglecting EI a nd £2 and solving the
two equations to obtain
( 1.3.3)
A
where the " hat" o n
A
/30 a nd /3 1 indicates estimate. The
k curve passes
10
C HAPTER I
I NTRODUcnON O F P ARAMETER E STIMATION
1.3 S IMPLE E XAMPLES
(0.-,
1iL-.
through both experimental points. Note that TI a nd T2 c annot be the same
temperature.
F or n > 2. a straight line (i.e .• a linear curve) cannot simultaneously pass
through all the points shown in Fig. 1.4. One can, however, imagine a
number of strategies to place the line. F or example. one could draw a line
by eye through the data. After measuring the intercept a nd slope, Po a nd
PI would be estimated. This has a number of advantages including simplicity a nd a visual check of the " fil." Moreover, all o f us have had experience
with this method. There are some severe shortcomings, however, including
the lack o f reproductivity. Different observers draw the line rather differently. Equally important is the disadvantage that the method does not
lend itself to direct t:xtension to more complex cases.
O ther relatively straightforward methods, such as the method of sequential differences, are discussed by Rabinowicz (18].
T he well-known method of least squares can be utilized to meet the
objections noted above. The sum of squares of the errors,
""
"
S=z ~ f }= ~ ( Y;_k;)2= ~ ( Y;-PO-PI1';)2
i -I
s
~------~~--~~--81
•
mfn. S
:
- - - - ______ i _.J."
.-
80
F lpre 1.5 Elliptic: paraboloid illustrating S given b y ( U.S) .
Both derivatives o f (1.3.6) are linear in Po a nd fl •. A necessary condition
for a minimum is that both the derivatives in (1.~.6) b e zero.
Setting these derivatives equal to zero a nd solving simultaneously yields
(1.3.4)
i -I
is minimized with respect to the parameters Po a nd PI' T he sum of squares
function S must be equal to o r greater than zero simply because it is the
sum o f n terms, each of which is a square. S can be zero if a nd only if
every measurement Y; is o n the line.
We can expand the S expression given by (1.3.4) to get (omitting, as we
sometimes do, the explicit designation o f limits)
11
( 1.3.7)
,
,
I
showing that in the three-dimensional coordinate system S is a n elliptical
paraboloid with one minimum. See Fig. 1.5. N ote that (1.3 .5) is of second
degree in Po a nd PI' Differentiate S with respect to Po a nd PI to obtain
as
ap
o
=
-2~(Y;-Po-P.1';)
as
ap. = -2~(Y;-Po-P.1';)1';
(1.3.6)
j
Introductio~ o f these values into (1.3.1) yields a n estimate o f k which is
designated k . A residual is defined by
Residual = el = ~ -
k;
(1.3.8)
which is not identical to the error, t i'
I n this example it is implied that t l is completely unknown. When this is
true, the least squares procedure j ust given is recommended. I f, however,
one knows that t ; has a variance a/2 (this term is discussed in Chapter 2).
some other estimation procedure might be better.
12
C HAPTER 1 I NTRODUCTION O F P ARAMETER ESTIMATION
Example 1.3.1
T he t hermal conductivity o f a ir in units o f W / m-K versus temperature in kelvin
has been measured to be the following values:
T (K)
300
350
400
0 .0255
0.0309
0.0350
13
where f3 is considered the unknown parameter. Unlike the previous example f3 has clear physical significance; it is given by the group
h
(3= pCpL
450
k (W / m-K)
1.3 S IMPLE E XAMPLES
0.0377
F ind estimates of the parameters
130 a nd /3 1 in
(1.3 . 1) using least squares.
Solution
L Ti =300+350+4oo+450= 1500
L
L
where h is the heat transfer coefficient, p density, cp specific heat, and L
the half-thickness of the plate. I t is not possible to estimate h, p, cp , a nd L
independently when given only measurements of T . One of the concepts in
estimation, called identifiability, relates to the question of which parameter
or groups of parameters can be uniquely estimated. See Section 1.5 a nd
Appendix A.
I n addition to (1.3.9a) an initial condition is needed to obtain a solution;
an appropriate one is
Yi = 0 .0255 + 0.0309 + 0.0350 + 0.0377 = 0.1291
(1.3.9b)
T(O)= To
Y Ji = 0.0255(300) + . .. + 0.0377(450)=49.43
The solution of (1.3.9) for constant f3 a nd Too is
T hen ~o a nd ~I from (1.3 .7) a re
(1.3.10)
~
0 .1291(575000)-1500(49.43)
0=
4 (575000)-(1500)2
= 0.00175 W / m-K
~
1=
4 (49.43)- 1500(0.1291)
(
)(
2 = 0.0000814 W / m_K 2
4 575000 - 1500)
NO.te t hat the parameter estimates ~o a nd ~I have units, each being different. T he
residuals are - 0.00067, 0.00066, 0.00069, a nd - 0.00068, which have a zero sum.
T he m inimum sum of squares, which the sum o f the square of each o f these terms
is 1.8 23 x 10 - 6 .
'
In the classical problem one stops at this point. In the estimation problem,
measurements of T are used to estimate (3.
Temperature data for the cooling of a plate using a single thermocouple
(a temperature sensor) and uniform time spacing are shown in Fig. 1.6.
N ote that even though the differential equation is linear, T (t,(3) in (1.3.10)
is a nonlinear function of (3; that is, the.derivative of (1.3.10) with respect
to f3 is a function of (3 unlike k given by (1.3.1). This is discussed further in
Section 1.4.
In calculations such as these, a t least an electronic calculator usually is
needed because there frequently are small differences of large numbers.
1.3.2 Linear First Order Differential Equation Model
•
Y
i
A case for which a fundamental law can be invoked is that of dropping a
thin plate initially at temperature To into a fluid at Too ' F rom the first law
of thermodynamics one can derive [19]
•
•
0\
I
•
- -- - -- -
•
•
T",
a
(1.3.9a)
• measured
value
t
Figure 1.6 Simulated temperature measurements of a cooled th in plate.
I ..
CHAPTER 1 I NTRODUcnON O F PARAMETER ESTIMATION
T he simplest method of estimation involves the use of Ihree t emperature
measurements including To a nd Too ' O bserve t hat a t l east three m easurements are needed although only one p arameter is being estimated. This is
in contrast with the preceding case for which measurements a t two
temperatures were sufficient to estimate two parameters.
F or m easurements o f To a nd Too a nd T j a t I i ' d esignated Yo. Y 00 ' a nd Yi •
a n e stimate of {J is
15
I .J S IMPLE EXAMPLES
Example 1.3.2
Suppose it is known that T OIJ is equal to 100 a nd To is equal to 300 and tha~ two
measurements of T are available. Y I -220 a t S sec and Y l-170 at 10 sec. EstImate
usiDgleast squares the parameter {J in (1.3.10). Use a trial and error approach.
Solution
A first estimate of {J can be obtained from (1.3.\ I) using the first observation. We
obtain
(1.3.11 )
&
p- -
As either 1; -+0. c orresponding to 1';-+ To . o r 1; -+00 . c orresponding to
T -+ Too. t he error in
d ue t o s ome small e rror in T; b ecomes very large.
T hus ~ is more sensitive to errors a t s ome m easurement times t han o thers.
which suggests the subjects o f sensilivily a nd oplimum e xperimental design
(see C hapter 8).
I f To a nd Too a re not precisely known they c an a lso be considered
p arameters like {J. T hey a re dissimilar from {J in t hat ( a) they are particular
values of the d ependent v ariable (termed the s tate v ariable in the systems
literature) a nd ( b) r epeated measurements o f these a re a vailable in this
particular example.
O ne o ther p arameter t hat c ould b e estimated for this problem is the
s tarting time. T he s tarting time c an b e seen to be a n u nknown if one
imagines several successive digital t emperature m easurements taken before
the plate is dropped into the fluid . A t t he i nstant t he plate c ontacts t he
fluid the plate's temperature rapidly changes; see Fig. 1.6. T he time a t
which the plate contacts the fluid might n ot c orrespond to the instant a t
which a ny m easurement was t aken .
Suppose that the starting time is known to be zero a nd t hat all the
measurements are for t > O. F or f inding estimates o f a ny c ombination o f
t he parameters To. Too. a nd {J o ne c ould s tart with the sum o f s quares for n
m easurements
fi
I
2 20-100
5 In 3 00-100 - 0.\022
Let us then evaluate the sum of squares function S in the neighborhood of that
value. From (1.3.12) we can write
S _020-200e-SfJ)2 + ( 70-200e- IOfJ )2
which we evaluate at {J-O.l022 to find S -3 .901. Now another value of {J must be
tried. Let us try {J c o 0.1; this gives S - 14.493. Because this S value is bigger than
the value for {J-O.l022. let us try a larger value than { J-0. 1022. At { J-O.lI.
S - 32.988. Hence the minimum must be between {J - 0.1 a nd 0.11 and is probably
nearer the first value. Let us try 0. 103 which gives S -2.214. Then the minimum S
must be between {J-O.l022 and 0.11. A further value of {J-O.IOS yields S -2 .8S3
4 0,000
30,000
s
S v alue f or
2 0,000
~
- ----
(1.3.12)
a nd minimize it with respect to the parameters. T he derivatives o f ( \.3 . 12)
a re l inear in terms o f To a nd Too b ut n onlinear in terms o f {J. T he
n onlinearity complicates the search for a minimum.
O ne way to minimize S with respect to a nonlinear p arameter is simply
to plot S versus that p arameter a nd g raphically find the m inimum . This is
a slow procedure. b ut c an give insight.
10,000
o
1
.5
o
F Ipre 1.7 Sum of squares function for exponential example.
CHAPTER 1 INTRODUCTION OF PARAMETER ESTIMATION
16
and thus /3 must be between 0.1022 and O. \OS, which region could be explored
further. One could continue further in this trial and error manner to estimate /3
more accurately . This is a possible approach but it is very tedious and time-consuming, particularly if more than one parameter is present. More direct methods of
minimizing S are given in Chapter 7.
I t is instructive to plot the function S for this case. See Fig. 1.7 . Note that the
minimum is near /3=0.1 and a local maximum is approached at large /3. Thus in
addition to a s / a/3 being equal to zero near /3 = 0.1, it also approaches zero as
/3-+00 . Even more ill-behaved S functions are possible. See Problem 1.5.
1.3.3
Partial Differential Equation Example
C onsider again the same physical problem of a plate dropped suddenly
into a fluid . Instead of negligible internal resistance ( Bi=hLlk<O . I)
assume that there is a significant variation of temperature across the plate.
T he describing equations for constant properties a nd a plate of width 2 L
are [19]
17
1.4 SENSITIVITY COEFFICIENTS
A nother aspect of identifiability is the determination of what parameters
o r groups of parameters c an be uniquely estimated. F or example, (1.3.13)
a nd (1.3.14) c an be divided by k to yield the groups pCp I k a nd h i k . Since
n o t erm in these groups appears elsewhere in the problem, one would
anticipate t hat these groups could be simultaneously estimated. T hat this is
n ot always true can be proved by noting t hat this physical problem is
i dentical to the one in Section 1.3.2 for which only the ratio of these two
parameters could be estimated. I t h appens t hat Bi = h L I k m ust be equal
to approximately one o r g reater in order to estimate both. There may be
other conditions that would also preclude estimation for this example. The
condition for identifiability is discussed in Section 1.5.
I n o rder to estimate the parameters one c an again use the sum of
squares function . Instead of a single summation over time, one could have
a double summation over time a nd sensors located a t different positions,
n
S=
m
LL
[ lj(i)_1)(i)]2
(1.3.17)
i - I j =1
(1.3.13)
( 0< x < 2L)
- k aa
T he subscript j is for position i is for time. There are m discrete locations
a nd n different times. l j(i) designates a n o bservation a nd 1)(;) a value
obtained from the model.
I
T
= h [ Too - T (0, t)],
x x- o
( l.3.l4a, b)
1.4 SENSITIVITY C OEFFICIENTS
(1.3.15)
T (x,O)= To
This is a problem which is linear in the dependent variable, T. F or the
estimation problem we c an consider T as a function of a number of
variables,
(1.3.16)
__
~
In parameter estimation one must be able to solve the model repeatedly
for different parameter values. F or this example an exact solution is
available as an infinite series but it may be easier to approximate the
solution using a finite-differences representation. Such a solution can also
be modified ~o treat nonlinearities entering in either the differential equation or the boundary conditions.
.
Note that in this example several different kinds of measurements are
required: temperature, time, a nd length. Measurements of the initial conditions a nd b oundary conditions may not be sufficient for parameter estimation; interior measurements may be needed (identifiability). T he l ocation
of sensors and duration of the experiment are studied in connection with
o ptimum experiments.
I
I n this section a brief introduction to sensitivity coefficients is given.
Consider the true mathematical model to be given by 1 /(x,t,P) where x
a nd t are independent variables a nd P is a p arameter vector. T he first
derivative of 1/ with respect to f3i will be called the sensitivity coefficient for
f3i a nd d esignated X i'
(1.4.1 )
O n some occasions the right side of (1.4.1) is multiplied by f3i a nd still
called simply a sensitivity coefficient.
Sensitivity coefficients are very important because they indicate the
magnitude of change of the response 1/ d ue to perturbations in the values
of the parameters. I t is for this reason we have given Xi' defined by (1.4.1),
the name "sensitivity coefficient." They appear in relation to many facets
of parameter estimation. The reader is urged to pay particular attention to
them a nd even to plot them versus their independent variables(s) if their
shapes are n ot obvious. O ne a rea where the sensitivity coefficients appear
I.
CHAPTER I 1 NIll0DUcnON OF PARAMETER ESIlMATION
is in the identifiability problem, which is briefly discussed in Section 1.5.
Another area where the X/s appear is the Gauss method of linearizing the
estimation problem when the model is nonlinear in terms of parameters
(see Section 7.4). In the optimal design of experiments discussed in
Chapter 8, the sensitivities also p laya key role.
The sensitivity coefficients also. appear in a Taylor's series for
TJ(PI,···,Pp,/) about the neighborhood of the point (bl,b1, . .. , b) which we
shall denote b. Provided ' I has continuous derivatives near IJ ~ b, we can
write
a1J(b, t)
a1J(b, t )
' I( PI, . .. ,pp,t)=1J(b,t)+ ~(PI- bl )+ . .. + ~(Pp- bp)
+
a~(b,t) ( P I - b l )2
apl
2'
l
(1.4.2)
19
P2 sensitivites of this equation are
(1.4.7)
which contain the parameters a nd thus 'I given by (1.4.6) is nonlinear in
terms of its parameters. I f, however, the only parameter of interest is PI' ' I
is linear in terms of PI.
The evaluation of sensitivity coefficients need not begin with an expression of 1J but could be initiated with the given differential equation. For
example, if the derivative of (1.3.9a) (a linear differential equation) is taken
with respect to P,
X(O):ooO
( 1.4.8a)
(1.4.8b)
2
I f the derivatives a'+'1J/ap;"dP/ ( i,j= I, . .. ,p) for r +s> I are zero, then
'I is said to be linear in the parameters. For 1J a linear function of P and
D
.
I
"'2' we can wnte
( 1.4.3)
This relation is an equality rather than an approximation if both XI a nd X
are not functions of the parameters. Hence 1J is linear in its parameters i f ali
the sensitivity coeJficients are not Junctions o j any parameter(s).
Consider now some simple examples. The PI' P2, and p) sensitivity
coefficients for the algebraic model
(1.4.4)
are, respectively,
X I-I,
The PI a nd
d
-X
=-(T-T ) _DX
dt
G O'" ,
a~(b,t)
+ . .. + ap ap ( PI-b l )(P2 -b 2 )+···
1.5 m EN11FlABIUfY
( 1.4.5)
Since each of these is independent o f all the parameters, 'I given by (1.4.4)
is linear in its parameters. Estimation involving models linear in the
parameters is generally easier and more direct than estimation involving
nonlinear parameters.
Another algebraic model which occurs in many fields is
(1.4.6)
Equation 1.4.8a is termed the sensitivity equation for this case and, together
with (1.4.8b), constitutes a statement o f the sensitivity problem. In (1.4.8a)
it is assumed that T (or 'I in the notation of this section) is a known
function obtained from a previous solution of the original differential
equation and initial condition. Since P appears explicitly in (1.4.8a), the
sensitivity coefficient X is a function of p. Consequently the dependent
variable T is nonlinear in P as can be verified from differentiating (1.3.10).
1.5 IDENTIFIABILITY
There are some models for which it is not possible to uniquely estimate all
the parameters from measurements. Rather it is possible to estimate only
certain functions of them. This is p art of the identifiability problem. See
Appendix A for a derivation o f a n identifiability criterion.
In this section several simple cases for which one cannot uniquely
estimate all the parameters are discussed. Later a n identifiability criterion
utilizing sensitivity coefficients is introduced a nd related to some of the
cases previously investigated.
A model that will n ot permit estimation of both PI a nd P2 is
(1.5.1)
where.lf is a constant a ndJ(t) is a ny known function of I . I n this case one
can only estimate P- PI + APl given measurement o f '1/ versus '/. I n this
S,PI,P2 space, S does not have a unique minimum, but instead has a
20
C HAPTER 1 I NTRODUCTION O F P ARAMETER E STIMATION
I.S
IDENTIFIABILITY
21
Dividing by [31 yields
(1.5.5)
where it is seen that 1); is a function of a I' a 2, a nd t;, a nd thus only a 1 a nd
a 2 c an be simultaneously estimated.
Another simple case where all three parameters cannot be uniquely
estimated is for
l1i = [31 e -
(/12 + /1,1)
( 1.5.6a)
(1.5.6b)
Figure 1.8 C ontours of minimum S for various cases where not all the parameters can be
uniquely estlmated.
where only a l a nd (3) c an be found.
There are other cases where the parameters can not (easily) be uniquely
estimated if measurements are made only over a certain range of the
independent variable o r a t certain values. One example is
minimum along a line which projects into [3 = [31 + A [32 in the [31' [32 plane;
see Fig. 1.8.
Consider next the model
(1.5.7)
( 1.5.2)
From inspection we see that [31 a nd [32 c an be replaced by the product
[3 = [31 [32 a nd that any combination of [31 a nd [32 equal to [3 would yield the
same value of 1), for a given I i' In terms of the three-dimensional space of S
plotted versus [31 a nd [32' there is a minimum S along a curved line which
projects into [32=[3/[31 in the [31,[32 plane, as shown in Fig. 1.8.
A very similar case to (1.5.2) is
(1.5.3)
where again only the ratio # is unique and various combinations of [31 a nd
[32 could be given to provide [3= [31/ [32= constant. In the S , [31' a nd [32
coordinates. the minimum of S occurs along a straight line [31 = [3[32
projected into the [31.[32 plane. In Fig. 1.8 this line passes through the
O rIgIn.
A less obvious case is for the model
( 1.5.4)
/
for maxlt;1 small compared to unity. F or such a model it is possible to
estimate accurately only [31 + 10[32 if It;1 is small. This model is thus similar
to (1.5.1) for small It;!, F or sufficiently " large" t; b oth [31 a nd [32 c an be
estimated. Another example is for the model
1); =
[31 t; + [32 sin (3)t;
( l.5.8a)
for small (3)t; since then 11; c an be approximated by
(1.5.8b)
Hence for small maxi (3)t;l, instead of being able to estimate uniquely all
three parameters we c an estimate only [31 + [32 (3).
M any other cases could be cited that demonstrate that only certain
functions of parameters can be estimated from measurements of 11, versus
its independent variable(s). Some of these cases may not be at all obvious.
This is particularly true where there are a number of parameters and the
model is a differential equation. Rather than depending upon being able to
manipulate the model so that groups of parameters appear, we would be
helped by having some criterion that could be applied to the above
algebraic models a nd also to models involving differential equations. In
the latter case we imagine that the solutions of the equations a nd the
sensitivity coefficients are available in graphical o r t abular form. I t turns
out in the algebraic cases above, as well as for other cases involving
11
, ".
....
C HAPTER I
I NTRODUCTION O F P ARAMETER E STIMATION
U
S UMMARY AND CONCLUSIONS
differential equations, that the sensitivity coefficients can provide insight
into the cases for which parameters can a nd c annot be estimated.
Parameters can be estimated i f the sensitivity coefficients over the range o f
the observations are not linearly dependent. This is the criterion that we shall
u~e to deter~i~e if the parameters can be simultaneously estimated
Without ambigUity. See Appendix A for a derivation of this criterion.
Linear dependence OCCurs when for p p arameters the relation
1 .0..----.-------,,---.-----,----,
.75
.5
.25
0k--------===========~
(1.5.9)
is true fo~ all i observations a nd for not all the S values equal to zero.
Let us Illustrate the above criterion for a few examples. F or (1.5. 1) n ote
that
a nd thus, if C I = A a nd C2 = - I. (1.5.9) is satisfied. Consequently, both
a nd P2 c annot be estimated simultaneously.
Another example involves (1.5.4) for which
PI
2
Flame
1.9
4
5
Dependent variable " a nd sensitivity coefficients for , ,- Il.l< III + Il,.) with
1 l2-1.
a.",
ap i = P2 + P3 /; ,
a.",
- PI/;
ap3 = (P 2 + P3 ti
It is n ot immediately obvious from an inspection of these sensitivity
relations that there is linear dependence. It c an be verified, however, that if
C I == PI' C2 == P2' a nd C3 = P linear dependence exists; in equation form,
3
we then have
a.,,;
a.,,;
a"'j
PI api + P2 aP2 + P3 aP =0
3
which form can OCcur in various cases with linear dependence. The
dependent variable." a nd the sensitivity coefficients for the model (1.5.4)
a re depicted in Fig. 1.9 for P2 = I . I t is strongly recommended that the
sensitivity coefficients be plotted a nd carefully examined to see if linear
dependence exists o r even is a pproached. T he relation given a bove between the coefficients can be approximately verified by graphically adding
the three together to obtain zero a t each instant of time. Furthermore, note
3
that the P, a nd P sensitivities seem to have approximately p roportional'
magnitudes for P3 t g reater than 3. This mean~ t hat not only is it impossible
t o estimate PI' P2' a nd P3 s imultaneously from measurements o f ~ versus t,
b ut it is difficult to estimate only PI a nd P3 using d ata for P t > 3 If P2 <; I .
3
1.6 S UMMARY AND CONCLUSIONS
I.
P arameter e stimation is a discipline that provides tools for the efficiendt
use o f d ata for aiding in mathematically modeling o f p henomena a n
t he estimation of constants appearing in these models. T he p roblem o f
e stimating parameters is t hat o f finding constants appearing in a n
e quation describing a system as suggested b y Fig. ~ .3.
.
2. O ne way to estimate the parameters for a large vanety o f models IS tO
use least squares which involves minimizing the sum o f sq~a~es. 0 f
differences between measurements a nd model values. T he mJDlmlzation problem can b e either linear o r n onlinear.
3. O ne c annot always independently estimate all the parameters t hat
a ppear in the model. I t is clear t hat n ot all the parameters m ay be
e stimated if parameters a ppear i n groups, b ut i n some cases n~t e ven
all these groups m ay be f ound. This is related t o t he subject o f
identifiability .
---- I
,
I
I.
CHAPTER 1 INTRODUCTION O F PARAMETER ESTIMATION
24
P ROBLEMS
I
REFERENCES
Legendre, A. M., Nouvelles Methodes Pour la Determination des Orbites des Cometes,
Paris, 1806 .
2. Gauss, K. F., Theory o f the Motion o f the Heavenly Bodies Moving about the Sun in Conic
Sections, 1809, reprinted by Dover Publications, I nc ., N ew York, 1963.
3. Draper, N. R. a nd Smith, H ., Applied Regression Analysis, J ohn Wiley &: Sons, Inc., New
York, 1966 .
4. Box, G. E. P. a nd Jenkins, G. M ., Time Series Analysis forecasting and control,
H olden-Day, Inc ., S~n F rancisco, 1970 .
I
,
1.
i
J
I
25
(a) Estimate f30 and
Answer.
f3 1in (1.3 . 1), using least squares.
7 8.5,0.106
( b) Calculate the residuals.
Answer.
1
0.9, - 1.7, 0.7, 0.1
(c) F or f30=80, plot S versus
1.2
(a) Derive using least squares an estimate of /3 for the simple model
TJ;= /3
5. Myers, R. H., Response Surface Methodology, Allyn a nd Bacon, Inc ., Boston, 1971.
fo r n measurements. Assume Y; = TJ; + £ ;, £; being the measurement error.
( b) Also derive estimates for /30 a nd /3 1 for the model
6. Bard, Y., Nonlinear Parameter Estimation, Academic Press, New York, 1974.
7. Sage, A . P. a nd Melsa, J. L., Estimation Theory with Applications to Communications and
Control, M cGraw-Hill Book Co., New York, 197 1.
8. Sage, A . P., Optimum Sy stems Control, Prentice-Hall, I nc., Englewood Cliffs, N . J ., 1968 .
9. Deutsch, R ., Estimation Theory, Prentice-Hall, Inc., Englewood Cliffs, N . J ., 1965.
10. Bryson, A. E., Jr. and H o , Yu-Chi, Applied Optimal Control Optimization, Estimation and
Control, Blaisdell Publishing Co., Waltham, Mass ., 1969.
I I. Sage, A. P. a nd Melsa, J . L., System Identification , Academic Press, New York, 1971.
12. G raupe, D ., Identification o f Systems, Van Nostrand-Reinhold Co., New York, 1972.
13 . Mendel, J. M., Discrete Techniques o f Parameter Estimation The Equation Error Formulation, Marcel Dekker, Inc., New York, 1973.
14 . Kmenta, J ., Elements o f Econometrics, T he Macmillan Co., New York, 1971.
IS. Bevington, P. R., Data Reduction and Error Analysis for the Physical Sciences, M cGrawHill Book Company, New York, 1969 .
16 . Wolberg, J . R., Prediction Analysis, D. Van Nostrand Co., Inc., Princeton, N . J., 1967.
TJ; = f3o+ /3 1 s in/;
1.3
Some actual measurements for the specific heat c of Armco iron a t room
temperature are, in units of kJ j kg-C,
P
I
cp
1.4
i
19 . Kreith, F., Prillciples o f Heat Transfer, 3rd ed ., Intext Educational Publishers, N ew
York, 1973.
I'
0.4287
2
0.4363
3
0.4451
4
0.4409
5
0.4442
cp
17 . Lewis, T. O. a nd Odell, P. L., Estimation in L inear Models, Prentice-Hall, Inc., Englewood Cliffs, N. J., 1970.
18 . Rabinowicz, E., A n Introduction to Experimentation, Addison-Wesley Publishing C o .,
Reading, Mass., 1970.
.
6
0.4400
7
0.4400
8
0.4405
9
0.4375
10
0.4333
Using the model of Problem 1.2a, estimate c . Plot the residuals as a
function of i. W hat is the sum of the residuals? P
(a) F or the model
TJ= efJ1
and the data given below, estimate /3 by plotting S versus /3 . Cover the
range 0 to - 20.0.
(b) Compare the curve with Fig. 1.7.
(c) Compare the residuals with the true errors ( e;= Y ;-TJJ also given below:
Data,
Y;
P ROBLEMS
I
1.1
The thermal conductivity k has been found from four independent experiments at different temperature to be given by
T; (0C)
~'
...,-
f31 in the. neighborhood of the minimum.
90
98
I II
121
t
k; (W / m-C)
100
200
300
400
0.25
0.5
0.75
1.0
1.25
I
1.5
Errors,
0.419
0.204
0 . 159
- 0.106
0 .042
- 0.053
- 0 .019
0 .054
- 0 . 156
0.0187
£;
Plot S versus /3 for the model TJ = IOOsin/30 with /30 in radians and for the
data 0 1 = 2.79, YI = 34.2; O = 6.98, Y2 = 64.2; and 03 = 8.38, Y3 = 86 . Investi2
gate the range 0 < /3 < 1.1 for t::..f3 increments at least as small as 0.1. (A
CHAPTER I I NTRODUcnON O F PARAMETER ESTIMATION
16
Z··..:
W hat is (are) the dependent variablc(s)?
W hat is (are) the independent variablc(s)?
W hat is (are) the statc(s)?
What could be termed parameters?
What could be termed properties?
( f) The sol~tion of the problem is
( a)
( b)
( c)
( d)
(e)
programmable calculator would be helpful to get the solution.) What conclusions can you draw?
How can (1.3.12) be changed to permit estimation of the starting time I P
and To? Assume that measurements are available for I both less tha~ ~nd
greater than 10 . Also assume that the plate has been at To for a "long" time
before 10,
( a) For the model and data
1.6
1.7
PI
Ij
200
SS
30
20
I
2
3
calculate the sum of squares S in the rectangular region 100 < P < 300
a~d 2.0 < P2 < 4.0. In particular, evaluate S at PI"" 100, 200, a~d 300
with P2 = 2.0, 3.0, and 4.
(b) Is 11; linear in PI and P2?
( c) Ba~ed on the information in (a), estimate PI a nd P2'
( d) Usmg the. search procedure in (a), is it more or less than twice as much
work to fmd two parameters as it is to estimate one?
The ~urrent / in the ~ircuit of Fig. 1.10 after the switch S is closed satisfies
the differential equation
L di,+ R1 - £ =0
d
·
~here
i - ~[I - e l -
R
Y;
0
11, = 1 + P2 /,
1.8
PROBLEMS
L is inductance, R is resistance, and £ is voltage. An initial condition
I1 / L ),]
+ i 0e l -
I1 / L ),
Is i linear in £ ? R? L? i 01
( g) W hat parameters or groups or parameters can be estimated given
1.9
measurements or i?
F or the rollowing expressions or the model 11, indicate ror the various
values ir 11; is linear or nonlinear in terms or them.
pj
( a) 11; - PI + P2 sin 7T I I
lo
( b) lI;=PI,le- "
Pie -p",
( c)
"i-
( d)
"1- f
2
1 + P I'I+ P2 /1
j -t
~(l- Pti)e-jw"
J
1.10 F or the following expressions ror the model ",. derive expressions ror the
sensitivity coerricients. Also plot the sensitivity coerricients a nd" versus Pl/
F or parts (b) a nd ( c) graph " I PI' a"laPI' a nd (P21 PI)a"lapl versus Pl/
( If values or PI a nd P2 are needed, let P I" 2 a nd P2 - 1.)
IS
i =io
at
Note that a solution of i is in terms of
( a) , ,- PI + P2
'
( b) , ,-PICOSP21
(0<: P /<:47T)
l
( c) , ,_PI(I-e- PJI )
(0<: P /<.3)
1
/ -0
I,
L , R. £ . and io.
1.11
s
F or the model
w here' is in radians, plot the sensitivity coefficients ror - 2 <. I <. 2. Over
what range (ir any) do the parameters P I a nd P2 appear to be linearly dependent?
1.11 Find a linear relation between the sensitivity coefficients ror ·PI' P l' a nd P3
ror the model
R
E
L
F igure 1.10
Circuit for Problem 1.8.
28
CHAPTER I
I NTRODUCTION O F PARAMETER ESTIMATION
C HAPTER
2 _____________________
~ot----....,.",,-
P ROBABILITY
Figure 1.11 11 for Problem 1.13.
1.13
C onsider the model (see Fig. 1.11)
1 )=1)0-/3\('-1 0 )
1 /=1)0
1 ;"0
1 <10
( /3\
is posi tive)
( a) F ind a nd graph a1)ja1)o '
(b) F ind and graph a1/jal o.
(c) C an 10 a nd 1/0 be simultaneously estimated using only two measurements
of
1)
if
/3\
is known?
2.1 RANDOM HAPPENINGS
I f a room thermostat is set a t 21 °C, we d o not expect the temperature
throughout the room, or even right a t the thermostat, to remain constant.
Rather, we expect the temperature a t any point to change continually and
continuously while remaining very near 21°C.
I f we r un a test of braking distance by repeatedly bringing a car to 55
mph, then applying the brakes, we expect the distance covered after
application of the brakes to differ from trial to trial no matter how we try
to make sure that the road and wind conditions and pressure o n the brake
pedal are the same from trial to trial. We do hope to settle on some typical
distance a nd p erhaps on some measure of variability. In both cases,
thermostat a nd braking distance, there are elements of stability and
elements o f randomness.
Example 2.1.1
As a simple example of randomness with a n element of stability, let us observe
successive determinations of percent defective in a sampling inspection of items
from a production line. (The d ata to be exhibited were actually generated by
computer simulation.) Successive items were inspected a nd declared to be either
G ood o r Defective. The first two items were found to be Good, the third Defective.
a nd so on. The results of the first 500 d eterminations are given in Table 2.1 A . T o
make a long series of such determinations easier to contemplate, Table 2.1 B gives
the number of defectives found among the first n items inspected for various n up
29
/