Back to Parameter Estimation in Engineering and Science Home
Table of Contents
C HAPTER 1_______________________________________ I NTRODUCTION TO AND SURVEY OF PARAMETER ESTIMATION '~ 1.1 INTRODUCfION One of the fundamental tasks of engineering and science, and indeed of mankind in general, is the extraction of information from data. Parameter estimation is a discipline that provides tools for the efficient use of data in the estimation of constants appearing in mathematical models and for aiding in modeling of phenomena. The models may be in the form of algebraic, differential, or integral equations a nd their associated initial and boundary conditions. An estimated parameter m ayor may not have' a direct physical significance. Parameter estimation can also be visualized as a study of inverse problems. In the solution of partial differential equations one classically seeks a solution in a domain knowing the boundary and initial conditions and any constants. In the inverse problem not all these constants would be known. Instead discrete measurements of the dependent variable inside the domain must be used to estimate values for these constants, also called parameters. Parameter estimation is needed in the modern world for the solution of the many diverse problems related to the space program, investigation of the atom, and modeling of the economy. Examples and applications in this book, however, are directed to estimation problems occurring in engineering a nd science in which partial differential equations as well as ordinary differential and algebraic equations are used to model the phenomena. 1 re:' CHAPTER I INTRODUCTION O F PARAMETER ESTIMATION F ortunately, simultaneous with the development o f i ncreased need o f p arameter estimation, computers have been built that make p arameter e stimation practicable for a great array o f a pplications. I t should be noted that both digital computational a nd d ata a cquisition facil ities are practical necessities in p arameter estimation. Both these facilities have been readily available only since the late 1950s o r e arly 1960s, w hereas estimation was first extensively discussed by Legendre in 1806 [ I) a nd G auss in 1809 [2J. In Gauss's classic paper he claimed usage o f the m ethod o f least squares (still used in parameter estimation) as early as 1795 in connection with the orbit determination of minor planets. F or this reason G auss is recognized as being the first to use this important tool o f p arameter e stimation. The name " parameter e stimation" is n ot universally used. O ther terms are nonlinear least squares, nonlinear estimation, nonlinear regression,· a nd identification, although the latter sometimes is given a quite different meaning. Estimation is a statistical term a nd i dentification is a n electrical engineering term . 1. 1.1 P arameters, Properties, and S tates A m athematical model of a dynamic process usually involves ordinary o r p artial differential equations. Sometimes the solution o f these equations is a relatively simple set of algebraic equations. In any case, there are dependent a nd i ndependent variables a nd also certain constants. T he d ependent variables are sometimes called slate v ariables (or signals). T he c onstants may be parameters. I n experiments the states are frequently measured directly, b ut the parameters are not. Approximate values of the parameters are inferred from measurements o f the states. Since only approximate p arameter values are found, the parameters are said to be eSlimated. T his book is p rimarily concerned with estimating parameters. O ther b ooks concentrate o n p roviding " best" predictions o f states based on knowledge o f i ndependent variables. The two problems are quite similar. When parameters are estimated, state estimates are usually found simultaneously. A p arameter having a physical significance for a solid o r fluid might also b e t ermed a properly. Examples o f p arameters t hat might also be properties are characteristics of materials such as density, specific heat, viscosity, thermal conductivity, electrical conductivity, Young's modulus, emittance, a nd electrical capacitance. A property frequently involves the concept o f p er unit length, area, o r volume. Examples o f q uantities that are parameters b ut n ot properties are weight o r mass o f a given specimen, electrical · Parameter estimation is not necessarily nonlinear as implied by some of these terms. 1.1 INTRODUCTION 3 r esistance o f a s ection o f wire, a nd d rag e xperienced b y a t ruck moving a t a c onstant velocity. T he c oncepts o f p arameters, properties, a nd s tates a re i llustrated b y t he following examples. E xample 1.1.1 Newton's second law states for a system that F AI) - ma,,(I), where F AI) is the force in the x direction, a,,(I) is the acceleration in the x direction, and m is mass. Force and acceleration, which can be functions of time I , can be considered to be states, whereas mass is a parameter. Force and acceleration are both often easily measured; usually mass is easily measured separately also, but there can be cases when the mass must be inferred, such as when determining the mass of comets and planets. I f a body is homogeneous, the mass is the product of its density and volume. The density is a property. I f the volume were known and measurements of F and ax were available, the parameter to be estimated would be the density, which also happens to be a property of the material. Example 1.1.2 Ohm's Law states that £ (1)= R /(t), where £ (1) is voltage, / (1) is current, and R is electric resistance. Voltage and current are states, whereas resistance is a parameter. I r the resistance of a wire of known length and diameter is being determined, one could instead estimate the electric resistivity of that type of wire, a parameter which is also a property. Example 1.1.3 An object is thrown vertically above the earth with an initial velocity of Vo· From the solution of the appropriate differential equation, the distance S of the object above the earth is described by s ... V01- gl2 / 2, where g is the acceleration of gravity. Here s would be the state and g the parameter. The independent variable is time I . VO could be a parameter or a state. This illustrates that the parameter and state estimation problems sometimes overlap. Above, the term p arameter was applied t o w hat m ight be termed " physical" p arameters in constrast with statistical parameters. Examples o f s tatistical parameters are variances a nd c orrelation coefficients o f the measurement errors. Both types o f p arameters m ay have to be estimated in some problems; in others only the physical parameters need b e f ound. 1.1.2 Purpose o f l bls C hapter T he p urpose o f this c hapter is t o survey some o f t he basic problems a nd c oncepts o f p arameter e stimation covered i n this book. By u nderstanding 4 CHAPTER 1 INTRODUCTION OF PARAMETER ESTIMATION 1.2 FUNDAMENTAL PROBLEMS 5 some of the ideas in a simple form, the reader can better comprehend the detailed treatment in later chapters. Much of parameter estimation can be related to five optimization problems. The first problem is the choice of the best function to extremize. The most common function chosen for minimizing is the sum of squares of deviations. This yields the method of least squares discussed in Section 1.3. The second optimization problem is the minimization of the chosen function, wh ich is also discussed in Section 1.3. These first two optimization problems are usually what is meant by the parameter estimation problem, which is discussed further in Section 1.2.3. T he third optimization problem involves optimal design of experiments to obtain the " best" parameter estimates. This is discussed in Section 1.204. I f several competing mathematical models are known, but the true model is unknown, the (fourth) problem is the optimal design of experiments to discriminate between the models; see Section 1.2.5. (By a " known" model we mean that the mathematical structure of the 'equation is known even though the values of certain parameters may be unknown.) The fifth and most difficult problem is the determination of mathematical models when there is so little information that a complete, finite list of competing models is not known (Section 1.2.6). In addition to the basic optimization problems in parameter estimation there are a number of basic concepts. One of these relates to what we call sensitivity coefficients. A sensitivity coefficient is formed by taking the first derivative of a dependent variable (i .e., a state variable) with respect to a parameter. This is discussed further in Section 104. Sensitivity coefficients are important because they can give information regarding linearity a nd identifiability . Linearity is concerned with the dependent variable(s) being linear or nonlinear in the parameters (Section 104). There are cases where unique solutions for some parameters do not exist; this relates to identifiability, wh ich is discussed in Sections 1.3 a nd 104. Another concept, parsimony , long a principle in which choices among alternative explanations of physical phenomena have been made, has recently been emphasized by Box [4] in its application to parameter estimation. Box asks that a model have a minimum number of parameters consistent with the physical basis, if there is one. a nd L. Lapidus. Some books o n parameter estimation with a statistical emphasis are given in references 3 through 6. A nother group that has made a large contribution to estimation is the control and systems group of electrical engineers. Most of their work relates to state estimation rather than parameter estimation, which they call identification. Some books on state estimation are by Sage a nd Melsa [7], Sage [8], Deutsch [9], a nd Bryson and Ho [10]. Books by Sage a nd Melsa [11], G raupe [12], a nd Mendel [13] discuss identification. This group usually is concerned with estimating states and parameters in sets of ordinary differential equations. Another group is composed of econometricians, that is, economists with a strong interest in statistics. One reference is K menta [14] . Econometricians have generally concerned themselves with models which can be approximated adequately by systems of linear algebraic equations. Other valuable works, not identified with any of the above groups, are references 15-17. 1.1.3 Related Research 1.2.1 A number of individuals and groups have contributed to the research in the past decade. Some of the statisticians and engineers who have made significant contributions to parameter estimation are G. E. P. Box, N. Draper, 1. S. Hunter, M. J. Box, W. G. Hunter, Y. Bard, J. H. Seinfeld, In the classical problem one mathematically models a system in a certain domain and seeks to calculate the dependent variable(s) in the domain for a known model and initial and boundary conditions. There is no uncertainity in any of these. N ot only is the structure of the model known (e.g., I 1.1.4 Relation to Analytical Design Theory Parameter estimation is related to analytical design theory. The concepts of choosing the best cost function and minimizing it are common to both. Because of the similarity, there may be some technology transfer from parameter estimation theory to design theory. In addition to the design aspects related to cost functions mentioned above, parameter estimation is also concerned with the design of " best" experiments. This involves various ideas including criteria, constraints, a nd sensitivity. Furthermore, modern design is utilizing statistics to a greater extent than formerly to describe tolerances, life of structures, and so on. 1.2 FUNDAMENTAL P ROBLEMS A n umber of distinctions between various' estimation problems should be understood. Some of these distinctions can be confusing because similar, but not identical problems, are encountered in control engineering. Deterministic or Classical Problem CHAPTER I I NTRODUcnON O F PARAMETER ESTIMATION 6 1.1 n JNDAMENTAL PROBLEMS F lpre 1.1 Oassical problem of known input and system. The problem is to calculate the output state. q (t) Known f nput 1.1.3 Parameter Estlmadon Problem In the parameter estimation problem the structure of the differential equation is known; measurements of the input q (/) a s well as the initial condition(s) [B in (1.2.2)] o r boundary conditions are available. Some or all of the parameters may be unknown. The problem is to obtain the " best" o r optimal estimate of these parameters using the measured values of input and output. Because measurements invariably contain errors, solution of parameter estimation problems utilize concepts of probability and statistics. The requisite probability background is reviewed in Chapter 2 and the bases of the statistical methods are given in Chapter 3. The reader who has an adequate background in probability and statistics can omit these two chapters. This estirhation problem is illustrated by Fig. 1.3. The "unknown" system is modeled by a differential equation containing unknown parameters. This problem also involves state estimation because 1J(/) is unknown and is usually estimated a t the same time as the parameters. The investigation of the parameter estimation problem is a primary objective of this text. The emphasis is on parameter estimation techniques that are appropriate for analysis of dynamic experiments. Methods useful for estimation involving linear and nonlinear partial differential equations are given particular attention. an ordinary differential equation of known order) but all the relevant parameters or properties are known. This is illustrated by Fig. 1.1. which might be visualized as the heating of a billet with q (/) being the heat input and 1J(I) the temperature of the billet. The model might be /3. tJr, dl + /301J = q( I) (1.2.1) 1J(0) = B (1.2.2) where (1.2.1) is the differential equation containing the input q ( I ) a nd (1.2.2) is the initial condition. All the parameters. /30>/3 •• B. a nd q (/). are known. The objective is to calculate the s lale 1J(/) for I > O. in other words. to solve the differential equation. Of all the problems listed here the classical problem is the one engineers are most often trained to solve. I t is not the subject of this book. 1.2.2 State EstImatIon Problem In the state estimation problem. the state is estimated using measurements of the input and the state. See Fig. 1.2. This problem is similar to the classical one in that 1J(/) is needed and the model is known. There are extra complications. however. in that the observed input contains the noise W(/) a nd that measurements are only available for the output state 1J corrupted with the noise E(/). ("Noise" means non systematic measurement errors.) Using the preceding example one could still write (1.2.1) but q ( I ) would not be precisely known and neither would B in (1.2.2). In the solution of this problem the statistics of w (/) and E (t) are usually assumed to be known. One seeks a "best" or optimal estimate 1J(/) of the lrue system state 1J(/). I t is in connection with this problem that the term "filter" is used [7-12]. 7 1.1.4 Opdmum Experiment Problem The optimum experiment problem can be illustrated using Fig. 1.3. An objective is to adjust any inputs such as q (/) o r boundary and initial conditions so as to minimize the effect of errors on estimated values of the parameters. In other words the output 1J(/) would be made as "sensitive" as possible to the parameters. Adjusting q (/) means the selection of the time variation to accomplish the objective. Another objective would be to find the best location for sensors and the best duration for taking measure- . q(t) I Iet) Known f npu~t;l-_ _- 1 Output S tate c ontafns unknown paratneters Measurement noise o r d isturbances Measurement noise ( t) + £ et) S tate L ---r-_ _ _r bserved Measurement nofse fIIwe 1.3 Parameter estimation problem. The problem is to estimate certain unknown parameten in the model of the system. F lpre 1.1 State estimation problem. The problem is to estimate , ,(I). t 8 CHAPTER 1 INTRODUCTION O F PARAMETER ESTIMATION 9 1.3 S IMPLE EXAMPLES ments. There are certain realistic constraints that must be included when seeking these optimums, such as the maximum allowable experiment duration and maximum temperature rise. Optimum experiments are discussed mainly in C hapter 8. 1.2.5 Discrimination Problem In the discrimination problem there are two or more possible candidates for the model, one of which is the true model. The objective is to design experiments that will enable one to decide upon the correct model. There are some similarities with the optimum experiment problem. This is also discussed in Chapter 8. 1.2.6 Identification Problem Figure 1.4 Simulated data for thermal conductivity vs. temperature. The identification problem is similar to the parameter estimation problem in t hat there may be unknown parameters in the model. The problem is much more complex because the structure of the model (e.g., differential equation) is unknown. Developing models is sometimes called model building which is discussed in various connections in Chapters 5 to 8. where /30 a nd /31 are unknown parameters. The measurements Y; are related to k(T;), abbreviated k;, b y 1.3 where E; is a n unknown error. F or n measurements there are n equations with two parameters and n unknown errors, S IMPLE E XAMPLES i = 1,2, . . . , n Typical problems are outlined in this section for estimation problems involving algebraic, ordinary differential, and partial differential equations. These examples are given to introduce the student to a number of parameter estimation concepts that are amplified in subsequent chapters. 1.3.1 (1.3.2a) Y I=/3 0+/3 IT I +EI Y 2 = /30+ /3I T 2+ E 2 (1.3.2b) Linear Algebraic Model Suppose that a number of distinct experiments have been performed for a given material at different temperatures, 1', a nd that at each l ' a value of the thermal conductiv;~~1 k has been determined. Hence there are a number of data sets ( Y I , 1'1) ' ( Y2, T 2), . .. , where Y; is a measured value of k a nd 1'; is the temperature in the ith experiment. The data are shown in Fig. 104. A model mus't now be proposed for k versus T. I f there is a ny applicable physical law relating k a nd T it should be used. F or this example none is known but the data of Fig. 104 suggest a linear relation in 1', (1.3.1 ) ; I f n = I, both /30 a nd f31 c annot be estimated. If n = 2, estimates of f30 a nd /31 can be obtained from (1.3.2b) by neglecting EI a nd £2 and solving the two equations to obtain ( 1.3.3) A where the " hat" o n A /30 a nd /3 1 indicates estimate. The k curve passes 10 C HAPTER I I NTRODUcnON O F P ARAMETER E STIMATION 1.3 S IMPLE E XAMPLES (0.-, 1iL-. through both experimental points. Note that TI a nd T2 c annot be the same temperature. F or n > 2. a straight line (i.e .• a linear curve) cannot simultaneously pass through all the points shown in Fig. 1.4. One can, however, imagine a number of strategies to place the line. F or example. one could draw a line by eye through the data. After measuring the intercept a nd slope, Po a nd PI would be estimated. This has a number of advantages including simplicity a nd a visual check of the " fil." Moreover, all o f us have had experience with this method. There are some severe shortcomings, however, including the lack o f reproductivity. Different observers draw the line rather differently. Equally important is the disadvantage that the method does not lend itself to direct t:xtension to more complex cases. O ther relatively straightforward methods, such as the method of sequential differences, are discussed by Rabinowicz (18]. T he well-known method of least squares can be utilized to meet the objections noted above. The sum of squares of the errors, "" " S=z ~ f }= ~ ( Y;_k;)2= ~ ( Y;-PO-PI1';)2 i -I s ~------~~--~~--81 • mfn. S : - - - - ______ i _.J." .- 80 F lpre 1.5 Elliptic: paraboloid illustrating S given b y ( U.S) . Both derivatives o f (1.3.6) are linear in Po a nd fl •. A necessary condition for a minimum is that both the derivatives in (1.~.6) b e zero. Setting these derivatives equal to zero a nd solving simultaneously yields (1.3.4) i -I is minimized with respect to the parameters Po a nd PI' T he sum of squares function S must be equal to o r greater than zero simply because it is the sum o f n terms, each of which is a square. S can be zero if a nd only if every measurement Y; is o n the line. We can expand the S expression given by (1.3.4) to get (omitting, as we sometimes do, the explicit designation o f limits) 11 ( 1.3.7) , , I showing that in the three-dimensional coordinate system S is a n elliptical paraboloid with one minimum. See Fig. 1.5. N ote that (1.3 .5) is of second degree in Po a nd PI' Differentiate S with respect to Po a nd PI to obtain as ap o = -2~(Y;-Po-P.1';) as ap. = -2~(Y;-Po-P.1';)1'; (1.3.6) j Introductio~ o f these values into (1.3.1) yields a n estimate o f k which is designated k . A residual is defined by Residual = el = ~ - k; (1.3.8) which is not identical to the error, t i' I n this example it is implied that t l is completely unknown. When this is true, the least squares procedure j ust given is recommended. I f, however, one knows that t ; has a variance a/2 (this term is discussed in Chapter 2). some other estimation procedure might be better. 12 C HAPTER 1 I NTRODUCTION O F P ARAMETER ESTIMATION Example 1.3.1 T he t hermal conductivity o f a ir in units o f W / m-K versus temperature in kelvin has been measured to be the following values: T (K) 300 350 400 0 .0255 0.0309 0.0350 13 where f3 is considered the unknown parameter. Unlike the previous example f3 has clear physical significance; it is given by the group h (3= pCpL 450 k (W / m-K) 1.3 S IMPLE E XAMPLES 0.0377 F ind estimates of the parameters 130 a nd /3 1 in (1.3 . 1) using least squares. Solution L Ti =300+350+4oo+450= 1500 L L where h is the heat transfer coefficient, p density, cp specific heat, and L the half-thickness of the plate. I t is not possible to estimate h, p, cp , a nd L independently when given only measurements of T . One of the concepts in estimation, called identifiability, relates to the question of which parameter or groups of parameters can be uniquely estimated. See Section 1.5 a nd Appendix A. I n addition to (1.3.9a) an initial condition is needed to obtain a solution; an appropriate one is Yi = 0 .0255 + 0.0309 + 0.0350 + 0.0377 = 0.1291 (1.3.9b) T(O)= To Y Ji = 0.0255(300) + . .. + 0.0377(450)=49.43 The solution of (1.3.9) for constant f3 a nd Too is T hen ~o a nd ~I from (1.3 .7) a re (1.3.10) ~ 0 .1291(575000)-1500(49.43) 0= 4 (575000)-(1500)2 = 0.00175 W / m-K ~ 1= 4 (49.43)- 1500(0.1291) ( )( 2 = 0.0000814 W / m_K 2 4 575000 - 1500) NO.te t hat the parameter estimates ~o a nd ~I have units, each being different. T he residuals are - 0.00067, 0.00066, 0.00069, a nd - 0.00068, which have a zero sum. T he m inimum sum of squares, which the sum o f the square of each o f these terms is 1.8 23 x 10 - 6 . ' In the classical problem one stops at this point. In the estimation problem, measurements of T are used to estimate (3. Temperature data for the cooling of a plate using a single thermocouple (a temperature sensor) and uniform time spacing are shown in Fig. 1.6. N ote that even though the differential equation is linear, T (t,(3) in (1.3.10) is a nonlinear function of (3; that is, the.derivative of (1.3.10) with respect to f3 is a function of (3 unlike k given by (1.3.1). This is discussed further in Section 1.4. In calculations such as these, a t least an electronic calculator usually is needed because there frequently are small differences of large numbers. 1.3.2 Linear First Order Differential Equation Model • Y i A case for which a fundamental law can be invoked is that of dropping a thin plate initially at temperature To into a fluid at Too ' F rom the first law of thermodynamics one can derive [19] • • 0\ I • - -- - -- - • • T", a (1.3.9a) • measured value t Figure 1.6 Simulated temperature measurements of a cooled th in plate. I .. CHAPTER 1 I NTRODUcnON O F PARAMETER ESTIMATION T he simplest method of estimation involves the use of Ihree t emperature measurements including To a nd Too ' O bserve t hat a t l east three m easurements are needed although only one p arameter is being estimated. This is in contrast with the preceding case for which measurements a t two temperatures were sufficient to estimate two parameters. F or m easurements o f To a nd Too a nd T j a t I i ' d esignated Yo. Y 00 ' a nd Yi • a n e stimate of {J is 15 I .J S IMPLE EXAMPLES Example 1.3.2 Suppose it is known that T OIJ is equal to 100 a nd To is equal to 300 and tha~ two measurements of T are available. Y I -220 a t S sec and Y l-170 at 10 sec. EstImate usiDgleast squares the parameter {J in (1.3.10). Use a trial and error approach. Solution A first estimate of {J can be obtained from (1.3.\ I) using the first observation. We obtain (1.3.11 ) & p- - As either 1; -+0. c orresponding to 1';-+ To . o r 1; -+00 . c orresponding to T -+ Too. t he error in d ue t o s ome small e rror in T; b ecomes very large. T hus ~ is more sensitive to errors a t s ome m easurement times t han o thers. which suggests the subjects o f sensilivily a nd oplimum e xperimental design (see C hapter 8). I f To a nd Too a re not precisely known they c an a lso be considered p arameters like {J. T hey a re dissimilar from {J in t hat ( a) they are particular values of the d ependent v ariable (termed the s tate v ariable in the systems literature) a nd ( b) r epeated measurements o f these a re a vailable in this particular example. O ne o ther p arameter t hat c ould b e estimated for this problem is the s tarting time. T he s tarting time c an b e seen to be a n u nknown if one imagines several successive digital t emperature m easurements taken before the plate is dropped into the fluid . A t t he i nstant t he plate c ontacts t he fluid the plate's temperature rapidly changes; see Fig. 1.6. T he time a t which the plate contacts the fluid might n ot c orrespond to the instant a t which a ny m easurement was t aken . Suppose that the starting time is known to be zero a nd t hat all the measurements are for t > O. F or f inding estimates o f a ny c ombination o f t he parameters To. Too. a nd {J o ne c ould s tart with the sum o f s quares for n m easurements fi I 2 20-100 5 In 3 00-100 - 0.\022 Let us then evaluate the sum of squares function S in the neighborhood of that value. From (1.3.12) we can write S _020-200e-SfJ)2 + ( 70-200e- IOfJ )2 which we evaluate at {J-O.l022 to find S -3 .901. Now another value of {J must be tried. Let us try {J c o 0.1; this gives S - 14.493. Because this S value is bigger than the value for {J-O.l022. let us try a larger value than { J-0. 1022. At { J-O.lI. S - 32.988. Hence the minimum must be between {J - 0.1 a nd 0.11 and is probably nearer the first value. Let us try 0. 103 which gives S -2.214. Then the minimum S must be between {J-O.l022 and 0.11. A further value of {J-O.IOS yields S -2 .8S3 4 0,000 30,000 s S v alue f or 2 0,000 ~ - ---- (1.3.12) a nd minimize it with respect to the parameters. T he derivatives o f ( \.3 . 12) a re l inear in terms o f To a nd Too b ut n onlinear in terms o f {J. T he n onlinearity complicates the search for a minimum. O ne way to minimize S with respect to a nonlinear p arameter is simply to plot S versus that p arameter a nd g raphically find the m inimum . This is a slow procedure. b ut c an give insight. 10,000 o 1 .5 o F Ipre 1.7 Sum of squares function for exponential example. CHAPTER 1 INTRODUCTION OF PARAMETER ESTIMATION 16 and thus /3 must be between 0.1022 and O. \OS, which region could be explored further. One could continue further in this trial and error manner to estimate /3 more accurately . This is a possible approach but it is very tedious and time-consuming, particularly if more than one parameter is present. More direct methods of minimizing S are given in Chapter 7. I t is instructive to plot the function S for this case. See Fig. 1.7 . Note that the minimum is near /3=0.1 and a local maximum is approached at large /3. Thus in addition to a s / a/3 being equal to zero near /3 = 0.1, it also approaches zero as /3-+00 . Even more ill-behaved S functions are possible. See Problem 1.5. 1.3.3 Partial Differential Equation Example C onsider again the same physical problem of a plate dropped suddenly into a fluid . Instead of negligible internal resistance ( Bi=hLlk<O . I) assume that there is a significant variation of temperature across the plate. T he describing equations for constant properties a nd a plate of width 2 L are [19] 17 1.4 SENSITIVITY COEFFICIENTS A nother aspect of identifiability is the determination of what parameters o r groups of parameters c an be uniquely estimated. F or example, (1.3.13) a nd (1.3.14) c an be divided by k to yield the groups pCp I k a nd h i k . Since n o t erm in these groups appears elsewhere in the problem, one would anticipate t hat these groups could be simultaneously estimated. T hat this is n ot always true can be proved by noting t hat this physical problem is i dentical to the one in Section 1.3.2 for which only the ratio of these two parameters could be estimated. I t h appens t hat Bi = h L I k m ust be equal to approximately one o r g reater in order to estimate both. There may be other conditions that would also preclude estimation for this example. The condition for identifiability is discussed in Section 1.5. I n o rder to estimate the parameters one c an again use the sum of squares function . Instead of a single summation over time, one could have a double summation over time a nd sensors located a t different positions, n S= m LL [ lj(i)_1)(i)]2 (1.3.17) i - I j =1 (1.3.13) ( 0< x < 2L) - k aa T he subscript j is for position i is for time. There are m discrete locations a nd n different times. l j(i) designates a n o bservation a nd 1)(;) a value obtained from the model. I T = h [ Too - T (0, t)], x x- o ( l.3.l4a, b) 1.4 SENSITIVITY C OEFFICIENTS (1.3.15) T (x,O)= To This is a problem which is linear in the dependent variable, T. F or the estimation problem we c an consider T as a function of a number of variables, (1.3.16) __ ~ In parameter estimation one must be able to solve the model repeatedly for different parameter values. F or this example an exact solution is available as an infinite series but it may be easier to approximate the solution using a finite-differences representation. Such a solution can also be modified ~o treat nonlinearities entering in either the differential equation or the boundary conditions. . Note that in this example several different kinds of measurements are required: temperature, time, a nd length. Measurements of the initial conditions a nd b oundary conditions may not be sufficient for parameter estimation; interior measurements may be needed (identifiability). T he l ocation of sensors and duration of the experiment are studied in connection with o ptimum experiments. I I n this section a brief introduction to sensitivity coefficients is given. Consider the true mathematical model to be given by 1 /(x,t,P) where x a nd t are independent variables a nd P is a p arameter vector. T he first derivative of 1/ with respect to f3i will be called the sensitivity coefficient for f3i a nd d esignated X i' (1.4.1 ) O n some occasions the right side of (1.4.1) is multiplied by f3i a nd still called simply a sensitivity coefficient. Sensitivity coefficients are very important because they indicate the magnitude of change of the response 1/ d ue to perturbations in the values of the parameters. I t is for this reason we have given Xi' defined by (1.4.1), the name "sensitivity coefficient." They appear in relation to many facets of parameter estimation. The reader is urged to pay particular attention to them a nd even to plot them versus their independent variables(s) if their shapes are n ot obvious. O ne a rea where the sensitivity coefficients appear I. CHAPTER I 1 NIll0DUcnON OF PARAMETER ESIlMATION is in the identifiability problem, which is briefly discussed in Section 1.5. Another area where the X/s appear is the Gauss method of linearizing the estimation problem when the model is nonlinear in terms of parameters (see Section 7.4). In the optimal design of experiments discussed in Chapter 8, the sensitivities also p laya key role. The sensitivity coefficients also. appear in a Taylor's series for TJ(PI,···,Pp,/) about the neighborhood of the point (bl,b1, . .. , b) which we shall denote b. Provided ' I has continuous derivatives near IJ ~ b, we can write a1J(b, t) a1J(b, t ) ' I( PI, . .. ,pp,t)=1J(b,t)+ ~(PI- bl )+ . .. + ~(Pp- bp) + a~(b,t) ( P I - b l )2 apl 2' l (1.4.2) 19 P2 sensitivites of this equation are (1.4.7) which contain the parameters a nd thus 'I given by (1.4.6) is nonlinear in terms of its parameters. I f, however, the only parameter of interest is PI' ' I is linear in terms of PI. The evaluation of sensitivity coefficients need not begin with an expression of 1J but could be initiated with the given differential equation. For example, if the derivative of (1.3.9a) (a linear differential equation) is taken with respect to P, X(O):ooO ( 1.4.8a) (1.4.8b) 2 I f the derivatives a'+'1J/ap;"dP/ ( i,j= I, . .. ,p) for r +s> I are zero, then 'I is said to be linear in the parameters. For 1J a linear function of P and D . I "'2' we can wnte ( 1.4.3) This relation is an equality rather than an approximation if both XI a nd X are not functions of the parameters. Hence 1J is linear in its parameters i f ali the sensitivity coeJficients are not Junctions o j any parameter(s). Consider now some simple examples. The PI' P2, and p) sensitivity coefficients for the algebraic model (1.4.4) are, respectively, X I-I, The PI a nd d -X =-(T-T ) _DX dt G O'" , a~(b,t) + . .. + ap ap ( PI-b l )(P2 -b 2 )+··· 1.5 m EN11FlABIUfY ( 1.4.5) Since each of these is independent o f all the parameters, 'I given by (1.4.4) is linear in its parameters. Estimation involving models linear in the parameters is generally easier and more direct than estimation involving nonlinear parameters. Another algebraic model which occurs in many fields is (1.4.6) Equation 1.4.8a is termed the sensitivity equation for this case and, together with (1.4.8b), constitutes a statement o f the sensitivity problem. In (1.4.8a) it is assumed that T (or 'I in the notation of this section) is a known function obtained from a previous solution of the original differential equation and initial condition. Since P appears explicitly in (1.4.8a), the sensitivity coefficient X is a function of p. Consequently the dependent variable T is nonlinear in P as can be verified from differentiating (1.3.10). 1.5 IDENTIFIABILITY There are some models for which it is not possible to uniquely estimate all the parameters from measurements. Rather it is possible to estimate only certain functions of them. This is p art of the identifiability problem. See Appendix A for a derivation o f a n identifiability criterion. In this section several simple cases for which one cannot uniquely estimate all the parameters are discussed. Later a n identifiability criterion utilizing sensitivity coefficients is introduced a nd related to some of the cases previously investigated. A model that will n ot permit estimation of both PI a nd P2 is (1.5.1) where.lf is a constant a ndJ(t) is a ny known function of I . I n this case one can only estimate P- PI + APl given measurement o f '1/ versus '/. I n this S,PI,P2 space, S does not have a unique minimum, but instead has a 20 C HAPTER 1 I NTRODUCTION O F P ARAMETER E STIMATION I.S IDENTIFIABILITY 21 Dividing by [31 yields (1.5.5) where it is seen that 1); is a function of a I' a 2, a nd t;, a nd thus only a 1 a nd a 2 c an be simultaneously estimated. Another simple case where all three parameters cannot be uniquely estimated is for l1i = [31 e - (/12 + /1,1) ( 1.5.6a) (1.5.6b) Figure 1.8 C ontours of minimum S for various cases where not all the parameters can be uniquely estlmated. where only a l a nd (3) c an be found. There are other cases where the parameters can not (easily) be uniquely estimated if measurements are made only over a certain range of the independent variable o r a t certain values. One example is minimum along a line which projects into [3 = [31 + A [32 in the [31' [32 plane; see Fig. 1.8. Consider next the model (1.5.7) ( 1.5.2) From inspection we see that [31 a nd [32 c an be replaced by the product [3 = [31 [32 a nd that any combination of [31 a nd [32 equal to [3 would yield the same value of 1), for a given I i' In terms of the three-dimensional space of S plotted versus [31 a nd [32' there is a minimum S along a curved line which projects into [32=[3/[31 in the [31,[32 plane, as shown in Fig. 1.8. A very similar case to (1.5.2) is (1.5.3) where again only the ratio # is unique and various combinations of [31 a nd [32 could be given to provide [3= [31/ [32= constant. In the S , [31' a nd [32 coordinates. the minimum of S occurs along a straight line [31 = [3[32 projected into the [31.[32 plane. In Fig. 1.8 this line passes through the O rIgIn. A less obvious case is for the model ( 1.5.4) / for maxlt;1 small compared to unity. F or such a model it is possible to estimate accurately only [31 + 10[32 if It;1 is small. This model is thus similar to (1.5.1) for small It;!, F or sufficiently " large" t; b oth [31 a nd [32 c an be estimated. Another example is for the model 1); = [31 t; + [32 sin (3)t; ( l.5.8a) for small (3)t; since then 11; c an be approximated by (1.5.8b) Hence for small maxi (3)t;l, instead of being able to estimate uniquely all three parameters we c an estimate only [31 + [32 (3). M any other cases could be cited that demonstrate that only certain functions of parameters can be estimated from measurements of 11, versus its independent variable(s). Some of these cases may not be at all obvious. This is particularly true where there are a number of parameters and the model is a differential equation. Rather than depending upon being able to manipulate the model so that groups of parameters appear, we would be helped by having some criterion that could be applied to the above algebraic models a nd also to models involving differential equations. In the latter case we imagine that the solutions of the equations a nd the sensitivity coefficients are available in graphical o r t abular form. I t turns out in the algebraic cases above, as well as for other cases involving 11 , ". .... C HAPTER I I NTRODUCTION O F P ARAMETER E STIMATION U S UMMARY AND CONCLUSIONS differential equations, that the sensitivity coefficients can provide insight into the cases for which parameters can a nd c annot be estimated. Parameters can be estimated i f the sensitivity coefficients over the range o f the observations are not linearly dependent. This is the criterion that we shall u~e to deter~i~e if the parameters can be simultaneously estimated Without ambigUity. See Appendix A for a derivation of this criterion. Linear dependence OCCurs when for p p arameters the relation 1 .0..----.-------,,---.-----,----, .75 .5 .25 0k--------===========~ (1.5.9) is true fo~ all i observations a nd for not all the S values equal to zero. Let us Illustrate the above criterion for a few examples. F or (1.5. 1) n ote that a nd thus, if C I = A a nd C2 = - I. (1.5.9) is satisfied. Consequently, both a nd P2 c annot be estimated simultaneously. Another example involves (1.5.4) for which PI 2 Flame 1.9 4 5 Dependent variable " a nd sensitivity coefficients for , ,- Il.l< III + Il,.) with 1 l2-1. a.", ap i = P2 + P3 /; , a.", - PI/; ap3 = (P 2 + P3 ti It is n ot immediately obvious from an inspection of these sensitivity relations that there is linear dependence. It c an be verified, however, that if C I == PI' C2 == P2' a nd C3 = P linear dependence exists; in equation form, 3 we then have a.,,; a.,,; a"'j PI api + P2 aP2 + P3 aP =0 3 which form can OCcur in various cases with linear dependence. The dependent variable." a nd the sensitivity coefficients for the model (1.5.4) a re depicted in Fig. 1.9 for P2 = I . I t is strongly recommended that the sensitivity coefficients be plotted a nd carefully examined to see if linear dependence exists o r even is a pproached. T he relation given a bove between the coefficients can be approximately verified by graphically adding the three together to obtain zero a t each instant of time. Furthermore, note 3 that the P, a nd P sensitivities seem to have approximately p roportional' magnitudes for P3 t g reater than 3. This mean~ t hat not only is it impossible t o estimate PI' P2' a nd P3 s imultaneously from measurements o f ~ versus t, b ut it is difficult to estimate only PI a nd P3 using d ata for P t > 3 If P2 <; I . 3 1.6 S UMMARY AND CONCLUSIONS I. P arameter e stimation is a discipline that provides tools for the efficiendt use o f d ata for aiding in mathematically modeling o f p henomena a n t he estimation of constants appearing in these models. T he p roblem o f e stimating parameters is t hat o f finding constants appearing in a n e quation describing a system as suggested b y Fig. ~ .3. . 2. O ne way to estimate the parameters for a large vanety o f models IS tO use least squares which involves minimizing the sum o f sq~a~es. 0 f differences between measurements a nd model values. T he mJDlmlzation problem can b e either linear o r n onlinear. 3. O ne c annot always independently estimate all the parameters t hat a ppear in the model. I t is clear t hat n ot all the parameters m ay be e stimated if parameters a ppear i n groups, b ut i n some cases n~t e ven all these groups m ay be f ound. This is related t o t he subject o f identifiability . ---- I , I I. CHAPTER 1 INTRODUCTION O F PARAMETER ESTIMATION 24 P ROBLEMS I REFERENCES Legendre, A. M., Nouvelles Methodes Pour la Determination des Orbites des Cometes, Paris, 1806 . 2. Gauss, K. F., Theory o f the Motion o f the Heavenly Bodies Moving about the Sun in Conic Sections, 1809, reprinted by Dover Publications, I nc ., N ew York, 1963. 3. Draper, N. R. a nd Smith, H ., Applied Regression Analysis, J ohn Wiley &: Sons, Inc., New York, 1966 . 4. Box, G. E. P. a nd Jenkins, G. M ., Time Series Analysis forecasting and control, H olden-Day, Inc ., S~n F rancisco, 1970 . I , 1. i J I 25 (a) Estimate f30 and Answer. f3 1in (1.3 . 1), using least squares. 7 8.5,0.106 ( b) Calculate the residuals. Answer. 1 0.9, - 1.7, 0.7, 0.1 (c) F or f30=80, plot S versus 1.2 (a) Derive using least squares an estimate of /3 for the simple model TJ;= /3 5. Myers, R. H., Response Surface Methodology, Allyn a nd Bacon, Inc ., Boston, 1971. fo r n measurements. Assume Y; = TJ; + £ ;, £; being the measurement error. ( b) Also derive estimates for /30 a nd /3 1 for the model 6. Bard, Y., Nonlinear Parameter Estimation, Academic Press, New York, 1974. 7. Sage, A . P. a nd Melsa, J. L., Estimation Theory with Applications to Communications and Control, M cGraw-Hill Book Co., New York, 197 1. 8. Sage, A . P., Optimum Sy stems Control, Prentice-Hall, I nc., Englewood Cliffs, N . J ., 1968 . 9. Deutsch, R ., Estimation Theory, Prentice-Hall, Inc., Englewood Cliffs, N . J ., 1965. 10. Bryson, A. E., Jr. and H o , Yu-Chi, Applied Optimal Control Optimization, Estimation and Control, Blaisdell Publishing Co., Waltham, Mass ., 1969. I I. Sage, A. P. a nd Melsa, J . L., System Identification , Academic Press, New York, 1971. 12. G raupe, D ., Identification o f Systems, Van Nostrand-Reinhold Co., New York, 1972. 13 . Mendel, J. M., Discrete Techniques o f Parameter Estimation The Equation Error Formulation, Marcel Dekker, Inc., New York, 1973. 14 . Kmenta, J ., Elements o f Econometrics, T he Macmillan Co., New York, 1971. IS. Bevington, P. R., Data Reduction and Error Analysis for the Physical Sciences, M cGrawHill Book Company, New York, 1969 . 16 . Wolberg, J . R., Prediction Analysis, D. Van Nostrand Co., Inc., Princeton, N . J., 1967. TJ; = f3o+ /3 1 s in/; 1.3 Some actual measurements for the specific heat c of Armco iron a t room temperature are, in units of kJ j kg-C, P I cp 1.4 i 19 . Kreith, F., Prillciples o f Heat Transfer, 3rd ed ., Intext Educational Publishers, N ew York, 1973. I' 0.4287 2 0.4363 3 0.4451 4 0.4409 5 0.4442 cp 17 . Lewis, T. O. a nd Odell, P. L., Estimation in L inear Models, Prentice-Hall, Inc., Englewood Cliffs, N. J., 1970. 18 . Rabinowicz, E., A n Introduction to Experimentation, Addison-Wesley Publishing C o ., Reading, Mass., 1970. . 6 0.4400 7 0.4400 8 0.4405 9 0.4375 10 0.4333 Using the model of Problem 1.2a, estimate c . Plot the residuals as a function of i. W hat is the sum of the residuals? P (a) F or the model TJ= efJ1 and the data given below, estimate /3 by plotting S versus /3 . Cover the range 0 to - 20.0. (b) Compare the curve with Fig. 1.7. (c) Compare the residuals with the true errors ( e;= Y ;-TJJ also given below: Data, Y; P ROBLEMS I 1.1 The thermal conductivity k has been found from four independent experiments at different temperature to be given by T; (0C) ~' ...,- f31 in the. neighborhood of the minimum. 90 98 I II 121 t k; (W / m-C) 100 200 300 400 0.25 0.5 0.75 1.0 1.25 I 1.5 Errors, 0.419 0.204 0 . 159 - 0.106 0 .042 - 0.053 - 0 .019 0 .054 - 0 . 156 0.0187 £; Plot S versus /3 for the model TJ = IOOsin/30 with /30 in radians and for the data 0 1 = 2.79, YI = 34.2; O = 6.98, Y2 = 64.2; and 03 = 8.38, Y3 = 86 . Investi2 gate the range 0 < /3 < 1.1 for t::..f3 increments at least as small as 0.1. (A CHAPTER I I NTRODUcnON O F PARAMETER ESTIMATION 16 Z··..: W hat is (are) the dependent variablc(s)? W hat is (are) the independent variablc(s)? W hat is (are) the statc(s)? What could be termed parameters? What could be termed properties? ( f) The sol~tion of the problem is ( a) ( b) ( c) ( d) (e) programmable calculator would be helpful to get the solution.) What conclusions can you draw? How can (1.3.12) be changed to permit estimation of the starting time I P and To? Assume that measurements are available for I both less tha~ ~nd greater than 10 . Also assume that the plate has been at To for a "long" time before 10, ( a) For the model and data 1.6 1.7 PI Ij 200 SS 30 20 I 2 3 calculate the sum of squares S in the rectangular region 100 < P < 300 a~d 2.0 < P2 < 4.0. In particular, evaluate S at PI"" 100, 200, a~d 300 with P2 = 2.0, 3.0, and 4. (b) Is 11; linear in PI and P2? ( c) Ba~ed on the information in (a), estimate PI a nd P2' ( d) Usmg the. search procedure in (a), is it more or less than twice as much work to fmd two parameters as it is to estimate one? The ~urrent / in the ~ircuit of Fig. 1.10 after the switch S is closed satisfies the differential equation L di,+ R1 - £ =0 d · ~here i - ~[I - e l - R Y; 0 11, = 1 + P2 /, 1.8 PROBLEMS L is inductance, R is resistance, and £ is voltage. An initial condition I1 / L ),] + i 0e l - I1 / L ), Is i linear in £ ? R? L? i 01 ( g) W hat parameters or groups or parameters can be estimated given 1.9 measurements or i? F or the rollowing expressions or the model 11, indicate ror the various values ir 11; is linear or nonlinear in terms or them. pj ( a) 11; - PI + P2 sin 7T I I lo ( b) lI;=PI,le- " Pie -p", ( c) "i- ( d) "1- f 2 1 + P I'I+ P2 /1 j -t ~(l- Pti)e-jw" J 1.10 F or the following expressions ror the model ",. derive expressions ror the sensitivity coerricients. Also plot the sensitivity coerricients a nd" versus Pl/ F or parts (b) a nd ( c) graph " I PI' a"laPI' a nd (P21 PI)a"lapl versus Pl/ ( If values or PI a nd P2 are needed, let P I" 2 a nd P2 - 1.) IS i =io at Note that a solution of i is in terms of ( a) , ,- PI + P2 ' ( b) , ,-PICOSP21 (0<: P /<:47T) l ( c) , ,_PI(I-e- PJI ) (0<: P /<.3) 1 / -0 I, L , R. £ . and io. 1.11 s F or the model w here' is in radians, plot the sensitivity coefficients ror - 2 <. I <. 2. Over what range (ir any) do the parameters P I a nd P2 appear to be linearly dependent? 1.11 Find a linear relation between the sensitivity coefficients ror ·PI' P l' a nd P3 ror the model R E L F igure 1.10 Circuit for Problem 1.8. 28 CHAPTER I I NTRODUCTION O F PARAMETER ESTIMATION C HAPTER 2 _____________________ ~ot----....,.",,- P ROBABILITY Figure 1.11 11 for Problem 1.13. 1.13 C onsider the model (see Fig. 1.11) 1 )=1)0-/3\('-1 0 ) 1 /=1)0 1 ;"0 1 <10 ( /3\ is posi tive) ( a) F ind a nd graph a1)ja1)o ' (b) F ind and graph a1/jal o. (c) C an 10 a nd 1/0 be simultaneously estimated using only two measurements of 1) if /3\ is known? 2.1 RANDOM HAPPENINGS I f a room thermostat is set a t 21 °C, we d o not expect the temperature throughout the room, or even right a t the thermostat, to remain constant. Rather, we expect the temperature a t any point to change continually and continuously while remaining very near 21°C. I f we r un a test of braking distance by repeatedly bringing a car to 55 mph, then applying the brakes, we expect the distance covered after application of the brakes to differ from trial to trial no matter how we try to make sure that the road and wind conditions and pressure o n the brake pedal are the same from trial to trial. We do hope to settle on some typical distance a nd p erhaps on some measure of variability. In both cases, thermostat a nd braking distance, there are elements of stability and elements o f randomness. Example 2.1.1 As a simple example of randomness with a n element of stability, let us observe successive determinations of percent defective in a sampling inspection of items from a production line. (The d ata to be exhibited were actually generated by computer simulation.) Successive items were inspected a nd declared to be either G ood o r Defective. The first two items were found to be Good, the third Defective. a nd so on. The results of the first 500 d eterminations are given in Table 2.1 A . T o make a long series of such determinations easier to contemplate, Table 2.1 B gives the number of defectives found among the first n items inspected for various n up 29 /