Back to Parameter Estimation in Engineering and Science Home
Table of Contents
CHAPTER 8 DESIGN O F OPTIMAL EXPERIMENTS A PPENDIX A ___________________ 8.11 A plate which is subjected to a large instantaneous pulse of energy Q at x = 0 a nd is insulated at x = L has the solution for the temperature of IDENTIFIABILITY CONDITION where t + = 1/' 2a t iL 2, c is the density-specific heat product, and Q has units of energy (Btu or J) per unit area. For x = 0 the temperature is infinity at time zero and decays to To+ Q I c L for large time. At X " L the temperature starts at To a nd increased to To + Q I cL. (a) Find an ellpression for the a sensitivity at x l L = I . (b) Evaluate using a. c omputer the ellpression found in ( a) for 0 < t + < 3. F or a filled value of Q (and no restriction on the range of T ) show that the optimum time to take a single measurement is t + = 1.38. Also show that this time corresponds to the time that the temperature at x = L has reached one haIr of the m uimum temperature rise. This "one-haIr" time is the basis of finding a in pulse or nash ellperiments. See the paper by Parker, Jenkins, Butler, and Abbott (27). ( c) Also using a computer find the optimum ellperiment duration for many equally spaced measurements at x I L = I. 8.13 ( a) A large number of measurements uniformly spaced in time have been made at x = 0 a nd x = L in the heat conducting body discussed in Section 8.5.2.2. For mo and m l sensors at x = 0 and L , respectively, show that ~ + given by (8.3.7) can be written as ~+ ... [ zCIt.o+(I-z)Cltl][ z c2to+(I-Z)C2t.] - [zC I i.o+(I-Z)C ltlt where z = m ol m and 1 - z = m Il m and where The third subscript in Cij~O o r Cij~1 refers to x =O or L , respectively. The standard statistical assumptions are valid. ( b) Derive an ellpression for z at which ~ + is a m uimum, assuming that z can assume any value in the interval 0 to I. ( c) T he following values are for the heat conducting body discussed in Section 8.5.2.2: c lto=0.07609 c lto=0.1062 c 2to=0.1552 c 2tl=O.126 C I t I = - 0.0422 c ltl=0.0l48 T he values correspond to the dimensionless time -=0.65. The first two subscripts correspond to k (a I subscript) or c (a 2 subscript). Using the ellpression derived in part (b), find a value for z. ( d) What conclusions can you draw from the results of this problem? t: A.I I NTRODUcnON T he problem of investigating the conditions under which parameters can be uniquely estimated is called the identifiability problem. A convenient means of anticipating slow convergence or even nonconvergence in estimating parameters can save unnecessary time and expense. Also if easy-to-apply identifiability conditions are known, many times insight can be provided to avoid the problem of nonidentifiability, through either the use of a different experiment o r a smaller set of parameters that are identifiable. The purpose of this appendix is to derive the identifiability criterion that the sensitivity coefficients in the neighborhood o f the minimum sum of squares function must be linearly independent over the range of the measurements. This criterion applies for linear and nonlinear estimation. This criterion is derived only for a weighted sum of squares function which includes least squares, weighted least squares, and ML estimation with normal errors, in each case with no constraints on the parameters. f or MAP estimation with prior parameter information it might be possible to estimate the parameters even if the sensitivity coefficients are linearly dependent. This condition of independence of the sensitivity coefficients is particularly convenient if the number of the parameters is not large, say, less than six. Even if the number is larger, linear dependence between two o r three o f the parameters can sometimes be readily detected from graphs of the sensitivity coefficients. The p lolling of the coefficients is extremely important and should be done for each new problem before attempting to estimate the parameter. 481 .1 ~. _ - APPENDIX A 1DEN11F1ABILm C ONomON .4.1 m EOIY Define the determinant D, a s 1...1 THEORY C onsider a general sum of squares function for measurements given hy N "" S = ~ ~ ( Y. - 'I.)w.,,( Y" - 'I.,) c.n SI,P, S~ZIJI ( A.I) where w"" is a n element of W. a square. symmetric. positive-definite malr.i": lett~e funclion S possess conlinuous derivatives in the neighborhood of ils minImum I n the parameter space which occurs when 11 is evaluated al p•. p ~ S (P)=S(p·)+ Sft,(P,-fJP I-I p + ,. 22 SJ.fI, ( P, - fJ,· )( P - fJ;. ) + . . . j (A.3a) i-I,-I (A.6) T hen S. a pproximated by the terms explicitly given by (A.3a), has a unique local minimum if in addition to (A.S) being true it is also true Ihat (A.2) A T aylor series expansion o f S in the neighhorhood of its minimum is S lzlJl D,= .,-1,--1 10 SI,PI D ,>O for , -1.2 . .. ..p (A.1) which is Ihe condition that D, b e positive-definite; see reference I . A minimum can exist with weaker conditions. however. F or example. if D , - 0 a minimum may exist b ut it may n ot b e unique; that is. the minimum could b e a long a line rather than a t a point. T he conditions given by (A.S) a nd (A.1) are necessary a nd sufficient conditions for a unique local minimum. We wish t o relate the conditions of D, > 0 a nd D, - 0 10 the sensitivity coefficien Is. Let us define where (A.Sa) I IS( W) (A.3h) S ;'=ap;-· (A.Sb) Using S defined by (A .I) in (A.3bl gives "" S ;'",-2 ~ ~ H·••. (Y"-'I.~)X:.; . -1, --1 n S;'~=2 ~ n 2 H,., _ [X,~X.:-(Y.--'I.~)X:'I] Then using (A.4a) SA - 0 for all i. (A.3a) c an be written p (A.4h) where (A.Sa) a nd (A.Sb) a re employed in S ;/ - 2 ( A.4c) The expression X ui in (A.4a) is called a sensitivity co~rric!ent.. . F or a model linear in Ihe parameters the cross-denvatlve X"V I n (A.4b) IS equal to zero as are also Ihe third and higher order derivatives of S. Note that the condition of continuous derivalives of S with respect to P will be satisfied if 11 a nd ils derivatives are continuous functions of p . A necessary condition for S to possess a minimum a lp· is thai f or i = 1.2 . .... p (A.9) S ;/flIJ/llP/ i -I j - I u -I t 1-1 Sft,=O p S(lI>-S("·),",,~ ~ ~ ( A.S) S;/. "" 2 2 w",,[ X ..i X '; - (Yo - 'I:)X..lj ] ( A. 1 0) 1 1-1.,-1 N ow (A.9) is a q uadratic form a nd if a unique minimum is to exist it is necessary that all the determinants D: , -1.2. ....p ( A.l1) S ,j b e greater than zero. Suppose that a minimum ellists a t II· b ut t hat the minimum is APPENDIX A IDENTIFIABILITY C ONDITION n ot unique. for example. it ellists along a line o r in a plane. This results in D,+ =0 for some r. Suppose first that the term in (A. IO) involving is negligible in its c ontribution to S / . Notice that this term becomes negligible as the residuals. Yv - ,,:. become small but this is n ot true for the Xu: Xcj terms. ( For linear-in-the-parameters cases. is always zero.) Furthermore. assume that A.J C OMMENTS Suppose (A. 16) is written in the form , xut Xuu = ou- 2 "'",,=0 " 'w a nd then S,/ U = v a nd a;"eO ~ 0 Xuj = 0. u = 1.2 • ...• n: j -I ~here a t least o ne ( A.II) 0 r = 1.2. .. .. p (A.17 is n ot equal to zero. Also form the summation involving a r m In n , u "ev r ~ ( A . 12) S./0=- ~ ~ ~ j -I j -I u -I n ~_I ! "'uv[Xu XJ-(Yv -11:)Xu ij]S given by (A. IO) becomes S'/ = n ~ ( ou-IXu! Hau-IXu; ) (A.18 . ( A.I3) u -I Differentiating (A. 17) with respect to _[a~- IXI; 1 8; - . L 0 Xuij=0. i =I. . ..• P : r =I. . . .. p , L S/0=0. r -I . ...•p ; i = I . ....p j -I a fa, D += , (A.19 ) Using (A. 17) a nd (A. 19) in ( A.l8) then produces ( A .14) Use this interpretation in (A.J3) a nd i ntroduce (A. 13) into (A . II) to get ( A . IS) a ;a l u =I. . ..• n: j -I a.- IX.! a ia l Pi yields , T he summation in (A . I3) can be considered to form an inner product involving vectors a,. i = I . 2. .... r : 8 ;a, which can be considered a G ram d eterminant of al . a1 . ... . a,. I t is known that D,+ is equal to zero if a nd only if the vectors a, a re linearly dependent. which means (A.20 1 ~e have shown for linear dependence o f the sensitivity coerficients. (A. ' 7). that a ~Iven c olumn o f the square matrix D,+ given by C A.") c an b e considered to be a I~near c ombi.nati.on o f the other columns. But if a ny c olumn o f a s quare matrix is a h near c ombinatIon o f t he other columns. then the determinant o f t hat matrix is zero. Consequently. the sum o f s quares function S does not have a unique mi~imum in th~ P I,Pl. ' . . ,p, space a nd thus not all o f these parameters can be umquely d etermined. T~e results given ab~ve a pply for r equal to I to p p arameters. Note. however. if the h near d ependence condition o f t he sensitivities given by (A .• 7) is satisfied for r < p . then it is also satisfied for r + I r + 2 . .. p because C It C' +2• . .. ' C can be set equal to zero I n (A.17). p • • t t , . .. (A .16a) A.3 or (A .16b) for k - 1.2 • .. .. n a nd for not all C; being equal to zero. In other words. i f the (continuous) sensitivity coefficients are linearly dependent in the neighborhood o f t hf minimum there is no unique minimum and all the r parameters cannot be simultaneously and uniquely estimated. This is the desired relation. Note. however. that this result assumes that the term involving Xuij in (A. 10) c an be dropped: "'110 is given by (A.12): there is n o prior information; a nd there are no parameter constraints. C OMMENTS ( a) P arameters c annot all be uniquely estimated for 11 being linear o r n onlinear in the parameters if the sensitivity coefficients a re linearly dependent over the range o f the measurements. This is true if (i) the S function is formed by some weighted least ~~uare f u?ction. (ii) the sensitivities a re c ontinuous functions o f the p arameters. (m) t~ere IS n o prior information regarding the parameters. a nd (iv) there are no c onstraints o n the parameters. . (b) I f t~e termrin (A.IO~ is negligible o r is d ropped. the determinant o f D / IS p roportIonal to IX WXI whIch must not be zero when attempting to estimate xut APPENDIX A IDEN11FIA8IUrv C ONDmON 7.4. parameters using the Gauss method discussed in Section Hence: .u~i~g the Gauss method. n one of the parameters can b e estimated ,f the sens,t,v,ttes a re linearly dependent. Other methods might permit one to obt~in certain .p.ara~eters, b ut n ot all since there would b e no unique minimum of S ,f the cond,ttons t n ( a) above are true. T (e) Regardless of the form of W if IXTXI-O it is also true .that IX WXI:-O. Hence if the sensitivities are linearly dependent, there is no chOIce o f W poss,ble that will cause IXTWXI to be n ot zero. ( d) I f IXTXI * 0. then IXTWXI m ayor may not be equal to zero. But if M L estimation is used as mentioned in (b).IXTWXI would not b e zero if IXTXI*O. REFERENCES where (A.2S) which is also limited to between zero a nd one. F or small ( it c an b e demonstrated that (A.26) ( -I S ,t- S2;' Only for does T he d eterminant D + is numerically equal to the product of its eigenvalues ",.A 2." .,A.. N ow t h; matrix on the right side of (A. I I) is real and symmetric which results in all the eigenvalues being real. Also since (A.21) D, + will b e then equal zero if a nd only if a t least one of the A, values is equal to zero. Since S. + is normalized so that the scale of the A, values (or the choice of their units) is u~important. the relative magnitudes of the >./ values is significant. I f o ne value is much smaller than the others (but not zero). the XTWX matrix is p robably ill-conditioned and the minimum of S is not well-defined. This would occur when there is " almost" linear dependence of the sensitivity coefficients. In such cases there will b e relatively large inaccuracy (large variances) in the parameters. Consider the case of two parameters a nd then the eigenvalues >., a nd A2 in I SI~ (A.22) where f l is the determinant in (A.22) with>. set equal to zero. Let A, be the smaller eigenvalue. The the ratio of the eigenvalues. A,/A 2• is always between zero a nd o ne; A.lA2 is given by 1 _(l_()I/l A2 =- I + (I_()I/l S,; - S2t - 0 and (A.27) could b e used to see if there is n ear linear d ependence o f the sensitivity coefficients. (Note t hat if the X"; term in (A.lO) can b e d ropped, the components of (X+)TWX+ a re given b y S /.) W hen ~ goes t o zero, a t least o ne eigenvalue is equal to zero; the maximum value o f ~ is unity. Thus in addition t o plotting the sensitivity coefficients, o ne could examine ~ to see if it is n ear zero. I f it is, the experiment is poorly designed a nd o ne o r more of the parameters should not b e estimated, b ut r ather certain groups o f parameters. I f possible, the experiment 'should b e redesigned so t hat ~ is n ot so small. However, the recommended criterion for accomplishing this is n ot ~, b ut r ather the numerator o f (A.27), subject to certain constraints. See C hapter 8. REFERENCES I. are >., unity; ( equals one only when T he above analysis suggests for more than two parameters that the criterion of small A.4 RELATION T O EIGENVALUES s,t - >. "'/"2 equal (A.24) G. S. G . Beveridge a nd R. S. Schechter. Optimizatio,,: Theory a u Practice. McGraw-Hili Book Company. New York, 1970, p. 217. tsz > ." ." f Il 2 C >;; eo AppeacUx B Estimators and Covariances" for Various Estimation Methods for the Linear M odel" = Xf1 Name of Estimator Assumptions Used Estimator Covariance Matrix of b cov(y) for Y -Xlb Usual Estimator for Unknown a l F or';- alO, 0 Known (XTX)-IXTy Ordinary Least Squares I I-II Same as above ( Ots) 1111-11 Same as above Maximum 11-1111 (XT.;-IX)XT.;- Iy Likelihood (ML) 11-1011 (XTO-IX) - 'XTO-Iy G auuMarkov 11-011 Maximum a posteriori 11-1112 (XTX) -'XT';X(XTX) - , XI(XTX) -'XT';X(XTX) - IXr a1(XTX)-' a 2XI (XTX) - 'Xf s 2_(y_y>T(Y-Y>/(n-p) where V -Xbu ( XT.;-IX)-' a 2(X TO - IX) - , X ,(XT';-IX)-'Xf a 2Xf<X TO-IX) - 'Xf (a 2 assumed known) s l_(Y _ V)TO-I(Y - V)/(n-p) where Y - XbML (XTO-IX) - 'XTO-Iy a 2(X TO-IX)-' a 2X I(X TO-I)X)-IXf s 2_(Y - y>TO-I(y - Y>/(n-p) where Y - XbO-M 11-1012 c ov(b-Il)P MAP XIPMA,xf "+~A,xTO-I(Y_X,,) c ov(b-Il)- a2XI~A,xf where~AP- (MAP) P +PMA,xT';-I(y-X,,) w herePMAP ( XT';-IX+V-q-' a2~AP ( a 2 assumed known) ' ;2.(y_y> TO-I(y - Y>/n where Y - Xb MAP [ XTO-IX+';lV-lr' (Note iteration is required for ';2). ·';-cov(~) for assumptions denoted 1 1--. (See Section 6.1.5 o r inside rear cover for list of standard assumptions). s A PPENDIX c____- -----------LIST O F S YMBOLS ENGLISH SYMBOLS j k ' I-.(P) L LS m MA M AP ML n P P (·) P Pu PML PMAP Q R E NGLISH S VMBOLS a; al A j(i+l) AR b C/j C c ov(',' ) V e e E( ) f (·) f(VIII) f(IIIV) F G h H Hi Coefficients in S; see (7 .6.3) First-order AR errors Term used in sequential methods, (6.7.8a) a nd (7.8.23a) Autoregressive P arameter vector; estimated from estimation equation ; Ip x I) C omponent of XTWX; C Ij-l:.l:,w.,X.;XIi , (7.4.13) = XTWX, (7.4.12); IpxpJ Covariance, c ov(A,B)- E([A - E(A)II B - E(B )I} Matrix used for depenslent observations. e = Du; In x nJ Residual vector - V - V Eigenvector; Section 6.8 Expectation operator Probability density Probability density o f V given II Probability density o f II given V Modified observation vector, F - V-IV where V comes from e -Du F statistic associated with (J - a) 1 00% confidence region a nd p a nd n - p degrees o f freedom Related to the slope of S, (7.6.8a) Acceleration factor (7 .6. 1) = XTW(y-,,), (7.4. 15); Ipx II =Ii_II:'_lw,onXt;( Yon -71",), (7 .4.16) Subscript or superscript $ 491 Subscript Subscript o r superscript Coefficient for confidence region, Section 6.8 Likelihood function, Sections 3.2.5 and 6. 1.6 Least squares Number of observations at a given time Moving average Maximum a posteriori Maximum likelihood Number of observations o r number of observation times Number of parameters Probability Covariance matrix of estimators; Ip x pl = (XTX) -tXT",-IX(XTX)-1 = (XT",-IX)-I = (XT",-IX+Viltl Quadratic form. AT (6.1.27) Minimum S ; for LS, R = (Y - y) T( y - Y) Estimated standard deviation of observation errors, $ = 2 ($2)1/2 where $2= a ; for independent constant variance errors, $2 - R/(n - p) Sum of squares function. scalar Least squares sum of squares. ( y_,,)T(y_,,) Maximum likelihood sum of squares; for standard - I(y _ , ,) assumptions, ( Y MAP loss function; for standard MAP assumptions. S MAP = ( Y - I(y - ,,) + ( I'll - fJ)TVi I( I 'll - fJ) Time t statistic associated with ( I - a) 100% confidence region and n - p degrees of freedom Random component for correlated errors. Appendix 6A. .A. ")T,,, ")T,,, U.,/2 V( · ) Vb V~ W lj W x X l - Du 1 00(1- a/2) percentage point of the normal distribution Variance operator; V (A)- E{ [ A - £(A) t } Covariance matrix of b, ( p X pI Covariance matrix of I'~, ( pxp] (I'~ is prior vector of fJ) Component o f weighting matrix W Weighting matrix; for ML, W =",-I. ( nxn] Coordinate o r independent variable I f " is linear in the Sensitivity matrix; X = (V~" XfJ. (V ~" reduces to X. parameter as in ,,= Tt Ty APPENDIX C L IST OF S YMBOLS A PPENDIX Y O bservation v ector, ( nX II Pr~dicted v ector o f o bservations, (n X I I; f or l inear c ase Z M odified s ensitivity m atrix, Z = D -1X; Y Y =Xb In x pl 1:)_____________________________________ S OME ESTIMATION PROGRAMS GREEK SYMBOLS 0/ Associated with confidence interval o r region percent .confidence; see Section 6.8 See Section 7.6 for parameter related to reducing the interval for calculating S (·' Parameter vector. [ p X I) G amma function; see Section 6.8 Optimum experiment criterion (see Chapter 8); for standard assumptions of Y =,,+I:, £ (1:)=0, I: with normal density. known independent variable values. tit known within a multiplicative constant, we have 11= IXT", - IXI Error vector, [n X I) ; usually Y = " + t Expected value vector, regression vector, model vector, (n X I) Moving average parameter Eigenvalue. Section 6.8 Parameter vector known from prior information [p x I) Autoregressive parameter Correlation coefficient; (2 .6. 17) Standard deviation of constant variance observation errors Constant variance of observation errors Variance of E, ; V (Ej )=O/ a;. V arianceofu, ; V(Uj)=o~ a a P f (·) 11 t lJ 9 >. 1'/3 p P a 0 2 0,2 ~ X2 tit o Vari~nce of E j • used for the AR case designated a I; 0,2 = o~(I- p2) - I. See below (6 .9.9) . ~iagonal matrix; usually </>= £ (uu T ) for £ (u)=O a nd where £ (u;"1)=O for i =t:j ; [ nxn) Chi-squared statistic Covariance matrix of the observation errors; for £ (t)=O. " '= £ (u T ) Known part of tit . as in tit = 0 20 where 0 2 is unknown; [n x n) In this appendix a few c omputer programs are referenced. Many others are available. F or additional references see Himmelblau ( 01. pp. 170, 171,203), Bard [ 02 , pp. 323, 324), and Kuester a nd Mize [D3). LINEAR E STIMATION P ROGRAMS U NFIT A linear least squares program with optional constraints to make the parameters nonnegative, a dd to a constant, etc. This is o ne o f eighteen statistical routines written by 1. R . Miller ( 04). U NREG A linear least squares program that is described in reference 0 3 where an example a nd the listing are given .. O MNIT AB A general purpose computer program for statistical and numerical analysis [OS) . NONLINEAR ESTIMATION P ROGRAMS O THER S YMBOLS . ,. Vp () M atrix d erivative o perator. V /3=(3;3p , . . . 3;3Pp l B ARD A nonlinear least squares program that uses the Gauss method [ 03, p . 218). BSOLVE A nonlinear least squares program that uses Marquardt's method ( 03). N UN IBM Share Program SO 3094 written by Marquardt and others. Written in F ORTRAN IV for IBM 7040. Uses M arquardt's method with derivatives o r finite difference approximations to solve weighted least squares problems. 493 APPENDIX D SOME E STIMAnON PROGRAMS NLiNA This is a program written at Michigan State University by 1. V. Beck and available from him. It uses the sequential and Box-Kanemasu modifications of the Gauss method. SSQMIN This program uses the Powell procedure and is discussed in reference OJ. Index N CJ1 CD REFERENCES D I. 02. 03. 0 4. OS. Himmelblau. O. M .• P rocm Analysis by Sialisilcal MtlllodJ. John Wiley . t Sons. Inc .• New York. 1970. Bard. Y.• Nonlintar Parameltr ESlimolion. Academic Press. Inc .• New York. 1974. Keuster. J. l . and Mize. J. H .• Oplimizalion T«IIniquts Willi Fortran. McGraw-Hili Book Co .• New York. 1973. Miller. J. R.• O n-Lint Analysis for Social Scientists. MAC-TR-40. Project MAC. Massachusetts Institute o r Technology. Cambridge. Mass .• 1967. Hilsenrath. J.• Ziegler. G .• Messina. C. G .• Walsh. P. 1.. and Herbold. R .. O MNITAB. A Compu/er Program for Sia/is/ical and Numtrical Analysis. Nal. Bur. or Std. Handbook 101. U. S. Government Printing Orrice. Washington. O. c.. 1966. Reissued Jan. 1968. with corrections. A bbott, G. L , 4 15, 4 80 Abramowitz, M., 1 1 AJ-Arajl, 5 ., 263, 319 Analysis o f co~arlance, 131 A naly,l, o harlance, 1 30, 1 31, 115, 178 ArI" R., 4 14 Arkin, H., 78 Assumptions, Gauss-Markov, 1 34, 232 ,tandard, 134, 228, 229 violation of, 1 85-204, 2 90-319, 3 93, 4 00,401,459,460 Atkinson, A. C., 4 35, 4 38, 439, 4 14 A utocovulance, 5 9 Bacon, D. W., 3 19,414 Badaval, P. C., 4 32, 4 74 B ud, Y., 4 , 24, 335, 362, 3 64, 375, 386, 4 11,414,472,475,493,494 Bayesian e 'tlmation, 9 7-101. & ttll,o Maximum a po,teriori estimation Baye,·. t heorem. 4 6, 4 1,160,164,210 Beale. E. M. L ,414 Beck. I . V., 263, 3 19,415.474.475.494 Beveridse. G. S. G ., 3 38,414,481 Bevington. P. R., 24 Beyer, W. H., 7 8,319 Bias. 8 9 Bia, error, 1 80 Bonaelna, C., 474 Booth. G. W.• 4 14 Box, G. E. P., 2 4,114,129.162,204,229, 2 32.319,359,363.364,369-376, 3 80,386,414.415,419.432,438, 4 39.469,470,474,475 Box. M. 1 .,4 Box-Kanemasu i nterpolation method, 3 62-311,381,494 Box-Muller transformation, 126 Brownlee, K. A., 2 04, 2 26 Bryson. A. E., J r., 24 BUrington, R. 5 •• 78, 2 04,319,415 Butler, C. P., 4 75, 4 80 Cannon, J. R., 4 74 C uslaw, H. 5 .,474 C entral limit theorem, 6 4, 6 1,186 Cheby,hev's i nequaUty,62 Chl-squared t est, 2 68, 269 Cochran's theorem, 176 Coefficient o f multiple determination, 1 73-115 Colored errors, , tt Errors, correlated 495