* * $Id: mnhess.fdoc,v 1.1 2002/05/14 15:32:56 mf Exp $ * * $Log: mnhess.fdoc,v $ * Revision 1.1 2002/05/14 15:32:56 mf * Added Minuit source, plus other stuff from a year ago. * * Revision 1.1.1.1 1996/03/07 14:31:30 mclareni * Minuit * * #include "minuit/pilot.h" SUBROUTINE MNHESS(FCN,FUTIL) #include "minuit/d506dp.inc" CC Calculates the full second-derivative matrix of FCN CC by taking finite differences. When calculating diagonal CC elements, it may iterate so that step size is nearly that CC which gives function change= UP/10. The first derivatives CC of course come as a free side effect, but with a smaller CC step size in order to obtain a known accuracy. CC #include "minuit/d506cm.inc" EXTERNAL FCN,FUTIL DIMENSION YY(MNI) LOGICAL LDEBUG CHARACTER CBF1*22 C LDEBUG = (IDBG(3) .GE. 1) IF (AMIN .EQ. UNDEFI) CALL MNAMIN(FCN,FUTIL) Set up internal parameters and find starting value for minimum at that point. IF (ISTRAT .LE. 0) THEN NCYC = 3 TLRSTP = 0.5 TLRG2 = 0.1 ELSE IF (ISTRAT .EQ. 1) THEN NCYC = 5 TLRSTP = 0.3 TLRG2 = 0.05 ELSE NCYC = 7 TLRSTP = 0.1 TLRG2 = 0.02 ENDIF This just establishes how much work to do... it would be better to have an estimator and specify the accuracy needed (or some tradeoff) but then you have to know how sensitive you will be to errors, etc. That may have been impractical. IF (ISW(5).GE.2 .OR. LDEBUG) WRITE (ISYSWR,'(A)') + ' START COVARIANCE MATRIX CALCULATION.' CFROM = 'HESSE ' NFCNFR = NFCN Number of times the function has been invoked CSTATU= 'OK ' NPARD = NPAR C make sure starting at the right place This comment is a bit cryptic... Note, however, that unless some other routine modifies NPAR as a side effect, NPARD and NPARX are always simply NPAR CALL MNINEX(X) Applies inversion of limits to the internal external variables where necessary, to modify the corresponding external parameters. NPARX = NPAR CALL FCN(NPARX,GIN,FS1,U,4,FUTIL) FS1 is then f(u) NFCN = NFCN + 1 IF (FS1 .NE. AMIN) THEN Unless you call mnhess just after mnamin, this could be the case. But the implecation is that it would differ only by a small amount. DF = AMIN - FS1 WRITE (CBF1(1:12),'(G12.3)') DF CALL MNWARN('D','MNHESS', + 'function value differs from AMIN by '//CBF1(1:12) ) ENDIF AMIN = FS1 So in any event, amin = f(u) IF (LDEBUG) WRITE (ISYSWR,'(A,A)') ' PAR D GSTEP ', +' D G2 GRD SAG ' C . . . . . . diagonal elements . C ISW(2) = 1 if approx, 2 if not posdef, 3 if ok C AIMSAG is the sagitta we are aiming for in second deriv calc. sagitta means the difference between the quardratic curve and the (hyper)chord representing the linearization of f. Thus a small sagitta means we are aiming for a small inaccuracy in the quantity, in this case the second derivative (I think) xxxx AIMSAG = SQRT(EPSMA2)*(ABS(AMIN)+UP) AIMSAG is sqrt(2) e**.25 * ( | f(u) | + UP ) This is an indirect way to adjust the differnce size h (D in our case) to an optimal value for minimizing the error in the second derivative. The h which you would want is some factor (which I believe is about 2.5-5 depending on how big eps of f is) times epsilon_f^.25 times a characteristic curvature cx = (f/f'''')^.25. S is then D^2*g/2 which is {3-12} * sqrt(eps) times scale of f / curvature of f ^2 * cx^2. And now we make the key observation that the scale of f involves not only f(u) but also f at nearby points, and use |f(u)|+UP. C Zero the second derivative matrix NPAR2 = NPAR*(NPAR+1)/2 DO 10 I= 1,NPAR2 10 VHMAT(I) = 0. VHMAT is logically M(i,j) symmetric on i and j; the indices are internal indices but the meaning is "the external parameter corresponding to this internal index." The meaning of VMHAT ends up as the **INVERSE** of the Hessian matrix partial^2 f / partial i partial j but it does not consistently mean that internal to this routine. C C Loop over variable parameters for second derivatives IDRV = 2 This seems to be set to 2 and then never used, except to write out in an LDEBUG message. DO 100 ID= 1, NPARD I = ID + NPAR - NPARD xxxx A bit puzzing: how can NPAR-NPARD ever be non-zero? IEXT = NEXOFI(I) Work with the external parameter value IF (G2(I) .EQ. ZERO) THEN WRITE (CBF1(1:4),'(I4)') IEXT CALL MNWARN('W','HESSE', + 'Second derivative enters zero, param '//CBF1(1:4) ) WINT = WERR(I) WERR is the global parameter error, that is, the error in the EXTERNAL variable corresponding to internal index I. IF (NVARL(IEXT) .GT. 1) THEN That is, for a limit variable... CALL MNDXDI(X(I),I,DXDI) local scaling factor going from internal to external. Note that this is always positive, but safety first... IF (ABS(DXDI) .LT. .001) THEN WINT = .01 Let's see: You take the error in the external variable, and use this to figure the error in the internal variable. If the external variable goes up very slowly with internal, that would mean that either Pint is very near to pi/2 (that is, the variable is near to the limit), OR (b-a) is tiny. In those cases, you arbitraily set the internal error to .01. That can't be very smart; everywhere else it is dependent on Mii. Also, this means if I rescale a limit range to, say, .00001 - .00002 rather than 1-2, we profoundly affect the calculation! ELSE WINT = WINT/ABS(DXDI) This always gets back to the internal variable error, that is, to sqrt(Mii+UP), within the assumption that the error is small compared to the range. But if it not small, then there is a difference. ENDIF ENDIF G2(I) = UP/WINT**2 If f(x) is a parabola with minimum at x0, and f(x0+sigma) = f(x0-sigma) = f(x0)+UP, then d^2 f/d x^2 = m/sigma^2. ENDIF So for any parameter with indicated second derivative of zero, you establish the second derivative based on the value of error, which in turn was based on the diagonal Hessian element: WINT was the sqrt(UP*Mii) for free variables so G2(I) will be 1/Mii; with Mii taken from the previous iteration. XTF = X(I) DMIN = 8.*EPSMA2*ABS(XTF) This is 16*sqrt(machine epsilon) * |x| C C find step which gives sagitta = AIMSAG D = ABS(GSTEP(I)) GSTEP(I) starts out as .1 * the user-supplied error in the parameter, and evolves how ? xxxx DO 40 ICYC= 1, NCYC NCYC is 3, 5, or 7 depending on the strategy choice C loop here only if SAG=0. DO 25 MULTPY= 1, 5 C take two steps X(I) = XTF + D CALL MNINEX(X) Actually, since only this X(I) has changed, MNINEX is overkill; something like MNPINT is wanted. But since MNPINT is driven by the external variable and provides the internal value, we would want to go the other way. This is just an efficiency matter, however. NPARX = NPAR CALL FCN(NPARX,GIN,FS1,U,4,FUTIL) FS1 = f(x+step) NFCN = NFCN + 1 X(I) = XTF - D CALL MNINEX(X) CALL FCN(NPARX,GIN,FS2,U,4,FUTIL) FS2 = f(x-step) NFCN = NFCN + 1 X(I) = XTF Notice that at the end, X(I) is restored to what it was at the start. But U(I) is not, at least, until later. SAG = 0.5*(FS1+FS2-2.0*AMIN) AMIN was already the value of f(x) so the sag is the deviation of the chord from f(x-sigma) to f(x+sigma) from its linearization. IF (SAG .NE. ZERO) GO TO 30 If the sag is apparently zero, increase your delta by a factor of 10 (up to .51 in the limit case) and try again -- but no more than 5 tries. IF (GSTEP(I) .LT. ZERO) THEN This is a limit parameter, so a scale of 1 is natural IF (D .GE. .5) GO TO 26 D = 10.*D IF (D .GT. 0.5) D = 0.51 GO TO 25 ENDIF D = 10.*D 25 CONTINUE If you fall out of the loop without finding a D for which SAG != 0, ... 26 WRITE (CBF1(1:4),'(I4)') IEXT CALL MNWARN('W','HESSE', + 'Second derivative zero for parameter'//CBF1(1:4) ) GO TO 390 Just because one second derivative is zero, that does not mean you can't invert the Hessian. But it does mean that the Hessian is a symmetric matrix with a zero on the diagonal, which therefore can't be positive definite - and thus is useless to estimate the minimum of f! C SAG is not zero 30 G2BFOR = G2(I) G2(I) = 2.*SAG/D**2 GRD(I) = (FS1-FS2)/(2.*D) Gradient component in the I direction; this is exact if f is exactly quadratic. IF (LDEBUG) WRITE (ISYSWR,31) I,IDRV,GSTEP(I),D,G2(I),GRD(I),SAG 31 FORMAT (I4,I2,6G12.5) GSTEP(I) = SIGN(D,GSTEP(I)) which is signum(GSTEP(I)) * abs(D) So the step changes to that step required to get a non-zero sagitta. For a reasonable fucntion with a minimum here, that will be no change at all. DIRIN(I) = D Notice that the incoming value of DIRIN(I) has been discarded. YY(I) = FS1 This is f(x+D) DLAST = D D = SQRT(2.0*AIMSAG/ABS(G2(I))) For a quadratic, the sagitta is d^2 A/2 where A is the second derivative. The above just inverts that to find D such that the sagitta is AIMSAG. C if parameter has limits, max int step size = 0.5 STPINM = 0.5 IF (GSTEP(I) .LT. ZERO) D = MIN(D,STPINM) IF (D .LT. DMIN) D = DMIN DMIN, to remind ourselves, is 16*sqrt(machine epsilon) * |x| C see if converged IF (ABS((D-DLAST)/D) .LT. TLRSTP) GO TO 50 IF (ABS((G2(I)-G2BFOR)/G2(I)) .LT. TLRG2 ) GO TO 50 If either D **or** G2 is not changing much, break out. For the center strategy, TLRSTP = .3, and TLRG2 = .05. D = MIN(D, 10.*DLAST) D = MAX(D, 0.1*DLAST) No really radical scale changes allowed. If we have yet to break out due to a good enough D, repeat the loop up to 3, 5, or 7 times. 40 CONTINUE C end of step size loop WRITE (CBF1,'(I2,2E10.2)') IEXT,SAG,AIMSAG CALL MNWARN('D','MNHESS','Second Deriv. SAG,AIM= '//CBF1) C OK, when we get here one of two things has happened: If we iterated thru more than one value of D, then we have successfully found a value of D such that the predicted sagitta -- based on the previously calculated second derivative -- matches the actual sagitta found from stepping by D in each direction. (And that sagitta happens to be AIMSAG). So we know our second derivative is accurate enough. If we got here without having to repeat the iteration, then we have a value of D where the predicted sagitta based on the second derivative computed at the end of the last migrad step of the algorithm -- notice that GSTEP(I) was set to D -- is the previous value of AIMSAG, and the actual sagitta found by stepping by D is close to the new value of AIMSAG. Now the new approximation to second derivative g is twice the sagitta due to the incoming D (at x[N]) over the square of the incoming D. And since the new D equals the incoming D, this new D and the new g together satisfy the sagitta relation. So again we know the second derivative is accurate enough. 50 CONTINUE NDEX = I*(I+1)/2 VHMAT(NDEX) = G2(I) Which is the computed second derivative. Note that at this stage, VHMAT will be holding terms of the Hessian, NOT its inverse. 100 CONTINUE C end of diagonal second derivative loop That loop was done for each internal variable. CALL MNINEX(X) This call seems superfluous, since MNINEX was called earlier in the routine, and although each X(I) was changed a few times, each always ended up with its original value. However, when the internal variable was reset to its original value, nobody reset the corresponding external variable. SO this call is needed. C refine the first derivatives IF (ISTRAT .GT. 0) CALL MNHES1(FCN,FUTIL) Refines GRD and GSTEP, and forms DGRD. ISW(2) = 3 C ISW(2) represents the nature of the Hessian: 1 if approx, 2 if not posdef, 3 if ok But how do we know at this point that the 2nd derivatives were all positive? A negative sagitta will give a negative g2, and this is perfectly consistent with all code to here. DCOVAR = 0. Apparently, DCOVAR (in common) is 0 if the 2nd derivative matrix is invertible, 1 otherwise. It somewhat replicates the logical states of ISW(2), but not really. C . . . . off-diagonal elements IF (NPAR .EQ. 1) GO TO 214 DO 200 I= 1, NPAR DO 180 J= 1, I-1 XTI = X(I) XTJ = X(J) X(I) = XTI + DIRIN(I) X(J) = XTJ + DIRIN(J) Remember, at this point DIRIN(i) = the value of D that gave the target sagitta. MNHES1 did not alter that, though it did change GSTEP(i). We are about to do: d^2 f / dx dy = [ f(x + hx*xHat + hy*yHat) + f(x) - f(x + hx*xHat) - f(x + hy*yHat) ] / [hx*hy] The mixed derivative formula used has the merit of using only one fucntion evaluation per pair of indices. The appropriate value of h to use for this formula is hx = (3 epsilon_f * cxxy)**(1/3) where cxxy is a characteristic curvature of the form fxxy**2/fyyx. This value is proportional to epsilon**.25. The optimal value is proportional to epsilon**.33, for this same derivative method (we don't really want to do a less biased method because here we are doing O(N^2) function evaluations.) CALL MNINEX(X) Same comment as above about only (in this case) 2 X(i) changing, yet MNINEX does work for every internal variable. Again, this is just inefficiency. CALL FCN(NPARX,GIN,FS1,U,4,FUTIL) NFCN = NFCN + 1 X(I) = XTI X(J) = XTJ ELEM = (FS1+AMIN-YY(I)-YY(J)) / (DIRIN(I)*DIRIN(J)) NDEX = I*(I-1)/2 + J VHMAT(NDEX) = ELEM 180 CONTINUE 200 CONTINUE 214 CALL MNINEX(X) Again, needed since external variables were not all reset when the internals were. C verify matrix positive-definite CALL MNPSDF Potentially modifies VHMAT to try making it posdef. Forms PSTAR as the vector of eigenvalues. DO 220 I= 1, NPAR DO 219 J= 1, I NDEX = I*(I-1)/2 + J P(I,J) = VHMAT(NDEX) 219 P(J,I) = P(I,J) 220 CONTINUE P is now the symmetric matrix of VHMAT values CALL MNVERT(P,MAXINT,MAXINT,NPAR,IFAIL) IF (IFAIL .GT. 0) THEN CALL MNWARN('W','HESSE', 'Matrix inversion fails.') GO TO 390 ENDIF C . . . . . . . calculate e d m EDM = 0. DO 230 I= 1, NPAR This calculates sum(i,j) { del_i(f) del_j(f) / partial_ij (f) } EDM is documented in mnhelp as "Estimated vertical difference to minimum." But is this calculation doing that? For example, taking one variable f(x) = x^2, at x = 3, EDM would be 6^2/2 = 18. The potential improvement, however, is just 9. It looks like EDM is off by a factor of 2. However... This loop also sets VHMAT to TWICE the corresponding element of P. So it looks like VHMAT is coming out to 2*Hess^{-1}??? xxxx C off-diagonal elements NDEX = I*(I-1)/2 DO 225 J= 1, I-1 NDEX = NDEX + 1 ZTEMP = 2.0 * P(I,J) EDM = EDM + GRD(I)*ZTEMP*GRD(J) 225 VHMAT(NDEX) = ZTEMP C diagonal elements NDEX = NDEX + 1 VHMAT(NDEX) = 2.0 * P(I,I) EDM = EDM + P(I,I) * GRD(I)**2 230 CONTINUE IF (ISW(5).GE.1 .AND. ISW(2).EQ.3 .AND. ITAUR.EQ.0) + WRITE(ISYSWR,'(A)')' COVARIANCE MATRIX CALCULATED SUCCESSFULLY' GO TO 900 C failure to invert 2nd deriv matrix 390 ISW(2) = 1 DCOVAR = 1. CSTATU = 'FAILED ' IF (ISW(5) .GE. 0) WRITE (ISYSWR,'(A)') + ' MNHESS FAILS AND WILL RETURN DIAGONAL MATRIX. ' DO 395 I= 1, NPAR NDEX = I*(I-1)/2 DO 394 J= 1, I-1 NDEX = NDEX + 1 394 VHMAT(NDEX) = 0.0 NDEX = NDEX +1 G2I = G2(I) IF (G2I .LE. ZERO) G2I = 1.0 395 VHMAT(NDEX) = 2.0/G2I 900 RETURN END