Small-scale design issues in the Minimization package ----------------------------------------------------- 1 - How is it best to present the various sorts of number-collections that the user and program internals need to deal with. Examples are (coordinates of) external and internal points, gradients, lists of second derivatives, errors, scales, and so forth. There are two possible philosophies: a) Use vector for all of them. b) Provide distinct classes, e.g. Point, Gradient, InternalPoint, etc. to represent the distinct concepts. We have chosen (b) because it provides a form of "type-safety" ensuring that entities used as (for example) an Internal quantity or a Gradient are not an External quantity or coordinates of a Point. To quote a C++ design guru: "Are these separate concepts or are they interchangeable?" There is a sub-issue here. Given that we have Point, Gradient and so forth, is it best to try to avoid repetition of the very similar code implementing these, using inheritance? We came to the conclusion that this would be unwise because: (1) public inheritance would be wrong, because there isn't an "is usable as" relationship between these things. (2) private inheritance is possibly interesting; it is a technique for inheriting implementation but *not* for enabling polymorphic use. Lots of people hate private inheritance, however, and about half the repeated code is in ctors and the like, which would still need to be repeated. Given the decision to provide distinct classes, we still must commit to having ways that the basic user can avoid needing knowledge of them. One example: Our UserFunction class, which turns an ordinary C-style function into the Funtion object our Problems work with, needs to be constructible not only from a double f(const Point& p) but also from a double f(const std::vector& p). Finally, in naming these classes, we decided that since user code might see external classes but would never see internal ones, we should use the short/familiar names for the external classes, Thus the individual classes are Point, Gradient, ... and InternalPoint, InternalGradient, ... ---------------------------------------------------------- 2 - What is the best way to let the Function tell the Algorithm whether or not it can compute its own gradient (and similarly for second derivatives and hessian)? Clearly the natural pattern for this is that the (user's minimization function object) class, which is derived from an abstract Function class, can optionally override a virtual method gradient() if it knows how to efficiently compute its own gradient; otherwise the Algortithm is left to use its ingenuity in numerically computing gradient if it needs it. But how does the Algorithm know what to do? There are three ways to answer this question, but one just can't be made to work. a) The naive way (and the conceptually simplest to anybody coming from a procedural progamming background) is to require that the derived class provide, along with gradient(), an override for a method bool gradientAvalable(). In the base FUnction class, that would return false. A trap in this approach is that if a user Function overides gradient() but forgets to override gradientAvalable(), the program silently will use the Algrotihm's way of computing gradients instead of the user's. This is identical to the case in Minuit, where even if FCN can properly react to FCN(IFAG=2), this will never be called unless ISW(3) has been set by calling setGradient. b) A better (but unfortunately impossible) way would be to have the base class gradientAvalable() somehow detect whether the address of the virtual method gradient() matches that of the base class method. There is a technique to do this which works in gcc 3, but which exploits a non-C++-standard behavior to do so. c) There is a design pattern that lets you do this in a cool way: The Algorithm calls for the gradient by calling Function::grad(const Point& p, Gradient& g, void(const Point& Gradient&) callback) where callback is the Algoritm's method of computing the gradient. We chose approach (a) because of two problems with (c): i) In (c) there would be no way to ask whether the gradient() is available without actually calling it. We can imagine algorithms which might want to do different things depending on whether, for example, the analytic hessian is effciently available or not. ii) There is added apparent complexity; to the uninitiated maintainer, you are throwing in this pattern from left field with no obvious reason. If you look at the calling sequence in the algorithm, if (gradAvailable) f.gradient(p,g) else computeGrad(p,g) endif; versus typedef void(specificAL::* computegrad)(const Point&, Gradient&); Computegrad gradCallback; f.gradient(p,g,gradCallback); the former is easier to read and follow and understand. ----------------------------------------------------------- 3 - How should Domain provide ways to convert objects that non-trivially depend on derivatives of the transformation? There are two possible philosophies: a) Domain can have conversion methods to convert each specific type of objects needed (for example, a gradient) from internal to external coordinates and vice-versa. For example: convertGradientExtToInt (const Point& p, const Gradient & ge, InternalGradient& gi) b) Domain can have methods to provide particular derivatives, which the algorithms can then use to perform these conversions. For example, if every Domain were separable like RectilinearDomain is, then derivatievesExtToInt (const Point & p, TransformationDerivatives & d) There is a major problem with (b): For non-separable transformations (for example, spherical coordinates) the output type would have to be a matrix partial_i(f_j). And then the algrithm would need to deal with this general case, with added complexity just where you don't want it and gratuitous inefficiency. So we choose (a). A pair of minor side issues: It would be possible in principle to just have one method name, convert(), and disambiguate depending on the arguments. For example, convert (const Point& p, const Gradient & ge, InternalGradient& gi) convert (const InternalPoint &pi, Point & p) We decided that this is too cute and offers gratuitous opportunity for confusion; we chose instead to imbed the conversion type in the method names, but don't feel strongly about this choice. And there is some syntactic clarity advantage to making "output" types appear as return types, e.g. InternalGradient convertGradientExtToInt (const Point& p, const Gradient & ge) There was some feeling of potential efficiency disadvantage in this; also this would remove the degree of type-checking that comes from using the long convert routine names. And it is conceivable that sometimes one wants to compute two things at once. So we chose the form that provides output by argument rather than by return. Larger-scale design issues in the Minimization package ------------------------------------------------------- 4 - How is ownership of objects essential to the Problem expressed? The issue is that if we allow the user to pass a simple pointer (or reference to the Function, Domain, TerminationCriteria, or Algorithm to a Problem, then there is always the chance that by the time that object is to be used, it will be out of scope. The usual C++ idoim for this sort of situation is that you pass the problem an auto_ptr to the Function and so forth. But the user will occasionally need to directly interact with those objects. For example, he may want to call myFunc.useMOreDataPoints() after a few preliminary steps. In what form is this access provided? There are two approaches, which we illustrate via showing a use-case: a) {auto_ptr fap (new MyFunction); Problem prob(fap); } // Notice I can let fap go out of scope now prob.whateversteps(); {MyFunction* fp = dynamic_cast prob.getFuntion(); if (!fp) throw "Gee, I thought that I supplied a MyFunction"; fp -> changeSomething(); ... } // Notice that I didn't delete fp b) {MyFunction f; Problem prob(f) // can't let f go out of scope till prob does prob.whateversteps(); f.changeSomething(); } Given that approach (b) looks cleaner and more readable, why do we choose to supply the function via auto_ptr as in (a)? The trouble is that there is no way to guard against the user letting f go out of scope before the Problem is finished; and if that happens there will be a very tough to debug use of a wild pointer. In the end, though, what decides us on (a) is that among the community of C++ congniscetti and experts, conventional wisdom is to use the the idiom of using auto_ptr to express the notion of passing a resource (f) to an owning entity (prob). Given that, why should our community re-invent the anwer to that issue, or reject their consensus. 5 - Should Problem by a copyable type? The issue is really, should we give Problem all the boilerplate to be used in std:: containers. The worst hitch is in the copy ctor. The trouble is that once you have two Problems sharing the same Function instance, which one is to take care of deleting it? Rather than face this and similar issues, we just say that there is no copy ctor.