A Recommendation for Run II Physics Analysis Software

Eileen Berman, Philippe Canal, Frank Chlebana, Irwin Gaines, Herb Greenlee, Jeff Kallenbach, Rob Kennedy, Qizhong Li, Pasha Murat, Gordon Watts

 

Abstract:

This report contains the results of the Run II Physics Analysis Software Recommendations committee (PASREC). It contains a summary of the work completed by the committee as well as a recommendation for Run II analysis software.

I. Introduction:

II. Objective statements about candidate products

ROOT:

    1. ROOT is a complete, full-featured package that meets the functional requirements
    2. There are some trivial unacceptable features (use of CMZ, lack of build scripts) which should not be a stumbling block, but will require a formal collaboration with the ROOT team
    3. There is a large, world-wide user base, but so far limited use for serious HEP analysis
    4. ROOT can cope with the CDF and D0 data models
    5. ROOT has an effective internal data format well matched to HEP needs
    6. The present version of CINT is a potential serious drawback (buggy, undocumented, limited C++ features, hard to support, poorly engineered). This will require a decision to enhance/upgrade/replace, which would require significant work.
    7. The user interface is not very friendly
    8. The interconnectedness of the various modules is substantial. External modules must conform to (ROOT specific non-standard) ROOT protocols to be functional.
    9. The package is not highly engineered (i.e., it has grown organically rather than been designed). The current implementation reflects this evolution, for example, it has not kept up with the C++ language standard (has its own container classes, etc.) Even beyond CINT, the product has many bugs.
    10. It will require some relatively straightforward customization to support casual users
    11. There is an active and responsive support team with good archives and an active mailing list

NIRVANA:

    1. NIRVANA could become the core of a full package that meets the functional requirements, but it is not there yet.
    2. It has excellent plotting facilities and GUI.
    3. It is a well-engineered package, highly modular.
    4. The proposed full featured NIRVANA adopts a sound strategy of relying on standards and distinct components each with their own support (like PYTHON) to provide plug-and-play capability
    5. We would have complete control of NIRVANA development
    6. There is no large user community for existing components, and only limited use outside FNAL
    7. A minimum of 6-12 months would be required to include the scripting language, overall framework and HEP specific features
    8. There will be no extensive experience with the full version of NIRVANA before the start of run II.

Commercial products:

    1. licensing costs are unclear, especially for university collaborators
    2. dealing efficiently with very large data files is not yet demonstrated
    3. not optimized for HEP style analysis (concentrates on unbinned rather than binned distributions, doesn't support histograms as dynamic objects as needed for online use)
    4. Many attractive features, quickly getting much closer to our style of analysis: very nice scripting languages, good interfaces to FORTRAN and C++, excellent visualization, data models that support user defined structures, highly portable, etc.

Shareware:

    1. Octave typically lags one release behind the commercial alternative. This can be a problem since needed features may only be in the most recent commercial releases
    2. Octave does not support Matlab visualization, but instead relies on the gnuplot package, which is totally inadequate.

III. Detailed Product evaluations from PASFRG and PASSUMA criteria:

ROOT - See Appendix 1

Nirvana - See Appendix 2

MATLAB - See Appendix 3

IV. Conclusions:

Either ROOT or Nirvana could meet the functional and support requirements

V. Recommendations for Run II:

We recommend that ROOT be adopted as the standard physics analysis package for Run II, contingent on a collaborative agreement with the ROOT team. It should be recognized that this recommendation depends critically on timing and on sharing development with outside collaborators, and the steering committee should assess the validity of these assumptions in evaluating the recommendation. In particular, if the requirement for an immediate choice is being driven by on-line needs (which may not require the full functionality of an off-line analysis package immediately), it needs to be determined if the components of NIRVANA that already exist are adequate for the immediate needs.

VI. Long-term Recommendations:

It is highly likely that by the end of RUN II (or by the time of the LHC) that commercial components will be heavily used for analysis tasks. Commercial offerings should continue to be investigated and made available (perhaps on limited platforms). The Computing Division should also initiate formal collaboration with the LHC++ project so as to have some influence on the choices made and direction taken. These two initiatives, while lower priority than the immediate ROOT support and development needs, should position us to take full advantage of expected evolution of these products.

Appendix 1 - ROOT Evaluation

Run II Physics Analysis Software Requirements Checklist: ROOT

DATA ACCESS

DATA ANALYSIS

Scripting Language: C/C++ interpreted language (CINT)
 

User Control:
 

Data Selection:
 

 

Input/Output:

 

Numeric and Mathematical Functionality:

 

Offline Compatibility:
 

Prototyping:
 

DATA PRESENTATION

USABILITY

 

Run II Physics Analysis Software Support Requirements Checklist : ROOT

---------------------------------

PasSuma Checklist - ROOT

----------------------------

Contributors:
- Rob Kennedy     31-July-1998
- several additions by Pasha Murat (Aug 04 1998)

----------------------------

A) Support

1) Maturity and Completeness
a) What is the customer base and what is their experience and opinion? For
commercial software or non-HEP freeware, one should get a list of customers and
references.

RDK) 	The customer base is primarily HEP experimenters and support personnel,
with a number of experiments officially using ROOT as part of their data
handling mechanism or design. For example, here is a list of applications and
links to ROOT located at: http://root.cern.ch/root/ExApplications.html

	NA49 ROOT Physics Analysis Classes 
	ROOT Primer by Soren Lange 
	ROOT at GSI 
	More 3D visualization for the CMS Track Reconstruction Prototype 
	Clusterization in the CMS ECAL 
	The PHOBOS Analysis Toolkit 
	The E907 experiment 
	SAL Scientific Applications On Linux 
	The Cetus Links 
	The Rosebud Package 
	ROOT used for event monitoring in the Finuda experiment 
	ATLFast++, the ATLAS fast MonteCarlo 
	gh2root: Generates C++ classes to convert Geant3 KINE/HITS to ROOT 
	Direct Photons produced at RHIC energies 
	ROOT in STAR (large heavy ion experiment in Brookhaven) 
	The ALICE simulation/reconstruction framework 

PM) more customers:
 - STAR@RHIC decided to proceed with full-scale evaluation of ROOT as a
   CERNLIB replacement

 - CDF activities:
 - many physicicts are trying to use ROOT for analysis
 - prototyping of ROOT-based online consumers
 - simulation project is prototyping ROOT tools
 - CDF SVXII test stand is writing the data out as ROOT ntuples
 - CDF Karsruhe group is prototyping a ROOT-based interactive event display

b) How long has the product been in existence? What version is the product at?
How many major releases have there been? How often is there a minor release?
Several major releases or regular minor releases with integrated bug fixes are
good signs of a well supported mature product.  Availability of published books
on the product are also a sign of maturity as well an established customer 
base.

RDK) 	The ROOT project was started by Rene Brun late November 1994. His long
time collaborator Fons Rademakers joined the project around January 1995. In
the middle of August 1995 Nenad Buncic joined the team, followed by Valery Fine
in December 1995. Masaharu Goto created and supports CINT, including the ROOT
variant of CINT, RINT.

RDK)	The current version is 2.00/09 (ROOT's style of version numbers). There
have been two major releases that I know of (1.0 and 2.0), and patches appear
about every 2 to 4 weeks. The patches appear to be roughly equally divided
between developer-realized issues and user defect reports/feature requests.

c) How long will the product survive? Are there any competing products that are
likely to win the market (including freeware). Who is the product developer and
are they well supported financially (graduate student or full time staff).

RDK)	I recall that Pasha had a statement from Rene which was a committment to
support ROOT until ?. It is unlikely that commercial packages will replace the
demand for this product. Afterall they have not succeeded to date, and they
predate ROOT. With less certainty, in my opinion, it is unlikely that freeware
packages will replace ROOT either, since ROOT offers significant functionality
not found in other packages, such as interactive object browsing. ROOT is
developed by the "ROOT team" (see above) which consists of at least two
full-time developers, two part-time developers, and a number of specialists
working on specific aspects of ROOT as time permits.

PM) ROOT team has an excellent record and many years of experience
with HEP software. R. Brun was the leading developer of CERNLIB, F.Rademakers
for several years has been maintaining CERNLIB, V.Fine ported CERNLIB to PC/Intel
architecture (DOS/Windows 95/Windows NT). The team is very productive.

???) Financial backing
2) Who supports users
a) Who provides consulting support? Commercial, other Lab, CERN, Fermilab?  Are
they responsive? Newsgroups and dejanews may provide some information on 
support response (though these tend to be biased). This is rather subjective 
and should be treated as such.

RDK)	Consulting support is primarily provided by Rene Brun and Fons
Rademakers, with FNAL local consulting unofficially provided as time permits by
Pasha Murat. In everyone's opinion I have spoken with, the ROOT support
turn-around after contacting them is good to outstanding. To get a better idea
of the support activity, see the ROOTtalk e-mail archive at:
http://root.cern.ch/root/roottalk/AboutRootTalk.html

b) Who can get support? Particularly for commercial software, can any user of
the product access the support services or are these limited to a pre-specified
list of local contacts.

RDK)	Anyone can get support. There is no requirement to sign up or provide
personal information before one can post to ROOTtalk, though one can do so to
receive the e-mail and responses themselves in your e-mail browser. I prefer to
use the WWW interface to ROOTtalk myself. Presumably, if ROOT is select by 
FNAL, then some local support apparatus might be set-up to help alleviate the support
burden on the ROOT developers (as we have done with Kai, with C++, and other
topics).

c) Is the use of the product in the community enough that there is a pool of
people/knowledge to draw from for support if needed? HEP use should be 
assessed; PAW knowledge in the HEP community is widespread and Root is growing. A
dedicated newsgroup would be a plus.

RDK)	Yes, in my opinion, there are enough users of ROOT on different
operating systems (Unix and Windows) to provide a knowledge base outside of the
ROOT (support) team itself. ROOTtalk acts like a newsgroup, though the exact
mechanism is different (http://root.cern.ch/root/roottalk/AboutRootTalk.html).
ROOT in many ways can be used as an overhauled PAW, though the PAW model of
data analysis does not seem to lead to the most efficient use of ROOT.

PM) ROOT user community is by 99.9% HEP community. Practically in all the US 
HEP laboratories including BNL, LBL, LLNL, FNAL, SLAC there are physicists using 
ROOT. There are ROOT users at CERN and in Russia, again - in HEP community.
From the point of view of accumulated knowledge of the product, BNL and FNAL are 
already capable of providing the local support. 

d) Is user training needed and available? What is the cost?

RDK)	In my opinion, user training is needed, though there are many documents
available to help users understand and use ROOT. I think a five page tutorial
on how to start, interact with, and quit ROOT would be a good complement to
existing documentation. Also needed is English documentation on ROOT CINT, as
well as one page tutorial on what is known to work and what fails with CINT.
The cost of the five page tutorial would be very small. The cost of the CINT
documentation may be an FTE week or two of someone's time who is familiar with
CINT.

PM)   Many ROOT commands have their PAW counterparts
(hist/plot vs hist-Draw(), for example), so PAW users adapt to "ROOT philosophy" 
pretty easily. New commands/tools require more training, mostly in C++ itself.
I also heard comments that using CINT makes it much easier for physicists to make
their first steps in learning C++.

e) Is training required and available for support staff? What is the cost (time
and money)? For commercial products such as SAS and IDL, support and user
training may be required to optimally use the product, and that cost should be
folded in.

RDK)	No specific training is required for support staff. ROOT includes
documentation of its internal data formats and implementation classes (not
visible to the user). Some of the mechanisms in ROOT are non-standard
(especially RTTI), and will a few FTE weeks to document more completely for the
support staff.

f) How much (local) support will be required (is it complicated and hard to
use)? This and the remaining questions in this section can be determined by
talking to current users or scanning any newsgroups, mailists or FAQS.

RDK)	This depends heavily on how we plan to improve ROOT and CINT locally,
and how many of its limitations will be fixed by September 1 or we will simply
accept. Many new users have been productive with ROOT in a few days to a week,
but many heavy PAW users have found the transition in thinking to make using
ROOT seem complicated, tedious, and bug-prone. I think that once ROOT is
selected and a much large number of users have adapted to it (provided examples
of analyses for others to use as examples), then the need for local support
will predominantly be for adapting ROOT to OS/compiler combinations not
supported officially by the ROOT team, and to add staff to the ROOT team to
handle uncovered defects and implement new features. Perhaps local support
could start out as dedicated ROOT testers to have teh most impact on ROOT's
quality.

g) For commercial or freeware, what kind and quality of user level support is
provided?

RDK)	Via ROOTtalk, individual users interact via e-mail directly with Rene
and Fons. One does not always get an instant fix, but one does get a thoughtful
intelligent reply to your e-mail. In some cases, other users who know the 
answer will step in and answer your e-mail.

PM) there are 2 mailing lists - ROOTTALK and ROOTDEV - the first intended for
    general discussion, the second one - for bug reports. Most of the users
    use the 1st list for all the purposes.

h) Is the software completely and well documented at the user level?

RDK)	In my opinion, the software is well-documented as to what it *should*
do, but not necessarily as to what it is known to be capable of doing right
now. This is in part due to the development style of the ROOTteam, which
emphasizes the goal functionality without listing what subset of this has been
fully "certified" as operational. The documentation is not as comprehesive and
reference-oriented as some commercial documentation I have seen, but it ranks
very well against other freeware package documentation.

PM) the ROOT software is extensively documented, the documentation system 
    is source-based, in this sense it is more developer-oriented. What could
    be improved is the documentation for the beginners (including non-experienced
    C++ users)

i) Is a system manager required in order to install and/or maintain the 
package? If so, this would be significantly complicate matters for some remote 
users who do not have ready access to (or a friendly relationship with) the system
managers of their computer.

RDK)	A system manager is not required to use this, but we probably will
distribute this from FNAL though the UAS UPS/UPD model, which implies that a
"products" support person will probably install the UPS root product on
machines. Some machines include alternative "products" area administered by a
normal user, bypassing the requirement that someone have access to the
"products" account. ROOT is compatible with this approach too.
3) Licensing
a) What types of licensing are available?

RDK)	There is only one license, making ROOT free for non-commerical use.

b) What is the cost? For Universities? Lab?

RDK)	ROOT is free for non-commercial use for everyone. If you want to pay
for it, I am sure that the ROOTteam will accept donations. They seem to be very
willing to accept computer accounts on machines with compilers that allow them
access to different OS/compiler combinations (cdfsga and Kai C++, for example).

B) Maintenance

1) Who provides it and how much
a) Who provides maintenance both local and external to the Lab? What are the
fallbacks (if the maintainer(s) is run over by bus or the company folds)?

RDK)	Currently the ROOT team and associates provides all the maintenance.
There is no reason to believe users (FNAL, for instance) cannot contribute to
maintenance and have changes rolled back into the ROOT repository. For now,
however, one must learn CMZ to do this, which inhibits users from working with
the source code to overcome locally discovered problems. I tried to get ROOT to
work under Linux2 with KCC v3.2 (local build with debug symbols) and made
little progress.

RDK)	Since the source code and build procedures are available (though we
would like to see them moved out of CMZ with kumacs and into CVS with
makefiles), anyone can provide maintenance. For now that is almost entirely the
ROOT team and associates. If Rene and Fons were on an ill-fated airliner, a
collaborative mainenance team could be formed which would function, like the
EGCS compiler development "team", using a world-readable CVS repository.
Clearly this would be different from having two or more maverick coders turning
out 100 new lines per day (my groundless guess), but the product would survive
the transition.

PM) Users from BNL started actively contributing into the code distribution.
 - S.Adler(BNL) generated rpm's for i386 and Alpha linux'es
 - D.Morrison (BNL) generated ROOT distribution tar-file based on GNU
   configuration tools - autoconf, libtool, and automake.


b) Are the maintainers responsive and are bug fixes turned around in a
reasonable amount of time?

RDK)	In my and other's opinion, yes. Some defects unrelated to core
functionality take longer to get fixed, but this is a reasonable choice on the
part of the ROOT team.

c) Does the software maintainer need additional training (beyond that needed by
users). If so, is it available and at what cost.

RDK)	Right now, today, a maintainer must learn CMZ. I have done it with help
from Pasha, Pasha has done it, and we would not wish such on our fiercest
competitor. With a move to a CVS repository and makefiles, this burden will be
eliminated. ROOT is a diverse package, though. It includes elements of 
Graphics, HTML, Postscript, data structures, complicated RTTI, statistics, and 
basic data presentation. No one person here is likely to be able to cover all those
subjects at an expert level and maintain 100% of ROOT. It will cost some time 
to familiarize those expert in a subject with the source code in ROOT related to
that subject.

d) What is maintenance/licensing costs for commercial products?

RDK)	Maintenance is free. It would not hurt to give them an account on your
machine if you are working with an OS/compiler version to which they do not
have ready access.

e) How much software is there (line count)? How much needs to be supported
locally (how many people required)? Can/should support be split up into areas 
of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant
for non-commercial software that will be maintained locally.

RDK)	"The ROOT system consists of about 480,000 lines of code (390,000 lines
C++ and 92,000 lines C). The C language is used in CINT and in pieces of public
domain code that perform specific functions like, terminal I/O handling
(Getline), data compression (Zip) and the 3D interactive interface to X/Windows
(X3D)." Also, much of the C code is the result of translating F77 (MINUIT,
Simluation packages).

RDK)	The support can be split up into areas of expertise fairly easily. All
of the maintainers would have to understand at some level the basic
infrastructure: memory management, data structures, IPC services, and so on.
Beyond that, the modularity in ROOT is based on high-level areas of expertise.
Here is a text translation of the "ROOT System Tree" to give some idea how this
is organized. Roughly each subject below has its own library of classes.

                            NA49
                             |
                       RINT (ROOT CINT)
                     CINT C++ Interpreter
                    /                 |
  Detector Description       User Interface Components     Minimization
        |                                       |            |
  Geometry Rendering\                           Formula Evaluation
          |          \---\                              |
  Style Management        Containers                 Ntuples
    |            |              |                       |
3D Graphics      |          Object I/O                Trees
    |            |              |                       |
2D Graphics      |              |                 /Histogramming
 |     |         |              |       /--------/      |
 |  Postscript   Object Runtime Services            IPC Services
 |     |         |              |                       |
X11/Windows/Mac Interface     Memory Management     OS Interface

f) In the case of commercial software, is source code available (in escrow)?
This would be required for finding bugs locally or in case the company folds.
This may be an additional cost.

RDK)	Source code is available, although it is maintained in a CMZ repository.
2) Maintenance Infrastructure
a) What kind of build environment is provided. Is it robust? This is mostly
relevant for non-commercial software that may need to be co-maintained.

RDK)	The maintenance and build environment is CMZ (CERN Patchy combined with
CERN Zebra). It is robust, but very clumsy and tedious to use and arcane in its
user interactions. In some cases, I have had to put a a symlink to the Kai
compiler in order to get CMZ to recognize where it is. Surely there is a way to
avoid this, but the symlink was faster than finding and reading CMZ
documentation. This is completely unacceptable, and would not be too difficult
to change (just time-consuming).

b) Can the package be built AT ALL on new or different sub-systems? Root still
provides NO makefiles.

RDK)	ROOT is beginning to include makefiles for some selected OS/compiler
combinations as of 2.00/09, but not many. Once one learns CMZ, and is willing 
to
edit code within CMZ <<, then one can adapt ROOT to new or different
systems. Once ROOT moves to CVS with makefiles, this will be much, much easier.

c) Is the source repository accessible so that local support persons can select
which changes to accept and which to reject for local use?  Root still uses 
CMZ.

RDK)	The primary source code repository is not available to the general
public. One can, if one knows where to look, get a *copy* of the CMZ file
containing all the source and build procedures for a particular version ROOT.
One cannot tell from the filename, however, *which* version of ROOT a 
particular
CMZ file contains (a convention problem from the ROOT team). This should all be
changed to an open CVS repository as is used for Egcs compiler development.

d) Will the software have to be maintained and/or extended locally and
externally? If so, can the software be maintained in a common repository. If
separate repositories, what is the commitment to keep them from diverging from
modifications, extensions and bug fixes.  This excludes locally maintained
extensions which use pre-defined APIs or hooks into the product, which we will
have to maintain ourselves in any case.

RDK)	A small FNAL ROOT team might port ROOT to new OS/compiler combinations
that the ROOT team does not have access to and make high priority modifications
(fix shared memory access for online monitoring programs during data-taking).

RDK)	We should develop a model with the ROOT team that is somewhere between
a single shared repository (Egcs model) and a sub-ordinate repository where
changes are fed back to the "central ROOT team" for consideration for inclusion
in the master repository (CLHEP - FNAL Zoom model). I do not think we should
allow or tolerate divergence in separate sibling repositories (BaBar - CDF
Framework collaboration model) because resyncing the ROOT repositories will be
an overwhelming task which will make permanent divergence seem like an economic
alternative. ROOT as a product does not exclude any of these models once it is
moved to a CVS-based repository. For now, maintenance and development
collaboration with CMZ as the repository does not seem very economical as all
local personnel would have to be trained in using CMZ effectively in a
collaboration.

e) Is the software passed through quality assurance software such as Purify or
Insure++ before being put into production?

RDK)	To my knowledge, ROOT has not been "Purifyed" or "Insured". It is clear
that ROOT leaks memory, for instance. Its own statistics show that clearly.
ROOT sessions end with a memory allocation/de-allocation histogram which
documents large number of memory leak, some of considerable size. I do not know
how ROOT will be judged by a C++ 

f) Are there any restrictions that would prevent the product from being placed
in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to
support more than one version of the product on the system.

RDK)	No, there are not. I am doing exactly this with ROOT v2.00/08 in support
of CDF's Event I/O facilities. ROOT depends on a single environmental variable,
ROOTSYS, to indicate the base of its internal file system. LD_LIBRARY_PATH is
sometimes required to be set on some OSs. This is very easy to implement in the
UPS/UPD framework.

g) Are release notes and change lists provided with releases? For example, the
commercial product IDL comes with "what's new" and release notes lists.

RDK)	Yes, there are good "CHANGES" files available on the web to allow users
to determine if a desirable patch is in a new version before downloading and
installing a new version of ROOT. They can be found at:
http://root.cern.ch/root/Availability.html underneath the version numbers at
the top of the page.
 
3) Maturity and Completeness
a) Are there active mailing lists/FAQs/newsgroups for the product? How do they
reflect on the product? Root has a support list, the commercial product IDL has
FAQs and a newsgroup.

RDK)	ROOT has an e-mail support list ROOTtalk which includes a search engine
on the e-mail digest, http://root.cern.ch/root/roottalk/AboutRootTalk.html.

b) Are recent releases extensions/enhancements and not bug fixes?

RDK)	My impression from reading through the CHANGES files for the last few
minor releases is that there is roughly a one-to-one ratio of added
features/classes and reactive modifications. IBy reactive modifications, I mean
changes to existing code which does not add functionality, such as defect
fixes, run-time improvements, and minor design-related changes.

c) Are product releases reasonably paced and useful?

RDK)	There have been two major releases that I know of (1.0 and 2.0) in the
last 15 months, and patches appear about every 2 to 4 weeks. The patches appear
to be roughly equally divided between developer-realized issues and user defect
reports/feature requests. Depending on the features in ROOT which you exploit,
the patch releases may or may not be immediately useful to you.

PM)  For example, this spring a concept of multifile tree has been introduced, 
     in the coming release (2.11) we expect to have a new LaTeX interface for 
     writing the formulas
 
4) Modularity:
a) Will the tool/software need to be upgraded (additions/replacements) to
satisfy Run II functional requirements, and how difficult will this be? Does 
the product provide API/hooks to easily interface locally written extensions? Would
existing support be able to handle this or would manpower need to be added? For
example adding a command line (e.g. Python) to Histoscope is thought to be
difficult. Anticipated additions and replacements should be identified.

RDK)	To meet the functional requirements, ROOT will at least need some work
done on CINT to make it more standards-compliant and more robust. A complete
C++ interpreter to replace CINT would probably require acquiring a C++
front-end (from EDG, US$60k) and applying roughly 2 to 4 FTE years of effort.
More modest approaches to improving CINT would require less effort, but are not
now well-defined. *** This answer should be better developed ***

RDK)	ROOT provides enough hooks to allow locally written extensions to be
used with ROOT in most cases. It would still take a fair amount of effort,
however, to replace an existing ROOT "module" like Linear Algebra with a
locally developed solution, due to ROOT RTTI expectations and ROOT module
interdependence.

PM) There are "2-way" hooks available: 
- ROOT  C++ classes could be used in the offline/online code, 
- the existing offline shared libraries culd be loaded in dynamically
  and be used within the ROOT interactive framework

b) Is the product modular? Is it broken down logically and physically into
reasonably distinct sub-systems? Can some of these sub-systems be replaced by
external packages with the same functionality? For example, in Root can Linear
Algebra or Minimization be done by packages developed specifically to solve
these topics separately from Root? A mini code review should be performed on 
the package to determine what would be involved in replacing such an identified
component or sub-system.

RDK)	ROOT is very modular, but some modules are also fairly interdependent.
It would be very difficult to remove some ROOT modules and re-use them outside
of the ROOT framework (which is not what is meant here, of course). It would
take a fair amount of extra effort to take a module like Linear Algebra and
install a replacement which has all the "ROOT-like" functionality as the
original, especially in the context of ROOT RTTI which permits interactive
object browsing. Another issue is that the ROOT Linear Algebra module has an
interface which must be preserved for other ROOT modules to continue to
function, thus probably requiring a replacement to use some interface adapter
layer before installing it.

c) If new functionality needs to be added, is the software sufficiently modular
such that the code changes can be localized? For example, it is believed this 
is not the case for adding STL support to cint. A mini code review should be
performed to assess whether extensive and/or destabilizing code changes would 
be required to add the functionality.

RDK)	Most of ROOT, in my experience, appears to have fairly localized
functionality, allowing extensions to be fairly easily added. CINT is the
obvious exception. CINT is poorly and irregularly organized compared to C/C++
compilers, lacking distinct parsing, symbol table management, and action code.
Further its fundamental design is flawed in that the parser itself is the wrong
variant to handle C++ syntax efficiently, requiring the syntax to be implicitly
expressed in coded procedures instead of an easily editted, conceptually clear,
grammar.

d) Is the software sufficiently modular to be such that bug fixes are 
localized? A mini code review focused on a particular section or component of 
the software should provide information on this.

RDK)	Most of ROOT, in my experience, appears to be sufficiently modular,
allowing bug fixes to be fairly easily added. 

e) If a component needs to be replaced in it's entirety, is the software
sufficiently modular such that a new component can be slotted in with minimal
disruption? For example, Root depends on functionality in cint other than the
interpreter in a fundamental way.

RDK)	Due to ROOT RTTI expectations, it would not be trivial to drop in
module replacements, but neither would it be technically difficult. The
exception to this is CINT, which has a "broader interface" to ROOT than a
simple C++ interpreter. CINT passes additional information about objects to the
rest of the ROOT infrastructure that a "traditional" C++ interpreter would.
Also, it is not clear that the ROOT-CINT interface is well-documented.

PM) It is important to understand that the dependence discussed provides many 
unique features not available in other packages,  for example, ROOTCINT is used 
for automatic generation of dictionaries, which makes it trivial to hook up 
any external code.

f) What distinct (external) packages or interfaces are required to build and/or
run the package... Motif (shared libraries), OpenGL, etc.  Are there external
software components that are out of the maintainer's control? LHC++ depends on
numerous commercial packages, Nirvana/Histoscope on motif. All such packages 
and interfaces should be identified.

RDK)	No external packages are required to build and use ROOT, though OpenGL
appears to be capable of being used with ROOT. ROOT does not require Motif, 3D
X11 packages (one is supplied), or any other commercial/freeware package to be
supplied by maintainers or users. ROOT does supply, integrated into its source
tree, several freeware packages which it does use.
 
5) Portability
a) Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a
specific Run II platform isn't supported, what would it take to get support for
it should be determined.

RDK)	All Run II platforms are supported by ROOT. We (Pasha) have asked the
ROOT team to support for the Kai C++ compiler, and they have done so with our
providing accounts for ROOT developers on appropriately equipped platforms. 

b) Porting of code to new platforms; this applies to non-commercial software
that is currently only supported on selected systems. The issue to be raised is
the ability of the original developers to accept changes to be incorporated 
into the base code so any porting done here is done once (aside from effects of
future OS upgrades) and does not have to be re-done with each release of the
software.

RDK)	Because ROOT is already supported on many more OSs than are being
considered for Run II, I do not think porting to new OSs is a potential
problem. Since they have already ported ROOT to a (relatively)
standards-compliant compiler, I do not think that porting ROOT to a new
standards-compliant compiler is a potential problem.

c) How sensitive is the package to minor OS and/or compiler and/or system 
header variations? Root under Linux may be sensitive to which C-libs are in use, 
which distribution you are using, which kernel you are using, which system header
patches you have applied (esp., cint), and so on. PAW is sensitive to OS
upgrades. This can be determined by looking at support history in newsgroups,
other support logs, or talking to the user community.

RDK)	This sensitivity of ROOT to minor OS/header changes was definitely a
problem with ROOT v1. Since then, ROOT has adapted to the same Linux
distribution that FNAL has chosen to support, and distributes ROOT for old and
new versions of the Linux C library. This does not mean that changing an
important system header will not affect ROOT, just that we do not now see a
problem with the changes I recommended to the FNAL Linux distribution.

d) 64 bit considerations: Does the software run on 64 bit platforms/OS
(alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64
bit systems (a la COMIS for PAW).

RDK)	The C++ code in general should be 64-bit clean. I do not know if the C
and F77-converted-to-C code is, but I suspect it is or can be easily adapted to
be 64-bit clean. I wonder about the implications though of no longer being able
to convert ntuples from HBOOK to ROOT since Zebra does not function on 64 bit
systems. Note that while Dec Unix and SGI IRIX are both 64 bit systems, we run
both with 32 bit pointers (Dec) or in 32 bit mode (SGI) largely because Zebra
does not function correctly on a complete 64 bit system.

PM) The most system-dependent part of ROOT system - CINT - has been ported to 
64-bit IRIX architecture at BNL. 

e) Are there Endianship and other heterogeneous environment considerations?

RDK)	ROOT currently only support big-endian IEEE (IEEE floating point)
files. That is a reasonable choice during their rapid development phase; Trybos
at CDF has made the same choice. Nevertheless, we should require that ROOT
support little-endian IEEE files in the future to improve performance on
little-endian systems, such as Linux/Windows based on Intel chips and Dec Unix
based on Alpha chips.

f) Is the software product build dependent on a specific compiler or is it
compiler independent? If compiles are not needed for the product, are there any
compiler dependencies present in the API used for locally written extensions? 
In particular, if the software needs to be built with Run II compilers, it should
be verified if it can.

RDK)	The ROOT build procedures are mildly dependent on the compiler in use,
but not much more so than any other C++ product. Afterall, different compilers
have different switches to express similar concepts. ROOT libraries, because
they are built from C++ code, are specific to a particular C++ compiler and to
certain switches (exceptions on/off with Kai C++, threads on/off with MS VC++).
ROOT memory management is known to fail or misbehave on some OS-compiler
combinations for various reasons such as a compiler not allowing overload of
global new.
 
6) Standards:
a) Are standards followed. Compiler standards? Library standards (e.g. STL,
POSIX). Are they fully supported (e.g. cint and STL)?

RDK)	ROOT is relatively standard C++, and hides vast OS dependencies in OS
interface modules (for Unix, Windows, Mac, etc). Currently, CINT has difficulty
dealing with many advanced C++ features, however, and so certainly cannot be
labeled standards-compliant.

b) Are there any support and maintenance standards or procedures? For example,
any control over what goes into releases?

RDK)	The ROOT team is small, and explicit written procedures for support and
maintenance do not seem appropriate.

c) Are good coding practices (documentation) followed? Is there good 
developer's documentation (how easily can the product be "taken over"?)

RDK)	ROOT follows a C++ style code which is reasonable and which they have
published: http://root.cern.ch/root/Conventions.html. The documentation in
general is vast, though several pieces are missing. A short 5 page "What to do
when you first use ROOT" tutorial would be invaluable since users first
experience with ROOT is through a clumsy command line interface (ROOT CINT)
which has an not obvious command language ("How do I QUIT ROOT!"). Also missing
is significant English language documentation on what language (hopefully a
proper subset of C++) CINT does support.

d) Are good "Computer Science" techniques and methods used, for example in a
language interpreter (see below)?

RDK)	ROOT CINT is not an good example of Computer Science techniques. Its a
long story, and perhaps Chih-Hao and Scott Snyder can contribute to fill this
in from their PasFrg/PasSuma talks.

e) Is there a design methodology applied? Are any design tools used? (such as a
code generator). If so, do we need to have and/or support these tools?

RDK)	No higher level tools appear to be used in the development of ROOT.
 
7) Reliability and Security:
a) Is there any security maintenance concerns with the product?

RDK)	There are security concerns only if we use one of the ROOT data server
programs. These might be attacked to at least deny service, at worst to damage
or alter data. ROOT developers may not be aware of buffer overruns within ROOT
which could be exploited to at least "do anything" on a system which the
"owner" of the server daemon has permission to do, at worst to "do anything" on
a system, period. If we do not use ROOT server daemons, then the only security
issues might be with IPC services used, but these are minor concerns related to
whom on a system can access the data (in shared memory, on a socket) of another
program.

b) Is the product likely to crash and if so, how does it recover? What would be
the impact? Do system managers need to intervene?

RDK)	ROOT does not in general use a master server daemon, so crashes do not
generally involve system managers. ROOT has some facilities to recover from
crashes, while a datafile is open for instance. Perhaps Pasha could elaborate?

PM) ROOT output buffers are regularly (after each 1 GByte written, for example) 
flushed out. The autosave frequency is defined by the user. Autosave doesn't affect 
efficiency of the disk space usage. A datafile recovery procedure equivalent to that 
of HBOOK ntuple recovery is available.

c) Any government regulation applied to the product? Export restrictions?

RDK)	No, there are no US regulatory problems to my knowledge. I cannot speak
for other countries represented at FNAL.

d) Are there any Y2K issues?

RDK)	ROOT does use a ROOT-specific time/date format (why oh why?), and it
should be checked for Y2K safety.
 
8) Application specific:
a) Will an interface/adapter need to be made to fit the tool in with the rest 
of the analysis tool framework (e.g. data import/export). Does the product provide
well defined and documented API/hooks for such an extension?

RDK)	No specific adapter is required if one uses ROOT as a physics analysis
tool. ROOT has enough hooks to allow user-developed formats to be read/written
which are compatible with ROOT data browsing at some level. CDF for instance is
attempting to use these hooks to write ROOT-compatible event data files.

b) Does the product use a language interpreter. If so does it support the full
language. Is it written following correct computer science techniques and
algorithms? Is it sensitive against language changes, etc.  (examples are COMIS
and cint, which support only subsets of their language). Or, turning the
question around, is the language computational complete? (Then forget about the
language it was trying to emulate and treat it as completely new language).

RDK)	ROOT uses a C/C++ interpreter called CINT. CINT supports a subset of
the Standard C++ language, but it is not well-documented *what* subset of C++
is supported. Since it is not specifically documented what language is
supported, it is impossible (if I understand the statement above) to say the
CINT is computationally complete. It is not written using modern Computer
Science techniques. For instance, there are no distinct parsing, syntax
checking, and parse tree navigation phases. There is no easily-editted
concpetually-comprehensible grammar description.

RDK)	One of the marketing statements from ROOT is that there is "only one
language for users of ROOT to learn". This has clearly not been achieved. COMIS
was not modern Fortran, and CINT is even less modern C++. Users must learn not
only the feature of C++ which are not supported (or are poorly supported) by
CINT, like templates, STL, C++ RTTI for dynamic dispatch, but they must also
remember the C/C++ expressions which ROOT cannot handle (like ***p++, which in
fact is not so absurd for certain coding styles). One cannot in general take
"sophisticated" C++ code and run it with CINT, and sometimes "simple" C++ code
fails too.

c) Is the data processing model correct? For example, in MATLAB, the model is 
to read in a file then process all its data - will this work with Run II sized
files.

RDK)	ROOT supports both extremes of data processing: read all data and work
on it (efficient for small data sets), and read only a piece of the data,
process it, and then do it again with the next piece of data. The user has
control over, at least at the data model design level, where in between these
extremes an analysis job will operate.

PM) Conceptually ROOT continues the development line we refer to as to "CERNLIB"
and it has very good chances to become a successor of the CERNLIB. 
STAR collaboration at RHIC started a full scale evaluation of ROOT not only as a 
PAW, but as a CERNLIB replacement.

d) Are there restrictions on the input data (say size or format)? For example,
in Histoscope the size of ntuples is limited because it is stored and processed
in memory.

RDK)	Input data must be in ROOT format, or have some code translate the data
on the fly into ROOT format. There is no practical limit to my knowledge on
input data size, but this may depend on the data model in use. ROOT exchanges
input data between disk and memory as needed. ROOT may be limited, depending on
the data model, to the virtual memory size of a machine for a piece of an
event/histogram (branch), for an entire event/histogram, or for a ROOT I/O
buffer. Given the cost of memory nowadays, I do not consider such limits to be
of any concern.

e) What is the minimal environment to run the product? How does
performance/capability scale up while the environment scales up?

RDK)	This may vary between Unix and Windows. For Unix, one needs simply a
supported system, 

???)	For Windows?

Appendix 2 - Nirvana Evaluation

Run II Physics Analysis Software Requirements Checklist: Nirvana

DATA ACCESS

  1. Access rates (online):
  2. Access rates (offline): 3 times HBook.
  3. Serial vs random access: both are available.  (random is in log(N))
  4. Granularity of access: Column.  Size of column chunk customizable at creation (63K)
  5. Foreign Input and Output Formats: API is available for this.
  6. Specialized output formats: All information about objects are retrievable from API

DATA ANALYSIS

Scripting Language:

A python interface will be added using these items in the design process.  (3 man months needed)
 

  1. full featured scripting language
  2. analysis tool's object model
  3. extract data from events
  4. express complex mathematical expressions
  5. debugging facilities
  6. interface the scripting language to dynamically linked compiled high level languages
  7. User Control:

    The Python interface will provide those.
     

  8. control functions
  9. Mathematical operations
  10. Results of analysis available to users
  11. command line recall and interacive command line editing.
  12. Data Selection:

    The Python interface will provide those.
     

  13. program selection criteria using extracted data
  14. display selection criteria as text
  15. Input/Output:

     

  16. support its own object I/O format:    YES
  17. allow its own format object files to be read or written from compiled programs.:    YES
  18. read or write object files in foreign formats:    Can easily be added.
  19. write selected event objects to one or more output streams:    YES
  20. object definition language and/or be able to define new object formats programmatically:    YES
  21. read events in one format, convert and write them out in a different format:    YES
  22. virtual streaming:    Yes with very little work (thanks to the efficiency of random access)
  23. Numeric and Mathematical Functionality:

    Additionnals high level functionnality can be provided by Python's add-ons.
     

  24. accurate and precise numerical functionality, including double precision.:    YES
  25. Analysis capabilities applied to fetched data as well as subsequent renditions:    YES
  26. Functions operating on multiple data sets:    Yes from C, Not yet from GUI (will be added in GUI upgrade)
  27. fit, parameterize, and calculate statistical quantities from data:    YES
  28. user control of fitting algorithms. :    YES (Minuit)
  29. Offline Compatibility:

    All of those will be available for both Python and C.
     

  30. tailor the sequence of mathematical operations
  31. ability to include external software in their analysis.
  32. functionality of the analysis package linked into user defined code.
  33. Prototyping:

     

  34. prototyping of simple versions which can later be expanded upon. : may need to translate Python to C
  35. Prototyped sequences contain the full interface of an arbitrarily complex version. : Yes

DATA PRESENTATION

          The Graphical Interface need to be upgraded to handle column wise ntuples, operations on multiple histograms,  and improve the 'glue' between components (Histoscope, Nfit). (6 man months)

  1. Interactive visualization:     Yes
  2. Presentation quality graphical output:     Yes
  3. Formal publication of graphical output:     No, output postscript can be re-formatted from Adobe Illustrator.

USABILITY

  1. Batch vs. interactive processing:     Yes
  2. Sharing data structures:     Yes
  3. Shared access by several clients:  Yes (very good)
  4. Parallel processing (using distinct data streams):     Yes
  5. Debugging and profiling:     Yes (python and C)
  6. Modularity (user code):     Yes
  7. Modularity (system code):     Yes
  8. Access to source code:     Yes
  9. Robustness:     Good
  10. Web based documentation:  Yes
  11. Use of standards:     Yes
  12. Portability:     Pretty good (have been ported on all needed platforms)
  13. Scalability:     Good
  14. Performance:     Good
  15. User Friendliness:     Very Good

 

Run II Physics Analysis Software Support Requirements Checklist: Nirvana

---------------------------------

PasSuma Checklist - Nirvana

----------------------------

Philippe Canal

----------------------------

Support

Maturity and Completeness

Difficult to quantify.  However NEdit, by the same authors with the same technologies is being used successfully by a large community.  Python has a very large community of users and a large set of ressources.  Eventually Histoscope and NFit could be distributed as python add-on, thus growing the user base.

 

Exist since 1991.  New version have mostly been for added features and port to new operating systems. The newest version (alpha release of version 5.00) introduces column wise ntuple and a redefines some file operations.

 

Nirvana is 'owned' by Fermilab and will survice has long as Fermilab supports it.

Who supports users

Fermilab

 

Fermilab's 'customer'.
 

Yes it is already used by the HEP community but no newsgroup.  Python has newsgroups and existing knowledgable person in the HEP community.
Might need training for Python.  GUI is good enough to be learnt on the spot.
Available for Python.
Same as python (and the Nirvana group for Nirvana's specific parts)

 
 
 

Yes up to version 4.  New features will need documentation.

No.

Licensing

 
Free.
 

 
Free except for Fermi's man power.

Maintenance

Who provides it and how much:

Fermilab
Yes
No.

N/A
 

 
Around 130000 lines.  The interpreter and the graphics part and the C implementation should separable to a large extend.

 
Yes

 

Maintenance Infrastructure

Ansi C and Motif.  Standard makefiles.
Yes

 
Uses CVS

N/A

 
Yes
 

No
Yes

Maturity and Completeness

No
Yes

 
 

 

Modularity:

Will need an upgrade of the Graphical Interface and the addition of an interpreter (with Python the likely choice)  (The latter comment seems to be a little hasty to me.)
Yes.
Yes
To a large extend.
Yes. Graphical Interface, interpreter and I/O are three sub-packages.
Currently need Motif and Minuit.  Will need python.

 

Portability

Available on ALL platform as of June 98.

 

ok.

 
not very sensitive.

 
ready.
 

Nirvana writes Big Endian IEEE  (IEEE floating points) files.

 
No

 

Standards:

ok.
the few authors have entire control.
it's alright
yes
No high level design tool is currently used.

Reliability and Security:

relies on .rhosts for some interprocess communication (actually rsh)
 

 
unlikely
 

 
no
 

 
not that i know of.

Application specific:

 
Yes.

 
Yes
 

 
Yes

 
Histoscope can now also have disk resident ntuples.  The size limitation is now 2Gb.

I don't know for sure but it should be able to scale nicely.

 

 

Appendix 3 - MATLAB Evaluation

Run II Physics Analysis Software Requirements Checklist : MATLAB

Disclaimer - This report contains a combination of fact, observation, and opinion. It is our best understanding of the situation under our particular circumstances, and is not guaranteed to be correct or to apply to conditions other than those under which the evaluation was made. It is for the use of the Fermilab / HEP community only.

DATA ACCESS

  1. Access rates (online):
  2. Access rates (offline):
  3. Serial vs random access: Both
  4. Granularity of access: Matrix/row/column/element/structure element
  5. Foreign Input and Output Formats: Available via F, C, C++ API
  6. Specialized output formats: Available via F, C, C++ API

DATA ANALYSIS

Scripting Language:

  1. full featured scripting language Yes. MATLAB has its own scripting language which is complete and includes looping, arg passing, and "capture" capabilities. These can be stored in .m files using a history capture mechanism or the built-in editor.
  2. analysis tool's object model double precision matrices, also support int, char
  3. extract data from events Yes - commands to extract rows, columns, elements. It is also possible to have C structures in the arrays, so variable lengths are available.
  4. express complex mathematical expressions That's what it does
  5. debugging facilities Yes, and profiling
  6. interface the scripting language to dynamically linked compiled high level languages Yes - scripting language can call .dll or equivalent
  7. User Control:

  8. control functions Yes
  9. Mathematical operations Matrix operations, ODE solvers, FFT's, curve fitting...
  10. Results of analysis available to users At any point - rows/columns can be stored to their own variables or printed out from the current array.
  11. command line recall and interacive command line editing. Yes. Full command line editing, with good editor (NT version) to save work. Both the command line editor and the command file editor have Win32 controls (ctl-C - ctl-V). Win32 version provides DOS directory traversal
  12. Data Selection:

  13. program selection criteria using extracted dataYes
  14. display selection criteria as text Yes
  15. Input/Output:

  16. support its own object I/O format. Yes - double[][]
  17. allow its own format object files to be read or written from compiled programs. Yes - std dll's produced
  18. read or write object files in foreign formats No
  19. write selected event objects to one or more output streams Yes
  20. object definition language and/or be able to define new object formats programmatically. Yes
  21. read events in one format, convert and write them out in a different format. Yes
  22. virtual streaming
  23. Numeric and Mathematical Functionality:

  24. accurate and precise numerical functionality, including double precision. Yes - default model
  25. Analysis capabilities applied to fetched data as well as subsequent renditions Yes
  26. Functions operating on multiple data sets Yes
  27. fit, parameterize, and calculate statistical quantities from data Yes - built-in hist, bar3(lego) functions, polynomial fits, error bar plot
  28. user control of fitting algorithms. Yes, or implement you own or use a library one pretty easily.
  29. Offline Compatibility:

  30. tailor the sequence of mathematical operations Yes
  31. ability to include external software in their analysis. As long as it eventually matches data model
  32. functionality of the analysis package linked into user defined code.
  33. Prototyping:

  34. prototyping of simple versions which can later be expanded upon. Yes - major strength. These .m files are simple to write, can be combined/split/changed to code
  35. Prototyped sequences contain the full interface of an arbitrarily complex version. Yes

DATA PRESENTATION

  1. Interactive visualization: Yes, with capability to write callbacks for additional functionality. Native "Handle Graphics". Also many advanced vis features - texture, lighting, ...
  2. Presentation quality graphical output: Yes - see demo
  3. Formal publication of graphical output: Yes. There also exists the capability to save the graphics in a native .m file for inclusion later. Printing can be in mono/color, level1/level2, eps/ps, Adobe Illustrator illustration file. There is an "online Notebook" feature using MS Word

USABILITY

  1. Batch vs. interactive processing: Yes - licenses are cheaper too
  2. Sharing data structures:
  3. Shared access by several clients:
  4. Parallel processing (using distinct data streams):
  5. Debugging and profiling: Yes
  6. Modularity (user code): Yes
  7. Modularity (system code): N/A
  8. Access to source code: No
  9. Robustness: Seems good - error messages are precise, and the facility to trap and recover from errors in user code is present.
  10. Web based documentation: Yes
  11. Use of standards: N/A
  12. Portability: All platforms
  13. Scalability: Excellent
  14. Performance: Good
  15. User Friendliness: Outstanding

 

Run II Physics Analysis Software Support Requirements Checklist: MATLAB

---------------------------------

PasSuma Checklist - MATLAB

----------------------------

Jeff Kallenbach

----------------------------

Disclaimer - This report contains a combination of fact, observation, and opinion. It is our best understanding of the situation under our particular circumstances, and is not guaranteed to be correct or to apply to conditions other than those under which the evaluation was made. It is for the use of the Fermilab / HEP community only.

A) Support

1) Maturity and Completeness

What is the customer base and what is their experience and opinion? For commercial software or non-HEP freeware, one should get a list of customers and references.

There are a number of references on Mathworks' WWW pages. They claim to have around 400,000 users worldwide. I have asked for HEP references (users at SLAC, BNL, LANL, etc.) and am waiting.

How long has the product been in existence? What version is the product at? How many major releases have there been? How often is there a minor release? Several major releases or regular minor releases with integrated bug fixes are good signs of a well supported mature product. Availability of published books on the product are also a sign of maturity as well an established customer base.

The product has been around for ~15 years. Current release is 5.2.  

How long will the product survive? Are there any competing products that are likely to win the market (including freeware). Who is the product developer and are they well supported financially (graduate student or full time staff).

There is a freeware version, called Octave. It is unclear whether this will really overtake MATLAB, though. The fact that the Octave graphics are still very crude indicates that not a lot of work is being applied to areas we consider important.

2) Who supports users

Who provides consulting support? Commercial, other Lab, CERN, Fermilab? Are they responsive? Newsgroups and dejanews may provide some information on support response (though these tend to be biased). This is rather subjective and should be treated as such.

 

There are e-mail and phone hotlines for support questions. A local base of knowledge could easily bedeveloped to help with FAQ and other straightforward questions. The newsgroup is busy, with about 800 articles posted in the last two - three weeks.  

Who can get support? Particularly for commercial software, can any user of the product access the support services or are these limited to a pre-specified list of local contacts.

Anyone at the site would be able to contact the hotline

Is the use of the product in the community enough that there is a pool of people/knowledge to draw from for support if needed? HEP use should be assessed; PAW knowledge in the HEP community is widespread and Root is growing. A dedicated newsgroup would be a plus.

HEP usage is unknown. The newsgroup is busy, and the q & a in there look healthy, ranging from beginner-level inquiries to refinements of graphics output. There seem to be plenty of users.

 Is user training needed and available? What is the cost?

It is available and expensive (about $500/person/2-day course). From my experience the product is pretty easy to get started with using the doc.

 

Is training required and available for support staff? What is the cost (time and money)? For commercial products such as SAS and IDL, support and user training may be required to optimally use the product, and that cost should be folded in.

There is such a thing, but it isn't yet priced.

How much (local) support will be required (is it complicated and hard to use)? This and the remaining questions in this section can be determined by talking to current users or scanning any newsgroups, mailists or FAQS.

The product seems very easy to use. I would think though that a local working group or mailing list would be helpful, as we write our data interface modules and other HEP-specific functions. The volume of licenses that we will require some attention, as well as coordination with the license server people (fnalu-admin).

For commercial or freeware, what kind and quality of user level support is provided?

The helpwin facility and www-based documentation are outstanding. Response to technical questions from the hotline has been very good. The books and helpwin are full of examples. Getting started with the product is very easy.

 Is the software completely and well documented at the user level?

Yes - see above

Is a system manager required in order to install and/or maintain the package? If so, this would be significantly complicate matters for some remote users who do not have ready access to (or a friendly relationship with) the system managers of their computer.

Not required, but recommended. Workarounds exist for the case where peons do the installation. UPS/UPD tailoring would be pretty easy.

3) Licensing

What types of licensing are available?

Flexlm, floating seats exist for Un*x and NT (these are separate). However, any of our collaborators can use licenses off of our server. In addition, there is another breakdown of the Unix licenses into interactive (programmer's) and batch licenses.

What is the cost? For Universities? Lab?

The cost is very high. They are quoting ~$3000 per floating Unix seat per year, with 20% discounts for large volumes.

 

B) Maintenance

1) Who provides it and how much:

Who provides maintenance both local and external to the Lab? What are the fallbacks (if the maintainer(s) is run over by bus or the company folds)?

I would envision a pool of local knowledge (mailing lists, module repository) to supplement the hotline. AS a commercial, it would not be "maintained" by fixing the code, but by feeding questions/problems back to MathWorks and then updating the local installations. If the company folds we could go to Octave.

Are the maintainers responsive and are bug fixes turned around in a reasonable amount of time?

We didn't encounter any bugs. The support line was responsive in the case of our two questions.

Does the software maintainer need additional training (beyond that needed by users). If so, is it available and at what cost.

Probably not - just relaying bugs and fixes.

What is maintenance/licensing costs for commercial products?

Two ways to go. We can by a four year service agreement, in which case we pay for 4 years of upgrades/hotline/etc. for 20% of our purchase price. This includes perpetual licenses. Or, we can renew one year at a time, for 20% of the current price. In this case the license must be renewed each year. 

How much software is there (line count)? How much needs to be supported locally (how many people required)? Can/should support be split up into areas of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant for non-commercial software that will be maintained locally.

Probably N/A.

In the case of commercial software, is source code available (in escrow)? This would be required for finding bugs locally or in case the company folds. This may be an additional cost.

It is not available.

2) Maintenance Infrastructure

What kind of build environment is provided. Is it robust? This is mostly relevant for non-commercial software that may need to be co-maintained.

Standard build kits for Unix and Install Shield kits for Win32. Distributed on CD, we could distribute via ZIP files/upd kits probably.

Can the package be built AT ALL on new or different sub-systems? Root still provides NO makefiles.

Kits exist for all of our platforms.

Is the source repository accessible so that local support persons can select which changes to accept and which to reject for local use

N/A

Will the software have to be maintained and/or extended locally and externally? If so, can the software be maintained in a common repository. If separate repositories, what is the commitment to keep them from diverging from modifications, extensions and bug fixes. This excludes locally maintained extensions which use pre-defined APIs or hooks into the product, which we will have to maintain ourselves in any case.

I can see us sharing modules and controlling those however we choose. But as far as changing the distribution, it will not occur.

Is the software passed through quality assurance software such as Purify or Insure++ before being put into production?

N/A

Are there any restrictions that would prevent the product from being placed in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to support more than one version of the product on the system.

No such restrictions.

Are release notes and change lists provided with releases? For example, the commercial product IDL comes with "what's new" and release notes lists.

Yes - release notes are provided.

3) Maturity and Completeness

Are there active mailing lists/FAQs/newsgroups for the product? How do they reflect on the product? Root has a support list, the commercial product IDL has FAQs and a newsgroup.

All of this exists. THere is a newsgroup where users share ideas and code. The Mathworks WWW pages are very complete, with FAQ's and the like.

Are recent releases extensions/enhancements and not bug fixes?

Yes. Version 5.2 is primarily an upgrade.

 Are product releases reasonably paced and useful?

Yes. Much work is going into Win32 versions.

 4) Modularity:

Will the tool/software need to be upgraded (additions/replacements) to satisfy Run II functional requirements, and how difficult will this be? Does the product provide API/hooks to easily interface locally written extensions? Would existing support be able to handle this or would manpower need to be added? For example adding a command line (e.g. python) to Histoscope is thought to be difficult. Anticipated additions and replacements should be identified.

The only big deficiency I see is possibly with the data management. It is unclear how this will behave when fed a 2GB ntuple. Some code to help manage such data may have to be written. The API is excellent and well-documented. CD (PAT, I guess) would probably be able to handle the support with existing manpower. This would include getting distribution and licenses set up, upd'ifying the product, and helping the experimentors get going with their data interface modules. Also preferable would be management of local module base and mailing list. I see maintenence of this product requiring less work than a HEP-ware product.

Is the product modular? Is it broken down logically and physically into reasonably distinct sub-systems? Can some of these sub-systems be replaced by external packages with the same functionality? For example, in Root can Linear Algebra or Minimization be done by packages developed specifically to solve these topics separately from Root? A mini code review should be performed on the package to determine what would be involved in replacing such an identified component or sub-system.

I didn't actually do it, but I think such a thing could be done without much problem. For example, it would seem to be straightforward to replace (actually override) the plotting with OpenInventor graphics, for example.

 

If new functionality needs to be added, is the software sufficiently modular such that the code changes can be localized? For example, it is believed this is not the case for adding STL support to cint. A mini code review should be performed to assess whether extensive and/or destabilizing code changes would be required to add the functionality.

N/A

Is the software sufficiently modular to be such that bug fixes are localized? A mini code review focused on a particular section or component of the software should provide information on this.

N/A 

If a component needs to be replaced in it's entirety, is the software sufficiently modular such that a new component can be slotted in with minimal disruption? For example, Root depends on functionality in cint other than the interpreter in a fundamental way.

??

What distinct (external) packages or interfaces are required to build and/or run the package... Motif (shared libraries), OpenGL, etc. Are there external software components that are out of the maintainer's control? LHC++ depends on numerous commercial packages, Nirvana/Histoscope on motif. All such packages and interfaces should be identified.

This package is self-contained.

5) Portability

Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a specific Run II platform isn't supported, what would it take to get support for it should be determined.

All runii platforms are maintained. We could share licenses between Un*X platforms.

Porting of code to new platforms; this applies to non-commercial software that is currently only supported on selected systems. The issue to be raised is the ability of the original developers to accept changes to be incorporated into the base code so any porting done here is done once (aside from effects of future OS upgrades) and does not have to be re-done with each release of the software.

N/A

How sensitive is the package to minor OS and/or compiler and/or system header variations? Root under Linux may be sensitive to which C-libs are in use, which distribution you are using, which kernel you are using, which system header patches you have applied (esp., cint), and so on. PAW is sensitive to OS upgrades. This can be determined by looking at support history in newsgroups, other support logs, or talking to the user community.

64 bit considerations: Does the software run on 64 bit platforms/OS (alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64 bit systems (a la COMIS for PAW).

Works on 64-bit platforms.

Are there Endianship and other heterogeneous environment considerations?

There don't seem to be

Is the software product build dependent on a specific compiler or is it compiler independent? If compiles are not needed for the product, are there any compiler dependencies present in the API used for locally written extensions? In particular, if the software needs to be built with Run II compilers, it should be verified if it can.

6) Standards:

Are standards followed. Compiler standards? Library standards (e.g. STL, POSIX). Are they fully supported (e.g. cint and STL)?

N/A

Are there any support and maintenance standards or procedures? For example, any control over what goes into releases?

N/A

Are good coding practices (documentation) followed? Is there good developer's documentation (how easily can the product be "taken over"?)

N/A

Are good "Computer Science" techniques and methods used, for example in a language interpreter (see below)?

N/A

Is there a design methodology applied? Are any design tools used? (such as a code generator). If so, do we need to have and/or support these tools?

N/A

7) Reliability and Security:

Is there any security maintenance concerns with the product?

None

Is the product likely to crash and if so, how does it recover? What would be the impact? Do system managers need to intervene?

I wasn't able to crash it on NT. There are reports of a crash on the newsgroup, with what sounded like a typical seg fault.

Any government regulation applied to the product? Export restrictions?

No government regulations. Foreign collaborators could probably make use of our licenses, but may cost more.

Are there any Y2K issues?

8) Application specific:

Will an interface/adapter need to be made to fit the tool in with the rest of the analysis tool framework (e.g. data import/export). Does the product provide well defined and documented API/hooks for such an extension?

Yes. The basic MATLAB data model is a double precision array. Interfaces will have to be written from experiments' data models.

Does the product use a language interpreter. If so does it support the full language. Is it written following correct computer science techniques and algorithms? Is it sensitive against language changes, etc. (examples are COMIS and cint, which support only subsets of their language). Or, turning the question around, is the language computational complete? (Then forget about the language it was trying to emulate and treat it as completely new language).

MATLAB has a very nice interpreter which can be learned pretty easily. In addition, the commands can be saved in scripts or compiled to shared objects. It is not the C++, though.

Is the data processing model correct? For example, in MATLAB, the model is to read in a file then process all its data - will this work with Run II sized files.

A problem - some codes will have to be written to handle runII sized files.

Are there restrictions on the input data (say size or format)? For example, in Histoscope the size of ntuples is limited because it is stored and processed in memory.

MATLAB has a similar restriction at this time.

What is the minimal environment to run the product? How does performance/capability scale up while the environment scales up

The program (Win32 version) runs fine on a minimal PC.