A Recommendation for Run II Physics Analysis Software

Eileen Berman, Philippe Canal, Frank Chlebana, Irwin Gaines, Herb Greenlee, Jeff Kallenbach, Rob Kennedy, Qizhong Li, Pasha Murat, Gordon Watts

Abstract:

This report contains the results of the Run II Physics Analysis Software Recommendations committee (PASREC). It contains a summary of the work completed by the committee as well as a recommendation for Run II analysis software.

I. Introduction:

Goals of committee:

Develop a short list of candidate products
Evaluate candidate products with respect to the PASFRG and PASSUMA criteria
Deliver a recommendation to the Run II Computing Steering Committee for a package to support physics analysis needs for Run II.

Survey revealed that the collaborations, while not having a near term need for very high performance or complete functionality, do want to have a decision made very soon so they can begin the transition away from existing packages (PAW) and not have to make any additional transitions during run II. Thus, they are asking for a product that is available now with most features, will satisfy the full set of functional requirements soon, and will be guaranteed to be supported throughout run II.
Discarded PAW (no support, doesn't run on 64 bit systems) and LHC++ (maturity, portability)
Concentrated on ROOT, NIRVANA, and commercial/shareware packages, typified by MATLAB and IDL as commercial products and OCTAVE as shareware

II. Objective statements about candidate products

ROOT:

ROOT is a complete, full-featured package that meets the functional requirements
There are some trivial unacceptable features (use of CMZ, lack of build scripts) which should not be a stumbling block, but will require a formal collaboration with the ROOT team
There is a large, world-wide user base, but so far limited use for serious HEP analysis
ROOT can cope with the CDF and D0 data models
ROOT has an effective internal data format well matched to HEP needs
The present version of CINT is a potential serious drawback (buggy, undocumented, limited C++ features, hard to support, poorly engineered). This will require a decision to enhance/upgrade/replace, which would require significant work.
The user interface is not very friendly
The interconnectedness of the various modules is substantial. External modules must conform to (ROOT specific non-standard) ROOT protocols to be functional.
The package is not highly engineered (i.e., it has grown organically rather than been designed). The current implementation reflects this evolution, for example, it has not kept up with the C++ language standard (has its own container classes, etc.) Even beyond CINT, the product has many bugs.
It will require some relatively straightforward customization to support casual users
There is an active and responsive support team with good archives and an active mailing list

NIRVANA:

NIRVANA could become the core of a full package that meets the functional requirements, but it is not there yet.
It has excellent plotting facilities and GUI.
It is a well-engineered package, highly modular.
The proposed full featured NIRVANA adopts a sound strategy of relying on standards and distinct components each with their own support (like PYTHON) to provide plug-and-play capability
We would have complete control of NIRVANA development
There is no large user community for existing components, and only limited use outside FNAL
A minimum of 6-12 months would be required to include the scripting language, overall framework and HEP specific features
There will be no extensive experience with the full version of NIRVANA before the start of run II.

Commercial products:

licensing costs are unclear, especially for university collaborators
dealing efficiently with very large data files is not yet demonstrated
not optimized for HEP style analysis (concentrates on unbinned rather than binned distributions, doesn't support histograms as dynamic objects as needed for online use)
Many attractive features, quickly getting much closer to our style of analysis: very nice scripting languages, good interfaces to FORTRAN and C++, excellent visualization, data models that support user defined structures, highly portable, etc.

Shareware:

Octave typically lags one release behind the commercial alternative. This can be a problem since needed features may only be in the most recent commercial releases
Octave does not support Matlab visualization, but instead relies on the gnuplot package, which is totally inadequate.

III. Detailed Product evaluations from PASFRG and PASSUMA criteria:

ROOT - See Appendix 1

Nirvana - See Appendix 2

MATLAB - See Appendix 3

IV. Conclusions:

Either ROOT or Nirvana could meet the functional and support requirements

ROOT would require a dedicated team of three people, up to at least the beginning or run II, for maintenance and support, especially for CINT improvement/replacement and GUI upgrades (see Rob Kennedy's slides in appendix 1) and to incorporate NIRVANA like visualization capabilities. The support load might ease somewhat after the start of Run II.
Adoption of ROOT would be contingent on forging an effective collaboration with the ROOT team. The collaborative agreement must explicitly specify, at a minimum, the decision-making procedures for the future direction of ROOT, plans for the future of CINT, future support commitments, and any intellectual property issues associated with ROOT.
NIRVANA would require a dedicated team of three people for the next 12 months to add features not currently supported (see Philippe Canal's proposed plan in appendix 2). The support load would probably ease somewhat at that point.
the major differences between ROOT and NIRVANA are:
ROOT exists now and is immediately usable, if buggy; NIRVANA supports certain styles of analysis well already but would not offer full PAW like functionality for some time
ROOT has a larger current user base and better possibilities to leverage off of outside support
NIRVANA would be under our complete control
NIRVANA is better engineered and would likely require less support in the long term (but we would need to supply all support locally)
NIRVANA offers better possibilities of easily integrating 3rd party products
Commercial products, while improving rapidly, are not close enough to our traditional styles of analysis to be exclusively adopted today (although they may very well be completely acceptable by the beginning of RUN II)

V. Recommendations for Run II:

We recommend that ROOT be adopted as the standard physics analysis package for Run II, contingent on a collaborative agreement with the ROOT team. It should be recognized that this recommendation depends critically on timing and on sharing development with outside collaborators, and the steering committee should assess the validity of these assumptions in evaluating the recommendation. In particular, if the requirement for an immediate choice is being driven by on-line needs (which may not require the full functionality of an off-line analysis package immediately), it needs to be determined if the components of NIRVANA that already exist are adequate for the immediate needs.

VI. Long-term Recommendations:

It is highly likely that by the end of RUN II (or by the time of the LHC) that commercial components will be heavily used for analysis tasks. Commercial offerings should continue to be investigated and made available (perhaps on limited platforms). The Computing Division should also initiate formal collaboration with the LHC++ project so as to have some influence on the choices made and direction taken. These two initiatives, while lower priority than the immediate ROOT support and development needs, should position us to take full advantage of expected evolution of these products.

Appendix 1 - ROOT Evaluation

Run II Physics Analysis Software Requirements Checklist: ROOT

DATA ACCESS

Access rates (online): ????

upports shared memory option
supports sockets

Access rates (offline) : 5-6 MBytes/sec demonstrated on Linux/Pentium w/o serious optimization

Serial vs random access: supports both modes

Granularity of access: leaf of the tree

Foreign Input and Output Formats:

h2root (PAW ntuple -- ROOT tree)
g2hroot: conversion of GEANT3 hit banks into ROOT format

Specialized output formats: several standard objects:

ROOT Tree (ntuple),
ROOT histogram
ROOT containers (TClonesArray, for example)

DATA ANALYSIS

Scripting Language: C/C++ interpreted language (CINT)

full featured scripting language : YES

available, its major weakness is the opposite side of its strongest advantage C++ is much more difficult to interpret than other languages like Python or Java, so not all the features of C++ are supported at the interactive level

analysis tool's object model :

data are stored in the external files in ROOT Trees,
primary interface: command line (interactive scripts, shared libraries)
GUI interface / GUI builder

extract data from events :

TTree::GetEvent(i)

express complex mathematical expressions : YES

scripting language is C+(+), so all the math is available (TMath class)

debugging facilities: YES

CINT supports primitive level of debugging, one can step through the interpreted code
insert cout/printf statements

interface the scripting language to dynamically linked compiled high level languages

TSystem::Load(char* filename) loads in an arbitrary shared library. An interface should be provided though. Automatic tool for generating the interfaces (rootcint) exists

User Control:

control functions: YES

interface to operating system

Mathematical operations: YES

Results of analysis available to users: YES

ROOT file
.ps, .eps, .gif, .C (ROOT macro)

command line recall and interactive command line editing: YES

Data Selection:

program selection criteria using extracted data: YES

display selection criteria as text : YES

normally stored in CINT scripts

Input/Output:

support its own object I/O format: YES

non-templated object format, objects are supposed to be derived from the base class TObject

allow its own format object files to be read or written from compiled programs: YES

read or write object files in foreign formats: YES

write selected event objects to one or more output streams: YES

supported, several output files may be opened at a time:
TFile f1("a.root","RECREATE")
TFile f2("b.root","RECREATE")

object definition language and/or be able to define new object formats programmatically: YES
- interpreted language it C++

read events in one format, convert and write them out in a different format: YES

virtual streaming: YES

Numeric and Mathematical Functionality:

accurate and precise numerical functionality, including double precision: YES
- all math operations implemented in double precision

Analysis capabilities applied to fetched data as well as subsequent renditions

Functions operating on multiple data sets

fit, parameterize, and calculate statistical quantities from data : YES
- uses MINUIT for fitting (the package tested and polished by the HEP community
over the decades
- all the functionality of HBOOK package including histogram fitting is already available

user control of fitting algorithms: YES
- MINUIT has an extensive interactive interface
- source codes are available

Offline Compatibility:

tailor the sequence of mathematical operations : YES

ability to include external software in their analysis: YES

also export CINT scripts into offline code

functionality of the analysis package linked into user defined code: YES

Prototyping:

prototyping of simple versions which can later be expanded upon: YES
Prototyped sequences contain the full interface of an arbitrarily complex version: YES

prototyped methods could be immediately used in the offline code

DATA PRESENTATION

Interactive visualization: YES

GUI builder is available,

simple graphics editor

Presentation quality graphical output: YES

TPostscript class (.ps, .eps)
.GIF format

Formal publication of graphical output: YES

LaTeX interface is coming

USABILITY

Batch vs. interactive processing: YES

wnteractive session: root
batch mode : root -b script

Sharing data structures: YES

Shared access by several clients: YES

Parallel processing (using distinct data streams): YES

Debugging and profiling:

debugging of the scripts

profiling of the compiled code

Modularity (user code): YES

Modularity (system code): YES

Access to source code: YES

http://root.cern.ch

Robustness: to be improved

Web based documentation: YES

http://root.cern.ch

Use of standards:

Portability: YES

all UNIX flavours
Windows (95/NT)
Mac OS

Scalability: YES

based on the consideration of scalability CDF has decided to write event data in ROOT format

Performance: better than HBOOK

User Friendliness: to be improved

this is our first experience with large scale C++ product

Run II Physics Analysis Software Support Requirements Checklist : ROOT

---------------------------------

PasSuma Checklist - ROOT

----------------------------

Contributors:
- Rob Kennedy     31-July-1998
- several additions by Pasha Murat (Aug 04 1998)

----------------------------

A) Support

1) Maturity and Completeness

a) What is the customer base and what is their experience and opinion? For
commercial software or non-HEP freeware, one should get a list of customers and
references.

RDK) 	The customer base is primarily HEP experimenters and support personnel,
with a number of experiments officially using ROOT as part of their data
handling mechanism or design. For example, here is a list of applications and
links to ROOT located at: http://root.cern.ch/root/ExApplications.html

	NA49 ROOT Physics Analysis Classes 
	ROOT Primer by Soren Lange 
	ROOT at GSI 
	More 3D visualization for the CMS Track Reconstruction Prototype 
	Clusterization in the CMS ECAL 
	The PHOBOS Analysis Toolkit 
	The E907 experiment 
	SAL Scientific Applications On Linux 
	The Cetus Links 
	The Rosebud Package 
	ROOT used for event monitoring in the Finuda experiment 
	ATLFast++, the ATLAS fast MonteCarlo 
	gh2root: Generates C++ classes to convert Geant3 KINE/HITS to ROOT 
	Direct Photons produced at RHIC energies 
	ROOT in STAR (large heavy ion experiment in Brookhaven) 
	The ALICE simulation/reconstruction framework 

PM) more customers:
 - STAR@RHIC decided to proceed with full-scale evaluation of ROOT as a
   CERNLIB replacement

 - CDF activities:
 - many physicicts are trying to use ROOT for analysis
 - prototyping of ROOT-based online consumers
 - simulation project is prototyping ROOT tools
 - CDF SVXII test stand is writing the data out as ROOT ntuples
 - CDF Karsruhe group is prototyping a ROOT-based interactive event display

b) How long has the product been in existence? What version is the product at?
How many major releases have there been? How often is there a minor release?
Several major releases or regular minor releases with integrated bug fixes are
good signs of a well supported mature product.  Availability of published books
on the product are also a sign of maturity as well an established customer 
base.

RDK) 	The ROOT project was started by Rene Brun late November 1994. His long
time collaborator Fons Rademakers joined the project around January 1995. In
the middle of August 1995 Nenad Buncic joined the team, followed by Valery Fine
in December 1995. Masaharu Goto created and supports CINT, including the ROOT
variant of CINT, RINT.

RDK)	The current version is 2.00/09 (ROOT's style of version numbers). There
have been two major releases that I know of (1.0 and 2.0), and patches appear
about every 2 to 4 weeks. The patches appear to be roughly equally divided
between developer-realized issues and user defect reports/feature requests.

c) How long will the product survive? Are there any competing products that are
likely to win the market (including freeware). Who is the product developer and
are they well supported financially (graduate student or full time staff).

RDK)	I recall that Pasha had a statement from Rene which was a committment to
support ROOT until ?. It is unlikely that commercial packages will replace the
demand for this product. Afterall they have not succeeded to date, and they
predate ROOT. With less certainty, in my opinion, it is unlikely that freeware
packages will replace ROOT either, since ROOT offers significant functionality
not found in other packages, such as interactive object browsing. ROOT is
developed by the "ROOT team" (see above) which consists of at least two
full-time developers, two part-time developers, and a number of specialists
working on specific aspects of ROOT as time permits.

PM) ROOT team has an excellent record and many years of experience
with HEP software. R. Brun was the leading developer of CERNLIB, F.Rademakers
for several years has been maintaining CERNLIB, V.Fine ported CERNLIB to PC/Intel
architecture (DOS/Windows 95/Windows NT). The team is very productive.

???) Financial backing

2) Who supports users

a) Who provides consulting support? Commercial, other Lab, CERN, Fermilab?  Are
they responsive? Newsgroups and dejanews may provide some information on 
support response (though these tend to be biased). This is rather subjective 
and should be treated as such.

RDK)	Consulting support is primarily provided by Rene Brun and Fons
Rademakers, with FNAL local consulting unofficially provided as time permits by
Pasha Murat. In everyone's opinion I have spoken with, the ROOT support
turn-around after contacting them is good to outstanding. To get a better idea
of the support activity, see the ROOTtalk e-mail archive at:
http://root.cern.ch/root/roottalk/AboutRootTalk.html

b) Who can get support? Particularly for commercial software, can any user of
the product access the support services or are these limited to a pre-specified
list of local contacts.

RDK)	Anyone can get support. There is no requirement to sign up or provide
personal information before one can post to ROOTtalk, though one can do so to
receive the e-mail and responses themselves in your e-mail browser. I prefer to
use the WWW interface to ROOTtalk myself. Presumably, if ROOT is select by 
FNAL, then some local support apparatus might be set-up to help alleviate the support
burden on the ROOT developers (as we have done with Kai, with C++, and other
topics).

c) Is the use of the product in the community enough that there is a pool of
people/knowledge to draw from for support if needed? HEP use should be 
assessed; PAW knowledge in the HEP community is widespread and Root is growing. A
dedicated newsgroup would be a plus.

RDK)	Yes, in my opinion, there are enough users of ROOT on different
operating systems (Unix and Windows) to provide a knowledge base outside of the
ROOT (support) team itself. ROOTtalk acts like a newsgroup, though the exact
mechanism is different (http://root.cern.ch/root/roottalk/AboutRootTalk.html).
ROOT in many ways can be used as an overhauled PAW, though the PAW model of
data analysis does not seem to lead to the most efficient use of ROOT.

PM) ROOT user community is by 99.9% HEP community. Practically in all the US 
HEP laboratories including BNL, LBL, LLNL, FNAL, SLAC there are physicists using 
ROOT. There are ROOT users at CERN and in Russia, again - in HEP community.
From the point of view of accumulated knowledge of the product, BNL and FNAL are 
already capable of providing the local support. 

d) Is user training needed and available? What is the cost?

RDK)	In my opinion, user training is needed, though there are many documents
available to help users understand and use ROOT. I think a five page tutorial
on how to start, interact with, and quit ROOT would be a good complement to
existing documentation. Also needed is English documentation on ROOT CINT, as
well as one page tutorial on what is known to work and what fails with CINT.
The cost of the five page tutorial would be very small. The cost of the CINT
documentation may be an FTE week or two of someone's time who is familiar with
CINT.

PM)   Many ROOT commands have their PAW counterparts
(hist/plot vs hist-Draw(), for example), so PAW users adapt to "ROOT philosophy" 
pretty easily. New commands/tools require more training, mostly in C++ itself.
I also heard comments that using CINT makes it much easier for physicists to make
their first steps in learning C++.

e) Is training required and available for support staff? What is the cost (time
and money)? For commercial products such as SAS and IDL, support and user
training may be required to optimally use the product, and that cost should be
folded in.

RDK)	No specific training is required for support staff. ROOT includes
documentation of its internal data formats and implementation classes (not
visible to the user). Some of the mechanisms in ROOT are non-standard
(especially RTTI), and will a few FTE weeks to document more completely for the
support staff.

f) How much (local) support will be required (is it complicated and hard to
use)? This and the remaining questions in this section can be determined by
talking to current users or scanning any newsgroups, mailists or FAQS.

RDK)	This depends heavily on how we plan to improve ROOT and CINT locally,
and how many of its limitations will be fixed by September 1 or we will simply
accept. Many new users have been productive with ROOT in a few days to a week,
but many heavy PAW users have found the transition in thinking to make using
ROOT seem complicated, tedious, and bug-prone. I think that once ROOT is
selected and a much large number of users have adapted to it (provided examples
of analyses for others to use as examples), then the need for local support
will predominantly be for adapting ROOT to OS/compiler combinations not
supported officially by the ROOT team, and to add staff to the ROOT team to
handle uncovered defects and implement new features. Perhaps local support
could start out as dedicated ROOT testers to have teh most impact on ROOT's
quality.

g) For commercial or freeware, what kind and quality of user level support is
provided?

RDK)	Via ROOTtalk, individual users interact via e-mail directly with Rene
and Fons. One does not always get an instant fix, but one does get a thoughtful
intelligent reply to your e-mail. In some cases, other users who know the 
answer will step in and answer your e-mail.

PM) there are 2 mailing lists - ROOTTALK and ROOTDEV - the first intended for
    general discussion, the second one - for bug reports. Most of the users
    use the 1st list for all the purposes.

h) Is the software completely and well documented at the user level?

RDK)	In my opinion, the software is well-documented as to what it *should*
do, but not necessarily as to what it is known to be capable of doing right
now. This is in part due to the development style of the ROOTteam, which
emphasizes the goal functionality without listing what subset of this has been
fully "certified" as operational. The documentation is not as comprehesive and
reference-oriented as some commercial documentation I have seen, but it ranks
very well against other freeware package documentation.

PM) the ROOT software is extensively documented, the documentation system 
    is source-based, in this sense it is more developer-oriented. What could
    be improved is the documentation for the beginners (including non-experienced
    C++ users)

i) Is a system manager required in order to install and/or maintain the 
package? If so, this would be significantly complicate matters for some remote 
users who do not have ready access to (or a friendly relationship with) the system
managers of their computer.

RDK)	A system manager is not required to use this, but we probably will
distribute this from FNAL though the UAS UPS/UPD model, which implies that a
"products" support person will probably install the UPS root product on
machines. Some machines include alternative "products" area administered by a
normal user, bypassing the requirement that someone have access to the
"products" account. ROOT is compatible with this approach too.

3) Licensing

a) What types of licensing are available?

RDK)	There is only one license, making ROOT free for non-commerical use.

b) What is the cost? For Universities? Lab?

RDK)	ROOT is free for non-commercial use for everyone. If you want to pay
for it, I am sure that the ROOTteam will accept donations. They seem to be very
willing to accept computer accounts on machines with compilers that allow them
access to different OS/compiler combinations (cdfsga and Kai C++, for example).

B) Maintenance

1) Who provides it and how much

a) Who provides maintenance both local and external to the Lab? What are the
fallbacks (if the maintainer(s) is run over by bus or the company folds)?

RDK)	Currently the ROOT team and associates provides all the maintenance.
There is no reason to believe users (FNAL, for instance) cannot contribute to
maintenance and have changes rolled back into the ROOT repository. For now,
however, one must learn CMZ to do this, which inhibits users from working with
the source code to overcome locally discovered problems. I tried to get ROOT to
work under Linux2 with KCC v3.2 (local build with debug symbols) and made
little progress.

RDK)	Since the source code and build procedures are available (though we
would like to see them moved out of CMZ with kumacs and into CVS with
makefiles), anyone can provide maintenance. For now that is almost entirely the
ROOT team and associates. If Rene and Fons were on an ill-fated airliner, a
collaborative mainenance team could be formed which would function, like the
EGCS compiler development "team", using a world-readable CVS repository.
Clearly this would be different from having two or more maverick coders turning
out 100 new lines per day (my groundless guess), but the product would survive
the transition.

PM) Users from BNL started actively contributing into the code distribution.
 - S.Adler(BNL) generated rpm's for i386 and Alpha linux'es
 - D.Morrison (BNL) generated ROOT distribution tar-file based on GNU
   configuration tools - autoconf, libtool, and automake.


b) Are the maintainers responsive and are bug fixes turned around in a
reasonable amount of time?

RDK)	In my and other's opinion, yes. Some defects unrelated to core
functionality take longer to get fixed, but this is a reasonable choice on the
part of the ROOT team.

c) Does the software maintainer need additional training (beyond that needed by
users). If so, is it available and at what cost.

RDK)	Right now, today, a maintainer must learn CMZ. I have done it with help
from Pasha, Pasha has done it, and we would not wish such on our fiercest
competitor. With a move to a CVS repository and makefiles, this burden will be
eliminated. ROOT is a diverse package, though. It includes elements of 
Graphics, HTML, Postscript, data structures, complicated RTTI, statistics, and 
basic data presentation. No one person here is likely to be able to cover all those
subjects at an expert level and maintain 100% of ROOT. It will cost some time 
to familiarize those expert in a subject with the source code in ROOT related to
that subject.

d) What is maintenance/licensing costs for commercial products?

RDK)	Maintenance is free. It would not hurt to give them an account on your
machine if you are working with an OS/compiler version to which they do not
have ready access.

e) How much software is there (line count)? How much needs to be supported
locally (how many people required)? Can/should support be split up into areas 
of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant
for non-commercial software that will be maintained locally.

RDK)	"The ROOT system consists of about 480,000 lines of code (390,000 lines
C++ and 92,000 lines C). The C language is used in CINT and in pieces of public
domain code that perform specific functions like, terminal I/O handling
(Getline), data compression (Zip) and the 3D interactive interface to X/Windows
(X3D)." Also, much of the C code is the result of translating F77 (MINUIT,
Simluation packages).

RDK)	The support can be split up into areas of expertise fairly easily. All
of the maintainers would have to understand at some level the basic
infrastructure: memory management, data structures, IPC services, and so on.
Beyond that, the modularity in ROOT is based on high-level areas of expertise.
Here is a text translation of the "ROOT System Tree" to give some idea how this
is organized. Roughly each subject below has its own library of classes.

                            NA49
                             |
                       RINT (ROOT CINT)
                     CINT C++ Interpreter
                    /                 |
  Detector Description       User Interface Components     Minimization
        |                                       |            |
  Geometry Rendering\                           Formula Evaluation
          |          \---\                              |
  Style Management        Containers                 Ntuples
    |            |              |                       |
3D Graphics      |          Object I/O                Trees
    |            |              |                       |
2D Graphics      |              |                 /Histogramming
 |     |         |              |       /--------/      |
 |  Postscript   Object Runtime Services            IPC Services
 |     |         |              |                       |
X11/Windows/Mac Interface     Memory Management     OS Interface

f) In the case of commercial software, is source code available (in escrow)?
This would be required for finding bugs locally or in case the company folds.
This may be an additional cost.

RDK)	Source code is available, although it is maintained in a CMZ repository.

2) Maintenance Infrastructure

a) What kind of build environment is provided. Is it robust? This is mostly
relevant for non-commercial software that may need to be co-maintained.

RDK)	The maintenance and build environment is CMZ (CERN Patchy combined with
CERN Zebra). It is robust, but very clumsy and tedious to use and arcane in its
user interactions. In some cases, I have had to put a a symlink to the Kai
compiler in order to get CMZ to recognize where it is. Surely there is a way to
avoid this, but the symlink was faster than finding and reading CMZ
documentation. This is completely unacceptable, and would not be too difficult
to change (just time-consuming).

b) Can the package be built AT ALL on new or different sub-systems? Root still
provides NO makefiles.

RDK)	ROOT is beginning to include makefiles for some selected OS/compiler
combinations as of 2.00/09, but not many. Once one learns CMZ, and is willing 
to
edit code within CMZ <<, then one can adapt ROOT to new or different
systems. Once ROOT moves to CVS with makefiles, this will be much, much easier.

c) Is the source repository accessible so that local support persons can select
which changes to accept and which to reject for local use?  Root still uses 
CMZ.

RDK)	The primary source code repository is not available to the general
public. One can, if one knows where to look, get a *copy* of the CMZ file
containing all the source and build procedures for a particular version ROOT.
One cannot tell from the filename, however, *which* version of ROOT a 
particular
CMZ file contains (a convention problem from the ROOT team). This should all be
changed to an open CVS repository as is used for Egcs compiler development.

d) Will the software have to be maintained and/or extended locally and
externally? If so, can the software be maintained in a common repository. If
separate repositories, what is the commitment to keep them from diverging from
modifications, extensions and bug fixes.  This excludes locally maintained
extensions which use pre-defined APIs or hooks into the product, which we will
have to maintain ourselves in any case.

RDK)	A small FNAL ROOT team might port ROOT to new OS/compiler combinations
that the ROOT team does not have access to and make high priority modifications
(fix shared memory access for online monitoring programs during data-taking).

RDK)	We should develop a model with the ROOT team that is somewhere between
a single shared repository (Egcs model) and a sub-ordinate repository where
changes are fed back to the "central ROOT team" for consideration for inclusion
in the master repository (CLHEP - FNAL Zoom model). I do not think we should
allow or tolerate divergence in separate sibling repositories (BaBar - CDF
Framework collaboration model) because resyncing the ROOT repositories will be
an overwhelming task which will make permanent divergence seem like an economic
alternative. ROOT as a product does not exclude any of these models once it is
moved to a CVS-based repository. For now, maintenance and development
collaboration with CMZ as the repository does not seem very economical as all
local personnel would have to be trained in using CMZ effectively in a
collaboration.

e) Is the software passed through quality assurance software such as Purify or
Insure++ before being put into production?

RDK)	To my knowledge, ROOT has not been "Purifyed" or "Insured". It is clear
that ROOT leaks memory, for instance. Its own statistics show that clearly.
ROOT sessions end with a memory allocation/de-allocation histogram which
documents large number of memory leak, some of considerable size. I do not know
how ROOT will be judged by a C++ 

f) Are there any restrictions that would prevent the product from being placed
in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to
support more than one version of the product on the system.

RDK)	No, there are not. I am doing exactly this with ROOT v2.00/08 in support
of CDF's Event I/O facilities. ROOT depends on a single environmental variable,
ROOTSYS, to indicate the base of its internal file system. LD_LIBRARY_PATH is
sometimes required to be set on some OSs. This is very easy to implement in the
UPS/UPD framework.

g) Are release notes and change lists provided with releases? For example, the
commercial product IDL comes with "what's new" and release notes lists.

RDK)	Yes, there are good "CHANGES" files available on the web to allow users
to determine if a desirable patch is in a new version before downloading and
installing a new version of ROOT. They can be found at:
http://root.cern.ch/root/Availability.html underneath the version numbers at
the top of the page.

3) Maturity and Completeness

a) Are there active mailing lists/FAQs/newsgroups for the product? How do they
reflect on the product? Root has a support list, the commercial product IDL has
FAQs and a newsgroup.

RDK)	ROOT has an e-mail support list ROOTtalk which includes a search engine
on the e-mail digest, http://root.cern.ch/root/roottalk/AboutRootTalk.html.

b) Are recent releases extensions/enhancements and not bug fixes?

RDK)	My impression from reading through the CHANGES files for the last few
minor releases is that there is roughly a one-to-one ratio of added
features/classes and reactive modifications. IBy reactive modifications, I mean
changes to existing code which does not add functionality, such as defect
fixes, run-time improvements, and minor design-related changes.

c) Are product releases reasonably paced and useful?

RDK)	There have been two major releases that I know of (1.0 and 2.0) in the
last 15 months, and patches appear about every 2 to 4 weeks. The patches appear
to be roughly equally divided between developer-realized issues and user defect
reports/feature requests. Depending on the features in ROOT which you exploit,
the patch releases may or may not be immediately useful to you.

PM)  For example, this spring a concept of multifile tree has been introduced, 
     in the coming release (2.11) we expect to have a new LaTeX interface for 
     writing the formulas

4) Modularity:

a) Will the tool/software need to be upgraded (additions/replacements) to
satisfy Run II functional requirements, and how difficult will this be? Does 
the product provide API/hooks to easily interface locally written extensions? Would
existing support be able to handle this or would manpower need to be added? For
example adding a command line (e.g. Python) to Histoscope is thought to be
difficult. Anticipated additions and replacements should be identified.

RDK)	To meet the functional requirements, ROOT will at least need some work
done on CINT to make it more standards-compliant and more robust. A complete
C++ interpreter to replace CINT would probably require acquiring a C++
front-end (from EDG, US$60k) and applying roughly 2 to 4 FTE years of effort.
More modest approaches to improving CINT would require less effort, but are not
now well-defined. *** This answer should be better developed ***

RDK)	ROOT provides enough hooks to allow locally written extensions to be
used with ROOT in most cases. It would still take a fair amount of effort,
however, to replace an existing ROOT "module" like Linear Algebra with a
locally developed solution, due to ROOT RTTI expectations and ROOT module
interdependence.

PM) There are "2-way" hooks available: 
- ROOT  C++ classes could be used in the offline/online code, 
- the existing offline shared libraries culd be loaded in dynamically
  and be used within the ROOT interactive framework

b) Is the product modular? Is it broken down logically and physically into
reasonably distinct sub-systems? Can some of these sub-systems be replaced by
external packages with the same functionality? For example, in Root can Linear
Algebra or Minimization be done by packages developed specifically to solve
these topics separately from Root? A mini code review should be performed on 
the package to determine what would be involved in replacing such an identified
component or sub-system.

RDK)	ROOT is very modular, but some modules are also fairly interdependent.
It would be very difficult to remove some ROOT modules and re-use them outside
of the ROOT framework (which is not what is meant here, of course). It would
take a fair amount of extra effort to take a module like Linear Algebra and
install a replacement which has all the "ROOT-like" functionality as the
original, especially in the context of ROOT RTTI which permits interactive
object browsing. Another issue is that the ROOT Linear Algebra module has an
interface which must be preserved for other ROOT modules to continue to
function, thus probably requiring a replacement to use some interface adapter
layer before installing it.

c) If new functionality needs to be added, is the software sufficiently modular
such that the code changes can be localized? For example, it is believed this 
is not the case for adding STL support to cint. A mini code review should be
performed to assess whether extensive and/or destabilizing code changes would 
be required to add the functionality.

RDK)	Most of ROOT, in my experience, appears to have fairly localized
functionality, allowing extensions to be fairly easily added. CINT is the
obvious exception. CINT is poorly and irregularly organized compared to C/C++
compilers, lacking distinct parsing, symbol table management, and action code.
Further its fundamental design is flawed in that the parser itself is the wrong
variant to handle C++ syntax efficiently, requiring the syntax to be implicitly
expressed in coded procedures instead of an easily editted, conceptually clear,
grammar.

d) Is the software sufficiently modular to be such that bug fixes are 
localized? A mini code review focused on a particular section or component of 
the software should provide information on this.

RDK)	Most of ROOT, in my experience, appears to be sufficiently modular,
allowing bug fixes to be fairly easily added. 

e) If a component needs to be replaced in it's entirety, is the software
sufficiently modular such that a new component can be slotted in with minimal
disruption? For example, Root depends on functionality in cint other than the
interpreter in a fundamental way.

RDK)	Due to ROOT RTTI expectations, it would not be trivial to drop in
module replacements, but neither would it be technically difficult. The
exception to this is CINT, which has a "broader interface" to ROOT than a
simple C++ interpreter. CINT passes additional information about objects to the
rest of the ROOT infrastructure that a "traditional" C++ interpreter would.
Also, it is not clear that the ROOT-CINT interface is well-documented.

PM) It is important to understand that the dependence discussed provides many 
unique features not available in other packages,  for example, ROOTCINT is used 
for automatic generation of dictionaries, which makes it trivial to hook up 
any external code.

f) What distinct (external) packages or interfaces are required to build and/or
run the package... Motif (shared libraries), OpenGL, etc.  Are there external
software components that are out of the maintainer's control? LHC++ depends on
numerous commercial packages, Nirvana/Histoscope on motif. All such packages 
and interfaces should be identified.

RDK)	No external packages are required to build and use ROOT, though OpenGL
appears to be capable of being used with ROOT. ROOT does not require Motif, 3D
X11 packages (one is supplied), or any other commercial/freeware package to be
supplied by maintainers or users. ROOT does supply, integrated into its source
tree, several freeware packages which it does use.

5) Portability

a) Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a
specific Run II platform isn't supported, what would it take to get support for
it should be determined.

RDK)	All Run II platforms are supported by ROOT. We (Pasha) have asked the
ROOT team to support for the Kai C++ compiler, and they have done so with our
providing accounts for ROOT developers on appropriately equipped platforms. 

b) Porting of code to new platforms; this applies to non-commercial software
that is currently only supported on selected systems. The issue to be raised is
the ability of the original developers to accept changes to be incorporated 
into the base code so any porting done here is done once (aside from effects of
future OS upgrades) and does not have to be re-done with each release of the
software.

RDK)	Because ROOT is already supported on many more OSs than are being
considered for Run II, I do not think porting to new OSs is a potential
problem. Since they have already ported ROOT to a (relatively)
standards-compliant compiler, I do not think that porting ROOT to a new
standards-compliant compiler is a potential problem.

c) How sensitive is the package to minor OS and/or compiler and/or system 
header variations? Root under Linux may be sensitive to which C-libs are in use, 
which distribution you are using, which kernel you are using, which system header
patches you have applied (esp., cint), and so on. PAW is sensitive to OS
upgrades. This can be determined by looking at support history in newsgroups,
other support logs, or talking to the user community.

RDK)	This sensitivity of ROOT to minor OS/header changes was definitely a
problem with ROOT v1. Since then, ROOT has adapted to the same Linux
distribution that FNAL has chosen to support, and distributes ROOT for old and
new versions of the Linux C library. This does not mean that changing an
important system header will not affect ROOT, just that we do not now see a
problem with the changes I recommended to the FNAL Linux distribution.

d) 64 bit considerations: Does the software run on 64 bit platforms/OS
(alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64
bit systems (a la COMIS for PAW).

RDK)	The C++ code in general should be 64-bit clean. I do not know if the C
and F77-converted-to-C code is, but I suspect it is or can be easily adapted to
be 64-bit clean. I wonder about the implications though of no longer being able
to convert ntuples from HBOOK to ROOT since Zebra does not function on 64 bit
systems. Note that while Dec Unix and SGI IRIX are both 64 bit systems, we run
both with 32 bit pointers (Dec) or in 32 bit mode (SGI) largely because Zebra
does not function correctly on a complete 64 bit system.

PM) The most system-dependent part of ROOT system - CINT - has been ported to 
64-bit IRIX architecture at BNL. 

e) Are there Endianship and other heterogeneous environment considerations?

RDK)	ROOT currently only support big-endian IEEE (IEEE floating point)
files. That is a reasonable choice during their rapid development phase; Trybos
at CDF has made the same choice. Nevertheless, we should require that ROOT
support little-endian IEEE files in the future to improve performance on
little-endian systems, such as Linux/Windows based on Intel chips and Dec Unix
based on Alpha chips.

f) Is the software product build dependent on a specific compiler or is it
compiler independent? If compiles are not needed for the product, are there any
compiler dependencies present in the API used for locally written extensions? 
In particular, if the software needs to be built with Run II compilers, it should
be verified if it can.

RDK)	The ROOT build procedures are mildly dependent on the compiler in use,
but not much more so than any other C++ product. Afterall, different compilers
have different switches to express similar concepts. ROOT libraries, because
they are built from C++ code, are specific to a particular C++ compiler and to
certain switches (exceptions on/off with Kai C++, threads on/off with MS VC++).
ROOT memory management is known to fail or misbehave on some OS-compiler
combinations for various reasons such as a compiler not allowing overload of
global new.

6) Standards:

a) Are standards followed. Compiler standards? Library standards (e.g. STL,
POSIX). Are they fully supported (e.g. cint and STL)?

RDK)	ROOT is relatively standard C++, and hides vast OS dependencies in OS
interface modules (for Unix, Windows, Mac, etc). Currently, CINT has difficulty
dealing with many advanced C++ features, however, and so certainly cannot be
labeled standards-compliant.

b) Are there any support and maintenance standards or procedures? For example,
any control over what goes into releases?

RDK)	The ROOT team is small, and explicit written procedures for support and
maintenance do not seem appropriate.

c) Are good coding practices (documentation) followed? Is there good 
developer's documentation (how easily can the product be "taken over"?)

RDK)	ROOT follows a C++ style code which is reasonable and which they have
published: http://root.cern.ch/root/Conventions.html. The documentation in
general is vast, though several pieces are missing. A short 5 page "What to do
when you first use ROOT" tutorial would be invaluable since users first
experience with ROOT is through a clumsy command line interface (ROOT CINT)
which has an not obvious command language ("How do I QUIT ROOT!"). Also missing
is significant English language documentation on what language (hopefully a
proper subset of C++) CINT does support.

d) Are good "Computer Science" techniques and methods used, for example in a
language interpreter (see below)?

RDK)	ROOT CINT is not an good example of Computer Science techniques. Its a
long story, and perhaps Chih-Hao and Scott Snyder can contribute to fill this
in from their PasFrg/PasSuma talks.

e) Is there a design methodology applied? Are any design tools used? (such as a
code generator). If so, do we need to have and/or support these tools?

RDK)	No higher level tools appear to be used in the development of ROOT.

7) Reliability and Security:

a) Is there any security maintenance concerns with the product?

RDK)	There are security concerns only if we use one of the ROOT data server
programs. These might be attacked to at least deny service, at worst to damage
or alter data. ROOT developers may not be aware of buffer overruns within ROOT
which could be exploited to at least "do anything" on a system which the
"owner" of the server daemon has permission to do, at worst to "do anything" on
a system, period. If we do not use ROOT server daemons, then the only security
issues might be with IPC services used, but these are minor concerns related to
whom on a system can access the data (in shared memory, on a socket) of another
program.

b) Is the product likely to crash and if so, how does it recover? What would be
the impact? Do system managers need to intervene?

RDK)	ROOT does not in general use a master server daemon, so crashes do not
generally involve system managers. ROOT has some facilities to recover from
crashes, while a datafile is open for instance. Perhaps Pasha could elaborate?

PM) ROOT output buffers are regularly (after each 1 GByte written, for example) 
flushed out. The autosave frequency is defined by the user. Autosave doesn't affect 
efficiency of the disk space usage. A datafile recovery procedure equivalent to that 
of HBOOK ntuple recovery is available.

c) Any government regulation applied to the product? Export restrictions?

RDK)	No, there are no US regulatory problems to my knowledge. I cannot speak
for other countries represented at FNAL.

d) Are there any Y2K issues?

RDK)	ROOT does use a ROOT-specific time/date format (why oh why?), and it
should be checked for Y2K safety.

8) Application specific:

a) Will an interface/adapter need to be made to fit the tool in with the rest 
of the analysis tool framework (e.g. data import/export). Does the product provide
well defined and documented API/hooks for such an extension?

RDK)	No specific adapter is required if one uses ROOT as a physics analysis
tool. ROOT has enough hooks to allow user-developed formats to be read/written
which are compatible with ROOT data browsing at some level. CDF for instance is
attempting to use these hooks to write ROOT-compatible event data files.

b) Does the product use a language interpreter. If so does it support the full
language. Is it written following correct computer science techniques and
algorithms? Is it sensitive against language changes, etc.  (examples are COMIS
and cint, which support only subsets of their language). Or, turning the
question around, is the language computational complete? (Then forget about the
language it was trying to emulate and treat it as completely new language).

RDK)	ROOT uses a C/C++ interpreter called CINT. CINT supports a subset of
the Standard C++ language, but it is not well-documented *what* subset of C++
is supported. Since it is not specifically documented what language is
supported, it is impossible (if I understand the statement above) to say the
CINT is computationally complete. It is not written using modern Computer
Science techniques. For instance, there are no distinct parsing, syntax
checking, and parse tree navigation phases. There is no easily-editted
concpetually-comprehensible grammar description.

RDK)	One of the marketing statements from ROOT is that there is "only one
language for users of ROOT to learn". This has clearly not been achieved. COMIS
was not modern Fortran, and CINT is even less modern C++. Users must learn not
only the feature of C++ which are not supported (or are poorly supported) by
CINT, like templates, STL, C++ RTTI for dynamic dispatch, but they must also
remember the C/C++ expressions which ROOT cannot handle (like ***p++, which in
fact is not so absurd for certain coding styles). One cannot in general take
"sophisticated" C++ code and run it with CINT, and sometimes "simple" C++ code
fails too.

c) Is the data processing model correct? For example, in MATLAB, the model is 
to read in a file then process all its data - will this work with Run II sized
files.

RDK)	ROOT supports both extremes of data processing: read all data and work
on it (efficient for small data sets), and read only a piece of the data,
process it, and then do it again with the next piece of data. The user has
control over, at least at the data model design level, where in between these
extremes an analysis job will operate.

PM) Conceptually ROOT continues the development line we refer to as to "CERNLIB"
and it has very good chances to become a successor of the CERNLIB. 
STAR collaboration at RHIC started a full scale evaluation of ROOT not only as a 
PAW, but as a CERNLIB replacement.

d) Are there restrictions on the input data (say size or format)? For example,
in Histoscope the size of ntuples is limited because it is stored and processed
in memory.

RDK)	Input data must be in ROOT format, or have some code translate the data
on the fly into ROOT format. There is no practical limit to my knowledge on
input data size, but this may depend on the data model in use. ROOT exchanges
input data between disk and memory as needed. ROOT may be limited, depending on
the data model, to the virtual memory size of a machine for a piece of an
event/histogram (branch), for an entire event/histogram, or for a ROOT I/O
buffer. Given the cost of memory nowadays, I do not consider such limits to be
of any concern.

e) What is the minimal environment to run the product? How does
performance/capability scale up while the environment scales up?

RDK)	This may vary between Unix and Windows. For Unix, one needs simply a
supported system, 

???)	For Windows?

Appendix 2 - Nirvana Evaluation

Run II Physics Analysis Software Requirements Checklist: Nirvana

DATA ACCESS

Access rates (online):

Access rates (offline): 3 times HBook.

Serial vs random access: both are available. (random is in log(N))

Granularity of access: Column. Size of column chunk customizable at creation (63K)

Foreign Input and Output Formats: API is available for this.

Specialized output formats: All information about objects are retrievable from API

DATA ANALYSIS

Scripting Language:

A python interface will be added using these items in the design process. (3 man months needed)

full featured scripting language
analysis tool's object model
extract data from events
express complex mathematical expressions
debugging facilities
interface the scripting language to dynamically linked compiled high level languages

User Control:

The Python interface will provide those.

control functions
Mathematical operations
Results of analysis available to users
command line recall and interacive command line editing.

Data Selection:

The Python interface will provide those.

program selection criteria using extracted data
display selection criteria as text

Input/Output:

support its own object I/O format: YES
allow its own format object files to be read or written from compiled programs.: YES
read or write object files in foreign formats: Can easily be added.
write selected event objects to one or more output streams: YES
object definition language and/or be able to define new object formats programmatically: YES
read events in one format, convert and write them out in a different format: YES
virtual streaming: Yes with very little work (thanks to the efficiency of random access)

Numeric and Mathematical Functionality:

Additionnals high level functionnality can be provided by Python's add-ons.

accurate and precise numerical functionality, including double precision.: YES
Analysis capabilities applied to fetched data as well as subsequent renditions: YES
Functions operating on multiple data sets: Yes from C, Not yet from GUI (will be added in GUI upgrade)
fit, parameterize, and calculate statistical quantities from data: YES
user control of fitting algorithms. : YES (Minuit)

Offline Compatibility:

All of those will be available for both Python and C.

tailor the sequence of mathematical operations
ability to include external software in their analysis.
functionality of the analysis package linked into user defined code.

Prototyping:

prototyping of simple versions which can later be expanded upon. : may need to translate Python to C
Prototyped sequences contain the full interface of an arbitrarily complex version. : Yes

DATA PRESENTATION

The Graphical Interface need to be upgraded to handle column wise ntuples, operations on multiple histograms, and improve the 'glue' between components (Histoscope, Nfit). (6 man months)

Interactive visualization: Yes

Presentation quality graphical output: Yes

Formal publication of graphical output: No, output postscript can be re-formatted from Adobe Illustrator.

USABILITY

Batch vs. interactive processing: Yes

Sharing data structures: Yes

Shared access by several clients: Yes (very good)

Parallel processing (using distinct data streams): Yes

Debugging and profiling: Yes (python and C)

Modularity (user code): Yes

Modularity (system code): Yes

Access to source code: Yes

Robustness: Good

Web based documentation: Yes

Use of standards: Yes

Portability: Pretty good (have been ported on all needed platforms)

Scalability: Good

Performance: Good

User Friendliness: Very Good

Run II Physics Analysis Software Support Requirements Checklist: Nirvana

---------------------------------

PasSuma Checklist - Nirvana

----------------------------

Philippe Canal

----------------------------

Support

Maturity and Completeness

What is the customer base and what is their experience and opinion? For commercial software or non-HEP freeware, one should get a list of customers and references.

Difficult to quantify. However NEdit, by the same authors with the same technologies is being used successfully by a large community. Python has a very large community of users and a large set of ressources. Eventually Histoscope and NFit could be distributed as python add-on, thus growing the user base.

How long has the product been in existence? What version is the product at? How many major releases have there been? How often is there a minor release? Several major releases or regular minor releases with integrated bug fixes are good signs of a well supported mature product. Availability of published books on the product are also a sign of maturity as well an established customer base.

Exist since 1991. New version have mostly been for added features and port to new operating systems. The newest version (alpha release of version 5.00) introduces column wise ntuple and a redefines some file operations.

How long will the product survive? Are there any competing products that are likely to win the market (including freeware). Who is the product developer and are they well supported financially (graduate student or full time staff).

Nirvana is 'owned' by Fermilab and will survice has long as Fermilab supports it.

Who supports users

Who provides consulting support? Commercial, other Lab, CERN, Fermilab? Are they responsive? Newsgroups and dejanews may provide some information on support response (though these tend to be biased). This is rather subjective and should be treated as such.

Fermilab

Who can get support? Particularly for commercial software, can any user of the product access the support services or are these limited to a pre-specified list of local contacts.

Fermilab's 'customer'.

Is the use of the product in the community enough that there is a pool of people/knowledge to draw from for support if needed? HEP use should be assessed; PAW knowledge in the HEP community is widespread and Root is growing. A dedicated newsgroup would be a plus.

Yes it is already used by the HEP community but no newsgroup. Python has newsgroups and existing knowledgable person in the HEP community.

Is user training needed and available? What is the cost?

Might need training for Python. GUI is good enough to be learnt on the spot.

Is training required and available for support staff? What is the cost (time and money)? For commercial products such as SAS and IDL, support and user training may be required to optimally use the product, and that cost should be folded in.

Available for Python.

How much (local) support will be required (is it complicated and hard to use)? This and the remaining questions in this section can be determined by talking to current users or scanning any newsgroups, mailists or FAQS.

Same as python (and the Nirvana group for Nirvana's specific parts)

For commercial or freeware, what kind and quality of user level support is provided ?

Is the software completely and well documented at the user level?

Yes up to version 4. New features will need documentation.

Is a system manager required in order to install and/or maintain the package? If so, this would be significantly complicate matters for some remote users who do not have ready access to (or a friendly relationship with) the system managers of their computer.

No.

Licensing

What types of licensing are available?

Free.

What is the cost? For Universities? Lab?

Free except for Fermi's man power.

Maintenance

Who provides it and how much:

Who provides maintenance both local and external to the Lab? What are the fallbacks (if the maintainer(s) is run over by bus or the company folds)?

Fermilab

Are the maintainers responsive and are bug fixes turned around in a reasonable amount of time?

Yes

Does the software maintainer need additional training (beyond that needed by users). If so, is it available and at what cost.

No.

What is maintenance/licensing costs for commercial products?

N/A

How much software is there (line count)? How much needs to be supported locally (how many people required)? Can/should support be split up into areas of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant for non-commercial software that will be maintained locally.

Around 130000 lines. The interpreter and the graphics part and the C implementation should separable to a large extend.

In the case of commercial software, is source code available (in escrow)? This would be required for finding bugs locally or in case the company folds. This may be an additional cost.

Yes

Maintenance Infrastructure

What kind of build environment is provided. Is it robust? This is mostly relevant for non-commercial software that may need to be co-maintained.

Ansi C and Motif. Standard makefiles.

Can the package be built AT ALL on new or different sub-systems? Root still provides NO makefiles.

Yes

Is the source repository accessible so that local support persons can select which changes to accept and which to reject for local use? Root still uses CMZ.

Uses CVS

Will the software have to be maintained and/or extended locally and externally? If so, can the software be maintained in a common repository. If separate repositories, what is the commitment to keep them from diverging from modifications, extensions and bug fixes. This excludes locally maintained extensions which use pre-defined APIs or hooks into the product, which we will have to maintain ourselves in any case.

N/A

Is the software passed through quality assurance software such as Purify or Insure++ before being put into production?

Yes

Are there any restrictions that would prevent the product from being placed in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to support more than one version of the product on the system.

No

Are release notes and change lists provided with releases? For example, the commercial product IDL comes with "what's new" and release notes lists.

Yes

Maturity and Completeness

Are there active mailing lists/FAQs/newsgroups for the product? How do they reflect on the product? Root has a support list, the commercial product IDL has FAQs and a newsgroup.

No

Are recent releases extensions/enhancements and not bug fixes?

Yes

Are product releases reasonably paced and useful?

Modularity:

Will the tool/software need to be upgraded (additions/replacements) to satisfy Run II functional requirements, and how difficult will this be? Does the product provide API/hooks to easily interface locally written extensions? Would existing support be able to handle this or would manpower need to be added? For example adding a command line (e.g. python) to Histoscope is thought to be difficult. Anticipated additions and replacements should be identified.

Will need an upgrade of the Graphical Interface and the addition of an interpreter (with Python the likely choice) (The latter comment seems to be a little hasty to me.)

Is the product modular? Is it broken down logically and physically into reasonably distinct sub-systems? Can some of these sub-systems be replaced by external packages with the same functionality? For example, in Root can Linear Algebra or Minimization be done by packages developed specifically to solve these topics separately from Root? A mini code review should be performed on the package to determine what would be involved in replacing such an identified component or sub-system.

Yes.

If new functionality needs to be added, is the software sufficiently modular such that the code changes can be localized? For example, it is believed this is not the case for adding STL support to cint. A mini code review should be performed to assess whether extensive and/or destabilizing code changes would be required to add the functionality.

Yes

Is the software sufficiently modular to be such that bug fixes are localized? A mini code review focused on a particular section or component of the software should provide information on this.

To a large extend.

If a component needs to be replaced in it's entirety, is the software sufficiently modular such that a new component can be slotted in with minimal disruption? For example, Root depends on functionality in cint other than the interpreter in a fundamental way.

Yes. Graphical Interface, interpreter and I/O are three sub-packages.

What distinct (external) packages or interfaces are required to build and/or run the package... Motif (shared libraries), OpenGL, etc. Are there external software components that are out of the maintainer's control? LHC++ depends on numerous commercial packages, Nirvana/Histoscope on motif. All such packages and interfaces should be identified.

Currently need Motif and Minuit. Will need python.

Portability

Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a specific Run II platform isn't supported, what would it take to get support for it should be determined.

Available on ALL platform as of June 98.

Porting of code to new platforms; this applies to non-commercial software that is currently only supported on selected systems. The issue to be raised is the ability of the original developers to accept changes to be incorporated into the base code so any porting done here is done once (aside from effects of future OS upgrades) and does not have to be re-done with each release of the software.

ok.

How sensitive is the package to minor OS and/or compiler and/or system header variations? Root under Linux may be sensitive to which C-libs are in use, which distribution you are using, which kernel you are using, which system header patches you have applied (esp., cint), and so on. PAW is sensitive to OS upgrades. This can be determined by looking at support history in newsgroups, other support logs, or talking to the user community.

not very sensitive.

64 bit considerations: Does the software run on 64 bit platforms/OS (alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64 bit systems (a la COMIS for PAW).

ready.

Are there Endianship and other heterogeneous environment considerations?

Nirvana writes Big Endian IEEE (IEEE floating points) files.

Is the software product build dependent on a specific compiler or is it compiler independent? If compiles are not needed for the product, are there any compiler dependencies present in the API used for locally written extensions? In particular, if the software needs to be built with Run II compilers, it should be verified if it can.

Standards:

Are standards followed. Compiler standards? Library standards (e.g. STL, POSIX). Are they fully supported (e.g. cint and STL)?

ok.

Are there any support and maintenance standards or procedures? For example, any control over what goes into releases?

the few authors have entire control.

Are good coding practices (documentation) followed? Is there good developer's documentation (how easily can the product be "taken over"?)

it's alright

Are good "Computer Science" techniques and methods used, for example in a language interpreter (see below)

yes

Is there a design methodology applied? Are any design tools used? (such as a code generator). If so, do we need to have and/or support these tools?

No high level design tool is currently used.

Reliability and Security:

Is there any security maintenance concerns with the product?

relies on .rhosts for some interprocess communication (actually rsh)

Is the product likely to crash and if so, how does it recover? What would be the impact? Do system managers need to intervene?

unlikely

Any government regulation applied to the product? Export restrictions?

Are there any Y2K issues?

not that i know of.

Application specific:

Will an interface/adapter need to be made to fit the tool in with the rest of the analysis tool framework (e.g. data import/export). Does the product provide well defined and documented API/hooks for such an extension?

Yes.

Does the product use a language interpreter. If so does it support the full language. Is it written following correct computer science techniques and algorithms? Is it sensitive against language changes, etc. (examples are COMIS and cint, which support only subsets of their language). Or, turning the question around, is the language computational complete? (Then forget about the language it was trying to emulate and treat it as completely new language).

Yes

Is the data processing model correct? For example, in MATLAB, the model is to read in a file then process all its data - will this work with Run II sized files.

Yes

Are there restrictions on the input data (say size or format)? For example, in Histoscope the size of ntuples is limited because it is stored and processed in memory.

Histoscope can now also have disk resident ntuples. The size limitation is now 2Gb.

What is the minimal environment to run the product? How does performance/capability scale up while the environment scales up?

I don't know for sure but it should be able to scale nicely.

Appendix 3 - MATLAB Evaluation

Run II Physics Analysis Software Requirements Checklist : MATLAB

Disclaimer - This report contains a combination of fact, observation, and opinion. It is our best understanding of the situation under our particular circumstances, and is not guaranteed to be correct or to apply to conditions other than those under which the evaluation was made. It is for the use of the Fermilab / HEP community only.

DATA ACCESS

Access rates (online):

Access rates (offline):

Serial vs random access: Both

Granularity of access: Matrix/row/column/element/structure element

Foreign Input and Output Formats: Available via F, C, C++ API

Specialized output formats: Available via F, C, C++ API

DATA ANALYSIS

Scripting Language:

full featured scripting language Yes. MATLAB has its own scripting language which is complete and includes looping, arg passing, and "capture" capabilities. These can be stored in .m files using a history capture mechanism or the built-in editor.

analysis tool's object model double precision matrices, also support int, char

extract data from events Yes - commands to extract rows, columns, elements. It is also possible to have C structures in the arrays, so variable lengths are available.

express complex mathematical expressions That's what it does

debugging facilities Yes, and profiling

interface the scripting language to dynamically linked compiled high level languages Yes - scripting language can call .dll or equivalent

User Control:

control functions Yes

Mathematical operations Matrix operations, ODE solvers, FFT's, curve fitting...

Results of analysis available to users At any point - rows/columns can be stored to their own variables or printed out from the current array.

command line recall and interacive command line editing. Yes. Full command line editing, with good editor (NT version) to save work. Both the command line editor and the command file editor have Win32 controls (ctl-C - ctl-V). Win32 version provides DOS directory traversal

Data Selection:

program selection criteria using extracted dataYes

display selection criteria as text Yes

Input/Output:

support its own object I/O format. Yes - double[][]

allow its own format object files to be read or written from compiled programs. Yes - std dll's produced

read or write object files in foreign formats No

write selected event objects to one or more output streams Yes

object definition language and/or be able to define new object formats programmatically. Yes

read events in one format, convert and write them out in a different format. Yes

virtual streaming

Numeric and Mathematical Functionality:

accurate and precise numerical functionality, including double precision. Yes - default model

Analysis capabilities applied to fetched data as well as subsequent renditions Yes

Functions operating on multiple data sets Yes

fit, parameterize, and calculate statistical quantities from data Yes - built-in hist, bar3(lego) functions, polynomial fits, error bar plot

user control of fitting algorithms. Yes, or implement you own or use a library one pretty easily.

Offline Compatibility:

tailor the sequence of mathematical operations Yes

ability to include external software in their analysis. As long as it eventually matches data model

functionality of the analysis package linked into user defined code.

Prototyping:

prototyping of simple versions which can later be expanded upon. Yes - major strength. These .m files are simple to write, can be combined/split/changed to code

Prototyped sequences contain the full interface of an arbitrarily complex version. Yes

DATA PRESENTATION

Interactive visualization: Yes, with capability to write callbacks for additional functionality. Native "Handle Graphics". Also many advanced vis features - texture, lighting, ...

Presentation quality graphical output: Yes - see demo

Formal publication of graphical output: Yes. There also exists the capability to save the graphics in a native .m file for inclusion later. Printing can be in mono/color, level1/level2, eps/ps, Adobe Illustrator illustration file. There is an "online Notebook" feature using MS Word

USABILITY

Batch vs. interactive processing: Yes - licenses are cheaper too

Sharing data structures:

Shared access by several clients:

Parallel processing (using distinct data streams):

Debugging and profiling: Yes

Modularity (user code): Yes

Modularity (system code): N/A

Access to source code: No

Robustness: Seems good - error messages are precise, and the facility to trap and recover from errors in user code is present.

Web based documentation: Yes

Use of standards: N/A

Portability: All platforms

Scalability: Excellent

Performance: Good

User Friendliness: Outstanding

Run II Physics Analysis Software Support Requirements Checklist: MATLAB

---------------------------------

PasSuma Checklist - MATLAB

----------------------------

Jeff Kallenbach

----------------------------

A) Support

1) Maturity and Completeness

What is the customer base and what is their experience and opinion? For commercial software or non-HEP freeware, one should get a list of customers and references.

There are a number of references on Mathworks' WWW pages. They claim to have around 400,000 users worldwide. I have asked for HEP references (users at SLAC, BNL, LANL, etc.) and am waiting.

How long has the product been in existence? What version is the product at? How many major releases have there been? How often is there a minor release? Several major releases or regular minor releases with integrated bug fixes are good signs of a well supported mature product. Availability of published books on the product are also a sign of maturity as well an established customer base.

The product has been around for ~15 years. Current release is 5.2.

How long will the product survive? Are there any competing products that are likely to win the market (including freeware). Who is the product developer and are they well supported financially (graduate student or full time staff).

There is a freeware version, called Octave. It is unclear whether this will really overtake MATLAB, though. The fact that the Octave graphics are still very crude indicates that not a lot of work is being applied to areas we consider important.

2) Who supports users

Who provides consulting support? Commercial, other Lab, CERN, Fermilab? Are they responsive? Newsgroups and dejanews may provide some information on support response (though these tend to be biased). This is rather subjective and should be treated as such.

There are e-mail and phone hotlines for support questions. A local base of knowledge could easily bedeveloped to help with FAQ and other straightforward questions. The newsgroup is busy, with about 800 articles posted in the last two - three weeks.

Who can get support? Particularly for commercial software, can any user of the product access the support services or are these limited to a pre-specified list of local contacts.

Anyone at the site would be able to contact the hotline

Is the use of the product in the community enough that there is a pool of people/knowledge to draw from for support if needed? HEP use should be assessed; PAW knowledge in the HEP community is widespread and Root is growing. A dedicated newsgroup would be a plus.

HEP usage is unknown. The newsgroup is busy, and the q & a in there look healthy, ranging from beginner-level inquiries to refinements of graphics output. There seem to be plenty of users.

Is user training needed and available? What is the cost?

It is available and expensive (about $500/person/2-day course). From my experience the product is pretty easy to get started with using the doc.

Is training required and available for support staff? What is the cost (time and money)? For commercial products such as SAS and IDL, support and user training may be required to optimally use the product, and that cost should be folded in.

There is such a thing, but it isn't yet priced.

How much (local) support will be required (is it complicated and hard to use)? This and the remaining questions in this section can be determined by talking to current users or scanning any newsgroups, mailists or FAQS.

The product seems very easy to use. I would think though that a local working group or mailing list would be helpful, as we write our data interface modules and other HEP-specific functions. The volume of licenses that we will require some attention, as well as coordination with the license server people (fnalu-admin).

For commercial or freeware, what kind and quality of user level support is provided?

The helpwin facility and www-based documentation are outstanding. Response to technical questions from the hotline has been very good. The books and helpwin are full of examples. Getting started with the product is very easy.

Is the software completely and well documented at the user level?

Yes - see above

Is a system manager required in order to install and/or maintain the package? If so, this would be significantly complicate matters for some remote users who do not have ready access to (or a friendly relationship with) the system managers of their computer.

Not required, but recommended. Workarounds exist for the case where peons do the installation. UPS/UPD tailoring would be pretty easy.

3) Licensing

What types of licensing are available?

Flexlm, floating seats exist for Un*x and NT (these are separate). However, any of our collaborators can use licenses off of our server. In addition, there is another breakdown of the Unix licenses into interactive (programmer's) and batch licenses.

What is the cost? For Universities? Lab?

The cost is very high. They are quoting ~$3000 per floating Unix seat per year, with 20% discounts for large volumes.

B) Maintenance

1) Who provides it and how much:

Who provides maintenance both local and external to the Lab? What are the fallbacks (if the maintainer(s) is run over by bus or the company folds)?

I would envision a pool of local knowledge (mailing lists, module repository) to supplement the hotline. AS a commercial, it would not be "maintained" by fixing the code, but by feeding questions/problems back to MathWorks and then updating the local installations. If the company folds we could go to Octave.

Are the maintainers responsive and are bug fixes turned around in a reasonable amount of time?

We didn't encounter any bugs. The support line was responsive in the case of our two questions.

Does the software maintainer need additional training (beyond that needed by users). If so, is it available and at what cost.

Probably not - just relaying bugs and fixes.

What is maintenance/licensing costs for commercial products?

Two ways to go. We can by a four year service agreement, in which case we pay for 4 years of upgrades/hotline/etc. for 20% of our purchase price. This includes perpetual licenses. Or, we can renew one year at a time, for 20% of the current price. In this case the license must be renewed each year.

How much software is there (line count)? How much needs to be supported locally (how many people required)? Can/should support be split up into areas of expertise (e.g. motif/graphics, interpreter, etc.). This is mainly significant for non-commercial software that will be maintained locally.

Probably N/A.

In the case of commercial software, is source code available (in escrow)? This would be required for finding bugs locally or in case the company folds. This may be an additional cost.

It is not available.

2) Maintenance Infrastructure

What kind of build environment is provided. Is it robust? This is mostly relevant for non-commercial software that may need to be co-maintained.

Standard build kits for Unix and Install Shield kits for Win32. Distributed on CD, we could distribute via ZIP files/upd kits probably.

Can the package be built AT ALL on new or different sub-systems? Root still provides NO makefiles.

Kits exist for all of our platforms.

Is the source repository accessible so that local support persons can select which changes to accept and which to reject for local use

N/A

Will the software have to be maintained and/or extended locally and externally? If so, can the software be maintained in a common repository. If separate repositories, what is the commitment to keep them from diverging from modifications, extensions and bug fixes. This excludes locally maintained extensions which use pre-defined APIs or hooks into the product, which we will have to maintain ourselves in any case.

I can see us sharing modules and controlling those however we choose. But as far as changing the distribution, it will not occur.

Is the software passed through quality assurance software such as Purify or Insure++ before being put into production?

N/A

Are there any restrictions that would prevent the product from being placed in the run II infrastructure (i.e. UPS/UPD)? In particular, the ability to support more than one version of the product on the system.

No such restrictions.

Are release notes and change lists provided with releases? For example, the commercial product IDL comes with "what's new" and release notes lists.

Yes - release notes are provided.

3) Maturity and Completeness

Are there active mailing lists/FAQs/newsgroups for the product? How do they reflect on the product? Root has a support list, the commercial product IDL has FAQs and a newsgroup.

All of this exists. THere is a newsgroup where users share ideas and code. The Mathworks WWW pages are very complete, with FAQ's and the like.

Are recent releases extensions/enhancements and not bug fixes?

Yes. Version 5.2 is primarily an upgrade.

Are product releases reasonably paced and useful?

Yes. Much work is going into Win32 versions.

4) Modularity:

Will the tool/software need to be upgraded (additions/replacements) to satisfy Run II functional requirements, and how difficult will this be? Does the product provide API/hooks to easily interface locally written extensions? Would existing support be able to handle this or would manpower need to be added? For example adding a command line (e.g. python) to Histoscope is thought to be difficult. Anticipated additions and replacements should be identified.

The only big deficiency I see is possibly with the data management. It is unclear how this will behave when fed a 2GB ntuple. Some code to help manage such data may have to be written. The API is excellent and well-documented. CD (PAT, I guess) would probably be able to handle the support with existing manpower. This would include getting distribution and licenses set up, upd'ifying the product, and helping the experimentors get going with their data interface modules. Also preferable would be management of local module base and mailing list. I see maintenence of this product requiring less work than a HEP-ware product.

Is the product modular? Is it broken down logically and physically into reasonably distinct sub-systems? Can some of these sub-systems be replaced by external packages with the same functionality? For example, in Root can Linear Algebra or Minimization be done by packages developed specifically to solve these topics separately from Root? A mini code review should be performed on the package to determine what would be involved in replacing such an identified component or sub-system.

I didn't actually do it, but I think such a thing could be done without much problem. For example, it would seem to be straightforward to replace (actually override) the plotting with OpenInventor graphics, for example.

If new functionality needs to be added, is the software sufficiently modular such that the code changes can be localized? For example, it is believed this is not the case for adding STL support to cint. A mini code review should be performed to assess whether extensive and/or destabilizing code changes would be required to add the functionality.

N/A

Is the software sufficiently modular to be such that bug fixes are localized? A mini code review focused on a particular section or component of the software should provide information on this.

N/A

If a component needs to be replaced in it's entirety, is the software sufficiently modular such that a new component can be slotted in with minimal disruption? For example, Root depends on functionality in cint other than the interpreter in a fundamental way.

What distinct (external) packages or interfaces are required to build and/or run the package... Motif (shared libraries), OpenGL, etc. Are there external software components that are out of the maintainer's control? LHC++ depends on numerous commercial packages, Nirvana/Histoscope on motif. All such packages and interfaces should be identified.

This package is self-contained.

5) Portability

Platform availability: Linux, NT, Solaris, IRIX, HP, DEC-Unix ... ? If a specific Run II platform isn't supported, what would it take to get support for it should be determined.

All runii platforms are maintained. We could share licenses between Un*X platforms.

Porting of code to new platforms; this applies to non-commercial software that is currently only supported on selected systems. The issue to be raised is the ability of the original developers to accept changes to be incorporated into the base code so any porting done here is done once (aside from effects of future OS upgrades) and does not have to be re-done with each release of the software.

N/A

How sensitive is the package to minor OS and/or compiler and/or system header variations? Root under Linux may be sensitive to which C-libs are in use, which distribution you are using, which kernel you are using, which system header patches you have applied (esp., cint), and so on. PAW is sensitive to OS upgrades. This can be determined by looking at support history in newsgroups, other support logs, or talking to the user community.

64 bit considerations: Does the software run on 64 bit platforms/OS (alpha/Unix, SGI/Unix, future, e.g. merced)? Will it be difficult to port to 64 bit systems (a la COMIS for PAW).

Works on 64-bit platforms.

Are there Endianship and other heterogeneous environment considerations?

There don't seem to be

Is the software product build dependent on a specific compiler or is it compiler independent? If compiles are not needed for the product, are there any compiler dependencies present in the API used for locally written extensions? In particular, if the software needs to be built with Run II compilers, it should be verified if it can.

6) Standards:

Are standards followed. Compiler standards? Library standards (e.g. STL, POSIX). Are they fully supported (e.g. cint and STL)?

N/A

Are there any support and maintenance standards or procedures? For example, any control over what goes into releases?

N/A

Are good coding practices (documentation) followed? Is there good developer's documentation (how easily can the product be "taken over"?)

N/A

Are good "Computer Science" techniques and methods used, for example in a language interpreter (see below)?

N/A

Is there a design methodology applied? Are any design tools used? (such as a code generator). If so, do we need to have and/or support these tools?

N/A

7) Reliability and Security:

Is there any security maintenance concerns with the product?

None

Is the product likely to crash and if so, how does it recover? What would be the impact? Do system managers need to intervene?

I wasn't able to crash it on NT. There are reports of a crash on the newsgroup, with what sounded like a typical seg fault.

Any government regulation applied to the product? Export restrictions?

No government regulations. Foreign collaborators could probably make use of our licenses, but may cost more.

Are there any Y2K issues?

8) Application specific:

Will an interface/adapter need to be made to fit the tool in with the rest of the analysis tool framework (e.g. data import/export). Does the product provide well defined and documented API/hooks for such an extension?

Yes. The basic MATLAB data model is a double precision array. Interfaces will have to be written from experiments' data models.

Does the product use a language interpreter. If so does it support the full language. Is it written following correct computer science techniques and algorithms? Is it sensitive against language changes, etc. (examples are COMIS and cint, which support only subsets of their language). Or, turning the question around, is the language computational complete? (Then forget about the language it was trying to emulate and treat it as completely new language).

MATLAB has a very nice interpreter which can be learned pretty easily. In addition, the commands can be saved in scripts or compiled to shared objects. It is not the C++, though.

Is the data processing model correct? For example, in MATLAB, the model is to read in a file then process all its data - will this work with Run II sized files.

A problem - some codes will have to be written to handle runII sized files.

Are there restrictions on the input data (say size or format)? For example, in Histoscope the size of ntuples is limited because it is stored and processed in memory.

MATLAB has a similar restriction at this time.

What is the minimal environment to run the product? How does performance/capability scale up while the environment scales up

The program (Win32 version) runs fine on a minimal PC.

A Recommendation for Run II Physics Analysis Software

Abstract:

II. Objective statements about candidate products

V. Recommendations for Run II:

VI. Long-term Recommendations:

Appendix 1 - ROOT Evaluation

Run II Physics Analysis Software Requirements Checklist: ROOT

DATA ACCESS

DATA ANALYSIS

Input/Output:

Numeric and Mathematical Functionality:

DATA PRESENTATION

USABILITY

Run II Physics Analysis Software Support Requirements Checklist : ROOT

A) Support

1) Maturity and Completeness

2) Who supports users

3) Licensing

B) Maintenance

1) Who provides it and how much

2) Maintenance Infrastructure

3) Maturity and Completeness

4) Modularity:

5) Portability

6) Standards:

7) Reliability and Security:

8) Application specific:

Appendix 2 - Nirvana Evaluation

Run II Physics Analysis Software Requirements Checklist: Nirvana

DATA ACCESS

DATA ANALYSIS

DATA PRESENTATION

USABILITY

Run II Physics Analysis Software Support Requirements Checklist: Nirvana

Support

Maturity and Completeness

Who supports users

Licensing

Maintenance

Who provides it and how much:

Maintenance Infrastructure

Maturity and Completeness

Modularity:

Portability

Standards:

Reliability and Security:

Application specific:

Appendix 3 - MATLAB Evaluation

Run II Physics Analysis Software Requirements Checklist : MATLAB

DATA ACCESS

DATA ANALYSIS

DATA PRESENTATION

USABILITY

Run II Physics Analysis Software Support Requirements Checklist: MATLAB

A) Support

1) Maturity and Completeness

2) Who supports users

3) Licensing

B) Maintenance

1) Who provides it and how much:

2) Maintenance Infrastructure

3) Maturity and Completeness

4) Modularity:

5) Portability

6) Standards:

Are standards followed. Compiler standards? Library standards (e.g. STL, POSIX). Are they fully supported (e.g. cint and STL)?

7) Reliability and Security:

8) Application specific: