Run II Physics Analysis Software Functional Requirements

This document describes the functional requirements for Physics Analysis Software for Run II. In general, these requirements should cover all software needed to access, analyze, and present, in reports and publications, data at the volumes that will need to be handled in RUN II. The requirements are organized into several categories representing the major functions of access, analysis, and presentation, with a final category dealing with usability issues. Requirements containing the word "must" are mandatory requirements, and failure to meet these would disqualify a product from consideration. Requirements containing the word "should" are desirable requirements; all other things being equal a product satisfying more of the desirable requirements would be preferred over one satisfying fewer. Each section of requirements is preceded with descriptive text giving some background justifying the specific needs.

DATA ACCESS

Data access requirements must allow data of various different formats to be retrieved for subsequent analysis in online, offline interactive and offline batch environments. The rate of access must support common online and offline uses. Events must be able to be accessed both serially and randomly, and data must be accessible with chunk sizes smaller than entire events.

It is unrealistic to expect all experiments to use a common data format (or even for a single experiment to use the same format for all stages of analysis). Data formats must support different optimizations based on different access patterns. These considerations lead to the requirements on input of foreign data formats and creation of specialized output formats.

Detailed Requirements:

DATA ANALYSIS

Data analysis consists of the related processes of selecting samples of events, performing analysis on these samples by calculating various mathematical functions from the data in the selected events, allowing interactive variation both in the selection criteria and the calculations performed, preserving samples of events in specialized (optimized) formats for later re-analysis, and preserving the functions and selection criteria.

One important tool for this analysis is a scripting language which allows the physicist to specify both the selection criteria and mathematical operations to be applied to the data, and to control the overall analysis, plotting and presentation environment. Thus this scripting language must contain some of the functionality of programming languages and some of command line or menu driven control interfaces.

However, the basic requirement is that the analysis tool support a rich interactive environment that supports easy control of data access and analysis description as well as interactive development of physics algorithms, with some level of compatibility with offline code so algorithms developed with the analysis tool are usable offline, and offline code can be incorporated in analysis. It is felt that the most effective paradigm to meet this requirement is to require the analysis tool to support linking with external high level language (HLL) routines. The scripting language then does not need to be identical to any particular high level language (or subset of a language) as long as it allows basic data access, commands, simple evaluations, flow control and looping, and, most importantly, invoking of precompiled or dynamically linked high level language procedures. It is also important for the scripting language to support the offline object model for data. There is no requirement, however, for COMIS-like interactive functionality as long as the scripting language supports links to HLL routines.

It might be argued that portability and ease of use (and learning) considerations would suggest that the scripting language be identical to some existing HLL. However, it is felt that dynamic linking is a better way to support portability and offline compatibility. Even if the scripting language shares its syntax with some HLL, it will need to have many new commands to support data plotting and presentation that are not native to the HLL anyway. Moreover, the interactive scripting language will never be totally identical to the HLL on which it might be based, causing problems with new bugs and limited portability. If was therefore concluded that there is no requirement for the scripting language to be derived from some HLL, although it is recognized that when used carefully such a scripting language can have certain advantages.

Detailed Requirements:

DATA PRESENTATION

The results of data analysis must be able to be viewed interactively, saved in standard formats for presentation to colleagues and for inclusion in informal and formal publications. The analysis software needs to provide interactive tools to modify the various features of graphical presentations (colors, labels, etc), and once the user is satisfied with the presentation on a computer terminal the software needs to preserve essentially this exact image.

Detailed Requirements:

USABILITY

Besides the specific functions described above, the software needs to obey certain rules to ensure it can be widely and effectively used. These include areas such as portability, performance, modularity, robustness, use of standards, etc.

Detailed Requirements: