Minutes of the Second MONARC Meeting at CERN, July 30 1998 (Draft)
     ____________________________________________________________

     Present:  Paolo Capiluppi (INFN/Rome), Mauro Campanella (INFN/Milano),
                  Dirk Duellmann (CERN/RD45), Philippe Galvez (Caltech),
                  Irwin Gaines* (FNAL),  Frank Harris* (Oxford), Ian McArthur*(Oxford),
                  Harvey Newman (Caltech), Bianca Osculati* (INFN/Genoa),
                  Les Robertson (CERN/IT), Simona Rolli (FNAL),
                  R D Schaffer (CERN/RD45), Leonello Servoli (INFN/Perugia)
                  Jamie Shiers (CERN/RD45), Krzysztof Sliwa* (Tufts),
                  Andrei Tsaregorodotsev (Marseilles),
                  Richard Wilkinson*(Caltech), David Williams (CERN/IT) 

     [*] By video


     Introductory Note:   These minutes are extracted from a long and somewhat involved
                                   meeting. MONARC members and especially the Insitute group
                                   leaders are requested to read these minutes carefully, and
 


 
  o H. Newman opened the meeting noting that the conclusions of the meeting will be
     preliminary in the sense that Laura Perini could not be present.

     An initial go-round of remote participants indicated that there were various audio
     problems, some of which may have been associated with the reconfiguration of the
     transatlantic network and the fact that there were 2 other videoconferences going on.

          Action Item: In all future meetings, technical tests and setups will commence
                             30 minutes before the start of the meeting. It is not advised to try
                             to just join the meeting at the moment the meeting starts, especially
                             if the user and/or the setup is "new".

     The third MONARC meeting, shortly before the release of Draft -1 of the PEP is
     scheduled for

         Tuesday August 18 at CERN at 16:00, with Virtual Room Video available.

     It is proposed that Laura Perini chair this meeting.

     The MONARC name was accepted

  o Before the meeting Mauro Campanella submitted some of his ideas,  distributed just
     before the meeting, derived from his note

      "The analysis model and the optimization of geographical distribution of computing
             resources: a strong connection" by him and Laura, which may be found at

                     http://www.mi.infn.it/~cmp/modelsv1.html

      Mauro noted that many Models can be ruled out prior to any detailed simulations,
      and emphasized the need for more "paper Models" of the Analysis Process. A Model
      often requires too many resources in at least one area, or turnaround times are too
      long if one presumes to do certain operations (example: too-frequent re-reconstruction)
      during the course of the analysis. He then noted the
      advantage of information from currently or soon running experiments in choosing a
      sensible initial guess (or guesses) at the Models to be simulated, and suggested that we
      explore finding collaborators from these experiments. [I note that several can be found
      amongst those on the MONARC mailing list.] Frank Harris expressed a similar opinion
      after the meeting, by E-mail.


               Action Items: Obtain collaboration with some of: Babar, D0, CDF, etc.

                                   Set up team to begin looking at Models on paper, and eliminate
                                       those that are clearly unreasonable. 



 
      Mauro requested, and it was approved that his note become the first MONARC
      technical note, to be labeled MONARC 98/01.


              Action Item: A volunteer to set up MONARC Web Pages, including links
                                     to Technical Notes is requested. 

      In summary, he emphasized the need to focus on a few areas in the Working Groups:
          - choice and validation of at least two simulation systems;
          - analysis models input parameters (experiments wishes/goals, data
            on present experiments);
          - analysis of possible models;
          - database performance on WAN, setting up a distributed testbed;

    o Harvey presented a proposed PEP Outline, and a detailed presentation of an
       idealized set of Phases, Tasks and Subtasks for the project. His slides may
       be found at 



                       http://l3www.cern.ch/~newman/monarc/july3098/index.htm 

       or as a Powerpoint file as monarc730.ppt in the/july3098/ directory above.

       The Phases were foreseen to follow a linear plan including Setup, Startup, Modeling,
        Refined Modeling, and Verification and Convergence, all between 10/98 and 12/99
        when the Computing Technical Progress Report was foreseen to be submitted.

        There are many items to setup, parametrize, install, manage and bring into operation.
        The behavior of the Object Database and HPSS are complex and will be hard to
        abstract. It will be hard to focus on sensible Models early, and there may also be
        several "branch points" where Models may be inherently different (example:
        centralized versus highly distributed Models) and where the analysis of different
        Classes of Models may persist; perhaps for most of the Project.

       Paolo commented that the plan as shown was quite "linear" compared to the reality
       where the study goes through a development cycle (Simulate/analyze/evaluate/refine/
       resimulate, etc.). This is clearly true, but it was agreed that a defined project has to
       assume that each of these Study+development loops last for only a limited time.

       He also expressed concern, supported by many, of the tightness of the schedule given
       that there were many geographically dispersed people, nearly all with a limited time to
       work on the project. In this light the limit on the Project to 12/99 could be unrealistic.

       A review of the Tasks in the plan presented by Harvey led to more general concerning
       about the depth of the problem(s) to be solved and the limited time available. Dirk
       asked about how the studies might account for certain major "events" such as loss
       of contact with a database partition over an extended period. It was said that the Models
       would have to assume that the major components, including the database, had to be
       reliable. RD45 was requested to provide information as to the "extra" resources
       required to robustly mirror the data, to satisfy the (heavy) requirements for reliability.
       Apart from the specifics, this discussion served to illustrate that even "refined"
       Model studies will fall short of the real level of complexity of the distributed
       systems the LHC experiments will build.

       Mauro emphasized that there will be intense activity in the early days of LHC
       running, until the analysis and software are in a somewhat stable "production"
       status. One cannot assume that the lower data rate will mean relatively low
       demands on the Computing Model in 2005 compared to when the LHC is running
       at full luminosity. This comment served to illuistrate that certain uses of the
       resources will be restricted, and strict priorities might have to be set by each
       experiment as a matter of policy (example: the Production Reconstruction group
       gets absolute priority if resources are not adequate to cover all activities in the
       data analysis). Such policy decisions should be reflected in the Models where
       relevant.

       As a result of the discussion it was agreed that MONARC would not start out
       by requesting delays from the LCB, but the scope of work would have to be moderated
       to be more realistic. Since the primary goals of MONARC are to identify and
       classify some (not the unique) "plausible" Models, the nature of the milestones
       to be achieved within the allotted time of 12-14 months could be more flexible.
       In fact, the Computing Model considerations reported in the CTPRs at the end of 1999
       could simply reflect the state of advance of the MONARC project. The issue of
       whether the CTPRs should be a few months later it order to take advantage of
       a more adequate set of Model studies by MONARC will be for the LCB to decide
       during 1999.

   o  Harvey presented a first proposal for Working Groups, as the basis for discussion
       (see his slides for further details on the tasks of each WG):

       Systems Design, Analysis and Network Process Design, Simulations, Model Measurements,
            and Steering

       David Williams noted that Systems (or Site) "Design" might be better-called
       Systems Parametrization; and similarly for the Analysis Process. It was noted that
       "Design" in the context of a Simulation study of a complex system with different
       levels of granularity, could mean as little as the specification of a few of the main
       components, their performances and interconnections (e.g. processing power, data storage
       and I/O channel speed for a computing system).

       There was some division of opinion as to whether there should be just three, larger WG
       called Systems, Processes and Steering or whether the proposed structure should be maintained.
       The reason for relatively small WG was that people can work and intercommunicate better
       in a small group dealing with a limited number of issues. Sufficient frequent meetings would
       be the way in which adequate inter-WG communication would be assured. There were different
       views expressed on the focus of the Project: to get down to the simulations of Models, even
       if the assumptions in the first ones were not so realistic (I. Gaines, C. von Praun); or
       alternatively to spend focus on getting a reasonable starting point by paper studies of the
       Analysis Process, and by detailed discussions of the possible site-architectures and funding
       constraints foreseen for 2005. The Steering Group, which everyone agreed on was needed,
       would have to settle on the focus of the MONARC project plan and schedule, preferably
       during the month of August. 



               Action: Form the Steering Group and have it begin discussions of these
                           issues. 

       From a practical standpoint it became evident that the structure and course of the
       project would depend greatly on the manpower actually available, and the interests
       and resources of each group in MONARC.



             Top Priority Action Item: Each group is to review the proposed tasks, and
                    Express:
                              (1) Their interests and experience in certain tasks
                              (2) Their desire/willingness to take responsibility for some
                                         of those tasks, or for an area of the Project.
                              (3) Nominations to serve as Working Group Chair
                                         or a subgroup chair
                              (4) Committments of individuals to certain tasks
                                         (names; fraction of time on the project;
                                           periods of availability for work).

              Each group leader in the Project is requested to formulate and submit
              a response from his Institute to the MONARC management by August 17
              at the latest.
                                 


         David also noted that the Steering Group should also take on the task of helping
         to define an initial class of plausible Models to be subjected to simulation study,
         based on their experience and knowledge of funding and political constraints.

         It was noted that the Structure of the project need not be the same as a function
         of time. Once a focused series of simulation studies was underway, it might be
         advantageous to reorient the WGs and the balance between them.

         A tie between these considerations and the need for an officially-adopted CERN
         vision of LHC Computing, including a clear view of its scale and unique character
         relative to nearer-term HEP experiments, was noted.

         The relationship of the MONARC to CERN/IT was discussed. There are
         a number of areas where direct involvement by CERN/IT would be a great
         advantage:

         Les Robertson is asked to consider the above, and propose contacts in MONARC
         from the appropriate IT groups. Jamie Shiers and other RD45 members in MONARC
         are requested to help with all ODBMS aspects.
 
 
   o   The schedule of meetings and writing was then discussed.

        The difficulty of meeting deadlines in August, in the presence of business travel
        as well as vacations was noted.          

             August 1-21:    Work on the PEP Draft; Informal meetings as needed.

              August 18:    Next Meeting with Laura Perini as chair. Review status of PEP
                                 in the light of the upcoming release of Draft "-1" of the PEP.
                                 Review the action items from the July 30 Meeting, especially
                                 the task/responsibility matrix.

             August 21 (?):   Distribute Draft "-1", for Comment.

              August 27:        Meet to finalize comments on Draft -1 and form Draft "0".
                                     Assign One Editor to collect comments and further develop
                                         the draft.
                                     [NB -- this is during the Atlas SW Week at Ann Arbor.
                                           Virtual room video is available from there, and it
                                                is suggested that we conduct the meeting with
                                               broad Atlas participation from Ann Arbor as well
                                               as CERN.]

               Sept.  8           Distribute Draft 0 for Comment
               Sept. 14          Meet in Lyon (Monday morning);
                                     Final comments on Draft 0 discussed;
                                     Form Draft 1 and distribute for comment

               Sept. 19          (Saturday) meeting at CERN for intensive work on the
                                     Draft to be submitted to the LCB

               Sept. 22          Submit PEP to LCB
 
               Sept. 26          Very last comments on the PEP.

               October 1        Final submission of the PEP to the LCB
 
               October 6        Present PEP

               October 7        Project (Officially) Starts.

    o  Volunteers for writing the PEP were requested.

       It was also requested that Krzysztof begin on the structure of the PEP very soon,
       as Laura and Harvey will have limited availability during the first half of August.

      It was agreed that the team of Krzysztof, Laura, Harvey, and Paolo would coordinate
      on the writing. Mauro volunteered for some writing assignments. CERN/IT contributions
      also were solicited, in the areas outlined above.

  o  Paolo suggested a solution for MONARC management, satisfying the constraints of

     Paolo's proposal was distributed by E-mail: Laura as Project Leader, Harvey as Spokesperson,
        and Krzysztof as head of the Simulation Group (tasks as defined on Harvey's slides).
        This was approved by persons present and the MONARC collaboration has been
        requested to give any comments by August 6.

    o AOB: The issue of the Modeler (update):

                A job description for the Modeler has been finalized:


         Modeler/Analyst of Distributed Computing Systems
         for Experiments at CERN's Large Hadron Collider

                       Job Description
         _________________________________________

 Task
 ____

 Development of a system of simulation software that allows the
 modeling of globally distributed computer systems for large-scale
 physics data analysis.

 This work will be performed in the context
 of a project being carried out by a collaboration of physicists
 and computer scientists from Europe and the US, studying future
 Models of how physicists could do the data analysis at experiments
 to be performed at CERN's Large Hadron Collider (LHC) starting in
 2005. Data analysis with this scale or degree of geographical spread
 does not exist now, but is expected to become possible through
 advances in the information technology over the next several years.

 Qualifications
 ______________
 
 
 Education:
 ________

 A university degree in computer science or engineering, physics,
 electrical engineering, or the equivalent.

 Knowledge and experience:
 ______________________

 The successful candidate will understand how to define software
 abstractions of real computers and networks for the purpose
 of modeling and simulating their behavior in hypothetical
 distributed computing systems.

 A proven track record of software development in C++ and/or Java
 is required, as is experience and familiarity with software
 development on both NT and Unix operating systems. Candidates
 should be skilled at quickly understanding existing applications
 at the source code level, and be comfortable modifying, developing
 and porting those applications.

 Candidates must have a good knowledge of modern software design
 techniques, GUI-building and especially Object-Oriented analysis
 and design. They must understand the need, and be capable of,
 fully documenting their work.

 Familiarity with an existing (commercial or other) modeling tools,
 such as ModSim or ComNet, would be a major advantage.

 Good knowledge of English, including the ability to write clear
 documentation is required. A working knowledge of French or Italian
 is an advantage.

 Location:
 ________
 
 The candidate will be located at CERN, but some travel to the U.S.
 and European countries is foreseen.

 Duration:
 ________

 The project will last for a minimum of two years. The candidate's
 contract is subject to annual renewal, for the duration of the
 project.

 Salary:
 ______

 From 5500 CHF to 7000 CHF/month, according to experience and level
 of expertise.



 
                It is hoped to find a candidate in August. All members of MONARC
                are invited to identify candidates, or post the job description to various
                lists. The successful candidate would work as an associate in CERN/IT
                Division and would be a staff member of Caltech.

                                                                                           -- Harvey Newman
                                                                                              August 1, 1998