Next Meeting: Tuesday March 23, 1999 16:00 MET Topic: Regional Centres Minutes of MONARC Architecture Working Group Meeting -- Mar. 9, 1999 I. The first part of the discussion centered on the draft document ================================================================== reviewing the architectures of existing experiments. =================================================== I.A. Lessons from the LEP experience ------------------------------------ David Williams addressed the issues that were raised concerning the accuracy of the initial planning for LEP experiments (Feb. 23 action item 2). He described various studies and reports from that period. First, there was a study that produced the so-called Green Book which he described as an attempt to convince CERN to invest significant money in computing at LEP startup. By way of reminding us of the issues that were significant at the time, he mentioned that much of the discussion was about VM/CMS, VMS, and networking. In the 6 months after the appearance of this report, CERN did establish a budget line of 8 MCF for LEP computing. This was then followed by a somewhat briefer report, the MUSCLE Report, that specifically dealt with issues of CPU and disk capacity and the capacity of the networks required for LEP computing. In July of 1989, a report called "Computing at CERN in the 1990's" was released. David remarked that this report was technically interesting but had little impact because it was commissioned under one Director General but appeared just as a new Director General was starting. We then turned to how to use the information from LEP to advantage in our MONARC report. It was observed that there is a big difference between the computing resources needed at startup and those needed when the machine really achieves (or approaches) its design luminosity. Tim Smith agreed to try to compare CERN advance plans for LEP, the LEP numbers at startup, and the situation when high luminosity operation was achieved. We agreed that one goal was to determine whether we, as a field, have a 'reasonable track record' in preparing for new machines and predicting the computing resources they will need at turn on and as the program goes into full swing. Irwin said that he felt that there were two models of planning for computing: one can try to make accurate projections or one can just say that we need whatever we can get and push for as much as possible. He expressed the hope that for LHC we will be able to successfully employ the first model. Joel said that he believed that we did not do well when we attempted to understand the needs for something completely new but that once we had a 'baseline' of relevant experience to draw from, so that our projections were incremental, we did reasonably well. Vivian reformulated David's argument as 'politics tends to push the computing budget low so that we always need more'. Vivian also felt that we should cite LEP as an example of how technology changes were incorporated during the course of the program and were absolutely critical to meeting the needs. In terms of improving our information base, Tim is able to get CERN statistics on experiment utilization over the last 10 years. David and Tim will try to track the evolution of disk and CPU during this period and also their costs. I.B. Comparison of LHC to FNAL Run II ------------------------------------- People wanted to understand why there was a factor of 100 in CPU requirements between Run II at FNAL and LHC. Irwin said that he felt that there was really a factor of 5 in the amount of data (even tho the number of events was similar), that the complexity level was much higher (4-12 times more interactions/crossing, higher energy), and that a large penalty for Objectivity (or other database) overhead had been recently added. He felt that this meant that the factor of 100 was at least understandable although it might represent and overestimate of a factor of 2. I.C. Objectivity Overhead ------------------------- While people do not believe that a factor of three for Objectivity overhead is necessarily inevitable, all agreed that some cushion or contingency was needed in the estimates at this point and that we shouldn't reduce the numbers in the report. I.D. Distributed Computing -------------------------- We continue to look for examples of distributed processing. We had hoped to have information on E791, which distributed its primary reconstruction, E831/FOCUS which did its splitting and stripping at two sites away from Fermilab, and D0's CBPF group which ran a remote reconstruction service over the network in Brazil for D0. The information was not available at the meeting but has since surfaced. It will be posted on the MONARC Architecture Web page. I.E. Action List for the Report ------------------------------- Vivian gave a summary of the status of the Action List: 1) Expand the analysis of why Monte Carlo was more successfully distributed than other parts of the task (Vivian) **** IN PROGRESS 2) Locate a copy of the Green Book and look up initial LEP estimates (Tim) **** MUCH WORK ON THIS REPORTED VIA EMAIL AND AT MEETING 3) Get permission and list names of 'contact persons' on experiments -- (survey team) **** DONE. I WOULD LIKE TO ADD SOME WORDS THANKING THEM AND SAYING SOMETHING THAT WILL TAKE THE HEAT OFF THEM IF PEOPLE IN THEIR OWN COLLABORATIONS HAVE OTHER OPINIONS ABOUT WHAT HAPPENED. 4) Get 'range of times' for analysis tasks (survey team) **** IN PROGRESS 5) Contact Laura to learn whether we can get a list of resources available for analysis at Italian institutions (Joel) **** NEED TO CONTACT MAURO NEXT 6) Add FOCUS (FNAL E831) to the survey (Joel) **** DONE (by Vivian) 7) Try to lean about D0 Monte Carlo production at CBPF and elsewhere (Mike) and Michigan State ACP farm (Joel) **** IN PROGRESS. INFORMATION IN HAND AND TO BE POSTED TO WEB PAGE ASAP 8) Add paragraph above into conclusions (Vivian) **** DONE 9) Generate revised draft (Vivian) **** IN PROCESS. APPARENTLY NEVER-ENDING TASK We agreed that after one more draft, we would submit this to the MONARC management for review. II. Regional Centers ==================== II.A Status of Regional Centers Discussion ------------------------------------------ Luciano briefly discussed issues concerning the Regional Centers. He said that he thought that in order to define a basic computing model, we needed to answer some very basic questions such as the number of regional centers per experiment and the number of regional centers per country. There was a brief discussion about which nations might be interested and able to establish major centers of the kind we have been referring to as 'Tier1'. Possibilities include France, Italy, US, UK, and Japan. Steve O'Neal described a facility being constructed at Liverpool for LHCB simulation. It consists of a 300 processor PC farm. He felt this was a promising development which boded well for support of LHC computing in the UK. It was also noted that IN2P3 had a formal memorandum of understanding with the BaBar/SLAC. It was suggested that we might learn something from their experience in working with SLAC. II.B Recent Developments ----------------------- COMMENT: Some circumstances external to the Architecture Working Group have pushed the discussion of Regional Centers to the forefront. These will influence our schedule of activities. Two communications which were already circulated to the MONARC mailing list are included below for your convenience. For the Architecture Working Group, the most urgent task is now to make the statement, extracted from the messages below, come true! "Within two weeks or less, we expect to release our draft documents that will briefly describe estimated resource and service requirements." Laura Perini's summary of meeting with LCB ++++++++++++++++++++++++++++++++++++++++++ 2 - Already at the December LCB meeting, when MONARC was approved, we were asked to provide also a set of guidelines for Regional Centers (see LCB minutes). This request has been stressed again, and in the discussion it came out clearly that this point has also a political content ( related e.g. to the geographical coverage and constituency of the Centers ) that need to be discussed in the experiments, where the right contacts at country level should be naturally available. It has however been noted that in some sites the Center will be common between at least 2 LHC experiments thus making the MONARC approach to commonality especially valuable. We are required to present our ideas on the guidelines for Regional Centers already in the June progress report together with the work plan for phase2 and our ideas on a possible continuation for a phase3 of implementation. During today discussion someone noticed that stressing the point of guidelines for Regional Centers could be seen as an encouragement for a continuation of the project, as this point is clearly implementation oriented. I propose at the next plenary of Monday 15th we discuss how to proceed, in connection with the experiments and the candidate RC's that are already in MONARC, for producing a first agreed version of the documents on RC's which have been prepared in the Architecture WG; a special MONARC meeting, with the participation of the representatives of the candidates RC's, to be well prepared in advance, could be needed ( sometimes in April?). Announcement of the first meeting of Regional Centre Representatives of the +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ MONARC project ++++++++++++++ Dear Colleagues and Regional Representatives, I am writing to invite you to the first meeting of Regional Centre Representatives of the MONARC project, to be held at CERN on April 13 at 16:00 - 18:00 (Geneva time). Attendance by ISDN videoconference will be possible. The MONARC (Models of Networked Analysis at Regional Centres) project is studying Computing Models that will be able to meet the challenging data access and analysis needs of the LHC experiments. For CMS and ATLAS, for example, each of the experiments foresees that there will be several regional centres in addition to a large facility at CERN, linked by very high speed wide area networks. This distributed model is being designed to meet the physicists' analysis needs in a responsive and cost-efficient manner. The MONARC Site and Networks Architecture Working Group has made significant progress towards understanding the scope of the problem, in terms of the quantitative needs for computing and storage, and the services required to allow a Regional Centre to function effectively. It is now the right time for us to begin discussions in a broader community, especially with representatives of centres (or initiatives for future centres), in support of physicists working on LHC experiments. These discussions are essential for us to better define and converge on the technical requirements, and to understand the particular conditions that may apply in each country or region. Within two weeks or less, we expect to release our draft documents that will briefly describe estimated resource and service requirements. These documents should provide you with useful information, and will be used as a basis of further discussions. You are also invited, if time permits, to send comments on the documents, and on related issues of Regional Centres, prior to the meeting. Please confirm your attendance, or the attendance of an appropriate representative, at your earliest convenience. Let us know if you would like to attend by (ISDN) videoconference. Further details of the meeting will be sent out at the beginning of April. In the meantime, more information on MONARC may be found on the Web at http://www.cern.ch/MONARC. The Project Execution Plan (PEP) includes some detailed information on the aims and work plan of the project. Other Web sites of possible interest include the RD45 ( http://wwwinfo.cern.ch/asd/cernlib/rd45/ ) and GIOD ( http://pcbunn.cithep.caltech.edu ) projects on Object Databases. MONARC members and "friends" are invited to send me the names of additional representatives who would like to attend this meeting. Sincerely, Harvey Newman Caltech