Marc W. Mengel
October 17, 1997
Previous work on build systems at Fermilab has led to the conclusion that a better tool to help manage builds on multiple platforms will greatly boost productivity in getting products released on multiple platforms.
A second issue is the complexity of the build system; happi is a multiple-layered system, with each layer written in a different scripting language. uas_build uses a makefile-template system which was intended to free the user from having to know the details of Makefiles, but which led to more confusion due to synchonization issues between the config file and the Makefiles built from it.
Both systems in their original form use symbolic link ``forests'' to provide the illusion of multiple copies of the sources, in an attempt to cut down on copies and divergence between copies. This implementation leads to subtle problems if symbolic links get added for non-source files, etc. Since the lab as a whole is moving towards CVS as a version control system, this seems a much better choice for maintaining mulitple source trees for multiple builds.
Another problem with the previous build systems has been with authentication issues; if a build needs to be done on a system separate from the usual build cluster environment (for things like Oracle licenses, use of node-locked software checking tools, etc), running commands on that system can be difficult, or if the user needs access to mixed AFS non-AFS systems, using rsh often doesn't get the user authenticated.
I propose we use a simple framework to manage this problem which combines the nice user interface of happi with even more flexibility than uas_build. We can use the expect package to manage multiple telnet sessions for us. We have existing expect scripts which handle logging into various sorts of systems, and others which interconnect Tk text windows with an expect session. This makes it possible to build a simple screen interface which logs onto multiple systems, and presents each system's login session with a text window, where typing in that window works like a regular terminal session and sends characters to the remote system.
Buttons can be added to this system which send specific canned commands to all or some of the sessions thus connected, and if this is done with appropriate system specific substitutions, various operations can be launched on those systems that will proceed in parallel. The interaction script can also be taught to look for particular keyword or pattern based strings coming back from the build session and do things like highlight text and note success or failure of given actions. We could add standard messages to our new Makefile templates to facilitate this understanding.
At this point, by integrating existing code fragments, we can (as before) perform specific canned actions on a list of build machines, and keep track of their success/failure. But we can also click on one of the windows and interact directly with the session, and successful completion of various tasks will still be noted, since the same watcher that notes the completion of automated tasks will see success messages when they result from user interaction.
Platform specific initialization can be included in the login portion of the package, simply by checking the platform type when we log in and performing the appropriate actions.
I propose the project be built in several stages, each of which will be useful in itself, with each stage building on the previous ones.
The initial prototype would simply be capable of logging into multiple systems, presenting the interaction windows for each system, and letting the user enter commands that will execute on all systems simultaneously. It is estimated that this will be 10..20 hours of effort.
A second version would add platform specific initialization (which would combine information already recorded in uas_build and happi about command search paths, etc.) to the system, and flavor replacement in the command execution. That means you could tell the tool to issue a command like:
setup -f %F product vx_y
and the appropriate flavor string would be automatically substituted on each platform. This allows testing of command sequences that will be made into canned actions in the next phase. This should be 5..10 hours of work.
Next, various standard actions could be added as buttons or pull-down menus, etc. that would put the standard make targets in our makefiles (and other common actions), as canned sequences attached to buttons. Generating and testing a short list of these would take only a few hours, but this is an activity that will probably continue throughout the life of the product as choices of what is ``standard'' are extended.
Code could and probably should be added to the event loop dealing with text returning from the remote systems to recognize:
This would allow the addition of a status icon next to each system interaction window which would indicate the state of the activity on the given system. Then a user could see at a glance that a command was still running, or that the command failed, etc. This is another area that could be incrementally improved through the life of the product, after an initial few hours of work to get underway. A status icon indicating part 1, above, should probably be added to a very early version as a proof of concept, and other bits and pieces could be added as needed.
The build system should be able to determine, for a given product on a given system, whether the product has been checked out of CVS into its build area, whether a local instance of the product has been declared to the appropriate UPS database, if the software has passed its regression tests (this should be determined by as standardized stamp file in the product build makefile templates) and if the product has been distributed.
It should eventually have standard actions to attempt to step through these stages, but to let the user intervene if a stage fails. If the user intervention succeeds, then the automatic process can pick back up where the user left off.
The build system should be configurable on a per-system, per-group, and per-user basis. Lists of systems upon which to build, system types, etc. should be obtained from one of several sources so that groups can share pertinent configuration information, but override it when needed. Information that the system needs to maintain includes:
This information should be looked for:
That way, sites using the build system in the ``standard'' way will have very little to override, but individual users can tailor the behavior of the build for their own purposes (for example to have their builds use a file tree under their home directory rather than one in a system-wide location). As the product matures, these pieces of data should be configurable from within the system.
[back to main]
FB001
A Draft Design for a MultiPlatform Build Manager
This document was generated using the LaTeX2HTML translator Version 96.1-h (September 30, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html BuildFrontEnd.tex.
The translation was initiated by Marc Mengel on Wed Dec 3 12:05:20 CST 1997