Chapter 10: Monitoring Enstore on the Web
There are several installed Enstore systems at Fermilab. Currently these include STKEN for general Fermilab users, CDFEN for CDF RunII, and D0EN for D0 RunII. For each Enstore system, a separate but structurally identical series of web pages is available for monitoring the system and any jobs you’ve submitted to it. The currently implemented websites for Enstore monitoring include:
• http://www-stken.fnal.gov/enstore/enstore_system.html for STKEN
• http://www-cdfen.fnal.gov/enstore/enstore_system.html for CDFEN
• http://www-d0en.fnal.gov/enstore/enstore_system.html for D0EN
We recommend that you bookmark the appropriate one in your browser.
In this section, we briefly describe the format and function of the web pages that are of interest to users, and show you how to navigate them.
The Enstore pages present snapshots of the status of various components of the Enstore system, and the pages are updated and refreshed periodically. The auto-refresh time interval varies from page to page, and does not correspond with the information update interval, which also varies from page to page. See the online help screens for more detailed information.
Note for Netscape users: Links on these pages are intended to take you straight to the item of interest, not to the top of the page on which it’s found. Due to a Netscape bug, you’ll find yourself at the top of the target page. To get to the item of interest, place your cursor in the URL area of the browser and hit Enter.
The top page for monitoring an Enstore system is located at http://www-<xyz>en.fnal.gov/enstore/ (where <xyz> is one of stk, cdf or d0), as given above. This page has two sections, each containing links to other pages.
10.1.1 Enstore System Status Links
The links under the Enstore System Status heading lead to status web pages for the Enstore system and its servers, shown here for the D0en system:

The pages to which these links point (with the exceptions of Quota and Usage and Production System’s Overall Status1) share a header format, described in section 10.2 Header Format for Status Pages.
Under the Information header are links for finding help, documentation, and so on.

10.2 Header Format for Status Pages
Here we see the header for the Mass Storage Status At-A-Glance page (from the link “Enstore System Summary” on the top page):

Header elements:
• In the upper-right corner you’ll find the page title, Mass Storage Status At-A-Glance, in this case.
• Underneath the page title is the name of the Enstore server that created this web page (e.g., Enstore), and the date and time that the current page was created. If the time shown here is more than a few minutes earlier than the current time, you should refresh your browser window to get updated information.
• The buttons in the upper-left corner are quick links to different pages:
Home the top page, described in section 10.1 Top Page
System the Status At-A-Glance page, described in section 10.3 Mass Storage Status-At-A-Glance Page (the page associated with the “Enstore System Summary” link on the top page; it is the page shown in the above image)
Servers the Enstore Server Status page, described in section 10.4 Enstore Server Status (the page associated with the “Enstore Server Status” link on the top page)
Encp the Encp History page, described in section 10.10 Encp History (the page associated with the “encp History” link on the top page)
Help page-specific online help
• Underneath these buttons you’ll find the Enstore system identifier; in the above image, it is D0EN: Enstore for the D0/RunII AML/2.
10.3 Mass Storage Status-At-A-Glance Page
What? |
The Status-At-A-Glance page presents summarized information indicating which parts of the Enstore system are up and working, which parts have problems, which have a scheduled outage, and other system information. It also provides a mapping between Enstore servers and the nodes that run them. |
Why? |
Start at this page when investigating any possible problem. This page indicates which if any components of your Enstore system are experiencing problems. |
How? |
To arrive at this page, start at the top page and click “Enstore System Summary”, or click the System button on any of the pages. |
The Enstore components and servers listed on this page are described in Chapter 8: Overview of the Enstore Servers.
Page Description
The page is divided into three sections. They list systems and servers, and code them with colored ball icons to indicate their status. The Help button at the top of the page (see section 10.2 Header Format for Status Pages) describes the status icons.

Enstore Overall Status summarizes the status (from left to right) of Enstore as a whole, the tape robots, the network, and alarm components. There is only one link:
The “alarms” link takes you to the Enstore Active Alarms page described in section 10.12 Enstore Active Alarms.
Enstore Individual Server Status
lists all servers (Chapter 8), library managers (section 8.3), movers (section 8.4), and media changers (section 8.5); includes individual status indicators. Each link in this section takes you to its corresponding server entry on the page described in section 10.4 Enstore Server Status.
Status indicators do not apply to the third section, which lists the nodes in the Enstore system and the servers that run on each of them (below we show the first few rows of the table). There is no status information.

There is a legend at the bottom of the page for the status icons which looks like this:

What? |
As the page title implies, the Enstore Server Summary page provides the status of all the Enstore servers included in your system. This includes movers, library managers, and so on. The servers are described in Chapter 8: Overview of the Enstore Servers. |
Why? |
Use this page to find out what a particular server is currently doing, and what work it has pending. |
How? |
To arrive at this page, start at the top page and click “Enstore Server Status”, or click the Servers button on any of the pages. |
Page Description
The page is divided into two sections.

The first section, Shortcuts, is simply a compilation of links that point to anchors in the table that comprises the second section. There is also a link labelled “Full File List” which takes you to the Active File List page, described in section 10.5 Active File List (useful for tracking down the library managers associated with the file you want to investigate if you only know the name of the file).
The second (and main) section is a status table which lists all the servers, and displays the server name, status, host, date/time, and last time alive for each. Some server names and status information in this table have links to pages with more information. We define the statuses below by server type.
This page is updated and refreshed periodically.
• The link on a library manager (LM) name points to the Library Manager Queues page for the corresponding LM (see section 10.6 Library Manager Queues).
• The link “Full Queue Elements” points to the Full Library Manager Info page (see section 10.7 Full Library Manager Info).
• The link on a volume name takes you to a text page with the volume’s inventory information (see section ).
• The link on a mover name points to the Movers page (see section 10.9 Movers Page).

Statuses for library managers (LM) include:
alive : unlocked
LM is working normally
alive: locked LM is rejecting new encp requests, but continues to assign jobs already in the pending queue to movers; encp does not retry
alive: nowrite LM is locked for write requests
alive: noread LM is locked for read requests
alive: ignore LM is ignoring new encp requests (returning “ok” to encp), but continues to assign jobs already in the pending queue to movers; encp retries internally so user is unaware of the delay
alive: pause LM is ignoring new encp requests, and holding pending jobs
The link on a mover name points to the corresponding mover (MV) on the Movers page (described in section 10.9 Movers Page).

Statuses for movers include:
alive : IDLE MV is idle because there are no jobs to process
alive : SETUP MV is in initial phase of a job, it is setting up a connection with encp for a transfer
alive: busy mounting volume <volname>
MV is waiting for the media changer to finish mounting a tape; the volume name is given
alive : SEEK a tape is mounted, and the correct read or write location on the tape is being located
alive : busy reading/writing <n> bytes from/to Enstore
MV is reading data from Enstore, or writing data to Enstore; the number of bytes read or written so far is given
alive : busy dismounting volume <volname>
MV is waiting for media changer to finish dismounting a volume; the volume name is given
alive : HAVE BOUND volume - IDLE
MV has completed a job but is waiting for a subsequent job for same tape; tape is still in drive
alive: DRAINING
MV is completing last job before going offline; it will not accept more jobs
alive : CLEANING
a cleaning tape is in the drive; MV cannot accept more jobs until the cleaning has finished.
alive : OFFLINE
MV is offline and not accepting jobs (MV name displayed in orange)
alive : FINISH_WRITE
MV writing is completed and MV is waiting for file and volume metadata to be created.
alive : ERROR <text>
MV is in an error state described by the text, and cannot accept more jobs (MV name displayed in orange)
Statuses for all the servers, including movers and library managers:
alive server is working normally
timed out the inquisitor hasn’t received the latest “I’m alive” message from the server
dead duration of “timed out” status on server has exceeded configured limit (server name appears in orange)
not monitoring server is known to the enstore system, but is not currently being monitored by the inquisitor (server name is displayed in gray)
cant update status
similar to “timed out”, but “can’t update status” means that when the inquisitor asked the LM or Mover for more information, it didn’t get a response.
What? |
The Active File List page lists the data files being actively worked on by your Enstore system. The files are listed by user node. |
Why? |
Use this page to find your file. This is the right starting page for checking on your job if you only know the name of the file you’re reading or writing (i.e., you don’t know the volume or any Enstore server information). This page has links to pages containing more job-related information. |
How? |
To arrive at this page, start at the top page and click “Enstore Server Status”. Then under Shortcuts, click “Full File List”. |
Page Description
The files are listed by their full path and name. For each file, the node from/to which it is being read/written is also given. This page doesn’t distinguish between read and write.

Scroll as necessary or do a search to find your file in the list and click on it. This will take you to the Library Manager Queues page for the library manager servicing your job; see section 10.6 Library Manager Queues.
What? |
The Library Manager Queues page lists the encp jobs that a selected library manager is currently managing or has pending in a queue (encp is described in Chapter 6: Copying Files with Encp). Movers in states other than busy or IDLE are listed at the bottom of the page (mover statuses are described in section 10.4 Enstore Server Status). |
Why? |
Use this page to find out the status of a particular library manager’s read and/or write queue(s) once you know which LM is servicing your job. You can find the status of your job, and its priority relative to other jobs in the queue. From this page you can click links to get full details on the processing of your file and on the volume associated with your file. |
How? |
To arrive at this page, follow this string of links starting at the top page. Click “Enstore Server Status”, then: • If you know the filename but not the library manager, then under Shortcuts, click “Full File List”. Click on your file of interest. This will take you to the Library Manager Queues page for the appropriate library manager. • If you know the library manager, you can click directly on the link in the second section of the Enstore Server Status page, instead. |
In the case where there are any suspect volumes associated with an LM, this information is displayed on the individual library manager status page.
For the Reads, files are listed by volume. For each volume in use, the page lists the mover servicing it. Each encp job is listed on a separate line. The line lists the host to which the file is to be copied, the last 70 or so bytes of the filename (filenames can get quite long), the file’s current priority in the queue, and the file’s position on the tape.

To get full information on the library manager’s processing, click Full Queue Elements next to Reads to arrive at the Full Library Manager Info page (described in section 10.7 Full Library Manager Info). To get full information on a volume being read, click the volume id at the top-left of the queue containing your file. This takes you to the text inventory page for that volume (described in section ).
Under Writes, this page lists write jobs by file family. A mover is listed after the file family for a file only if the file is currently being worked on.

Each file in the write queue appears on a separate line. Each line lists several pieces of information: the host from which the file is to be copied, the last several bytes of the filename (filenames can get quite long), and the current priority and file family width.
The mover name provides a link to the Movers page, described in section 10.9 Movers Page.
Job Processing and File Family Width
Normally, the number of WRITE jobs running per file family can equal but not exceed the file family width (see section 2.2.2 File Family Width). If a READ job is running on a tape that is not marked full, this also counts against the width.
But note: Even if the number of current jobs equals the width, it is possible for a new READ job to start on a tape that’s not full (if the tape is marked full, the width is not an issue and the READ job can start anyway) since the width is checked only when assigning WRITE jobs; thus temporarily, the width may be exceeded. Any pending WRITE job must wait until the the number of jobs that count against the width drops below the width value.
10.6.4 Additional Movers
Underneath this information, there may be a table listing additional movers.

Movers that are in any of the following states are listed here (see section 10.4 Enstore Server Status for status descriptions):
• CLEANING
• DISMOUNT_WAIT
• ERROR
• HAVE_BOUND
• OFFLINE
Movers not listed anywhere on the page may be assumed to be IDLE, i.e., waiting for a job.
10.7 Full Library Manager Info
What? |
The Full Library Manager Info page displays the job parameters for each file in a given library manager’s current READ and WRITE queues (e.g., local file name, local node, file family, volume ID, priority, etc.). |
Why? |
Use this page to find the status of a READ or WRITE job, e.g., the file’s position in the queue, how long it’s been in a queue, when it was “dequeued” (i.e., when processing started), and other details about how Enstore is processing it. |
How? |
To arrive at this page, follow this string of links starting at the top page. Click “Enstore Server Status”, find the library manager you want, and click “Full Queue Elements”. If you don’t know which LM you want but you know the file, take this route. On the Enstore Server Status page under Shortcuts, click “Full File List”. On the Active File List page, click on your file of interest. This will take you to the Library Manager Queues page for the appropriate library manager. Here, click “Full Queue Elements” next to Reads to come to the Full Library Manager Info page. On this page, you can scroll down and locate your file. |
Page Description

Your file will appear as one of two types of entries on this page: one type for files being worked on, and another for files pending in the queue.

For those files being worked on, the mover name is given, and it provides a link to the Movers page, described in section 10.9 Movers Page.
10.8 Tape Inventory Page (Text)
There are a couple of pages that present volume inventory information. One is straight text, discussed in this section. The other page is dynamically generated HTML; see section 10.16 Tape Inventory Page (Dynamic HTML) . The formats of both pages are similar.
What? |
For each volume declared to your Enstore system, there is a page that presents volume inventory information in straight text format. The page gets updated periodically; be aware that it may not reflect the most recent information. |
Why? |
Use this page to find out details of the storage of your file(s) on a volume, to see how full a tape is, or to check the inhibits. |
How? |
If you only know the filename: To arrive at the text web page, follow this string of links starting at the top page: click “Enstore Server Status”; then under Shortcuts, click “Full File List”. Click on your file of interest. This will take you to the Library Manager Queues page for the appropriate library manager. Find your file, and click the corresponding volume ID to come to the inventory page for that volume. If you know the volume name: To arrive at the text web page, follow this string of links starting at the top page: click “Enstore Server Status”; then look for the volume name listed with one of the active library managers, and click on the volume. |
Page Description
At the top of the volume inventory (text) page, you’ll find the volume ID, the last accessed date, the number of bytes free, the number of bytes written, and the inhibits (described below).
The volume inventory contains a line for each file on the volume, listed in location order. In addition to the tape label, this page lists the bfid, size, location_cookie, delflag, and original_name (the name given in the encp command used to write it). Scroll down to the bottom of the page to find information for the tape volume itself.

Inhibits
The inhibits are listed on the page in the format system_inhibit[0] - system_inhibit[1].
system_inhibit[0] can take any of the following values:
none the normal state (no inhibits)
READONLY volume is read-only
DELETED volume has been deleted, but admins can still restore the metadata if the volume has not been reused.
NOACCESS no access allowed (set by system to prevent further access to volume on which it found an error; once the problem is resolved, operator must clear the NOACCESS state)
NOTALLOW no access allowed (set manually by the operator to prevent access to volume)
system_inhibit[1] can take any of the following values:
none the normal state (no inhibits)
full volume is full
migrated files have been migrated to another tape
What? |
The Movers page displays the current status of all the movers. (The mover statuses are described in section 10.4 Enstore Server Status.) |
Why? |
Use this page to see how far into a job a mover is, or to check other job details related to the mover, e.g., what volume is being used for your job. |
How? |
There are several paths to arrive at this page. The two easiest and most common are: • On the top page click “Enstore Server Status”. Click on a mover. • On the top page click “Enstore Server Status”. Choose a library manager to get to the Library Manager Queues page, then click on a mover.
|
When you click on a specific mover, you are brought to the entry for that mover on the Movers page.
Page Description
This web page shows the most recent known state of all of the movers in the Enstore system. The first image (below) shows the field headings. The online help page provides a detailed description of the fields.

This next image shows movers that are busy mounting, seeking and writing tapes. The pnfs and user filenames are given as appropriate:

This image shows movers that are idle (awaiting a job), and busy reading a tape:

Understanding the Number of Bytes Read/Written
For a READ job,
“Last/Current Read (bytes)” means “bytes read from tape”
“Last/Current Write (bytes)” means “bytes written to user’s file”
whereas for a WRITE job,
“Last/Current Read (bytes)” means “bytes read from user’s file”
“Last/Current Write (bytes)” means “bytes written to tape”
For jobs in progress, the number of “Current Read (bytes)” is by necessity higher than “Current Write (bytes)”.
For finished jobs (e.g., of status IDLE or busy dismounting volume), you can compare “Last Read (bytes)” to “Last Write (bytes)” to tell if a job was a READ or WRITE. The file size is always bigger on tape than on the user’s disk because the file family wrapper is on the tape copy only. So for example, on a READ job, Enstore reads a larger file from tape and writes a smaller one to disk, and thus the “Last Read (bytes)” value is larger than “Last Write (bytes)” (as shown in image below). The converse is true for a WRITE job.

What? |
This page lists the last several encp transfers that have completed, either successfully or with an error. |
Why? |
Use this page to review recent encp transfers. |
How? |
To arrive at the Encp History page, click “Encp History” on the top page or the ENCP button at the top of any page. |
Page Description
On the Encp History page:
Successful transfers show time that transfer completed, node, username and storage group, mover interface (the TCP/IP interface used on the mover node), bytes transferred, volume ID, and rates for network, transfer, drive, disk and overall (all in Mb/s). See section 6.5.3 Encp Transfer Rates Defined.
Unsuccessful transfers show time of attempted transfer, node, username, storage group and error summary. Each error summary contains a link to a more detailed error message.
The top portion of the page is a table listing details of each recent transfer:
The value under Bytes provides a link to the Files Transferred area of the page which gives you the originating and destination file names:

At the very bottom of the page, you can find the errors in red, if any:

What? |
The Enstore Configuration page shows the Enstore system’s current configuration. |
Why? |
This page is for administrators. But if you want configuration information on any component in the Enstore system, you can look here. For example: • If you see a server listed as unmonitored (in grey) on the Enstore Server Status page, you can verify its status here (if the element inq_ignore appears, the server is unmonitored). • If you want to check the log files for activity related to a particular mover, look here for the logname value associated with the mover, then search the log files for that string. |
How? |
To arrive at the Enstore Configuration page, click “Configuration” on the top page. |
Page Description
The page is divided into two sections:
• The first section provides a quick link to each of the servers listed in the table in the second section.
• The second section, is a (potentially quite long) table containing detailed configuration information for all of the Enstore servers. For each server, there is a table row for each element that appears in the server’s configuration. The information displayed includes the element name and its current value. No interpretation of the values is done, so for instance if the value is a python dictionary, then it is presented here as such. The server names, and under them the element names, are organized alphabetically.
This image shows the top of the table in the second section on the Enstore Configuration page, including the (truncated) entry for one of the system’s movers:

What? |
The Enstore Active Alarms page lists the alarms that have been raised but not yet resolved. |
Why? |
This page is for administrators, but as a user, you can always look here for information when there is a problem with Enstore. In particular, if a volume is set to NOACCESS, you can look here to find out which mover was involved. |
How? |
To arrive at the Enstore Active Alarms page, click “Alarms” on the top page. |
The page is quite wide; we show first the left side, then the right.


What? |
The Enstore Log Files page provides links to Enstore system-specific user log files and to standard Enstore daily log files. You can search log files or retrieve entire log files. |
Why? |
This page is for administrators. You can use the log files to retrace Enstore activity, to understand Enstore problems or behavior, and so on. |
How? |
To arrive at the Enstore Log Files page, click “Log Files” on the top page. |
There are three sections to the Enstore Log Files page.
Link to Search Page
First there is a link to a search page; look for “Enstore log files may also be searched”. Use the Help button for information on constructing your search string. (Shown in the image below.)
User Specified Log Files
The next part is entitled User Specified Log Files. It lists miscellaneous log files configured and maintained for your Enstore system.

Any given Enstore installation may contain some or all of these log files:
FAILED Transfers Lists all encp jobs that failed; lists by volume and by mover
Recent (robot) log messages Displays all the messages from the robot (for the most recent few days)
Active Monitor Log Displays the data transfer rate between the base node and all other nodes in the same Enstore system, including movers
Cambot (D0) Displays a live image photographed by a camera mounted inside the D0 ADIC robot
Enstore Node Information Displays information on all nodes belonging to this Enstore system.
Network-At-A-Glance Displays network interface status of all nodes relative to base node; uses colored icons for easy identification of problems
PNFS Export List Lists all the existing PNFS areas for the Enstore system (when PNFS is mounted, these are the areas that get NFS-mounted)
Enstore Log Files
The bottom portion of the page is called Enstore Log Files. It displays a calendar of the current (and possibly the previous) month from which you can click the date of the log file to view (the image below was captured on January 2, 2004, the last date that shows an entry). The size of the log file is given.
Requesting a day’s log file is memory intensive and very slow due to the large size of the log file.

10.14 Quota and Usage
What? |
The Quota and Usage page provides information on your Enstore volume usage, organized by library and by storage group. The page is not real-time, it displays a recent snapshot. |
Why? |
Administrators and users can look here to see a variety of details about your Enstore system’s resource and quota management. |
How? |
To arrive at the Quota and Usage page, click “Quota and Usage” on the top page. |
This page displays the following fields:
Library library manager
Storage Group storage group
Req. Alloc. requested volume (e.g., tape) allocation
Auth. Alloc. authorized volume allocation
Quota total space allowed in robot
Allocated total number of volumes currently allocated in the robot
Blank Vols of the allocated volumes, the number that are blank
Used Vols of the allocated volumes, the number that are written
Deleted Vols of the allocated volumes, the number that have been deleted
Space Used total space used on all allocated volumes
Active Files total number of active (non-deleted) files on all allocated volumes
Deleted Files total number of deleted files on all allocated volumes
Unknown Files
What? |
The Enstore Plots page provides information on Enstore performance in a visual format. These are not real-time, they are snapshots. |
Why? |
You can look here to see a variety of details about your Enstore system’s recent performance. |
How? |
To arrive at the Enstore Plots page, click “Plots” on the top page. |
The Enstore Plots page, provides information on some statistics that Enstore gathers. These statistics are gathered from the log files produced by Enstore. Several plots are available:
• Drive Utilization
• Bytes/Day Plot
• Bytes/Day per Mover Plot
• Mount Latency Plot
• Mounts/Day per Drive Type
• Storage group activity (STKEN only)
• Total bytes/Day
• Total bytes/Day combining all three systems (D0EN, CDFEN, STKEN)
• Total bytes written/Day
• Cumulative Mounts Plot
• Transfer Activity (log) Plot
• Transfer Activity Plot
• Mounts/Day Plot
• Null Terabytes/Day (Instantaneous Rate Plot)
• Real Terabytes/Day (Instantaneous Rate Plot)

All the plots are described in the online help page. Each plot is available for viewing three ways:
• a small version of the plot (postage stamp) displayed directly on the page
• a full size version of the plot; click on the postage stamp to display
• a postscript copy of the plot; click on the (postscript) link to display
10.16 Tape Inventory Page (Dynamic HTML)
What? |
For each volume declared to your Enstore system you can dynamically (re)create a page that presents its volume inventory information in HTML format. |
Why? |
Use this page to find out details of the storage of your file(s) on a volume, to see how full a tape is, or to check the inhibits. |
How? |
To arrive at this page start at the top page, scroll down to the Information table, and click “Tape Inventory”. This brings you to a list of all declared volumes in your Enstore system. Click on the volume of interest. |
This page is dynamic and uses a lot of server time; please minimize the number of times you regenerate this page.

Find the volume of interest and click it to get a file listing. The format is essentially identical to that shown in section 10.8 Tape Inventory Page (Text), but in addition under the heading, you get a list of volume parameters:
{'blocksize': 131072,
'capacity_bytes': 107374182400L,
'comment': '',
'declared': 'Thu Jul 25 15:27:27 2002',
'eod_cookie': 'none',
'external_label': 'IFA001L1',
'first_access': 'Wed Oct 23 12:42:43 2002',
'last_access': 'Fri Nov 8 10:43:08 2002',
'library': 'samlto', 'media_type': '3480', 'remaining_bytes': 107374182400L,
'si_time': ('Fri Dec 6 10:22:41 2002', 'Wed Dec 31 18:00:00 1969'),
'sum_mounts': 2,
'sum_rd_access': 0,
'sum_rd_err': 0,
'sum_wr_access': 0,
'sum_wr_err': 0,
'system_inhibit': ['none', 'none'],
'user_inhibit': ['none', 'none'],
'volume_family': 'd0backup.d0backup_30.cpio_odc',
'wrapper': 'cpio_odc'}