Fermi National Laboratory

Volume 23  |  Friday, August 25, 2000  |  Number 15
In This Issue  |  FermiNews Main Page

'The Software formerly known as Sherlock..
... is now called Sleuth.'

by Kurt Riesselmann

Dave Toback and Bruce Knuteson are the people behind Sleuth Whether it's your newborn, your latest cooking creation or a physics software package: Finding a great name is always a difficult task. You search for something terrific. Perhaps you want to create a wonderful association. Or you may have to focus on avoiding trouble: With family, friends -- or the law.

Bruce Knuteson, a UC Berkeley graduate student and member of the DZero collaboration at Fermilab, thought he had found the perfect name for his particle-physics software, an analysis package for quickly searching large amounts of experimental data for new particle signatures. Since the identification of new and unpredictable particles requires detective-like strategies, Knuteson chose a name connected to more than a century of detective work: Sherlock.

The name turned out to be too good. Sherlock is already the name of a commercial software package that finds electronic files rather than particles. Knuteson and Dave Toback, postdoc at the University of Maryland and co-author of the program, were prohibited from using the name Sherlock in the future.

To remedy the situation, Knuteson and Toback initiated a naming contest within the DZero collaboration. Fellow scientists, who clearly appreciated the detective aspect of the name-stripped software, made appropriate suggestions. Based on those results, the program formerly known as Sherlock is now called Sleuth.

What is Sleuth all about?

Knuteson's and Toback's software creation is designed to sleuth for new particles in collider data using a model-independent approach -- making as few assumptions as possible, and embracing as many possibilities as nature may offer.

"Even if you are looking for nuggets, you need to make sure that you don't miss the diamonds in your gold pan," says Toback.

Sleuth is designed to catch it all.

"Most search strategies test whether experimental data fit a particular model," explains Knuteson. "This approach is reasonable if the number of competing models is smallócontrary to our situation. Given the plethora of new possibilities, we decided to assume as little as possible about what we might discover. We want to let the data speak to us, rather than asking it narrowly-focused questions."

Run II data collection will start in March 2001, and the CDF and DZero detectors will record almost a million proton-antiproton scattering events per day. Physicists expect to identify about one hundred classes of events, with all events in a particular event class identified by a unique number of leptons, electroweak bosons, neutrinos and jets. For each event class, the software package Sleuth assigns two to four variables, which can be calculated from the transverse momenta and missing energy associated with the particles and jets observed in each event.

Sleuth looks at one event class at a time. It uses various other software packages to calculate the Standard Model background in a given class. Then the search for new particles begins.

"New interactions may create what we call signal events, which, we hope, look different from the background," says Toback. "Sleuth contains an algorithm to help distinguish these possible signal events from the background events in a very efficient and model-independent way."

The importance of pobfutoatonoes

To identify signal events in a specific event class, Sleuth transforms the background distribution to every statistician's field of dreams: the unit box.

Instead of several variables ranging from zero to infinity, Sleuth introduces a new set of special variables limited to values between 0 and 1. In addition, Sleuth's transformation spreads the background uniformly across the unit box, making it a flat distribution. A set of signal events would ideally stand out like Mount Everest in the Midwest. Physicists, however, are prepared to look for the appearance of a molehill.

When Sleuth analyzes the data of an event class as recorded by the particle detectors, it looks for deviations from the flat background. Data taking, of course, is a statistical process and is subject to fluctuations. To test whether a seeming excess of events in one region of data space could be a hint for new physics, Sleuth calculates a quantity that Knuteson refers to as a "pobfutoatonoe:" probability of background fluctuating up to or above the observed number of events. Each pobfutoatonoe value indicates how likely it is that the mismatch between the data recorded and the Standard Model prediction is just a statistical fluke.

To calculate a pobfutoatonoe (pronounced pub-phooto-what-to-know), Sleuth needs to divide the unit box of a certain event class into many small subspaces. Knuteson looked a long time before discovering the concept of Voronoi regions, which yields an algorithm that can accomplish this task in a sensible, reproducible and quick way (see figure).

"This is sort of cute," Knuteson smiles. "I was fortunate to stumble across Ken Clarkson's Internet pages at Bell Laboratories in Murray Hill, New Jersey. Clarkson worked in the area of computational geometry and had developed a software package to calculate Voronoi regions in many dimensions. It turned out to be the perfect solution to my problem." Clarkson's software only

needs a few minutes of drawing all Voronoi regions for 1000 data entries in a two-dimensional unit box.

Once the unit box is divided into Voronoi regions, Sleuth assembles subspaces composed of many adjacent regions and calculates pobfutoatonoes for each subspace. Comparing many different subspaces, Sleuth provides a measure of judging how significant a set of seemingly unusual events is, without the bias of model-dependent search strategies. The subspaces with the smallest values are the most likely candidates for physics signatures beyond the Standard Model.

"Sleuth cannot identify any particular event to be the signal for a new particle," Toback cautions. "However, it would be able to single out 10 events and indicate that 8 of these events are not in accord with the Standard Model."

No matter what kinds of new particles Knuteson and Toback may discover, they'd better talk to a lawyer before naming them.

last modified 8/25/2000   email Fermilab