When Computers Wake You Up at Night

by Philip Greenspun, working with Zak Kohane, Bill Long, Ken Mandl, and Pete Szolovits at the MEDG Group

Site Home : Research : One Article

In the 1960s and 1970s, corporations and universities were working hard on the problem of building computer systems that could replace human beings. Wages for the average working stiff were rising and robots seemed like the only way to grow the economy without inflation.

It is the 1990s now. Lack of progress in the artificial intelligence labs has discouraged researchers and funding agencies. Moreover, today's economy wouldn't really need the robots. American workers can be hired for almost nothing. Foreign workers can be hired over the Internet for literally nothing. In a good year, a big company might give its CEO a $50 million raise, enough to hire more than 1000 workers. It is hardly worth diverting the focus of management to replace workers, each of whom earns only 1/500th as much as a typical executive. That would be like asking executives to think about new detergents to improve the efficiency of the janitorial staff. Sure, higher efficiencies could be obtained, but it isn't going to significantly affect the bottom line.

So what does a 1990s organization need from computer systems? Better monitoring. Ubiquitous computers and networks mean that more data is available than ever before. Though workers cost bupkes, you can't always hire 1000 of them to sift through a huge data set because it is too difficult to coordinate their work. Also, if the data are dull, they may all fall asleep.

My personal vision of the future contains Joe Haggard, a moderately paid technician sitting at a terminal. His task is to maintain 1000 machines, each of which hardly ever requires any attention. All of these machines are running Web servers so that they can report their state remotely and can be adjusted remotely. Work for Joe consists of waiting for a monitor to beep him, checking a machine, adjusting it if necessary, and then going to back to waiting.

Like most people in Academia, my vision of the future is the same as the average industry person's vision of five years ago. Joe Haggard = Jane Intensive Care Nurse. Joe Haggard = Charlie Air Traffic Controller. There are already a lot of people working like Joe Haggard. The only reason that there aren't more is that not enough data are available on-line.

What is wrong with today's monitors?

I attended a talk by Chris Tsien here at the MIT Artificial Intelligence Laboratory. She hired undergraduates to sit in the intensive care unit (ICU) for 300 hours and record how many times nurses were alerted by monitors. In a hospital ICU, a "monitor" is a physical box that watches a signal, e.g., electrocardiogram or pulse oximeter, and beeps if it doesn't like the current value of the signal. They are manufactured by companies such as Hewlett-Packard and SpaceLabs.

Chris found that the 92% of the alarms were false. Only 8% of the time did an alerted nurse actually need to do something. Her team observed no cases in which something went wrong and a monitor did not beep. ["Poor Prognosis for existing monitors in the intensive care unit", Christine L. Tsien & James C. Fackler, Crit Care med 1997, 25:4.]

Upon hearing this result, an AI professor said "What a bunch of losers those monitor engineers are!" I thought to myself "Ten-year-old machines never once failed to operate properly in 298 hours and we think we have something to teach the engineers who built them?"

Is there in fact anything wrong with these monitors? Maybe not. If you only have one monitor box hooked up to one patient being treated by one nurse, then definitely not. But the very existence of the monitors has made it easier for hospitals to assign fewer nurses to the ICU. So now one nurse is potentially exposed to alerts from 100 monitors hooked up to 5 patients.

Monitors are signal-centric. There is no attempt to make them patient-centric. For example, the pulse oximeter monitor doesn't look at the ECG signal or the state of the ECG monitor. Would a patient-centric monitor be good? Sure. It could integrate data from multiple sensors and only raise the alarm when a combination of signals suggested trouble. But a nurse-centric monitor would be even better. If there is only one nurse and she is busy resuscitating Patient A, whose heart is stopped, then there isn't much point in alerting her to the fact that Patient B is getting a little too much oxygen. If Patient B's heart should stop as well, then the best thing that a nurse-centric monitor could do would be to call someone over from another part of the hospital.

I know a guy named Keith whose job is to make sure that ZD Net's 30-or-so Web sites are all up and running. It is a beeper job and the beeper goes off when any of the ZD Net URLs become unreachable. All of the Web sites are hosted from a handful of machines inside a machine room at BBN Planet. When BBN Planet becomes disconnected from the wider Internet, all 30 sites become simultaneous inaccessible. Keith does a great job of system administration so his servers very seldom suffer internal failures. Thus, almost every time Keith's beeper goes off, it is because of a BBN Planet problem that affects all of his sites. So he knows that the first page will be followed by 30 more over a 15 minute period.

Rethinking Monitors: Management Structure

There is a natural tendency in computer science circles to focus on the "nerd user", someone who can design a monitor, specify it in a formal language, and then expects to watch the output of the monitor him or herself. An example of this kind of monitor is a Web server pinger that complains when a URL is unreachable.

In the Real World, a more typical management structure would be the following:

The world already has great support for big shots. It would be nice to make the tech drone's life a little easier (and after all, the tech drone is one of us). But the suffering loser has been completely neglected until now.

Rethinking Monitors: System Structure

Let's start from the suffering loser's point of view. He has a computer screen. Although the Buddhists will tell you that desire is the root of suffering, my personal experience leads me to point the finger at system administration. To make life as painless as possible for users, we restrict our interface to work with commonly available Web browsers, e.g., Netscape Navigator and Microsoft Internet Explorer. So a Java applet is OK but a helper or plug-in application that the user has to install is not OK.

What is a suffering loser able to do? Pick a set of monitors and ask that they be aggregated together onto one Web page [implementation note: we use HTML framesets, with each monitor getting its own subframe]. The top-level frameset has a few controls for adding or removing monitors. All the rest of the user interface is per-monitor, though we define some standards here so that users won't be faced with complete chaos. Also, because the monitors are generally writing out an HTML page, much of the user interface is being rendered by their Web browser and hence will be inherently uniform.

When a monitor is reporting unhappiness, the suffering loser may wish more detail. Rather than develop a user interface to allow the user to adjust the portion of the screen occupied by a particular monitor, we will rely on their experience with their Web browser. Detail and justification links from the monitor's subframe should target new browser windows. The user can resize, iconify, expand, or close these new windows at will, using the same interface that he uses in day-to-day operation of the computer.

A Big Split: Continuous or Periodic Suffering?

If a monitor hasn't gone off for days or weeks, it seems natural for the suffering loser to want to stop staring at the screen. So we need to introduce a big split in the system design to support notification of suffering losers who expect to suffer no more frequently than every few days. They won't be staring at a Netscape browser waiting something to change. We'll have to send email, page them, or call them. We won't be able to count on them responding since they might be asleep or traveling and thus must have a procedure for escalating notification.

Monitoring Turns Into Reminding

If we assume that the Suffering Loser is overworked and is beginning to forget to perform tasks then monitor email alerts eventually blend in with other kinds of email reminders, perhaps even reminders for things in the user's personal life. At that point, we have to think about what the possible kinds of reminders are.

A Big Split (for reminders): Drop-dead or Nag-Nag-Nag

A drop-dead event is one where you have to do something by a certain date. If you don't get to it, you nonetheless don't want to be reminded about it again. Examples of drop-dead events are "wish Susan a happy 25th birthday", "buy a ticket to France by May 5th when the good fares expire", or "return that Nikon camera before your 30 days are up".

A nag-nag-nag event is one where you have to do something eventually. It would be better to do it sooner but if you don't get to it now, you must eventually. Examples of nag-nag-nag events are "get your teeth cleaned", "get an oil change for the car", "give the dog his monthly heartworm pill".

The next split (for reminders): Periodicity

A one-time event is just that, e.g., "buy a ticket to France by May 5th."

A calendar-periodic event is one that happens periodically in time regardless of the user's actions, e.g., "wish Susan a happy birthday" (comes around exactly the same time next year regardless of when or whether you wish her a happy birthday this year).

A user-periodic event is one that needs to happens on a date that is a function of when the user last acted, e.g., the teeth cleaning, oil change, and dog heartworm pill nag-nag-nag examples above.

E-Mail Reminding is a Two-Way Street

After e-mailing the user, what does a reminding system need to get back? Here are some possible responses: Some of these are useful to a system that wants to keep a database of whether reminders/alerts were followed. Some are essential for generating future user-periodic reminders.

Reminding Turns Into Education

I tell you some stuff. Then I tell you more stuff. Sometimes it is the same stuff that I told you before. Does that sound like school?

Paul Pimsleur, a pioneering language instructor, claimed that the key to language education was forcing a student to recall a word at graduated intervals. For example, the student would be asked to repeat a word immediately, then use it in a simple sentence 30 seconds later, then use it 3 minutes later, then be exposed to it again 30 minutes later, then a day later, etc. Pimsleur developed language tapes in which, if the student listened to one tape per day, these intervals were observed.

If one could boil a subject down to various points, it would be easy to present them to students in quick graduated intervals on the Web and then follow up the longer intervals with email.

We'd need some extra richness in possible responses by students:

If the user says he knows the point already, then presumably we reduce the frequency with which we remind him in the future. Note that this may vary with the user's age. A 10-year-old who understands something might need to be reminded in 6 months but a 50-year-old could go for two years.

If the user doesn't respond, it might be reasonable to assume that he needed the reminder and not require him to give us any feedback. The problem with this is that we don't want to send email into a black hole. What does it mean if we've sent 15 messages and the user hasn't responded? We're communicating perfectly? Or he's gone on vacation and we're filling his mailbox with SPAM that will get lost in the 2000 other messages that greet him upon his return?

If the user says he is confused then a computer-based education system should offer additional documentation on the point of confusion. If there is a human instructor periodically involved with this process then the point should be noted for the student's next meeting with the instructor.

Education Turns Into Medicine

Could the "instructor" in the above example be a medical doctor? Absolutely. When a patient develops juvenile diabetes, there is a big list of items that the doctor needs to communicate to the patient and be sure that the patient understands. In order to make the best use of scarce visit time, it would be great for the doctor to know on which points are a patient was "checked out" and on which he was either confused or still in need of more education.

How Does Email Turn Into Feedback?

My personal favorite method for getting feedback from people to whom my servers send email is to include encoded URLs as plain text. If they are using a modern mail client such as Netscape Navigator the string will be recognized with a regular expression matcher and turned into a mouseable hyperlink. Something like:
if you're confused, click on the URL below

Infrastructure to Operate

This stuff isn't exactly rocket science, but to make it happen without losing your hair you need the following infrastructural items:
  1. computer system connected 24x7 to the Internet with maintenance contract and reliable power
  2. professionally administered relational database management system for recording which reminders are to be sent, which have been sent, and what feedback has been received from users
  3. convenient and high-performance Web/DB integration tools
It sounds trivial but basically there is no group within our lab (MIT LCS) that has all of these items.

Personal Experience

Here are some sites that I personally built and operate...


Reader's Comments

I've been very happy with some free perl based monitoring software called "mon", available from: http://www.kernel.org/software/mon/ It can check everything we are interested in from one place, and is very configurable, so it can nag us at intervals, or just tell us when something is down and then back up again. Mon can contact as via e-mail or pager, which is very handy. It's still true that there are a lot of false reports though, hiccups in the system, or services that go down just briefly and come back up on their own. We run mon on a server that runs no mission critical applications so we don't get in a catch-22 of losing the mission critical apps and the monitoring software at the same time. We also wrote a small script that emails once a day to tell us that the monitoring script is still running, so we don't have to worry that mon quit running for some reason.

-mark Summersault website design and hosting

-- Mark Stosberg, January 23, 1999

Add a comment

Related Links

Add a link