Iowa Caucuses, Boeing 737 MAX, and Apollo 11: software is usually the weakest link

From Techcrunch:

A smartphone app tasked with reporting the results of the Iowa caucus has crashed, delaying the result of the first major count in nominating a Democratic candidate to run for the U.S. presidency.

Previously, I wrote about how a handful of lines of code could have prevented the Boeing 737 MAX’s software from trimming the airliner into a dive-bomber-at-Midway nose-down attitude (see “Boeing 737 MAX crash and the rejection of ridiculous data”, for example).

I recently visited the Kennedy Space Center visitor center. In the building housing one of the leftover Saturn V rockets there is a compelling “Lunar Theater” presentation explaining that software overloaded the computer system in the Lunar Module during Apollo 11, the first landing on the moon. According to the dramatic retelling, the mission was saved only because the crew hand-flew the spaceship to a successful landing. In other words, all of the civil, mechanical, electrical, and aeronautical engineering challenges were met, but the software failed.

[Update: See comments below for how the the software in this case may have been blameless!]

The books for sale at the KSC do not encourage young visitors to become computer programmers…

Maybe it is time to switch to Haskell?

Also, what if the Iowa debacle had happened in some other country? Would U.S. media report it as resulting from a fundamental problem with that country’s culture and educational system? Whereas if it happens here in the U.S. it is just an unfortunate freak event?

Related:

  • Apollo 11: Mission Out of Control (WIRED): “The inside story of how Neil Armstrong and Buzz Aldrin struggled to touch down on the moon, while their guidance computer kept crashing. Again and again.”

21 thoughts on “Iowa Caucuses, Boeing 737 MAX, and Apollo 11: software is usually the weakest link

  1. That’s why India, the world’s largest democracy, runs mostly trouble-free elections using electronic voting machines specified and designed by the electoral commission. Unlike the insecure commercial EVM, the logic is frozen at the time of manufacture, and they do not use general-purpose microprocessors.

    We could gain a lot by shedding Not Invented Here syndrome and adopting Indian voting machines, Canadian air traffic control, European military tanker aircraft and so on.

    BTW, don’t you know kids in the US and UK dream of being “vloggers” and “youtubers”, it’s only Chinese kids that still dream of becoming astronauts.

  2. While software is often the cause of problems, it’s not true that the Apollo 11 software was at fault during the landing. See:

    https://en.wikipedia.org/wiki/Apollo_11#Lunar_descent

    “To blame the computer for the Apollo 11 problems is like blaming the person who spots a fire and calls the fire department. Actually, the computer was programmed to do more than recognize error conditions. A complete set of recovery programs was incorporated into the software. The software’s action, in this case, was to eliminate lower priority tasks and re-establish the more important ones. The computer, rather than almost forcing an abort, prevented an abort. If the computer hadn’t recognized this problem and taken recovery action, I doubt if Apollo 11 would have been the successful Moon landing it was.”

    Even though the computer was restarting frequently, each restart was nearly instantaneous, and Armstrong was still basically using a fly-by-wire system to land.

    • Is it fair to say that a software system that restarts frequently and freaks everyone out is performing properly? The WIRED article (and how can we doubt anything in WIRED) makes it sound as though programs that shouldn’t have been running at all during the landing were, in fact, running and hogging resources.

  3. Way back when I was a student I had a part time job as a programmer on a research project at the university involving many people doing lots of observations. People double checked each other for every step of the process except for my computer part. When I suggested that maybe my part should be double checked by someone else also they were not happy and acted like I was a lazy slacker.

  4. Phil – Arthur is right. Software was the hero of the Apollo 11 landing, not the culprit. The software largely mitigated a hardware design flaw, arguably made worse by human error (the incorrect switch setting).

    • We need to go down to KSC and correct their presentation then! They make it sound as though the software was completely messed up and the only thing that saved the mission was heroic hand-flying.

  5. Phil, WIRED was write, but apparently left out context. There was a hardware failure and a switch misconfiguration that, together, resulted in the computer being sent additional radar data, constantly turning on and off, that it shouldn’t have been receiving at all at that point. The computer didn’t have enough capacity to handle that and the real mission at the same time, so it was programmed to shed less-important tasks, i.e. processing the incoming faulty data. That resulted in the program alarm code that Houston had to look up in the manuals to understand. But the restarts were so fast that the lander’s ability to fly wasn’t affected, which is exactly what was planned.

    NASA, like many of us, likes to emphasize the heroic nature of astronauts’ actions, not just in this case, but in general. Blaming the problem on the computer fits into that narrative, so it gets repeated. But this is one rare case where the software designers had actually anticipated the problem and solved it well, in advance.

    I don’t remember where I read this, but there had been some discussion of leaving such failsafe code out of the Apollo Guidance Computer’s software (after all, it had only 36KW of program memory). The software engineers designing the system insisted on including it, and managed to find a way to do so.

    I only wish my laptop could reboot in milliseconds, not even disturbing what I’m doing.

  6. I’d like to second everyone who has explained that the Apollo 11 Guidance Computer actually saved the mission, not the other way around as KSC and others have attempted to overdramatize. I read about it many years ago and I’m disgusted that KSC isn’t telling it like it was. Of course, it was also fortunate to have Neil Armstrong on hand – his experience during Gemini 8 was undoubtedly an asset that day. When you have to have someone fly a spacecraft, it helps to have a man who…has actually flown a spacecraft.

    https://en.wikipedia.org/wiki/Gemini_8#Emergency

    Cool under pressure. If Armstrong hadn’t gotten control, it was curtains:

    “…the tumble rate had reached 296 degrees per second and Armstrong decided to shut down the OAMS and use the Reentry Control System (RCS) thrusters, located on the Gemini’s nose, to stop the tumble. Scott later praised Armstrong’s actions as their spacecraft spun: “The guy was brilliant. He knew the system so well. He found the solution, he activated the solution, under extreme circumstances … it was my lucky day to be flying with him.”

    Good video here:
    https://www.airspacemag.com/videos/category/new-label/how-neil-armstrong-saved-the-gemini-8-spacec/

    As far as the Iowa Caucuses goes, look at it this way: A brand-new smartphone app., improperly vetted and stress tested, being rolled out for a do-or-die debut after a slew of rule changes that even the rule makers only dimly understand, and, oh yeah, it’s all got to work perfectly the very first time because Iowa is such an important sendoff for the rest of the entire primary season. What could go wrong?

    >Also, what if the Iowa debacle had happened in some other country?

    This IS some other country now.

  7. It’s kind of amazing that lowly computer with is pawmade rope core memory had some kind of task switcher with the capability of recognizing when it was maxed out. Modern programmers are too absorbed with job interview questions & coding contests to imagine such wizardry.

  8. >>> Maybe it is time to switch to Haskell?
    “Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.”

    PhilG – I am sure you would recognize this quote 😉

  9. JFK-VOICE: “We do these things not because they are easy, but because we thought they would be easy!”

    • Although Einsenhower created NASA (building on NACA, from World War I) and rockets were launched from Cape Canaveral starting during the Truman Administration, JFK is ubiquitous at the KSC visitor center! You will hear the moon speech 10 times during a typical visit. You will never hear about the timing with respect to the failed Bay of Pigs invasion.

Comments are closed.