Stockholm viewed from Stadshuset

Overview

of the results in Architecture and Implementation of Online Communities by Philip Greenspun

Farmland in Gotland, view from airplane In 1993 I set out to find the most useful and most efficient ways to harness the power of the Internet. I quickly concluded that (1) the Internet's greatest strength was connecting people separated in space and time, and (2) the most pressing and difficult need in society was for better tools to support education outside the traditional classroom. Combining these conclusions, I focused on the challenge of building high-quality software and Web services for online community. Many of the results presented in this thesis are applicable to simpler Web applications, such as ecommerce, but the heart of the thesis is attacking the problem of producing sustainable online communities.

The methods that I used in my research are fairly traditional in the engineering world: I identified a problem, conceived a solution, implemented a system to test the solution, and wrote about what worked and what didn't. The presentation in the chapters that follow is non-traditional for a thesis in that it assumes little background in the subject area. This is to ensure that the ideas and lessons are absorbed by as many people as possible. Someone who has spent 25 years working with relational database management systems won't find Chapter 12 very enlightening, but the chapter is there so that an intelligent thoughtful person doesn't get stuck because of unfamiliarity with particular technologies.

One aspect of my research is traditional in engineering but not in computer science: I continuously measured my ideas, architecture, and programs against the needs of users and administrators. Scientists measure their results against Nature. Engineers in industry measure their results against customers and competitors. Too often academic computer scientists don't measure their results against anything.

Does that mean that I'm down on academic computer science? No. In fact, this overview is meant to assist the academic reader in figuring out Vancouver, British Columbia, from the top of the tourist tower that every Canadian city seems to have.

As with the thesis that follows, two threads wind through this overview: (1) the user experience; (2) the technology behind it.

Envisioning the User Experience

Painted wall on the border between Chinatown and North Beach.  San Francisco, California Before thinking about the details of the user experience, it is worth considering the raison d'etre of a Web site. If a site is to be useful and therefore popular, I claim that it must fall into one of the following categories:
  1. service
  2. entertainment
  3. education
There are other Web sites out there that might fall into such categories as "corporate vanity" or "crass promotion" but I think that these may safely be ignored for the purposes of this thesis. Since they aren't going to be useful to users, they won't get enough traffic to require thoughtful engineering.

Examples of sites that fall into the service category:

Entertainment: Chinatown.  Manhattan 1995 Almost every other site, if it is actually popular, falls into education: Hawaii. We can draw a sociological conclusion from the preceding examples: People want services, entertainment, and education.

Identifying the Relevant Engineering Problems

Given the sociological objective of delivering services, entertainment, and education to users, what are the relevant engineering problems? In 1993, I made the following assumptions: Given those assumptions, I focused my research effort on building two layers of technology:
  1. A reliable, monitored substrate of network connectivity, operating system, and relational database management system.
  2. An open-source reusable toolkit for building Web sites for collaboration.
The first layer, which I've called the ArsDigita Server Architecture can be applied to any Web service that requires a database management system back-end, ranging from the most primitive order-taking ecommerce site to the most sophisticated collaborative authoring environments. The second layer, which I've called the ArsDigita Community System is more application-specific. It is intended to support development of online communities with an educational dimension, can be useful for building collaborative Web-based applications that replace desktop applications, and can be stretched to solve some commercial problems.

So that we do not lose momentum amidst the myriad and sometimes arcane ways in which computer systems can fail, I've relegated discussion of the ArsDigita Server Architecture to Appendix A. We're now free to talk about applying the ArsDigita Community System toolkit.

Required Elements of Online Community

Let us step back from the technology for a moment and consider the required elements for a sustainable online community: Head of the Charles Regatta, Sunday, October 18, 1998.  From the footbridge to Harvard Business School
  1. magnet content authored by experts
  2. means of collaboration
  3. powerful facilities for browsing and searching both magnet content and contributed content
  4. means of delegation of moderation
  5. means of identifying members who are imposing an undue burden on the community and ways of changing their behavior and/or excluding them from the community without them realizing it
  6. means of software extension by community members themselves
Before we look at which of these elements are addressed by the system built for this thesis, let's consider whether any of these points are novel, debatable, or unsupportable.

Element 1, that a community can't get off the ground without magnet content, was novel in 1993 when I began my research. The spectacular public failures of thousands of community sites over the ensuing half-decade have reduced the novelty and increased the supportability of this idea. Chapter 1 contains examples of sites that failed due to lack of magnet content and sites that have become vastly richer because of it.

Element 2, that a sustainable community needs means of collaboration that automatically organize user-contributed content, remains novel. Most online communities depend on the human administrators to collect user-contributed content and associate it with the magnet content. This activity typically will wear out the site owner (if a non-commercial site) and/or render the site unprofitable to operate (if a commercial site). Chapters 3 and 15 (Case 4) give examples of this element.

Element 3, browsing/searching magnet and user-contributed content, is not novel in and of itself. However, the systems implemented and described in the thesis are novel. They differ substantially from the standard USENET-style threaded discussion user interface that permeates community Web services. Chapter 15 outlines some of the straightforward improvements of the software over USENET. A sustainable moderated online community needs very high quality tools to help new users find previously answered questions. Otherwise moderators and long-term participants become overwhelmed with duplicate questions.

Element 4, that a community will not be scalable unless there are semi-automatic means of delegation of moderation, is novel as is the implementation discussed in Chapter 3.

Element 5, that costly community members must be identified and converted to supportable members, is novel as are the means outlined to determine a member's administration cost and the means that I've implemented and tested to exclude persistently costly members. Chapter 15 contains a brief discussion of how a problem with one community member can overwhelm a 25,000-user/day site and a surprisingly simple and effective method for dealing with the problem.

Element 6, that software to support an online community must be extensible by community members themselves, is the most novel element. Conventional wisdom (and the references below) assume that a priesthood of programmers will produce the ultimate community/collaboration system. The users will use it. My research has led me to conclude that no priesthood will be smart or large enough to serve users' needs. The solution is to let the priesthood worry about concurrency control and transactions (i.e., they get to build the RDBMS, the Web server that talks to the RDBMS, and a language interpreter) and to let the users extend fundamentally sound community software in a safe interpreted language. Examples of the utter failure of conventional approaches are given in the application servers section of Chapter 13.

How can we know whether or not these elements are truly required? By looking at successful and unsuccessful online communities and trying to correlate degree of success with possession of the allegedly required elements. This sounds like progress but it immediately gets us into a different kind of trouble: how do we measure the success of an online community?

Online community success metrics (that don't work)

Head of the Charles Regatta, Sunday, October 18, 1998.  From the footbridge to Harvard Business School All of the online communities discussed in this thesis have an educational dimension. Thus, one would think that the most important metrics would focus on the learner, e.g., "how many minutes did it take users to get their questions answered?"

Simple learning effectiveness metrics aren't appropriate for evaluating community site software. First, it might be the case that the publishers invested an extraordinary amount of effort writing the magnet content. The Encyclopedia Britannica site (http://www.eb.com) would rank very high on "did the user get an answer to his or her question?" metric and yet the EB folks have made no attempt to build an on-line community.

A second argument against simple learning effectiveness as a metric is that it does not take into account the scarcity of experts' time. Suppose that at Site X, for each learner 15 experts are each willing to devote 20 hours per week to teaching, but at Site Y, for each learner there is only 1/100th of an expert available and that expert can only spend 1 hour per week. Site X is always going to provide a more effective learning experience even if its community technology is much more primitive than Site Y's.

Community members' time will remain scarce despite any conceivable improvement in computing technology. Experts' time will be particularly scarce, especially on non-commercial sites where users are donating time to help other users. The site owner or administrator will be the most burdened, a problem that is again especially acute on non-commercial sites. The most useful metrics will all get at questions like "Is the software making the most effective use of the experts' time?" or "Is the community able to capture and make productive all the hours that experts are willing to donate?"

Let's consider the site administrator first. If we ask simply "What kind of server-side technology change would best reduce the number of hours required to run the service?" then the answer is "unplug the server." We presumably need to refine our question to adjust in some way for usage and/or benefits delivered to the community.

How to measure usage/benefit though? As discussed at the end of Chapter 4, it isn't clear whether a community site should want a user to show up, get an answer, and then leave. Suppose Joe User shows up looking for a phone number but is entertained and enlightened enough to spend two hours on the site. Maybe we should consider the site a success, even if Joe does not find the originally desired phone number.

An online community success metric that works

Head of the Charles Regatta, Sunday, October 18, 1998.  From the footbridge to Harvard Business School My experiments have shown that a workable metric of the effectiveness and sustainability of a community site is the ratio of material authored by the publisher and material authored by site users (community members).

Consider http://www.photo.net/cr/, a travel story about Costa Rica. This site has been live since early 1995 and yet has never felt like a community. People have an intense interest in a travel destination for the month or two before their vacation. After they return from Costa Rica and go back to work, they don't have a continuing interest in improving their knowledge about the country or their skills for traveling to Costa Rica. They focus on their jobs and their next travel destination. Here are some numbers for publisher- and user-authored content:

Because with the older version of the community software, users were unable to submit photographs, I have not included in the above statistics the publisher's photographs as content. In any case, it is not really meaningful to talk about photos in terms of bytes. Because we don't want a popular long-running discussion to skew the statistics, we only allow two months word of bulletin board postings to count for the user-contributed content.

Bottom line: user-contributed content is only 7 percent of publisher-authored content. Compare to http://www.photo.net/photo/, a page devoted to photography that is often explicitly referred to as an on-line community:

User-contributed content is 393 percent of publisher-authored content, i.e., 4 times as much as the publisher provided. Eighty percent of the service was user-contributed. Had we included archived (not deleted by the moderator) bboard postings older than two months, we would have found that 92 percent of photo.net was user-contributed (i.e., the archived bboard postings alone were about 10 times larger than publisher-authored content).

The metric seems to work. We have two sites that offer similar technological means of collaboration. The one that is perceived to be a failure as an online community scores low on the "percent of user-contributed content" metric and the one that is perceived to be a success scores high.

Let's try another pair of sites. The first is http://www.photo.net/italy/, my 600-image photo essay and guidebook to Italy. I don't think of the site as having a significant online community dimension. Here are the numbers:

User-contributed content is 15% of publisher-authored content. Let's compare this to http://www.photo.net/wtr/, my Web service for people who want to learn more about building Web services. My perception of this service is that it is beginning to take off as an online community, but it is nowhere near as developed as photo.net. The raw numbers do not confirm this observation: User-contributed content is only 15% of the publisher-authored content. However, it turns out that the archived discussion threads in this service tend to be less chatty and more relevant than at photo.net, so it wouldn't be unfair to include the full 724 MB of archived bboard content. That would bring user-contributed content up to 41% of publisher-authored content, significantly better under our metric than the total failures as communities (/cr and /italy) and significantly worse than the big success (/photo).

Note: data are as of November 26, 1998.

How does the toolkit help?

We have a metric to evaluate success. We have our required elements:
  1. magnet content authored by experts
  2. means of collaboration
  3. powerful facilities for browsing and searching both magnet content and contributed content
  4. means of delegation of moderation
  5. means of identifying members who are imposing an undue burden on the community and ways of changing their behavior and/or excluding them from the community without them realizing it
  6. means of software extension by community members themselves
How does the toolkit that I built help publishers achieve these six required elements? We will consider each element separately.

For Element 1 ("magnet content authored by experts"), the toolkit helps by encouraging publishers to structure content rather than use WYSIWYG tools. People in both industry and academia have been working hard since 1993 to make WYSIWYG tools for constructing Web sites. Every year these tools get better and yet sites get worse. Part of the problem, as noted in Chapter 4, is that most of this software is attacking the wrong problem: the putatively arcane syntax of HTML. There is a more fundamental problem, however, which is that a WYSIWYG tool enables a user to do things by hand but then forever forces the user to do everything by hand. This is OK for desktop publishing where the user wants to produce a physical artifact and then move on with life. Web sites, however, by their very nature must adapt and grow. A publisher who is forced to make thousands of changes by hand will become exhausted and give up.

The image library example (Chapter 6) is a good illustration of the power of even the most primitively structured content. By keeping captions for 7,000 images in flat files, I was able to painlessly convert from in-line GIF to in-line JPEG. The same rudimentary database enabled me to construct Emacs authoring shortcuts. The same database enabled me to build an image search service in about one hour: http://db.photo.net/stock/ (handles more than 500 queries per day; the program searches through the text of file names, captions, and technical details). The same database enabled me to enrich my site with the FlashPix format and OpenPix protocol. Despite its non-existent budget and staff, photo.net probably serves more high-res images via the OpenPix protocol than all the other users of the protocol combined. Perl and flat files aren't very high tech, but they're apparently much more effective than what most rich publishers are using.

For Element 2 ("means of collaboration"), the toolkit offers subsystems that collect comments on static .html articles, arrange discussions into threads, post classified ads and auction bids on those ads, collect comments on news items (and push them into the archives when the news becomes stale), and arranges stories that are intended to be durable, stand by themselves, and collect few comments.

For Element 3 ("browsing and searching magnet and contribute content together"), the toolkit makes careful use of modern full-text search engines. My experiments have demonstrated that this is critical to running a site. Without really high quality searching, discussion group moderators become overwhelmed as newcomers ask duplicate questions.

For Element 4 ("means of delegation of moderation"), the toolkit offers reports that show site administrators activity by user. The toolkit offers means of allowing multiple people to co-moderate a discussion forum.

For Element 5 ("means of excluding burdensome members"), the toolkit does not offer any silver bullets. There is an unavoidable tension between facilitating participation and spam- and bozo-proofing a community. For example, if a site were to require forum contributors to obtain a personal Verisign certificate before posting, that would presumably reduce impostors to a minimum. However, the overhead and expense of obtaining the Verisign certificate would stifle a nascent community. Pre-moderation may protect community members from irrelevant, redundant, or offensive postings. However, if there isn't a 24x7 group of energetic moderators, the fact that contributions don't make it to the live site for many hours may itself stifle discussion (for example, on the old http://www.photo.net/photo/, by the time one of the three moderators got around to deleting an off-topic camera shopping posting, the person who asked the question had usually gotten at least one helpful response).

The comprehensive tracking of a user's history with the site by the toolkit makes it possible to try some innovative techniques:

If none of these approaches is working for a particularly user, it is easy to flag that person in an RDBMS column. The obvious thing to do would be to serve a "you've been excluded for being burdensome" page. This may enrage the excluded user, who is then motivated to acquire a fake identity on hotmail.com and will return under the guise of a new user. As described in Chapter 15, the ArsDigita Community System is capable of presenting a plausible picture of a deadlocked database and a flaky site. Historically, the burdensome user becomes frustrated with trying to post material and leaves. This approach is innovative but we do not make the claim that it is a complete solution.

For Element 6 ("means of software extension by community members themselves"), the toolkit offers first and foremost its own open-source software. By leaving concurrency control and transactions to the RDBMS, and building pages in a safe interpreted language, we ensure that a novice programmer attempting to extend the system is likely to only break only one page at a time.

Another important feature of the ArsDigita Community System is an emphasis on transportable software, but not in the traditional direction. Since 1993, virtually all Web technology developments have been aimed at allowing publishers to transport software that runs on users' desktop machines. Java virtual machines have been embedded in commercial browsers since 1996. Tens of millions of these Java-capable browsers have been installed on desktops worldwide with remarkably little impact; most people cannot think of a valuable Web service that has been made more useful via client-side Java.

So enthusiastic are computer scientists about moving code from server to client that, despite the universal commercial adoption of Java, academic research continues in the area of transportable-to-the-client software architecture. The CURL project at MIT (http://curl.lcs.mit.edu/curl/) is a good example of some of the best ideas within this genre.

I argue that perhaps we're trying to move code in the wrong direction. A Web service operator will never be able to anticipate all user needs. For a static data set, it might be sufficient for a Web publisher to offer raw data in a structured format, e.g., a CSV file suitable for importing into a spreadsheet. This won't work, though, if what a user wants is some kind of algorithmic operation on new or changed data. We should be working on better ways for users to supply code to publishers.

Even in its infancy, the toolkit provided a very primitive example of client-specified code in the user-specified alerts for the bulletin board subsystem. A user can say "I want a daily summary of messages containing the keyword Nikon". The discussion forum system described in the community chapter incorporates a primitive formal language in which users can specify which messages they want and with what frequency.

Even the most expensively-produced commercial sites are currently unable to offer the server-side scripting that users need. For example, I'd like to see all the articles written by my friend Carey Goldberg, a reporter at The New York Times, but don't want to subscribe to the physical paper or check http://www.nytimes.com every day. I can order a section of the paper by email, but there is no way for me to tell the Times's server that I'd like "all articles by Carey Goldberg, delivered by email".

Related Work

The Computer as a Communications Device (Licklider and Taylor 1968; available at http://www.memex.org/licklider.html) contains a section entitled "on-line interactive communities". The paper sets forth most of the potential advantages of on-line communities listed in this thesis. Licklider and Taylor also deftly address the social issues of differential computer network access that today support an entire punditry industry.

As referenced in the last chapter of my thesis, Douglas Engelbart's demonstration at the December 1968 Fall Joint Computer Conference in San Francisco was at least as prescient, with the kicker that he actually implemented almost everything that he predicted would be useful.

The Great Men of 1960s Computer Science anticipated 1990s Internet-based computing to a degree that seems remarkable. Yet sadly we still need to sit at our desks and design systems and write code. Why?

The Great Men envisioned a system to support concurrent interaction by generals and business executives (i.e., important white guys like themselves). They didn't think too hard about organizing content contributed by users who weren't working concurrently. They didn't anticipate user interactions with these systems lasting for years or decades and therefore requiring thousands of moderation/categorization hours. Because they didn't foresee the thousands of required moderation hours, they didn't think about the problem of means of delegation of moderation. Because they were building systems for Important White Guys, they didn't anticipate the problem of how a community might identify and exclude an anonymous bozo.

More fundamentally, the Great Men did not anticipate the stagnation of software development technology. They assumed that software tools would be improved and therefore that programming challenges would be met relatively effortlessly. Licklider's 1960 Man-Computer Symbiosis notes that "a multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem solving of military significance. That would leave, say, five years to develop man-computer symbiosis and 15 years to use it."

This is somewhat frustrating for the working engineer. We have to figure out how to make on-line communities work. We have to figure out how to make sure that they stay up and running. We have to figure out how to make sure that they don't cost too much time or money to operate. We have to figure out how to engineer them in such a way that non-programmers can extend the source code without breaking it. When we're all done, we have to give the credit to Vannevar Bush, Licklider, Engelbart, and Negroponte. We might even have to pay patent royalties to Jerry Lemelson's estate (as Hal Abelson notes, "Success has many fathers but not all of them were smart enough to patent it.").

Do engineers have the right to be bitter? Why should anyone get credit for mere programming? The important thing is the idea.

My answer is to quote Al Drake. He tells students who complain about losing points on a quiz for a sign error, "sometimes a sign error means the rocket goes down instead of up." A great idea plus a server that isn't responding isn't a great system. If building great systems were easy, they'd all have been built in the 1970s.

The first Multi-User Dungeon (MUD) was built in 1979 by Roy Trubshaw and Richard Bartle at Essex University on a DECsystem-10 time-sharing computer. According to Richard Bartle's December 1990 report Interactive Multi-User Computer Games, MUDs did some interesting things:

How has technology improved as the world moved from MUD to the Web? By and large users have been stripped of their ability to perform server-side programming. In its place, they've gotten Java-enriched banner advertisements that crash their browsers.

And that, apparently, is progress.


philg@mit.edu
Add a comment | Add a link