Things I learned the Hard Way

(about running Web sites)

by Philip Greenspun for Web Tools Review

Note: some of these ideas have been fleshed out in my book on Web publishing.


Decide what your strengths are

Do you have $10 million/year to spend on content? If so, you can probably build something like HomeArts (a fine Hearst service, I might add), Pathfinder, or (shudder), HotWIRED. If you should find yourself a bit short, say, by $9.99 million, then you'll have to be a bit more creative if you want to get to several hundred thousand hits/day.

Porn is the most obvious strategy. I added a page of nudes to my photography magazine and, within two weeks, got an extra 3000 readers/day. These extra readers seem to be with me permanently and account for at least 30,000 hits/day. Still, there is something a little unsatisfying about this approach...

Unless you are a very creative person yourself, you will need to get some content somewhere. In the early days of wide-area hypertext, i.e., back in the 1960s, people thought that great collaborative documents would be produced with the aid of all this newfangled technology. It is true that the best sentence in Travels with Samantha was written by someone else: "You make me sound like a world-class caustic nympho bitch" (Jennifer, Chapter XVI; commenting on my description of her). It is also true that the other voices section of the book contains some gems. However, by and large you can't count on getting anything of much value from any individual reader.

Does that mean you can forget reader contributions? Not exactly. Even though you can't get anything of value from any one reader, maybe you can get something of value from them collectively. For example, look at the classified ad section of photo.net. I spent three days programming this relational-database backed service. I realized now that I have over 1000 records of what different cameras are worth. I have the beginnings of a blue book for used cameras. Maybe that will have some value. (I don't mean commercial value, by the way; I don't try to make money off my site unless you count shipping out a few photos.)

Maybe you aren't enough of an SQL and Unix whore to make your computer automatically combine reader contributions into something valuable. Maybe then perhaps you can find a partner, perhaps a non-profit organization, with rights to a body of material. Convert it all to HTML, install a full-text search engine, and you've got a worthwhile site.

Web publishing is getting much more competitive. Gone are the days when one could kick back and watch traffic climb 10%/month on a fundamentally lame site. If you don't have piles of money, you have to figure out what you can do better than the rest of the folks out there.

The Web is more flexible than publishing on paper... NOT

Electronic publishing is great. No deadlines. Revise anything at any time. Change your mind in a big way? mv or rm the file. That's what I thought in late 1993 when I was getting into Web publishing. It is now March 1996 and I've decided that Web publishing is in many ways less flexible than printing on dead trees.

If you're Hearst and you decide that a magazine isn't making it, you can just stop printing it. Put a "final issue" sticker on the cover, include an apologetic letter from the editor thanking the readers for their loyal support, pay the photographers and writers, and do something else the next month.

Travels with Samantha was the first Web site that I built. The files are organized rather badly so I decided early on to move a file or two. My site was only getting about 10,000 hits/day so I only got 100 email messages from users complaining about being led into dead ends from other sites. I only got 10 angry messages from other Web publishers who'd linked to those now-missing files. I put in some symlinks and redirects and my mailbox stopped humming.

I got a taste of my own medicine a few months later when Brandon Plewe at SUNY Buffalo changed the syntax of his CGI script that generates maps from the Xerox map server. I had linked every city name in Travels with Samantha to those maps. The day he made the change, I got hammered with email from unhappy users. I figured out the problem and sent Brandon a message begging him to make the old syntax available for my users, which he was gracious enough to do.

If you put something on the Web, you are inviting people to link to it. Webmasters worldwide will each spend a few minutes linking your pages into theirs. If you capriciously change your mind about file names and directory structures, or just decide that something doesn't belong anymore, you are forcing hundreds of webmasters to revisit a file that they might have put to bed months or years ago. You are causing thousands or tens of thousands of users to be led into an ugly 404 Not Found screen. You are a bad Net citizen.

What's wrong with being a bad Net citizen? Well, if your server gets 10 hits/second at peak hours like mine does, you may find that being a bad Net citizen fills up your inbox rather fast. It is easy to make friends on the Net but even easier to make enemies. For example, by trying to stay friends with my readers, I recently made an enemy out of an author.

He'd posted something in a USENET newsgroup. I asked him if I could publish it on my Web server and he agreed. I edited it and reformatted it into HTML and stuck it on-line. Everyone was happy. Some time later, the author decided that he'd like to sell the article to a dead trees magazine and that it was somehow unfair to the readers of that magazine if they paid for something they could have gotten for free on-line. So he asked me to remove the article for a period of time.

Note that this is a request he never would have made of a dead trees magazine. Once the magazine had been printed, the author would have assumed the article could not be withdrawn. However, he knew that it was technically possible for me to rm the file and to have my mirror site administrator in Finland rm the file. So he asked. I'd never gotten a request like that before but I thought about the fact that the URL had already been indexed by the search engines and was very likely being linked to by other sites. I didn't want to break faith with my readers. I knew that services like DejaNews were going to hold onto his article regardless of what I did. I thought that letting a print publication chase something off the Internet, albeit temporarily, was a bad precedent. I knew that if readers were led into a blind hole by AltaVista, WebCrawler, and Lycos that they wouldn't be shy about sending me email. So I refused and made an enemy out of a nice guy and good writer.

Is there a lesson to be learned here? Yes. Make it clear to authors that Web publishing by default means "forever at the same URL." That means they can hand it out to their friends but it also means that they can't take it back.

[Amusing anecdote: A friend of mine noticed that, after moving around some icons on her web server, there were a lot of error messages in the log. The "referer" header revealed that a bunch of people were linking to her icons as in-lines rather than copying them to their servers. She sent them email but they ignored her, so she put the icons back with some interesting modifications.]

Do not solicit user feedback

I had a "please tell me how you like Travels with Samantha" form. I discovered two things: Nicolas Pioch, a fellow winner of Best of the Web '94 (for his beautiful WebLouvre) told me of how he progressively buried his email address further and further away from his most popular pages. "People were asking me for the names of cheap hotels in Paris," he said.

Of course, your error pages should solicit user email, though if you are running a good stats program, then you'll be on top of this without their prodding... won't you?

Note: I eventually came up with a database-backed comment server solution for user feedback and described the system in Chapter 14 of my first book on Web publishing

If you're going to solicit user feedback...

... then ask them how they feel, not how they are going to feel.

For example, in a medical education "tip of the day" system that I built with some folks at Children's Hospital, we would be sending juvenile diabetics e-mail periodically with information that they needed to manage their condition. We wanted to keep track of which patients were "checked out" on which pieces of knowledge so that a doctor could use scarce face-to-face minutes with a patient to quickly focus on those areas in which the patient needed more education.

What we needed for our database was "should we send this patient the same tip a few weeks from now?" But it would have been a terrible idea to ask the patient "are you going to have forgotten this a few weeks from now and need to be reminded again?" How is the patient supposed to know? And in any case why force a patient to go through the mental exercise of trying to predict his behavior and feelings?

The solution was to ask a patient to "click here if you already knew this". That is cognitively a much easier question to process. And if the patient already knew the tip and had just been reminded again then it was pretty safe to assume that we did not need to send him e-mail anymore.

Come up with a coherent link strategy

If you have a popular server, you will get requests to "exchange links." Personally I thought this was pernicious even before I started getting 10 messages/day asking for links out. I link to resources that I think complement my services and will be valued by my readers. If the linked-to author thinks my work sucks and won't link back to me, should I remove the link? Should that change my opinion of the value of his service?

If someone asks me to link to their site for no special reason, I tell them that I don't try to maintain a comprehensive list of links to the rest of the Internet, even in one subject area. Even if I did, what would be the utility of this to my readers? They have Yahoo. They have AltaVista. Why do they need my half-heartedly maintained list?

I unconditionally refuse to link to anyone whose server isn't fast and who doesn't believe this width and height tag fad from 1994 is going to last. Why frustrate users by sending them to incompetently maintained sites?

Anyway, the important thing is to have a policy before you set out to publish. Are you going to link from inside your text? From a separate page? From the bottom of every page? From a relational database table? How are you going to make sure that your links are current?

Finally, it doesn't hurt to inform people that you are linking to them. Maybe they'll like what they see and link back to you, but that's not really the point. The point is to discourage them from reshuffling their URLs (see above) and also so that they'll warn you if they are going to add a service, move a service, or whatever.


Return to Web Tools Review home


philg@mit.edu

Reader's Comments

You are causing thousands or tens of thousands of users to be led into an ugly 404 Not Found screen.

A great way to do this is to get a URL published in a magazine. I had the URL http://www.interlog.com/~john13/vcoach/vcoach.htm published and about 10% of the readers typed it in wrong (including some very creative spellings). Makes you realize how important it is to make URLs pronounceable and easily spellable.

John

-- John Russell, April 12, 1999

tipjar.com is designed to be a way to bypass the publishing industry, based on the ideas in Snow Crash pretty much. Muchos Gracias to Nina Paley who suggested I read Snow Crash shortly before her bout with CTS

-- david nicol, April 19, 1999
One little tip that I learned the hard way, with my (shameless plug) dslreports site, was that assigning users password automatically (to ensure they have a real email address) is a lot harder than I had imagined..

First, I tried random combinations of letters and numbers. Easy! but I got emails complaining they couldnt login, so looking more closely, I found people were not seeing the difference between "l" and "1", so I took those letters out.

But the complaints continued... this time it was "0" and "O".. so I took those out.

Then people would type "1" instead of "i".

So I took "i" out.

Then i found people started using copy-paste, and they would include accidently, a leading or trailing space from my this-is-your-password email.. so I allowed spaces.

This just left a few, real hard core dummies, who didnt feel that upper and lower case were important.
So I made passwords case-insensitive, and shortened them to almost cash machine PIN sized.
Finally, my mailbox is quiet, but not after annoying quite a few users along the way....
Its this type of thing that they never taught me in computer science..


-- Justin Beech, July 13, 1999
I don't entirely see why the amusing anecdote about changing icons that are linked to by another site, *is* amusing. (Actually I do, but I'm concerned with the principle.) If someone deep links to an image on your page, don't you owe them as much responsibility as to someone who links to your text? If not, why not?

Philip usually favours the idea that people can deep link to text and that readers are entitled not to be "led by the nose" according to the site owner's intention. How come it's different for pictures?

-- philip jones, April 18, 2000

If someone deep links to an image on your page, don't you owe them as much responsibility as to someone who links to your text? If not, why not?

The point was not the deep linking to pictures, but the use of images, icons etc, uncredited, as part of another site. Say if I found a nice image saying 'home', I can use it in my site design by using an <IMG SRC="http://their.site.com/home_icon.gif"> instead of saving the image locally on my server. This then costs their.site.com the bandwidth. It may not be much in many cases, but the principle is always wrong. Either link with an <A HREF> tag, or save the image on your own server.

-- Michael Jemmeson, April 19, 2000
All things being equal, I think a site is much more easily managed by confining persistent elements (not only graphics, but text as well) to the local structure of the site. I have enough trouble maintaining the various links that are supposed to be there, not to mention links that (IMHO) shouldn't... BTW, I have noticed that visitors enjoy (too much I think) the opportunity to point out _any and all_ mistakes, broken links, etc., and I get too much hatemail already to risk linking my menu or bullet graphics directly from another site. I recall stories of webmasters replacing linked photos of generic content with explicit content, which is a little over the top, but if you choose to link to someone's stuff without notifying them, I guess it's your own fault. The deciding factor for me in this case is that she asked them several times via email to stop linking to her images, but they just ignored her... I mean how much trouble is it really to answer an email?

-- Deke Conine, August 3, 2001
Add a comment | Add a link