How do you clump database content?

Philip Greenspun's Homepage : Philip Greenspun's Homepage Discussion Forums : Ask Philip : One Thread
Notify me of new responses

In chapter 6 of the Internet Application Workbook, you and the other authors outline a unified content data model for holding news items, tutorial articles, user comments, questions and answers and other items in a single content table. One of the lines in this table is content_type, which you can use to tag a table entry as a forum_posting, article, etc.

But how do you separate, say, an 'Ask Philip' forum posting from an 'Ask Philip's evil twin' posting, or a 'Camera Equipment' posting from a 'Medium Format' posting? Perhaps with a content-forum map? But what do you call this map, and how general can it be? Is there a 'Forums' table, which maps form_postings to forums, or can you have a more general 'Categories' or 'Clumps' or 'Views' table, mapping any content to any grouping, with everyting from Ask Philip to Philip News Items to Philip Travel News Items to Philip Tutorials in it?

And then how would you define subcategories or sub-clumps, like a 'Lenses' forum that can be filtered out from the 'Equipment' forum?

-- R Tate, May 22, 2003


The answer in the ancient ArsDigita Community System toolkit (versions 3.x and earlier, which are the only ones that I worked on) is that each bboard forum is represented with a row in a bboard_topics table. That row indicates how the forum is displayed, whether it is private, who is the primary moderator for display on the pages, etc. Each message has a column mapping to the forum ID. So a message belongs to one and only one forum topic.

I believe that Karl Goldstein had a clever idea that every item of content in a site could have a parent_ID column. All the topics would be content items and every message under a topic would have its parent_ID column point to the topic.

I think that you probably also want a separate mapping to categories, maybe on several dimensions, so that someone could say "Show me all content items on this site that relate to China" or "show me all content items on this site that were authored by people living within 30 miles of the following zip code."

Partly it depends on the site. If you have a very focussed audience on a narrow topic you might be able to have one set of categories for all content on the site. For example, if you created a site just for owners of a particular model of airplane you might be able to get away with a very simple categorization. But I don't think that is possible for a 500,000-user broad-topic service such as

-- Philip Greenspun, June 15, 2003