Reading for this week:
Online assistance: 6.916 Q&A forum
Another issue is a perennial side-show in the world-wide computer programming circus: the spectacle of nerds arguing over programming tools. The data model can't represent the information that the users need, the application doesn't do what what the users need it to do, and instead of writing code, the "engineers" are arguing about Java versus Lisp versus Perl versus Tcl. If you want to know why computer programmers get paid less than medical doctors, consider the situation of two trauma surgeons arriving at an accident scene. The patient is bleeding profusely. If surgeons were like programmers, they'd leave the patient to bleed out in order to have a really satisfying argument over the merits of two different kinds of tourniquet.
If you're programming one Web page at a time, you can switch to the Language du Jour in search of higher productivity. But you won't achieve significant gains unless you switch from writing code for one page. You need to think about ways to write down a formal description of the application and user experience, then let the computer generate the application automatically.
In order to make your system comprehensible to the CIO/CTO types who will be adopting it, we'll use their vocabulary. A table row is an "object" and each column is an "element" of that object.
We need a way to represent the kinds of objects that our system will represent first. Let's assume that we'll have at least the following object types:
languageand one of type
person. Why not link to the users table instead? John McCarthy might not be a registered user of the system. Some of the people you'll be referencing, e.g., John Von Neumann, are dead. Characteristics of the Lisp language would be stored as elements of the language object.
For an example of what a completed system of this nature might look like to the casual reader, visit Paul Black's Dictionary of Algorithms, Data Structures, and Problems at http://hissa.nist.gov/dads/.
For each object type we'll be creating an Oracle table. For each Oracle table we create, we store one row in the metadata table:
We need to store information about what elements will be kept for each type of object. Note that some elements are common across object types:
create table km_metadata_object_types ( table_name varchar(21) primary key, pretty_name varchar(100) not null, pretty_plural varchar(100) );
For elements that are unique to an object type (i.e., elements other than the standard ones listed above), we need to insert one row in a metadata table per element:
Notice that the
create table km_metadata_elements ( metadata_id integer primary key, table_name not null references km_metadata_object_types, column_name varchar(30) not null, pretty_name varchar(100) not null, abstract_data_type varchar(30) not null, -- ie. "text" or "shorttext" "boolean" "user" -- this one is not null except when abstract_data_type is "user" oracle_data_type varchar(30), -- "varchar(4000)" -- e.g., "not null" or "check foobar in ('christof', 'patrick')" extra_sql varchar(4000), -- values are 'text', 'textarea', 'select', 'radio', -- 'selectmultiple', 'checkbox', 'checkboxmultiple', 'selectsql' presentation_type varchar(100) not null, -- e.g., for textarea, this would be "rows=6 cols=60", for select, Tcl list, -- for selectsql, an SQL query that returns N district values -- for email addresses mailto: presentation_options varchar(4000), -- pretty_name is going to be the short prompt, -- e.g., for an update page, but we also need something -- longer if we have to walk the user through a long form entry_explanation varchar(4000), -- if they click for yet more help help_text varchar(4000), -- note that this does NOT translate into a "not null" constraint in Oracle -- if we did this, it would prevent users from creating rows incrementaly mandatory_p char(1) check (mandatory_p in ('t','f')), -- ordering in Oracle table creation, 0 would be on top, 1 underneath, etc. sort_key integer, -- ordering within a form, lower number = higher on page form_sort_key integer, -- if there are N forms, starting with 0, to define this object, -- on which does this go? (relevant for very complex objects where -- you need more than one page to submit) form_number integer, -- for full text index include_in_ctx_index_p char(1) check (include_in_ctx_index_p in ('t','f')), -- add forms should be prefilled with the default value default_value varchar(200), check ((abstract_data_type not in ('user') and oracle_data_type is not null) or (abstract_data_type in ('user'))), unique(table_name,column_name) );
km_metadata_elementstable contains specifications for 1) generating the CREATE TABLE sql commands that you'll need to build the database data structures for storing knowledge and 2) building user interfaces to manipulate data in those tables.
If you have trouble feeding the above table definition to SQL*Plus running under an Emacs shell, cut and paste it into a file called "foo.sql". Then runfrom the shell.
# sqlplus student23/thepassword < foo.sql
So that the /admin/km/ directory will be writable by the Web server. Now use the prototype builder (documented at http://photo.net/doc/prototype.html; available on your own server at /admin/prototype/) to generate admin pages for the
> cd /web/yourservername/www/admin/ > chmod a+w km
overview. Here are some examples:
personin order to represent the prime developers of the system)
problemwill be linked to objects of type
algorithm(which algorithm solves the problem),
publication(papers that set forth the problem), and
person(people who were involved in stating or solving the problem)
algorithmwill be linked to objects of type
problem(what need the algorithm addresses),
publication(papers that describe the algorithm or implementations of it)
person(people who were involved in developing the algorithm)
Each object table should have an
object_id column. Use a
single Oracle sequence to generate keys for this column.
In addition to the metadata-driven object table definitions we'll define a generalized mapping table to support links between knowledge objects:
Notice that this table allows the users to map an object to any other object in the system, regardless of type.
create table km_object_object_map ( table_name_a not null references km_metadata_object_types, -- Assume all object type tables have integer primary keys. -- For a generalized mapping table we can't put an integrity -- constraint on this field. table_id_a integer not null, table_name_b not null references km_metadata_object_types, table_id_b integer not null, -- User-entered reason for relating two objects, e.g. -- to distinguish between John McCarthy the developer of -- Lisp and Gerry Sussman and Guy Steele, who added lexical scoping -- in the Scheme dialect map_comment varchar(4000), creation_user not null references users, creation_date date default sysdate not null, primary key (table_name_a, table_id_a, table_name_b, table_id_b) );
For simplicity, assume that associations are bidirectional. If a user
associates the Huffman encoding algorithm (used in virtually every
compression scheme, including JPEG) with the person David A. Huffman
(MIT Course VI grad student and then faculty member), we should also
interpret that to mean that the person David A. Huffman is associated
with the algorithm for Huffman encoding. This is why the columns in
km_object_object_map have names like "table_name_a"
instead of "from_table".
The primary key constraint above has the side effect of creating an index that makes it fast to ask the question "is object A related to object B?". For efficiency in querying "is object B related to object A?", create a concatenated index on the columns in the other order. (The trees chapter of SQL for Web Nerds, at http://photo.net/sql/trees.html, gives some examples of concatenated indices. Also read the composite indices section of the Oracle Tuning manual: http://philip.greenspun.com/sql/ref/composite_indices. See also the Oracle SQL Reference section at http://philip.greenspun.com/sql/ref/create_index.)
util_prepare_updatefrom /tcl/00-ad-utilities.tcl useful in building object-create-2.tcl.
When object-create-2.tcl is done inserting the row into the database,
ns_returnredirect to object-display.tcl. This
page should have small hyperlinks to edit single fields at a time (all
linking to object-edit-element.tcl with different arguments). This
page should show all the currently linked objects and have an "add link"
hyperlink to object-add-link.tcl.
The page returned by object-add-link.tcl will look virtually identical
to /km/index.tcl and will in fact link to the same URL:
object-browse-one-type.tcl. When called with only
table_name, this page will display a table of object
names with dimensional controls at the top. The dimensions should be
"mine|everyone's" and "creation date". The user ought to be able to
click on a table header and sort by that column.
When called with extra arguments, object-browse-one-type.tcl will pass
those arguments through to object-view-one.tcl and, if the user
clicks a confirmation button, will eventually result in
object-add-link-2.tcl being invoked. The extra arguments should be
table_id_a, which will
ultimately end up in the corresponding columns of a row in
Modify object-view-one.tcl so that you explicitly close the TCP connection to the user (using
-- we will be updating the reuse_p column of views so it -- will be easier to have a primary key create sequence km_object_view_id; create table km_object_views ( object_view_id integer primary key, -- which user user_id not null references users, -- two columns to specify which object object_id integer not null, table_name varchar(21) not null, view_date date not null, reuse_p char(1) default 'f' check(reuse_p in ('t','f')) );
ns_conn close). This will stop the Netscape icon to stop spinning but the AOLserver thread will remain alive so that you can log.
ns_conn close, insert a row into the
km_object_views table iff there isn't already a log row
for this user/object pair within 24 hours. You could do this with a
ns_db dmlstatement. Here's an example of an INSERT statement that only has an effect if there isn't already a row in the table.
Apply this idea to the problem of thread-safe logging if and only if there isn't an identical row logged within the last 24 hours.
insert into msg_id_generator (last_msg_id) select ('000000') from dual where 0 = (select count(last_msg_id) from msg_id_generator);
Date/time arithmetic: see http://photo.net/sql/dates.html.
reuse_pflag of the most recent relevant row in
km_object_views. The page should raise an error if it can't find a row to update.
km_object_viewsthat will make the code in exercises 6 and 7 go fast.
Create a file /admin/km/generate-sws-triggers.tcl to read the meta data tables and
sws_table_to_section_mapfor every object type defined in
pseudo_containssource code from http://software.arsdigita.com/www/doc/sql/pl-sql.sql.
See http://philip.greenspun.com/sql/ref/dbms_lob and http://philip.greenspun.com/sql/ref/intermediatext as references.
It was revised in January 2000 by Andrew Grumet (email@example.com) and Philip Greenspun. The old version asked students to build a KM system for an entire university. This revision restricts the system to the field of computer science. The old version asked students to explicitly model which types of objects could be linked to which other types. This revision suggests building a simplified system where any object may be linked to any other. The old version is available at http://philip.greenspun.com/teaching/psets/ps4/ps4-v1.adp.
It is permanently housed at http://philip.greenspun.com/teaching/psets/ps4/ps4.adp.