Online bboard for assistance:
Objectives: By doing this problem set you will learn
This problem set requires you to learn a lot of new software, so make sure you get started early: plan to spend at least two or three sessions on it. There is nothing difficult here, but you need to work through the initial mechanics of using Tcl, SQL, and running the Web server; and this takes time.
Please feel free to use the on-line forum to ask your questions about this problem set or any other class-related problem. You can use the forum to view other people's questions, and provide or view answers. Also, we strongly recommend that you do as much of your beginning work as possible in the supervised labs, where the Lab Assistants on hand to help you.
http://lcswww1.lcs.arsdigita.comif you are in the course at MIT -- from a Web browser running on your local client machine (not on your Web server machine). If the server is running, you'll see a message with the name of your virtual server, followed by "test page".
Read Using the LCS Web/db Computing Facility and follow the instructions to log on to your Web server machine.
m-x shell" to get a Unix shell. Type "
tclsh" to start the Tcl shell program. Try a few simple Tcl commands. Also, type "
info tclversion" at the tclsh prompt to make sure that you're running Tcl 8.3, the same version that is compiled into AOLserver.
Now define a recursive Fibonacci procedure in Tcl. Execute and test.
Hint: If you're writing Tcl programs of more than two or three lines, you'll find more convenient to type the code into a separate Emacs buffer (set the buffer to to tcl mode using m-x tcl-mode) and cut and paste from there into the Tcl shell buffer. Actually, you'll almost never be running Tcl from the shell in this course, but you will be writing lots of Tcl in Emacs as you create Web pages.
/web/yourvirtualserver/www/psets/basics/two-plus-two.adp. If these files are missing from your server machine, download them from ps-basics.tar and put them in
Augment the page so that (1) you add a $4000 South American Cichlid
aquarium as an option, (2) you use a constructor procedure to build
each aquarium element (instead of simply calling
(3) you add an element to the aquarium for how many of each type of
aquarium will be installed (4) you use procedures to extract
type, cost and quantity from an aquarium element (instead of simply calling
(5) you print out quantity-dependent subtotals and the grand
total at the bottom.
Hint: If the code you write doesn't produce what you expect when you
load the page (or, as is common, produces nothing at all), this is
likely to be a Tcl error. To see if there was an error, look at the
error log for your server, which is in the file
http://yourserveraddress/psets/basics/simple-tcl-page.tcl. Using Emacs running on the server machine, examine the source code for this page in
/web/yourvirtualserver/www/psets/basics/simple-tcl-page.tcl. Also look at the source code for the target of the form, which is in
/web/yourvirtualserver/psets/basics/tcl/simple-tcl-page-2.tcl. If these files are missing from your server machine, download them from ps-basics.tar and put them in
Notice how we use Tcl to read the form variables. Try out the form a
couple of times, using your browser. Now debug the regular expression
simple-tcl-page-2.tcl so that it properly handles the
names "Tammy Faye Baker" and "William H. Gates III".
Hint 1: It is easier if you don't try to do this in a single regexp. Use if then elseif then elseif ...
regexp has a side-effect. If you use a
if statement, make sure that you wrap your
regexp in braces so that they don't all get
Hint 3: Keep in mind that a match succeeds if the pattern matches a substring of the data. If you want to force your pattern to match the entire data item, you'll have to use an appropriate regexp to ensure this.
ns_httpgetto query several online bookstores to find price and stock information and displays the results in an HTML table. Save your program in files called
books-2.tclso people can access your service over the web.
We suggest querying barnesandnoble.com and www.1bookstreet.com. Your
program should be robust to timeouts, errors at the foreign sites, and
network problems. You can ensure this by wrapping a Tcl
catch statement around your call to
ns_httpget. Test your program with the following ISBNs:
0385494238, 0062514792, 0140260404, 0679762906.
Try adding more bookstores, but you may need to do kludges to make them work. For example, amazon.com and wordsworth.com tend to respond with a 302 redirect if the client doesn't give them a session ID in the query.
Extra credit: From which of the preceding books is the following quote taken?
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers."
Note that you have to end your SQL commands with a semicolon in SQL*Plus. These are not part of the SQL language and you shouldn't use these when writing SQL in your Tcl programs for AOLserver.
create table my_courses ( course_number varchar(20) );
Insert a few rows, e.g.,
Commit your changes:
insert into my_courses (course_number) values ('6.916');
See what you've got:
One of the main benefits of using an RDBMS is persistence. Everything that you create stays around even after you log out. Normally, that's a good thing, but in this case you probably want to clean up after your experiment:
select * from my_courses;
Quit SQL*Plus with the
drop table my_courses;
/web/yourvirtualserver/www/psets/basics/tcl/quotations.tcl, which is the source code for a page that displays quotations that have been stored in the Oracle database. Visit this page with your Web browser and you should get an error. The reason for the error is that the program is calling a procedure that doesn't exist:
ad_header("ArsDigita Header"). You can confirm this suspicion by using Emacs to read
/home/aol/log/yourvirtualserver-error.log, which is where AOLserver logs any notices or problems.
To get AOLserver to load procedure definitions at server startup, you
have to put .tcl files in your server's private Tcl library:
/web/yourvirtualserver/tcl/. Create a file called "basics-defs.tcl" in
this directory and define the following Tcl procedures:
ad_header page_title-- returns
bodytags, with argument enclosed within the
ad_footer-- returns a string that will close the
Reload the quotations.tcl page and you get ... the same error!
AOLserver doesn't know that you've added a file to the private
library; this is only checked at server startup. Go to a Unix shell
and "restart-aolserver yourservername" (this is the big hammer; it
kills your server's Unix process so that Unix will restart AOLserver
restart-aolserver does not come back
with "Killing 10234" or some other process ID, you'll know that you
did not succeed (perhaps you made a typo when specifying your server
Reload the quotations.tcl page and you get ... a slightly different
error! The program is trying to query a table that doesn't exist:
Go back to your sql shell and restart SQL*Plus. Copy the table
definition from the comments at the top of the file
feed this definition to Oracle. Go back to your Web browser and
reload the page that previously gave you an error. Things should now
work, although the
quotations table is empty.
Use the form on the web page to manually add the following quotation, under an appropriate category of your choice: "640K ought to be enough for anybody" (Bill Gates). Note that it would be funnier if our table had a column for recording the date of the quotation (1981) but we purposely kept our data model as simple as possible.
Return to SQL*Plus and
select * from the table to see
that your quotation has been inserted into the table. The horrible
formatting is an artifact of your having declared the
quote column to be 4000 characters long.
In SQL*Plus, insert a quotation with some hand-coded SQL. To see the
form of the SQL
insert command you should use, examine
the code on the page quotation_add.tcl. After creating this new table
select * again, and you should now see two rows.
Hint: Don't forget that SQL quotes strings using single quotes, not double quotes.
Now reload the
quotations.tcl URL from your Web browser.
If you don't see your new quotation here, that's because you didn't type
COMMIT; in SQL*Plus. This is one of the big features of a relational
database management system: simultaneously connected users are
protected from seeing each other's unfinished transactions.
This is a SQL*Loader control file. You'll look at SQL*Loader more in Oraexercise 1 (below). For now, just invoke SQL*Loader from the Unix shell with the
Here your-id should be your database username followed by a slash, followed by your database password. Reload the
sqlldr userid=your-id control=quotation-list.ctl
quotations.tclweb page, and verify that the new quotations are in the data base. If you check you directory, you should see that SQL*Loader also generated a
quotation-add.tcl, paying particular attention to how they access the database. Notice how they represent SQL commands as strings in Tcl, which are then used together with appropriate Tcl commands from the database API. You can find some documentation on the API in Brief introduction to database access using AOLServer and the ACS, but the comments in the code files should be reasonably self-explanatory.
quotations.tclpage and modify it so that categories entry is done via a select box of existing categories (you will want to use the "SELECT DISTINCT" SQL command). For new categories, provide an alternative text entry box labeled "new category". Make sure to modify
quotation-add.tclso that it recognizes when a new category is being defined.
Hint: In order to simplify debugging, you may find it useful to test your query using SQL*PLUS in the shell, before integrating the query into a Tcl page.
Hint 1: Read about simple queries in SQL for Web nerds.
Hint 3: SQL's UPPER and LOWER functions
You can personalize Web services with the aid of magic cookies. A cookie issued by the server directs the browser to store data in browser's computer. To issue a cookie, the server includes a line like
in the HTTP header sent to the browser. HereSet-Cookie: cookie_name=value; path=/ ; expires=Fri, 01-Jan-2010 01:00:00 GMT
cookie_nameis the name for this cookie, and
valueis the associated value, which can contain any character or format except for semicolon, which terminates a cookie. The
pathspecifies which URLs on the server the cookie applies to. Designating a path of slash (
/) includes all URLs on the server.
After the browser has accepted a server's cookie, it will include the cookie name and value as part of its HTTP requests whenever it asks that server for an applicable URL. Your Tcl programs can read this information using the AOLServer API
After the expiration date, the browser no longer sends the cookie information. The server can also issue cookies with no specified expiration date, in which case, the cookie is not persistent -- the browser uses it only for that one session.
[ns_set get [ns_conn headers] Cookie]
You can see an example of how cookies are issued and read, by
visiting the URL
and examining the Tcl for file and the associated URLs
Observe how expire-cookies gets rid of cookies by reissuing them with
an expiration date that has already past.
Reference: The magic cookie spec is available from http://home.netscape.com/newsref/std/cookie_spec.html.
Hint 1: It is possible to build this system using an ID cookie for the browser and keeping the set of killed quotations in Oracle. However, if you're not going to allow users to log in and claim their profile, there really isn't much point in keeping data on the server. In fact, by keeping killed quotation IDs in your users' browser cookies, you've achieved the holy grail of academic database management system researchers: a distributed database!
Hint 2: It isn't strictly copacetic with the cookie spec, but you can have a cookie value containing spaces. Tcl stores a list of integers internally as those numbers separated by spaces. So the easiest and simplest way to store the killed quotations is as a space-separated list.
Hint 3: Don't filter the quotations in Tcl. It is generally a sign of incompetent programming when you query more data from Oracle than you're going to display to the end-user. SQL is a very powerful query language. You can use the NOT IN feature to exclude a list of quotations.
In theory, people who wish to exchange structured data over the Web can cooperate using XML (eXtensible Markup Language), a 1998 standard from the Web Consortium (www.w3.org/XML/). XML has started to become widely hyped over the past year as "the next big thing on the Web", but in practice, hardly anybody uses XML yet (summer 2000). Fortunately for you in completing this problem set, you can cooperate with your fellow students: the overall goal is to make quotations in your database exportable in a structured format so that other students' applications can read them.
In order to cooperate, we need (1) an agreed-upon URL at everyone's server where the quotations database may be obtained; and (2) an agreed-upon format for the quotations. In point of fact, we could avoid the need for prior agreement by setting up infrastructures for service discovery and by employing techniques for self-describing data -- both of which we'll deal with later in the semester -- but we'll keep things simple for now.
We'll format our quotations using XML. XML structures consist of data
strings enclosed in HTML-like tags of the form
what kind of thing the data is supposed to be.
Here's an informal example, showing the structure we'll use for our quotations:
Notice that there's a separate tag for each column in our SQL table:<quotations> <onequote> <quotation_id>1</quotation_id> <insertion_date>1999-02-04</insertion_date> <author_name>Bill Gates</author_name> <category>Computer Industry Punditry</category> <quote>640K ought to be enough for anybody.</quote> </onequote> <onequote> .. another row from the quotations table ... </onequote> ... some more rows </quotations>
There's also a "wrapper" tag that identifies each row as a<quotation_id> <insertion_date> <author_name> <category> <quote>
<onequote>structure, and an outer wrapper that identifies a sequence of
<onequote>structures as a
Our DTD will start with a definition of the
This says that the<!ELEMENT quotations (onequote)+>
quotationselement must contain at least one occurrence of
onequotebut may contain more than one. Now we have to say what constitutes a legal
This says that the sub-elements, such as<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
quotation_idmust each appear exactly once and in the specified order. Now we have to define an XML element that actually contains something other than other XML elements:
This says that whatever falls between<!ELEMENT quotation_id (#PCDATA)>
</quotation_id>is to be interpreted as raw characters rather than as containing further tags (PCDATA stands for "parsed character data").
Here's our complete DTD:
The point of building a DTD is not just to satisfy some anal, formalistic craving of Web protocol designers. Rather, the DTD provides a machine-readable description of the XML data structure that can be fed to parsers that can then automatically tokenize the XML document. (This is a simple example of a self-describing data technique.)<!-- quotations.dtd --> <!ELEMENT quotations (onequote)+> <!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)> <!ELEMENT quotation_id (#PCDATA)> <!ELEMENT insertion_date (#PCDATA)> <!ELEMENT author_name (#PCDATA)> <!ELEMENT category (#PCDATA)> <!ELEMENT quote (#PCDATA)>
In this problem set, however, we won't use the DTD at all. You'll
need to use only the informal XML
quotationstable, produces an XML document in the preceding form, and returns it to the client with a MIME type of "application/xml". Place this in a file quotations-xml.tcl, so that other users can retrieve the data by visiting that agreed upon URL.
To get you started, we've provided the file
Requesting this URL with a Web browser should offer to let you to save
the document to a local file, and you can then examine it with a text
editor on your local machine. (Alternatively, your browser may
already have some special behavior defined for MIME type application/xml. IE
5.5, for example, will automatically display the XML document.) The
differences between our example and your program is that you'll need
to produce a document containing the entire table and you'll need to
generate it on the fly.
/psets/basics/tcl/quotations-xml.tclfrom another student's server using
quotation_id. (You don't want keys from the foreign server conflicting with what is already in your database.)
Hints: You might want to set up a temporary table using
table quotations_temp as select * from quotations and then drop
it after you're done debugging, so that you don't mess up your own
Rather than having you work with DTDs and general XML parsers, we've
gone for simplicity here by predefining for you a parser in Tcl that
understands only this particular quotations XML structure. The
You have to install this file in your server's private Tcl library,
/web/yourvirtualserver/tcl/, for this function
to be callable by .tcl and .adp pages. (Don't forget to restart your
Web server.) The
parse_all procedure takes an XML
quotation structure as argument and returns a Tcl list, showing the
parts and subparts of the structure. To see an example of the format,
use your browser to visit the page
Note: these exercises are designed to familiarize you with XML. In
most cases, XML processing should be done using Oracle's Java
create table my_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
my_stockstable. You'll have to create an appropriate control file, as when you earlier preloaded the quotations table. You can probably guess the format of the control file by analogy with the quotations example, but you should at least be aware of the existence of the SQL*Loader documentation in the Oracle docs at http://philip.greenspun.com/sql/ref/utilities)
Hint: If you leave blank lines at the end of the control file, the log file will contain error complaints about bad records with missing columns, since Oracle's implementation is too shoddily written to realize that a blank line isn't supposed to be a record definition. (What can you expect from a company with a mere $10 billion in annual revenues?)
stock_priceswith three columns:
symbol, quote_date, price. After this one statement, you should have created the table and filled it with one row per symbol in
my_stocks. The date and price columns should be filled with the current date and a nominal price.
Select symbol, sysdate as quote_date, 31.415 as price from my_stocks;.
create table newly_acquired_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
insert into .. select ...statement (with a WHERE clause appropriate to your sample data), copy about half the rows from
stock_prices, produce a report showing symbol, number of shares, price per share, and current value.
my_stocks. Run your query from Oraexercise 3. Notice that your new stock does not appear in the report. This is because you've JOINed them with the constraint that the symbol appear in both tables.
Modify your statement to use an OUTER JOIN instead so that you'll get a complete report of all your stocks, but won't get price information if none is available.
Choose a web page that provides some numerical data that is updated reasonably frequently, e.g., a couple of times a day. Two examples are:
ns_schedule_procto schedule your procedure to run once an hour. Build a .tcl page that people can access to see how data change over time.
One of the things you might do is keep track of when the server you are accessing is up. For example, one of the interesting things about Amazon is that they often lose control of their server farm and database. (They write a lot of C code and one programmer's sloppiness can generate a catastrophic failure of the entire service.) You might want to build your system so that you can record (a) times when the server is unreachable, and (b) times when the page served some error message. For example, Amazon will send "Our store is closed temporarily for scheduled maintenance". (You'll sometimes get this during the middle of weekdays when they would definitely not have intentionally scheduled any maintenance.)
You can also display your data graphically as a chart. You can do this in a way that is reasonably browser independent, by using single-pixel GIFs and WIDTH and HEIGHT tags now. Grab the software in http://software.arsdigita.com/tcl/ad-graphing.tcl and put it in your server's private Tcl directory (/web/yourservername/tcl/). Read the docs at http://software.arsdigita.com/www/doc/graphing.html and then write code to generate a pretty chart of your data.
You might also provide a way to return your data as an XML structure, so that people can access it in building their own programs, without going through the same page-scraping shenanigans that you had to in collecting it. The idea that the Web could contain lots of data sources designed for programmatic access, so that people can combine them to build more elaborate programs, hints at the vision of Web services, about which we will have more to say anon.
Place your project on your web server on a page called
project.tcl in your problem set directory, so that others
can access it.
This file is permanently housed at
This material is copyright 1999, 2000, by Philip Greenspun and Hal Abelson. It may be copied, reused, and modified, provided credit is given to the original authors with a hyperlink to this document.