Readings:
Online bboard for assistance:
Objectives: By doing this problem set you will learn
This problem set requires you to learn a lot of new software, so make sure you get started early: plan to spend at least two or three sessions on it. There is nothing difficult here, but you need to work through the initial mechanics of using Tcl, SQL, and running the Web server; and this takes time.
Please feel free to use the on-line forum to ask your questions about this problem set or any other class-related problem. You can use the forum to view other people's questions, and provide or view answers. Also, we strongly recommend that you do as much of your beginning work as possible in the supervised labs, where the Lab Assistants on hand to help you.
http://lcswww1.lcs.arsdigita.com
if
you are in the course at MIT --
from a Web browser running on your local client machine (not on
your Web server machine). If the server is running, you'll see a
message with the name of your virtual server, followed by "test page".
Read Using the LCS Web/db Computing Facility and follow the instructions to log on to your Web server machine.
m-x shell
" to get a Unix
shell. Type "tclsh
" to start the Tcl shell program. Try a few simple
Tcl commands. Also, type "info tclversion
" at the
tclsh prompt to make sure that you're running Tcl 8.3, the same
version that is compiled into AOLserver.
Now define a recursive Fibonacci procedure in Tcl. Execute and test.
Hint: If you're writing Tcl programs of more than two or three lines, you'll find more convenient to type the code into a separate Emacs buffer (set the buffer to to tcl mode using m-x tcl-mode) and cut and paste from there into the Tcl shell buffer. Actually, you'll almost never be running Tcl from the shell in this course, but you will be writing lots of Tcl in Emacs as you create Web pages.
/web/yourvirtualserver/www/psets/basics/two-plus-two.adp
.
If these files are missing from your server machine, download them
from ps-basics.tar
and put them in /web/yourvirtualserver/www
.
Augment the page so that (1) you add a $4000 South American Cichlid
aquarium as an option, (2) you use a constructor procedure to build
each aquarium element (instead of simply calling list
),
(3) you add an element to the aquarium for how many of each type of
aquarium will be installed (4) you use procedures to extract
type, cost and quantity from an aquarium element (instead of simply calling lindex
),
(5) you print out quantity-dependent subtotals and the grand
total at the bottom.
Hint: If the code you write doesn't produce what you expect when you
load the page (or, as is common, produces nothing at all), this is
likely to be a Tcl error. To see if there was an error, look at the
error log for your server, which is in the file
/home/aol/log/yourvirtualserver-error.log
.
http://yourserveraddress/psets/basics/simple-tcl-page.tcl
.
Using Emacs running on the server machine, examine the source code for
this page in
/web/yourvirtualserver/www/psets/basics/simple-tcl-page.tcl
.
Also look at the source code for the target of the form, which is in
/web/yourvirtualserver/psets/basics/tcl/simple-tcl-page-2.tcl
.
If these files are missing from your server machine, download them
from ps-basics.tar
and put them in /web/yourvirtualserver/www
.
Notice how we use Tcl to read the form variables. Try out the form a
couple of times, using your browser. Now debug the regular expression
in simple-tcl-page-2.tcl
so that it properly handles the
names "Tammy Faye Baker" and "William H. Gates III".
Hint 1: It is easier if you don't try to do this in a single regexp. Use if then elseif then elseif ...
Hint 2: regexp
has a side-effect. If you use a
multi-clause if
statement, make sure that you wrap your
calls to regexp
in braces so that they don't all get
evaluated immediately.
Hint 3: Keep in mind that a match succeeds if the pattern matches a substring of the data. If you want to force your pattern to match the entire data item, you'll have to use an appropriate regexp to ensure this.
ns_httpget
to query several online bookstores to find
price and stock information and displays the results in an HTML table.
Save your program in files called
/web/yourvirtualserver/www/psets/basics/books.tcl
and books-2.tcl
so people
can access your service over the web.
We suggest querying barnesandnoble.com and www.1bookstreet.com. Your
program should be robust to timeouts, errors at the foreign sites, and
network problems. You can ensure this by wrapping a Tcl
catch
statement around your call to
ns_httpget
. Test your program with the following ISBNs:
0385494238, 0062514792, 0140260404, 0679762906.
Try adding more bookstores, but you may need to do kludges to make them work. For example, amazon.com and wordsworth.com tend to respond with a 302 redirect if the client doesn't give them a session ID in the query.
Extra credit: From which of the preceding books is the following quote taken?
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers."
create table my_courses (
course_number varchar(20)
);
Note that you have to end your SQL commands with a semicolon in
SQL*Plus. These are not part of the SQL language and you shouldn't
use these when writing SQL in your Tcl programs for AOLserver.
Insert a few rows, e.g.,
insert into my_courses (course_number) values ('6.916');
Commit your changes:
commit;
See what you've got:
select * from my_courses;
One of the main benefits of using an RDBMS is persistence.
Everything that you create stays around even after you log out.
Normally, that's a good thing, but in this case you probably want to
clean up after your experiment:
drop table my_courses;
Quit SQL*Plus with the quit
command.
/web/yourvirtualserver/www/psets/basics/tcl/quotations.tcl
,
which is
the source code for a page that displays quotations that have been
stored in the Oracle database. Visit this page with your Web browser
and you should get an error. The reason for the error is that the
program is calling a procedure that doesn't exist:
ad_header
("ArsDigita Header").
You can confirm this suspicion by using
Emacs to read /home/aol/log/yourvirtualserver-error.log
, which is
where AOLserver logs any notices or problems.
To get AOLserver to load procedure definitions at server startup, you
have to put .tcl files in your server's private Tcl library:
/web/yourvirtualserver/tcl/
. Create a file called "basics-defs.tcl" in
this directory and define the following Tcl procedures:
ad_header page_title
-- returns
html
, head
, title
, and
body
tags, with argument enclosed within the
title
tags
ad_footer
-- returns a string that will close the
body
and html
tags
Reload the quotations.tcl page and you get ... the same error!
AOLserver doesn't know that you've added a file to the private
library; this is only checked at server startup. Go to a Unix shell
and "restart-aolserver yourservername" (this is the big hammer; it
kills your server's Unix process so that Unix will restart AOLserver
automatically). If restart-aolserver
does not come back
with "Killing 10234" or some other process ID, you'll know that you
did not succeed (perhaps you made a typo when specifying your server
name).
Reload the quotations.tcl page and you get ... a slightly different
error! The program is trying to query a table that doesn't exist:
quotations
.
Go back to your sql shell and restart SQL*Plus. Copy the table
definition from the comments at the top of the file quotations.tcl
and
feed this definition to Oracle. Go back to your Web browser and
reload the page that previously gave you an error. Things should now
work, although the quotations
table is empty.
Use the form on the web page to manually add the following quotation, under an appropriate category of your choice: "640K ought to be enough for anybody" (Bill Gates). Note that it would be funnier if our table had a column for recording the date of the quotation (1981) but we purposely kept our data model as simple as possible.
Return to SQL*Plus and select *
from the table to see
that your quotation has been inserted into the table. The horrible
formatting is an artifact of your having declared the
quote
column to be 4000 characters long.
In SQL*Plus, insert a quotation with some hand-coded SQL. To see the
form of the SQL insert
command you should use, examine
the code on the page quotation_add.tcl. After creating this new table
row, do select *
again, and you should now see two rows.
Hint: Don't forget that SQL quotes strings using single quotes, not double quotes.
Now reload the quotations.tcl
URL from your Web browser.
If you don't see your new quotation here, that's because you didn't type
COMMIT; in SQL*Plus. This is one of the big features of a relational
database management system: simultaneously connected users are
protected from seeing each other's unfinished transactions.
/web/yourvirtualserver/www/psets/basics/quotation-list.ctl
This is a SQL*Loader control file. You'll look at SQL*Loader
more in Oraexercise 1 (below). For now, just invoke SQL*Loader from
the Unix shell with the sqlldr
command:
sqlldr userid=your-id control=quotation-list.ctl
Here your-id should be your database username followed by a
slash, followed by your database password. Reload the
quotations.tcl
web page, and verify that the new
quotations are in the data base. If you check you directory, you
should see that SQL*Loader also generated a
quotation-list.log
file.
quotations.tcl
and
quotation-add.tcl
, paying particular attention to how they
access the database. Notice how they represent SQL commands as
strings in Tcl, which are then used together with appropriate Tcl
commands from the database API. You can find some documentation on
the API in Brief
introduction to database access using AOLServer and the ACS, but
the comments in the code files should be reasonably self-explanatory.
quotations.tcl
page and modify it so
that categories entry is done via a select box of existing categories
(you will want to use the "SELECT DISTINCT" SQL command). For new
categories, provide an alternative text entry box labeled "new
category". Make sure to modify quotation-add.tcl
so that
it recognizes when a new category is being defined.
Hint: In order to simplify debugging, you may find it useful to test your query using SQL*PLUS in the shell, before integrating the query into a Tcl page.
Hint 1: Read about simple queries in SQL for Web nerds.
Hint 2: like '%foo%'
Hint 3: SQL's UPPER and LOWER functions
You can personalize Web services with the aid of magic cookies. A cookie issued by the server directs the browser to store data in browser's computer. To issue a cookie, the server includes a line like
in the HTTP header sent to the browser. HereSet-Cookie: cookie_name=value; path=/ ; expires=Fri, 01-Jan-2010 01:00:00 GMT
cookie_name
is the
name for this cookie, and value
is the associated value,
which can contain any character or format except for semicolon, which
terminates a cookie. The path
specifies which URLs on
the server the cookie applies to. Designating a path of
slash (/
) includes all URLs on the server.
After the browser has accepted a server's cookie, it will include the cookie name and value as part of its HTTP requests whenever it asks that server for an applicable URL. Your Tcl programs can read this information using the AOLServer API
[ns_set get [ns_conn headers] Cookie]
After the expiration date, the browser no longer sends the cookie
information. The server can also issue cookies with no specified
expiration date, in which case, the cookie is not persistent -- the
browser uses it only for that one session.
You can see an example of how cookies are issued and read, by
visiting the URL
http://yourvirtualserver/psets/basics/tcl/set-cookies.tcl
and examining the Tcl for file and the associated URLs
check-cookies.tcl
and expire-cookies.tcl
.
Observe how expire-cookies gets rid of cookies by reissuing them with
an expiration date that has already past.
Reference: The magic cookie spec is available from http://home.netscape.com/newsref/std/cookie_spec.html.
Hint 1: It is possible to build this system using an ID cookie for the browser and keeping the set of killed quotations in Oracle. However, if you're not going to allow users to log in and claim their profile, there really isn't much point in keeping data on the server. In fact, by keeping killed quotation IDs in your users' browser cookies, you've achieved the holy grail of academic database management system researchers: a distributed database!
Hint 2: It isn't strictly copacetic with the cookie spec, but you can have a cookie value containing spaces. Tcl stores a list of integers internally as those numbers separated by spaces. So the easiest and simplest way to store the killed quotations is as a space-separated list.
Hint 3: Don't filter the quotations in Tcl. It is generally a sign of incompetent programming when you query more data from Oracle than you're going to display to the end-user. SQL is a very powerful query language. You can use the NOT IN feature to exclude a list of quotations.
In theory, people who wish to exchange structured data over the Web can cooperate using XML (eXtensible Markup Language), a 1998 standard from the Web Consortium (www.w3.org/XML/). XML has started to become widely hyped over the past year as "the next big thing on the Web", but in practice, hardly anybody uses XML yet (summer 2000). Fortunately for you in completing this problem set, you can cooperate with your fellow students: the overall goal is to make quotations in your database exportable in a structured format so that other students' applications can read them.
In order to cooperate, we need (1) an agreed-upon URL at everyone's server where the quotations database may be obtained; and (2) an agreed-upon format for the quotations. In point of fact, we could avoid the need for prior agreement by setting up infrastructures for service discovery and by employing techniques for self-describing data -- both of which we'll deal with later in the semester -- but we'll keep things simple for now.
We'll format our quotations using XML. XML structures consist of data
strings enclosed in HTML-like tags of the form
<foo>
and </foo>
, describing
what kind of thing the data is supposed to be.
Here's an informal example, showing the structure we'll use for our quotations:
Notice that there's a separate tag for each column in our SQL table:<quotations> <onequote> <quotation_id>1</quotation_id> <insertion_date>1999-02-04</insertion_date> <author_name>Bill Gates</author_name> <category>Computer Industry Punditry</category> <quote>640K ought to be enough for anybody.</quote> </onequote> <onequote> .. another row from the quotations table ... </onequote> ... some more rows </quotations>
There's also a "wrapper" tag that identifies each row as a<quotation_id> <insertion_date> <author_name> <category> <quote>
<onequote>
structure, and an outer wrapper that
identifies a sequence of <onequote>
structures as a
<quotations>
document.
Our DTD will start with a definition of the quotations
tag:
This says that the<!ELEMENT quotations (onequote)+>
quotations
element must contain at
least one occurrence of onequote
but may contain more
than one. Now we have to say what constitutes a legal
onequote
element:
This says that the sub-elements, such as<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
quotation_id
must
each appear exactly once and in the specified order. Now we have to
define an XML element that actually contains something other than
other XML elements:
This says that whatever falls between<!ELEMENT quotation_id (#PCDATA)>
<quotation_id>
and </quotation_id>
is to be interpreted as raw
characters rather than as containing further tags (PCDATA stands for
"parsed character data").
Here's our complete DTD:
The point of building a DTD is not just to satisfy some anal, formalistic craving of Web protocol designers. Rather, the DTD provides a machine-readable description of the XML data structure that can be fed to parsers that can then automatically tokenize the XML document. (This is a simple example of a self-describing data technique.)<!-- quotations.dtd --> <!ELEMENT quotations (onequote)+> <!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)> <!ELEMENT quotation_id (#PCDATA)> <!ELEMENT insertion_date (#PCDATA)> <!ELEMENT author_name (#PCDATA)> <!ELEMENT category (#PCDATA)> <!ELEMENT quote (#PCDATA)>
In this problem set, however, we won't use the DTD at all. You'll
need to use only the informal XML quotations
example.
quotations
table, produces an XML document in the
preceding form, and returns it to the client with a MIME type of
"application/xml". Place this in a file quotations-xml.tcl, so that
other users can retrieve the data by visiting that agreed upon URL.
To get you started, we've provided the file example-xml.tcl
.
Requesting this URL with a Web browser should offer to let you to save
the document to a local file, and you can then examine it with a text
editor on your local machine. (Alternatively, your browser may
already have some special behavior defined for MIME type application/xml. IE
5.5, for example, will automatically display the XML document.) The
differences between our example and your program is that you'll need
to produce a document containing the entire table and you'll need to
generate it on the fly.
/psets/basics/tcl/quotations-xml.tcl
from
another student's server using
ns_httpget
.
quotation_id
. (You don't
want keys from the foreign server conflicting with what is already in
your database.)
Hints: You might want to set up a temporary table using create
table quotations_temp as select * from quotations
and then drop
it after you're done debugging, so that you don't mess up your own
quotations database.
Rather than having you work with DTDs and general XML parsers, we've
gone for simplicity here by predefining for you a parser in Tcl that
understands only this particular quotations XML structure. The
procedure is parse_all
.
You have to install this file in your server's private Tcl library,
/web/yourvirtualserver/tcl/
, for this function
to be callable by .tcl and .adp pages. (Don't forget to restart your
Web server.) The parse_all
procedure takes an XML
quotation structure as argument and returns a Tcl list, showing the
parts and subparts of the structure. To see an example of the format,
use your browser to visit the page
http://yourvirtualserver/psets/basics/tcl/xml-parse-test.tcl
.
Note: these exercises are designed to familiarize you with XML. In
most cases, XML processing should be done using Oracle's Java
libraries. See http://www.arsdigita.com/doc/xml.html
.
create table my_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
my_stocks
table. You'll have to create an appropriate
control file, as when you earlier preloaded the quotations table. You
can probably guess the format of the control file by analogy with the
quotations example, but you should at least be aware of the existence of the
SQL*Loader documentation in the Oracle docs at http://philip.greenspun.com/sql/ref/utilities)
Hint: If you leave blank lines at the end of the control file, the log file will contain error complaints about bad records with missing columns, since Oracle's implementation is too shoddily written to realize that a blank line isn't supposed to be a record definition. (What can you expect from a company with a mere $10 billion in annual revenues?)
stock_prices
with three columns: symbol,
quote_date, price
. After this one statement, you should have
created the table and filled it with one row per symbol in
my_stocks
. The date and price columns should be filled
with the current date and a nominal price.
Select symbol, sysdate as quote_date, 31.415 as price from my_stocks;
.
create table newly_acquired_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
insert into .. select ...
statement
(with a WHERE clause appropriate to your sample data), copy about half
the rows from my_stocks
into newly_acquired_stocks
my_stocks
and
stock_prices
, produce a report showing symbol, number of
shares, price per share, and current value.
my_stocks
. Run your query from
Oraexercise 3. Notice that your new stock does not appear in the
report. This is because you've JOINed them with the constraint that
the symbol appear in both tables.
Modify your statement to use an OUTER JOIN instead so that you'll get a complete report of all your stocks, but won't get price information if none is available.
Choose a web page that provides some numerical data that is updated reasonably frequently, e.g., a couple of times a day. Two examples are:
ns_schedule_proc
to
schedule your procedure to run once an hour. Build a .tcl page that
people can access to see how data change over time.
One of the things you might do is keep track of when the server you are accessing is up. For example, one of the interesting things about Amazon is that they often lose control of their server farm and database. (They write a lot of C code and one programmer's sloppiness can generate a catastrophic failure of the entire service.) You might want to build your system so that you can record (a) times when the server is unreachable, and (b) times when the page served some error message. For example, Amazon will send "Our store is closed temporarily for scheduled maintenance". (You'll sometimes get this during the middle of weekdays when they would definitely not have intentionally scheduled any maintenance.)
You can also display your data graphically as a chart. You can do this in a way that is reasonably browser independent, by using single-pixel GIFs and WIDTH and HEIGHT tags now. Grab the software in http://software.arsdigita.com/tcl/ad-graphing.tcl and put it in your server's private Tcl directory (/web/yourservername/tcl/). Read the docs at http://software.arsdigita.com/www/doc/graphing.html and then write code to generate a pretty chart of your data.
You might also provide a way to return your data as an XML structure, so that people can access it in building their own programs, without going through the same page-scraping shenanigans that you had to in collecting it. The idea that the Web could contain lots of data sources designed for programmatic access, so that people can combine them to build more elaborate programs, hints at the vision of Web services, about which we will have more to say anon.
Place your project on your web server on a page called
project.tcl
in your problem set directory, so that others
can access it.
This file is permanently housed at
http://philip.greenspun.com/teaching/psets/basics/tcl/basics.adp
.
This material is copyright 1999, 2000, by Philip Greenspun and Hal Abelson. It may be copied, reused, and modified, provided credit is given to the original authors with a hyperlink to this document.