Reading for this week:
Online assistance: Problem Set 1 Q&A forum
Objectives: we're trying to make sure that everyone knows
This first problem set requires you to learn a lot of new software, so make sure you get started early: plan to spend at least two or three sessions on it. There is nothing difficult here, but we do want to lead you through the mechanics of using Tcl, SQL, and running the Web server.
Please feel free to use this forum to ask your questions about Problem Set 1 or any other class-related problem. You can also view other people's questions, and provide or view answers. It is archived, so maybe it'll save you some time!
Hint: If you're writing Tcl programs of more than two or three lines, you may find it convenient to type the code into a separate Emacs buffer (set to tcl mode) and cut and paste from there into the Tcl shell buffer.
Type info tclversion
at the tclsh prompt to make sure
that you're running Tcl 8.2, the same version that is compiled into
AOLserver.
Augment the page so that (1) you add a $4000 South American Cichlid
aquarium as an option, (2) you use a constructor procedure to build
each aquarium element (instead of simply calling list
),
(3) you add an element to the aquarium for how many of each type of
aquarium will be installed (4) you use procedures to extract
type, cost and quantity from an aquarium element (instead of simply calling lindex),
(5) you print out quantity-dependent subtotals and the grand
total at the bottom.
Hint 1: it is easier if you don't try to do this in one regexp. Use if then elseif then elseif ...
Hint 2: regexp
has a side-effect. If you use a
multi-clause if
statement, make sure that you wrap your
calls to regexp
in braces so that they don't all get
evaluated immediately.
ns_httpget
to query several online bookstores to find
price and stock information and displays the results in an HTML table.
Save your program in files called
/web/yourvirtualserver/www/psets/ps1/books.tcl and books-2.tcl so people
can access your service over the web.
We suggest querying wordsworth.com, barnesandnoble.com, and
www.1bookstreet.com (amazon.com tends to respond with a 302 redirect
you if the client doesn't give them a session ID in the query). Your
program should be robust to timeouts, errors at the foreign sites, and
network problems. You can ensure this by wrapping a Tcl
catch
statement around your call to
ns_httpget
. Test your program with the following ISBNs:
0385494238, 0062514792, 0140260404, 0679762906.
Extra credit: From which of the preceding books is the following quote taken?
"The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers."
create table my_courses (
course_number varchar(20)
);
Note that you have to end your SQL commands with a semicolon in
SQL*Plus. These are not part of the SQL language and you shouldn't
use these when writing SQL in your Tcl progams for AOLserver.
Insert a few rows, e.g.,
insert into my_courses (course_number) values ('6.916');
Commit your changes:
commit;
See what you've got:
select * from my_courses;
One of the main benefits of using an RDBMS is persistence.
Everything that you create stays around even after you log out.
Normally, that's a good thing, but in this case you probably want to
clean up after your experiment:
drop table my_courses;
Quit SQL*Plus by typing "c-c c-d".
ad_header
("ArsDigita Header").
You can confirm this suspicion by using
Emacs to read /home/nsadmin/log/yourvirtualserver-error.log, which is
where AOLserver logs any notices or problems.
To get AOLserver to load procedure definitions at server startup, you have to put .tcl files in your server's private Tcl library: /web/yourvirtualserver/tcl/. Create a file called "ps1-defs.tcl" in this directory and define the following Tcl procedures:
ad_header page_title
-- returns HTML, HEAD,
TITLE, and BODY tags, with argument enclosed within the TITLE tags
ad_footer
-- returns a string that will close the
BODY and HTML tags
Reload the quotations.tcl page and you get ... the same error!
AOLserver doesn't know that you've added a file to the private
library; this is only checked at server startup. Go to a Unix shell
and "restart-aolserver yourservername" (this is the big hammer; it
kills your server's Unix process so that Unix will restart AOLserver
automatically). If restart-aolserver
does not come back
with "Killing 10234" or some other process ID, you'll know that you
did not succeed (perhaps you made a typo when specifying your server
name).
Reload the quotations.tcl page and you get ... a slightly different
error! The program is trying to query a table that doesn't exist:
quotations
.
Go back to your sql shell and restart SQL*Plus. Copy the table
definition from the comments at the top of the file quotations.tcl and
feed this definition to Oracle. Go back to your Web browser and
reload the page that previously gave you an error. Things should now
work, although the quotations
table is empty.
Use the form on the web page to manually add the following quotation, under an appropriate category of your choice: "640K ought to be enough for anybody" (Bill Gates). Note that it would be funnier if our table had a column for recording the date of the quotation (1981) but we purposely kept our data model as simple as possible.
Return to SQL*Plus and SELECT * from the table to see that your
quotation has been recorded. The horrible formatting is an artifact
of your having declared the quote
column to be 4000
characters long.
In SQL*Plus, insert a quotation with some hand-coded SQL (if you are feeling lazy, you can cut and paste some SQL from the AOLserver error log; all SQL statements that AOLserver sends to Oracle are logged here). Now reload the quotations.tcl URL from your Web browser. If you don't see your new quote here, that's because you didn't type COMMIT; at SQL*Plus. This is one of the big features of a relational database management system: simultaneously connected users are protected from seeing each other's unfinished transactions.
quotations
table.
ns_set
API. A set is a collection of {key,value} pairs,
which should be a familar idea from 6.001. Selecting from a table
with ns_db select
returns an identifier for a set, whose
keys are the names of the selected columns. Subsequent successive
calls with ns_db getrow
will fill in the values in this
set with the values from successive selected rows. For example,
suppose you obtain a set identifier by selecting the following table
with ns_select
:
writer | book |
---|---|
Tolstoy | Anna Karenina |
Steinbeck | Grapes of Wrath |
Greenspun | Guide to Web Publishing |
Then, after the first call to ns_db getrow
the set will
be
{{writer Tolstoy} {book "Anna Karenina"}}
After the second call to ns_db getrow
the set will will be
{{writer Steinbeck} {book "Grapes of Wrath"}}
and after the third call the set will be
{{writer Greenspun} {book "Guide to Web Publishing"}}
The programs in the files quotations.tcl and quotation-add.tcl illustrate these ideas. It will be well worth your while to study these programs until you understand how they work, because you'll be doing a lot of this kind of programming throughout the semester.
lock table
via a sequencelock table
like '%foo%'
.
You can personalize Web services with the aid of magic cookies. A cookie issued by the server directs the browser to store data in browser's computer. To issue a cookie, the server includes a line like
Set-Cookie: cookie_name=value; path=/ ; expires=Fri, 01-Jan-2010 01:00:00 GMT
in the HTTP header sent to the browser. Here cookie_name
is the
name for this cookie, and value
is the associated value,
which can contain any character or format except for semicolon, which
terminates a cookie. The path
specifies which URLs on
the server the cookie applies to. Designating a path of
slash (/
) includes all URLs on the server.
After the browser has accepted a server's cookie, it will include the cookie name and value as part of its HTTP requests whenever it asks that server for an applicable URL. Your Tcl programs can read this information using the AOLServer API
[ns_set get [ns_conn headers] Cookie]
After the expiration date, the browser no longer sends the cookie
information. The server can also issue cookies with no specified
expiration date, in which case, the cookie is not persistent -- the
browser uses it only for that one session.
You can see an example of how cookies are issued and read, by visiting the URL http://yourvirtualserver/psets/ps1/set-cookies.tcl and examining the Tcl for file and the associated URLs check-cookies.tcl and expire-cookies.tcl. Observe how expire-cookies gets rid of cookies by reissuing them with an expiration date that has already past.
Reference: The magic cookie spec is available from http://home.netscape.com/newsref/std/cookie_spec.html.
Hint 1: it is possible to build this system using an ID cookie for the browser and keeping the set of killed quotations in Oracle. However, if you're not going to allow users to log in and claim their profile, there really isn't much point in keeping data on the server. In fact, by keeping killed quotation IDs in your users' browser cookies, you've achieved the holy grail of academic database management system researchers: a distributed database!
Hint 2: it isn't strictly copacetic with the cookie spec, but you can have a cookie value containing spaces. Tcl stores a list of integers internally as those numbers separated by spaces. So the easiest and simplest way to store the killed quotations is as a space-separated list.
Hint 3: don't filter the quotations in Tcl; it is generally a sign of incompetent programming when you query more data from Oracle than you're going to display to the end-user. SQL is a very powerful query language. You can use the NOT IN feature to exclude a list of quotations.
Here's what we need in order to cooperate:
We'll format the quotations using XML, which is simply a conventional
notation for describing structured data. XML structures consist of
data strings enclosed in HTML-like tags of the form
<foo>
and </foo>
, describing
what kind of thing the data is supposed to be.
Here's an informal example, showing the structure we'll use for our quotations:
<quotations>
<onequote>
<quotation_id>1</quotation_id>
<insertion_date>1999-02-04</insertion_date>
<author_name>Bill Gates</author_name>
<category>Computer Industry Punditry</category>
<quote>640K ought to be enough for anybody.</quote>
</onequote>
<onequote>
.. another row from the quotations table ...
</onequote>
... some more rows
</quotations>
Notice that there's a separate tag for each column in our SQL data model:
<quotation_id>
<insertion_date>
<author_name>
<category>
<quote>
There's also a "wrapper" tag that identifies each row as a
<onequote>
structure, and an outer wrapper that
identifies a sequence of <onequote>
stuctures as a
<quotations>
document.
Our DTD will start with a definition of the quotations
tag:
<!ELEMENT quotations (onequote)+>
This says that the quotations
element must contain at
least one occurrence of onequote
but may contain more
than one. Now we have to say what constitutes a legal
onequote
element:
<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
This says that the sub-elements, such as quotation_id
must
each appear exactly once and in the specified order. Now we have to
define an XML element that actually contains something other than
other XML elements:
<!ELEMENT quotation_id (#PCDATA)>
This says that whatever falls between <quotation_id>
and </quotation_id>
is to be interpreted as raw
characters rather than as containing further tags (PCDATA stands for
"parsed character data").
Here's our complete DTD:
<!-- quotations.dtd -->
<!ELEMENT quotations (onequote)+>
<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
<!ELEMENT quotation_id (#PCDATA)>
<!ELEMENT insertion_date (#PCDATA)>
<!ELEMENT author_name (#PCDATA)>
<!ELEMENT category (#PCDATA)>
<!ELEMENT quote (#PCDATA)>
You will find this extremely useful... Hey, actually you won't find
this DTD useful at all for completing this part of the problem set.
The only reasons that DTDs are ever useful is for feeding to XML
parsers because they can then automatically tokenize an XML document.
For implementing your quotations-xml.tcl page, you will only need to look
at informal example.
quotations
table, produces an XML document in the
preceding form, and returns it to the client with a MIME type of
"application/xml". Place this in a file quotations-xml.tcl, so that
other users can retrieve the data by visiting that agreed upon URL.
To get you started, we've provided /psets/ps1/example-xml.tcl. Requesting this URL with a Web browser should offer to let you to save the document to a local file, and you can then examine it with a text editor on your local machine. (This assumes that you haven't defined some special behavior for your browser for MIME type application/xml.) The differences between our example and your program is that you'll need to produce a document containing the entire table and you'll need to generate it on the fly.
ns_httpget
quotation_id
(you don't
want keys from the foreign server conflicting with what is already in
your database)
Hints: You might want to set up a temporary table using
create table quotations_temp as select * from quotations
and then drop it after you're done debugging. You should use
DoubleApos
when presenting data to Oracle for
comparisons.
Rather than having you link in a 100,000-line C program (or a
5,000-line Lisp program) that parses XML documents based on a DTD,
we've gone for simplicity here by predefining for you a parser in Tcl
that understands only this particular DTD for quotations. The
procedure is parse_all
(you
have to install this file in your server's private Tcl library,
/web/yourvirtualserver/tcl/, for this function to be callable by .tcl
and .adp pages) . The parse_all
proc takes an XML
quotation structure as argument and returns a Tcl list, showing the
parts and subparts of the structure. To see an example of the format,
use your browser to visit the page
http://yourvirtualserver/psets/ps1/xml-parse-test.tcl.
Note: these exercises are designed to familiarize you with XML. In most cases, sophisticated XML processing should be done inside Oracle using Java libraries. See http://photo.net/doc/xml.html.
ns_schedule_proc
to
schedule your procedure to run once every hour
Note that you're dipping into the ArsDigita Community System toolkit here, the software with which you'll be occupied in Problem Set 2.
create table my_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
sqlldr
shell command to invoke SQL*Loader to
slurp up your tab-separated file into the my_stocks
table
(see page 1183 of Oracle8: The Complete Reference and
the official Oracle docs at http://philip.greenspun.com/sql/ref/utilities)
stock_prices
with three columns: symbol,
quote_date, price
. After this one statement, you should have
created the table and filled it with one row per symbol in
my_stocks
. The date and price columns should be filled
with the current date and a nominal price. Hint:
select symbol, sysdate as quote_date, 31.415 as price from my_stocks;
.
create table newly_acquired_stocks ( symbol varchar(20) not null, n_shares integer not null, date_acquired date not null );
insert into .. select ...
statement
(with a WHERE clause appropriate to your sample data), copy about half
the rows from my_stocks
into newly_acquired_stocks
my_stocks
and
stock_prices
, produce a report showing symbol, number of
shares, price per share, and current value.
my_stocks
. Run your query from
Oraexercise 3. Notice that your new stock does not appear in the
report. This is because you've JOINed them with the constraint that
the symbol appear in both tables.
Modify your statement to use an OUTER JOIN instead so that you'll get a complete report of all your stocks, but won't get price information if none is available.
ASCII
function will be helpful)
stock_prices
to set each stock's value to whatever is returned by this PL/SQL
procedure
n_shares * price
for
each stock). You'll want to define your JOIN from Oraexercise 3
(above) as a cursor and then use the PL/SQL Cursor FOR LOOP facility.
Hint: when you're all done, you can run this procedure from SQL*Plus
with select portfolio_value() from dual;
.
date_acquired
set to sysdate
)
my_stocks
to
produce a report of symbols and total shares held
stock_prices
to produce a report of symbols and total
value held per symbol
stocks_i_like
that encapsulates the final query.
An alternative is to show the user a table of holdings right on the top-level page, with a sensible subset of the data by default, and provide controls to adjust what is included in the display. What kind of controls? You could have the ones above, plus whatever other views the publisher of the site and the users eventually decide are necessary. Suppose, though, that you can organize the controls along orthogonal dimensions. If you can do that, with just a handful of "dimensional sliders", the user will have many options.
For example, the ArsDigita ticket tracking system is used to store bug reports and feature requests. The following dimensions are employed
If those 24 options aren't enough, the ticket tracking system lets the user re-sort the table by any of the columns by clicking on the column heading.
Try building the same sort of thing for your stock portfolio. You
want a .tcl page that shows the contents of my_stocks
with stock_prices
. Provide controls across the top
(hint: TABLE WIDTH=100% and TD ALIGN=RIGHT will be useful) for the
following dimensions:
export_ns_set_vars
in
$SERVER_HOME/tcl/00-ad-utilities.tcl will be useful for this,
notably because of the exclusion_list
argument), e.g.,
symbol, number of shares, price per share, value of holding.
You can build this from scratch or use the ArsDigita Community System toolkit API calls in http://software.arsdigita.com/packages/acs-core/table-display-procs.tcl as a building block.
It was revised in January 2000 for AOLserver 3.0, which incorporates Tcl 8.2. The old version is available from http://philip.greenspun.com/teaching/psets/ps1/6916.ps1-for-aolserver-2.3.tar.
It is permanently housed at http://philip.greenspun.com/teaching/psets/ps1/ps1.adp.