Massachusetts Institute of Technology
Department Electrical Engineering and Computer Science

6.916: Software Engineering of Innovative Web Services
Emerging Instrastructures for Web Services

Readings:

Objectives: By doing this problem set you will learn

The Big Picture

"We're going to do for programs what the Web did for data." -- Rajiv Gupta, developer of Hewlett-Packard's e-speak, (July, 1998)

"It's exciting the progress that's taking place around XML just even in the last six months. We've got here a standard that Microsoft is very much behind, but not just Microsoft. We've got IBM and many others joining in in things like the SOAP definition that explain how XML can essentially be used as a program-to-program protocol, how programs can exchange arbitrary data with each other." -- Bill Gates (June, 2000)

The Medium-sized Picture

Back in the early 90s, few people had ever heard of Tim Berners-Lee's World Wide Web, and, of those that did, many fewer appreciated its significance. After all, computers had been connected to the Internet since the 70s, and transferring data between computers was commonplace. Yet the Web brought something really new: the perspective of viewing the whole Internet as a single information space, where users accessing data could move seamlessly and transparently from machine to machine by following links.

A similar shift in perspective is currently underway, this time with application programs. Although distributed computing has been around for as long as there have been computer networks, it's only recently that applications that draw upon many interconnected machines as one vast computing medium are beginning to be deployed on a large scale. What's making this possible are new protocols for distributed computing built upon HTTP, and which are designed for programs interacting with programs, rather than for people surfing with browsers.

There are several kinds of protocols:

  1. Data exchange: Something better than scraping text from web pages. As you saw in the Basic problem set, you can use XML here.

  2. Program invocation: Some way to do remote method invocation, that is, for programs to call programs running on other machines and to reply to such invocations. The proposed standard here, submitted to the Web Consortium in May 2000, is called SOAP (Simple Object Access Protocol).

  3. Self-description (also called Introspection): A machine-readable way for Web services to describe how they are supposed to be called.

  4. Discovery: A way for Web services to automatically learn about other services.

We're currently moving from an environment where applications are deployed on individual machines and Web servers, to a world where applications are composed of pieces -- called services in the current jargon -- that are spread across many different machines, and where the services interact seamlessly and transparently to produce an overall effect. While the consequences of this change could be minor, it's also possbile that they could be as profound as the introduction of the Web. In any case, companies are introducing new Web service frameworks that exploit the new infrastructure. Hewlett-Packard's e-speak and Microsoft's .NET are two such frameworks.

In this problem set, you'll explore composition of Web services, both in combining services to create new ones, and creating component services for use by others. You'll be using SOAP, which serves as an underyling method invocation protcol for e-speak and .NET, and (probably) other Web service frameworks that are yet to be introduced.

As part of your exploration, you'll be making use of an experimental service called Terranet, which has been deployed by Microsoft Research, initially developed to support this problem set.

Note: In order to do these exercises, you'll need to

  1. Define a database table to keep track of methods your server will provide:
    create sequence soap_6916_method_id_seq;
    
    -- Maps method names to tcl pages.
    create table soap_6916_methods (
       method_id       integer primary key,
       method_name     varchar(100) not null unique,
       method_tcl_url  varchar(200) not null unique,
       method_comment  varchar(4000),
       -- this is analogous to the SDL contract
       parameters       varchar(4000) not null
    );
    
  2. Install the method handler, utility and example tcl files from ps-services.tar. Directions are in the README file.

Invoking methods with SOAP

Take a look at the U.S. Census Bureau's 1990 Census Summary Tape Files, which provide data from the 1990 U.S. Census. Look up some city or town, and retrieve some information. You'll see that the forms interface provided by the Census bureau may be OK for selected data items by hand, but it's not at all convenient as a programmatic interface.

In contrast, visit the page census-data-example.tcl in the files supplied for this problem set. This should produce an XML structure with 1990 Census data for Cambridge, MA. If you haven't taught your browser to use a specific application for XML mime types, the browser will offer to let you save the page, and you can examine it with Emacs. Newer versions of Internet Explorer automatically include an XML viewer.

Our census data example is using SOAP to invoke an experimental .NET service named CensusService, that Microsoft Research has set up for our class to use this semester. CensusService provides much the same 1990 data as the Census Bureau site, but has been implemented as a .NET service.

Examine the source code for census-data-example.tcl. You'll see that we've provided a procedure soap_invoke that lets you invoke methods using SOAP requests. The arguments to soap_invoke are the URL address of the service, the name of the method to invoke, and an XML structure that specifies the arguments for the method.

In general, a service consists of a collection of methods, which a service provider makes available for some purpose. Methods can be invoked in several different ways, depending on how they are set up. In this problem set, we invoke methods by sending SOAP requests, which are messages sent via HTTP Post. A SOAP request consists of a body, enclosed in an envelope. Our procedure soap_invoke sends a SOAP request whose body is the XML structure you specify, wrapped in an appropriate envelope.

In gory detail, the full SOAP request sent over the wire in the census data example is

POST http://terranet.research.microsoft.com/CensusService.asmx HTTP/1.1
Content-Type: text/xml
User-Agent: AOLserver SOAP Client
SOAPAction: http://tempuri.org/GetPoliticalUnitFactsByName
Content-Length: 494
Connection: Close
Host: http://terranet.research.microsoft.com

<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
               xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
               xmlns:xsd="http://www.w3.org/1999/XMLSchema">
  <soap:Body>
     <GetPoliticalUnitFactsByName xmlns="http://tempuri.org/">
        <pu>City</pu>
        <name>Cambridge</name>
        <ParentName>Massachusetts</ParentName>
        <year>1990</year>
     </GetPoliticalUnitFactsByName>
  </soap:Body>
</soap:Envelope>
The soap_invoke procedure makes life a bit easier on you, by hiding the details of HTTP Post and SOAP envelope headers. In actuality, web services infrastructures like .NET suppress even more details: they make invoking SOAP methods look just like ordinary procedure invocations. But this problem set leaves the XML layer exposed so you can see what's going on.

Exercise 1

Use the census service to implement a page where the user enters a city and state and gets back a table (or a graph, if you prefer) showing the age breakdown of the city's population (in 1990).

Describing services in machine-readable form

SOAP provides a general way to pass structured arguments to Web methods. How do we know which arguments to pass? The obvious answer is to look at the documentation for each method. Documentation is important, but to achieve automatic and transparent interaction of Web services, we'd like that documentation to be machine readable. That is, we'd like Web services and their methods to be self-describing.

Visit the URL for the services that Microsoft has implemented for us at terranet.research.microsoft.com. There are three services. The page for the CensusService lists four methods: GetPoliticalUnitFactsByName, and three others. This page also has a form interface that lets you invoke two of the methods. Use this interface to get the 1990 political facts for some city, to verify that it works, and returns an appropriate XML structure.

The interesting thing about the CensusService page is that it was automatically generated from a data structure called the service's SDL Contract. SDL (Service Definition Language) is a way of specifying what services a method provides, what arguments the methods expect and what results they return. Examine the CensusService's SDL contract. The part at the beginning, enclosed within <soap>...</soap> tags, lists the methods that can be invoked by SOAP requests. There are analogous sections for methods that can be invoked using HTTP POST and HTTP GET, but we won't be concerned with these here. Further down, you'll see a section delineated by <schema>...</schema> tags. This describes the arguments required by each of the methods.

Note: If you're wondering why the Microsoft page for the CensusService has only two forms, yet there are four services, it's because the Microsoft forms are generated only for the HTTP GET part of the SDL contract, and Microsoft currently does not provide GET interfaces for methods that require arguments that are structured data.

Exercise 2

Write a program that uses the SDL contract to automatically generate a form -- similar to the form generated on the CensusService page -- for invoking GetPoliticalUnitFactsByName. Your form should invoke the method, by issuing a SOAP request. Rather than just asking the user to fill in input fields as in the Microsoft form, your form should provide a pull-down list of choices for the pu argument, since the type of that argument is a specifically enumerated set of strings.
Hint. For this problem, you need to write procedure that works for this particular SDL contract. You do NOT need implement a general solution for arbitrary SDL contracts.

Exercise 3

How would you implement a program to generate an input form for invoking GetPoliticalUnitFactsByRect, which takes a "rectangle" argument, specifed as the latitudes and longitudes of the upper left and lower right corners of the rectangle? By examining the SDL contract, you ought to be able to determine that the XML argument for invoking this method should have the form
<pu>....</pu>
<rect>
  <UpperLeft>
    <Lon>...</Lon>
    <Lat>...</Lat>
  </UpperLeft>
  <LowerRight>
    <Lon>...</Lon>
    <Lat>...</Lat>
  </LowerRight>
</rect>
<year>...</year>
Write a brief (one-paragraph) clear explanation of how to implement a program that generates input forms for methods that, like GetPoliticalUnitFactsByRect, have arguments with complex types.

Satellite images and maps

Spend a few minutes playing with the Microsoft Terraserver at terraserver.microsoft.com. Zoom in on some region you're interested in. Notice that there are three kinds of images: photographs (taken from satellites and airplanes), topographic map, and relief maps.

In contrast to the Terraserver, the TerraService at at terranet.research.microsoft.com provides a service interface to Terraserver's data and functionality. You can see the methods listed on the Terraservice page, and described in the service's SDL contract. The page lat-lon-example.tcl (part of the code for this problem set) uses the GetPlaceList method to find the latitude and longitude of several places in the US whose names include "Cambridge".

The page image-example.tcl shows how to use latitude and longitude of Cambridge to invoke the TerraService to produce an aerial photograph of Cambridge together with a corresponding topographic map. The page also includes a corresponding street map, produced from MapBlast! by calling an internal URL.

Study the source code for image-example.tcl to see how the images are produced: We start with the latitude and longitude and use these to retrieve some image tile metadata, from which we extract a tile id. The tile id is used as an argument to a procedure terraserver_image_url which provides a URL we can use in an <img src= ...> HTML tag to draw the image.

See the documentation of terraserver_image_url for details, and notice that

  1. terraserver_image_url uses a procedure called get_terraserver_image, which gets the actual image from the service.
  2. You need to have the file terraserver_image_page.tcl in your pset directory for this to work.

Experiment with modifying the parameters in image-example.tcl to show Cambridge at various scales. Note that the photograph and the topographic map remain aligned, but the street doesn't change, because its arguments are hand-coded.

Exercise 4

Generalize image-example so that you can view matching image, topographic, and street maps at any latitude, longitude, and scale in the United States. You'll need to compute an appropriate argument to feed to MapBlast. Generate some maps of interesting places.

Hint 1: The CT argument in the URL provides MapBlast with the latitude, longitude, and scale, in the form (lat:lon:scale). You'll need to play around to get a good corrspondence between Terraserver scale numbers and MapBlast scale numbers.

Hint 2: The actual center point of the image tile returned by Terraserver may not be the same as the latitude and longitude you request, due to details of how the photographs were produced. Consequently the street map and the image maps may not be aligned if you use the initial latutude and longitude as the latitude and longtude for MapBlast. To correct this problem, invoke MapBlast with the actual latitude and longitude of the image of the center, which you can find in the tile metadata. The structure of tile metadata is specified in the TerraService's SDL contract.

Census data maps

Take a look at the U.S. Census Bureau's Tiger Mapping Service, which draws maps that are shaded according to the kind of census data you examined in exercises 1-3. The instructions linked from that page describe how to produce maps by passing arguments to the mapping service in URLs.

Exercise 5

Extend exercise 4 by adding a fourth map, drawn by the Tiger service, and matched to the other three, that is shaded according to some census-data criterion that the user can select. Since the Tiger service shades maps at the census block level, drawing a map at too small a scale won't produce anything interesting because the mapped area won't contain enough different census blocks.

Hint: In order to match the maps, you'll need to compute the size of the mapped area in latitude and longitude. You can obtain the necessary information from the tile metadata.

Creating your own service

In this part of the problem set you will add methods to a generic container service running on your student server at /service/handler. The following procedures let you create and invoke these methods using a dumbed-down version of SOAP:

create_6916_method method_name page comment
produces a method that can be invoked remotely with invoke_method. The page should specify a tcl page that expects arguments and returns a value. The corresponding serivce will return this value, wrapped in <result>...</result> tags. The name is a symbol used to refer to the method, and comment is a comment intended to describe the method. If there already is a method of the given name in the database, then it will be overwritten.

invoke_6916_method host_url method_name args_body
invokes the method named method_name at the host specified by host_url, giving it the arguments args_body. Here args_body should be an XML structure.

invoke_6916_method host_url
Calling invoke_6916_method without method_name or args_body will returns an XML structure describing the methods at the given host_url

invoke_6916_method host_url method_name
Calling invoke_6916_method without theargs_body will returns an XML structure describing the method_name at the given host_url
Note: The host_url argument should always end in /service/handler, the location of the handler you installed from ps-services.tar.

For example, suppose you create a page http://lcswwwXX.lcs.mit.edu/service/add.tcl, whose contents are

ad_page_contract { 
    This page returns the sum of x and y.  
} {
    {x ""}
    {y ""}
}

if { [empty_string_p $x] || [empty_string_p $y] } {

    # Insufficient args; return the contract.

    ns_return 200 application/xml "
    <request>
      <param name=x type=notnull/>
      <param name=y type=notnull/>
    </request>
    <response>
      <param name=sum/>
    <response>
    "
    return
}

# We got args; return the sum.

set sum [expr $x + $y]

set return_xml "<result>
  <sum>$sum</sum>
</result>"

ns_return 200 application/xml $return_xml

If you make this method available remotely with

create_6916_method addition \
                   http://lcswwwXX.lcs.mit.edu/service/add.tcl \
		   "Takes two numbers and returns the sum."
then people can invoke this method with
invoke_6916_method http://lcswwwXX.lcs.mit.edu/service/handler addition "<x>23</x> <y>34</y>"
to get back
<result>
  <sum>57</sum>
</result>
Also, evaluating
invoke_6916_method http://lcswwwXX.lcs.mit.edu/service/handler
will return an XML structure that includes the entry
<service>
  <name>addition</name>
  <param name=x type=notnull/>
  <param name=y type=notnull/>
</service>
The files add3_method_example.tcl and student_server_method_example.tcl, provided in the tar file, give additional examples of tcl pages that can be used with create_6916_method.

Exercise 6

Create a couple of methods. Make something interesting -- perhaps something that uses some real data, which you can fetch from over the Web. Or something that uses the Microsoft CensusService and processes the result in some way.

Exercise 7

Write a page that scans the machines in the class and shows the available services and methods. Encourage your fellow students to complete up through exercise 6, so that your page will have something to show. (Hint: you will need the method student_servers available at http://6916.lcs.mit.edu/service/handler. Hint2: Fall, 2000 students should use class_id=22, role=student).

Exercise 8

Implement a method that uses some of the methods that your classmates have created. Work together to create a cluster of services that produces some interesting result. Notice the Achilles' heel of such a distributed service structure: in order for a compound service to work, all the component methods (which may be on other Web servers, which are out of your control) have to be working.

Dynamic service composition

The compound services you created in exercise 8 probably used particular services, on particular machines, but this is only one possibility. We could also have a service that examined what was available on neighboring machines and then picked component services dynamically, according to some criteria, such as cost or reliability. In order to make this idea effective, we need to augment the human-readable comment for the service with some formal machine-redable descriptions, so that there is some basis by which a program could select a service to use.

We won't ask you to develop such descriptions in this problem set. There's a lot of activity going on right now in the "B2B e-commerce sector" trying to define standards and protocols to support dynamic service composition, including service descriptions, service directories, and negotiation between services. One example, is Hewlett-Packard's Service Framework Specification (SFS) (see the PDF marketing blurb), which is part of HP's overall e-speak effort. Another is the Universal Description, Discovery and Integration (UDDI) service registration protocol being developed by IBM, Microsoft, and Ariba (see the September, 2000 news article).

Overall, the kind of insfrastructure you've been exploring in this problem set doesn't quite exist yet, but it is rapidly coming into being. More than likely, within a few years people will take seamless interoperation of Web services completely for granted, just as we take seamless linking of Web pages for granted today.

Who Wrote This and When

This problem set was written by Hal Abelson, Dan Parker (dparker@arsdigita.com), and Andrew Grumet (aegrumet@arsdigita.com) in October 2000 for MIT course 6.916. Microsoft Research's Terranet Service, which was initially developed to support this problem set, was implemented by Jeff Richter and Tom Barclay with support from Dan Fay and Dave Mitchell.

This file is permanently housed at http://philip.greenspun.com/teaching/psets/services/.

This material is copyright 2000, by the authors, It may be copied, reused, and modified, provided credit is given to the original authors with a hyperlink to this document.


Last Modified: October 11 2000, 12:58 AM
Maintainer: dparker@arsdigita.com