Massachusetts Institute of Technology
Department of Electrical Engineering and Computer Science

6.171 Mid-Term Exam

Fall 2003

Rules

This is a take-home exam. You can take it where you want to and use any resources you like, except that you must do the exam by yourself (no collaboration, not even with your team members). If you are relying on a section of a book or Web page, cite the source. An 80/100 will be a great score on this exam so if there is one part that is taking forever and doesn't interest you, write up your thoughts and move on.

When you are through with the exam, send your work by email (in plain text, please) to 6171-staff@mit.edu. At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself. We must receive your mail by 11:30 AM on Thursday, November 13th (before class begins).

Handed out: Thursday, November 6th, 1:00 pm. Due: Tuesday, November 13th, 11:30 am.

Resources

Here are some resources that will help you answer the questions below.

Problems or questions? Email ben@mit.edu.

Question 1: Developing a Data Model

Visit www.friendster.com. If you're not already a member, sign up for an account.
  1. Find our own Ben Adida, using a "user search," and add him as a friend (Ben said that he went into Computer Science because he wanted to meet a lot of cool people). You should do this earlier rather than later so that you have time to complete the exercises.
  2. Look at a page describing a user. Look carefully at the information presented on this profile, including friends' comments
  3. Add a few other friends (people in the class are a good idea).
  4. Change your favorite movies or TV shows to try to find people who have the same tastes. Observe how the site presents this information.
  5. Add some testimonials on your friends
  6. Search for people in your network who live in specific cities other than Boston.
Now develop a data model for the friendster.com site. Your data model should be complete enough to handle all of the preceding actions and operate the core friendster.com service. (Ignore the "Shopping" and "Events" sections). Assume the existence of a users table, keyed with user_id. The data model should include appropriate integrity constraints and indices to facilitate the queries that you believe are necessary for the pages that you just experienced. The data model should contain appropriate in-line comments.

Test your data model by feeding it to an RDBMS and making sure that the CREATE TABLE and index definition statements are accepted without errors. You will probably want to create a new database user so that your table definitions don't conflict with any tables that you might have defined for your team project.

Expected length of answer: if your data model contains more than 15 or 20 tables, you're probably handling too many special cases. The important thing is to get the structure of the data model right and the relations. Don't spend a lot of time making sure that you have every possible column in every table. For example it would be okay to say "-- ... address columns ..." to imply that there would be a way to store a member's street address, city, state, etc.

Question 2: Fancy SQL Queries

Using the data model you defined in Question 1 write two computer programs:

Part a: a computer program that generates the page "find people in my network of friends that live within 50 miles of me and who say that they are interested in meeting people for Dating someone of my sex". The program should list those people ordered by ascending geographically distance.

Part b: a computer program that generates the page "how am I related to this person?" and displays all the possible paths between two people.

Ideally each program will be a single SQL query. It is acceptable to use procedural language crutches, e.g., stored procedures. Hint: you can postulate the existence of a table or function that will give you the latitude/longitude of the centroid of a U.S. zip code.

Test your queries by feeding your data model from Question 1 into the RDBMS that you've been using this semester (though you can use a separate tablespace/database/namespace). Fill the tables with at least the following test data (you may add more if you like):

Print out a transcript of a database session in which you use your queries (for (b) you could query to find the friend path between George and Arnold).

Note that it is possible that you'll start doing these queries and discover that you want to make changes to your data model. That's a perfectly natural way to refine a system. See if a data model supports some queries then go back and modify it until it does.

Don't spend more than a few hours on this question. You can get partial credit if you show that you're thinking along the right lines, e.g., by coming up with some smaller queries that give you reports that could be useful in solving the larger problem.

Question 3: Usability Analysis

Based on your experience in Question 1 identify three usability problems with Friendster and write instructions to their programmers on how to fix them. Extra credit: email these ideas to the Friendster folks and attach their response.

Expected length of answer: 1 page.

Question 4: Extending the Application with Distributed Computing

Jose Frio Cuarenta, senior MIT administrator, has decided that friendship networks are the next hot thing and wants to bring the magic of Friendster to MIT by building MITster. Being the total Organization Man, he conceives this as a standalone application in which every MIT-affiliated person registers and comes to visit the site. Philip '82 and Gerry '68 can discover that they both live in the same city, share some interests, and are connected through Hal 'XX [we could tell you but we'd have to kill you].

Jose brings you into the project to lay out the system requirements and your first question is Why can't they just use Friendster? Jose responds that he isn't satisfied with Friendster's community features. A group of friends should be able to start an ad-hoc group with a private discussion forum, collaborative workspace, event planner, etc.

You point out that standalone servers were great in the 1990s but this century is all about distributed computing. Maybe the goal should be to augment rather than supplant Friendster. An MIT affiliate could log into a single server and get all of the services of Friendster plus additional services for groups of MIT friends.

Clearly you're going to have to set up some sort of server to provide the additional community support features. What kind of SOAP interface would you want from Friendster? Write a 1-2-page English description of the interface that you want and one WSDL file for a portion of the interface.

Here again we're looking for a reasonable structure not a complete solution with every possible field defined. It is okay to have some parts elided and commented.

Question 5: Fun with Data Warehousing

Using the Levi Strauss data model defined in the "Real World Example" section of http://philip.greenspun.com/sql/data-warehousing, write SQL queries that answer the following questions: Suppose now that you are asked to prepare a report of average waist size of a pair of shipped pants, broken down by region of the country. To which table would you add a waist_size column and why?

Turning it in

Please email your answers to the questions to 6171-staff@mit.edu.

At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

Who Wrote This and When

This exam was written by Hal Abelson, Ben Adida, Philip Greenspun, and Andrew Grumet, in October 2003 for MIT Course 6.171.. It is copyright 2003 by the authors but may be reused provided credit is given to the original authors with a hyperlink to this document.
philg@mit.edu