Abstraction, Filtration, and Comparison of more and less

by Philip Greenspun and John Morgan; updated March 2011

Site Home : Software : Abstraction, Filtration, Comparison : Pager Example


This is an example of the Abstraction, Filtration, and Comparison of two file paging utilities: more and less. These utilities display text files in a terminal window one screen of output at a time, allowing the user to view a document page by page.

Abstraction Level 0: code

more

The 1904 lines of source code and comments in the C programming language are spread across 44 functions and packaged into a single source file. More was written by Daniel Halbert in 1978.

less

The 19825 lines of source code and comments in the C programming language are spread across 255 functions and 35 different source files (concatenated to a single file for the preceding link). Less was written by Mark Nudelman between 1983 and 1985, motivated by a desire to add the capability of paging both backwards and forwards.

Abstraction Level 1: functional units

For illustrating the process to be carried out at Abstraction Level 1, we'll look at corresponding sections from more and less. The code sections compared, screen from more and forw from less get the next screen's worth of output from the input file.

more

more does its paging with the screen function. The first block is a comment describing the function's purpose plus code defining the parameter list and initializing variables. The function takes two parameters: the file to display, and the size of each page expressed as vertical lines of text.

/*
** Print out the contents of the file f, one screenful at a time.
*/

#define STOP -10

screen (f, num_lines)
register FILE *f;
register int num_lines;
{
    register int c;
    register int nchars;
    int length;			/* length of current line */
    static int prev_len = 1;	/* length of previous line */

The second block is the main loop where the heavy-lifting is done. Here single lines of text are read from the file, formatting checks are made, the line is output to the terminal, and the loop counter is updated. This is executed once for each line until the entire screen has been painted.

    for (;;) {
	while (num_lines > 0 && !Pause) {
	    if ((nchars = getline (f, &length)) == EOF)
	    {
		if (clreol)
		    clreos();
		return;
	    }
	    if (ssp_opt && length == 0 && prev_len == 0)
		continue;
	    prev_len = length;
	    if (bad_so || (Senter && *Senter == ' ') && promptlen > 0)
		erase (0);
	    /* must clear before drawing line since tabs on some terminals
	     * do not erase what they tab over.
	     */
	    if (clreol)
		cleareol ();
	    prbuf (Line, length);
	    if (nchars < promptlen)
		erase (nchars);	/* erase () sets promptlen to 0 */
	    else promptlen = 0;
	    /* is this needed?
	     * if (clreol)
	     *	cleareol();	* must clear again in case we wrapped *
	     */
	    if (nchars < Mcol || !fold_opt)
		prbuf("\n", 1);	/* will turn off UL if necessary */
	    if (nchars == STOP)
		break;
	    num_lines--;
	}

The last block is executed after the page has been processed. Here the output buffer is flushed, additional formatting checks are made, and the variables (line and chrctr within the screen_start structure) that track position within the file are updated.

	if (pstate) {
		tputs(ULexit, 1, putch);
		pstate = 0;
	}
	fflush(stdout);
	if ((c = Getc(f)) == EOF)
	{
	    if (clreol)
		clreos ();
	    return;
	}

	if (Pause && clreol)
	    clreos ();
	Ungetc (c, f);
	setjmp (restore);
	Pause = 0; startup = 0;
	if ((num_lines = command (NULL, f)) == 0)
	    return;
	if (hard && promptlen > 0)
		erase (0);
	if (noscroll && num_lines >= dlines)
	{
	    if (clreol)
		home();
	    else
		doclear ();
	}
	screen_start.line = Currline;
	screen_start.chrctr = Ftell (f);
    }
}

less

The forw function also starts with a comment describing the function's purpose as well as code defining the parameter list and variable initialization. Notice that in addition to terminal size, the forw function takes in the position within the file as well as several option control flags. One parameter we don't see here is a pointer to a specific file.

/*
 * Display n lines, scrolling forward, 
 * starting at position pos in the input file.
 * "force" means display the n lines even if we hit end of file.
 * "only_last" means display only the last screenful if n > screen size.
 * "nblank" is the number of blank lines to draw before the first
 *   real line.  If nblank > 0, the pos must be NULL_POSITION.
 *   The first real line after the blanks will start at ch_zero().
 */
	public void
forw(n, pos, force, only_last, nblank)
	register int n;
	POSITION pos;
	int force;
	int only_last;
	int nblank;
{
	int eof = 0;
	int nlines = 0;
	int do_repaint;
	static int first_time = 1;

The next unit is called once prior to the display of each new page and carries out calculations that relate to the page as a whole. It initializes variables in preparation for the main line-by-line loop where the actual scrolling will be carried out.

	squish_check();

	/*
	 * do_repaint tells us not to display anything till the end, 
	 * then just repaint the entire screen.
	 * We repaint if we are supposed to display only the last 
	 * screenful and the request is for more than a screenful.
	 * Also if the request exceeds the forward scroll limit
	 * (but not if the request is for exactly a screenful, since
	 * repainting itself involves scrolling forward a screenful).
	 */
	do_repaint = (only_last && n > sc_height-1) || 
		(forw_scroll >= 0 && n > forw_scroll && n != sc_height-1);

	if (!do_repaint)
	{
		if (top_scroll && n >= sc_height - 1 && pos != ch_length())
		{
			/*
			 * Start a new screen.
			 * {{ This is not really desirable if we happen
			 *    to hit eof in the middle of this screen,
			 *    but we don't yet know if that will happen. }}
			 */
			pos_clear();
			add_forw_pos(pos);
			force = 1;
			if (top_scroll == OPT_ONPLUS || first_time)
				clear();
			home();
		} else
		{
			clear_bot();
			/*
			 * Remove the top n lines and scroll the rest
			 * upward, leaving cursor at first new blank line.
			 */
			remove_top(n);
		}

		if (pos != position(BOTTOM_PLUS_ONE) || empty_screen())
		{
			/*
			 * This is not contiguous with what is
			 * currently displayed.  Clear the screen image 
			 * (position table) and start a new screen.
			 */
			pos_clear();
			add_forw_pos(pos);
			force = 1;
			if (top_scroll)
			{
				if (top_scroll == OPT_ONPLUS)
					clear();
				home();
			} else if (!first_time)
			{
				putstr("...skipping...\n");
			}
		}
	}

This third block is the main loop, which executes once for each line of output that is displayed to the screen. The loop pulls a single line from the file, updates the table used to track position within the file, prints the line to the screen, and checks to see if special output formatting is necessary.

	while (--n >= 0)
	{
		/*
		 * Read the next line of input.
		 */
		if (nblank > 0)
		{
			/*
			 * Still drawing blanks; don't get a line 
			 * from the file yet.
			 * If this is the last blank line, get ready to
			 * read a line starting at ch_zero() next time.
			 */
			if (--nblank == 0)
				pos = ch_zero();
		} else
		{
			/* 
			 * Get the next line from the file.
			 */
			pos = forw_line(pos);
			if (pos == NULL_POSITION)
			{
				/*
				 * End of file: stop here unless the top line 
				 * is still empty, or "force" is true.
				 * Even if force is true, stop when the last
				 * line in the file reaches the top of screen.
				 */
				eof = 1;
				if (!force && position(TOP) != NULL_POSITION)
					break;
				if (!empty_lines(0, 0) && 
				    !empty_lines(1, 1) &&
				     empty_lines(2, sc_height-1))
					break;
			}
		}
		/*
		 * Add the position of the next line to the position table.
		 * Display the current line on the screen.
		 */
		add_forw_pos(pos);
		nlines++;
		if (do_repaint)
			continue;
		/*
		 * If this is the first screen displayed and
		 * we hit an early EOF (i.e. before the requested
		 * number of lines), we "squish" the display down
		 * at the bottom of the screen.
		 * But don't do this if a + option or a -t option
		 * was given.  These options can cause us to
		 * start the display after the beginning of the file,
		 * and it is not appropriate to squish in that case.
		 */
		if (first_time && pos == NULL_POSITION && !top_scroll && 
#if TAGS
		    tagoption == NULL &&
#endif
		    !plusoption)
		{
			squished = 1;
			continue;
		}
		if (top_scroll == OPT_ON)
			clear_eol();
		put_line();
	}

Finally there is a post-loop block that calls formatting and clean-up functions. This code is run after each page of output has been processed. Notably, this is where the repaint function is called to change what the user sees on his or her screen.

	if (ignore_eoi)
		hit_eof = 0;
	else if (eof && !ABORT_SIGS())
		hit_eof++;
	else
		eof_check();
	if (nlines == 0)
		eof_bell();
	else if (do_repaint)
		repaint();
	first_time = 0;
	(void) currline(BOTTOM);
}

Abstraction Level 2: purpose of low-level modules

For a C program, a sensible place to start an Abstraction Level 2 analysis is to list every function, giving a short description of its purpose, and group those functions by source file. However, the newer less program contains approximately 10 times as much code as the older more program and is spread across 35 files. We observed a rough correspondence between each function of more and each file of less.

more

The source of more is divided among 44 functions:

less

As noted above, the source files of less correspond roughly to the functions of more and therefore Abstraction Level 2 is a summary of the capabilities within each file:

Abstraction Level 3: purpose of high-level modules

To go up from Level 2 to Level 3, we simply collect the headlines for each group of functions or files. These headlines are the same for both less and more:

Abstraction Level 4: behavior

Filtration of Abstraction Level 1

Of the example code that we selected at Abstraction Level 1, most sections outside of the main loop should be filtered.

The first blocks, which contain comments, parameter lists, and variable initializations, should be filtered down to just the parameter lists. The comments at the top of a function are conventional and variable declaration and initialization is mostly dictacted by the syntactical requirements of the C language. The second and fourth blocks of less and the third block of more contain initialization and clean-up code that are primarily dictated by the nature of terminal output and common user interface design conventions. Therefore much of these blocks may be filtered out.

Here are the filtered functions, also stripped of comments:

more

screen (f, num_lines)
...
{
...
   for (;;) {
 while (num_lines > 0 && !Pause) {
    if ((nchars = getline (f, &length)) == EOF)
    {
 if (clreol)
     clreos();
 return;
    }
...
    prbuf (Line, length);
...
    num_lines--;
 }
...
 screen_start.line = Currline;
 screen_start.chrctr = Ftell (f);

   }
}

less

 public void
forw(n, pos, force, only_last, nblank)
...
{
...
 while (--n >= 0)
 {
...
 pos = forw_line(pos);
 if (pos == NULL_POSITION)
 {
  eof = 1;
  if (!force && position(TOP) != NULL_POSITION)
   break;
  if (!empty_lines(0, 0) && !empty_lines(1, 1) && empty_lines(2, sc_height-1))
   break;
 }
...
 add_forw_pos(pos);
...
 put_line();
 }
...
}

Note that the fact that both programs loop using variables representing line count and decrement them in a while loop should be filtered out due to this being a common practice. However, these constructs were left in the above example for readability.

Filtration of Abstraction Level 2

At Level 2 we filter out functional units not material to to the programs in question. In the case of more and less the command processing, operating system interface, display formatting, basic input/output, and utility units exist primarily to interface with external components of the Unix operating system and construct a user interface that conforms to the expectations of users of terminal applications.

In addition, some of the units at Level 2 are re-implementations of or wrappers around common system library functions. These units do not relate to the core purpose of the application and have also been filtered out in the following:

more

less

Filtration of Abstraction Level 3

Based upon the function and file filtration at Abstraction Level 2, just two higher-level groups remain in Abstraction Level 3:

Comparison

At Abstraction Level 4, the programs differ only in the ability to page both backwards and forwards through a file.

At Abstraction Level 3, the programs' architectural similarities are apparent. Both programs include modules for paging files and navigating those pages via a search mechanism.

At Abstraction Level 2, the many extra features provided by less are revealed. Whereas more allows for forward scrolling through the file and forward searching based on regular expressions, less allows both forward and backward scrolling, horizontal scrolling, bracket matching, bi-directional regular expression searching, line numbering, custom key bindings, and customizable command prompts. At this level we also notice that the modules from Abstraction Level 3 are segmented into functional units differently.

This dissimilarity is magnified at Abstraction Level 1. The construction of more's paging routine reflects an assumption that the user will be moving forward through the file one page at a time. less on the other hand is more flexible in that it will advance by a given amount and only go forward if the forw function is called. The strategy used with respect to segmentation of functional units reflects these differences. For example, less calls other functions to update the comparatively complex table structure it uses to keep track of offsets for each line whereas more uses a simple variable to track just the current line's offset. While they share a general approach to paging files the differences we first noticed at Level 2 are very much present at Level 1 as well.

Finally at Abstraction Level 0 we see a fundamental difference in the composition of the two programs. They are both UNIX paging utilities written in the C programming language, but differ in significant ways:


johnpatrickmorgan@gmail.com