Site Home : Software : Abstraction, Filtration, Comparison : Pager Example
This is an example of the Abstraction, Filtration, and Comparison of
two file paging utilities: more
and less
.
These utilities display text files in a terminal window one screen of
output at a time, allowing the user to view a document page by page.
The 1904 lines of source code and comments in the C programming language are spread across 44 functions and packaged into a single source file. More was written by Daniel Halbert in 1978.
The 19825 lines of source code and comments in the C programming language are spread across 255 functions and 35 different source files (concatenated to a single file for the preceding link). Less was written by Mark Nudelman between 1983 and 1985, motivated by a desire to add the capability of paging both backwards and forwards.
more
and
less
. The code sections compared, screen
from more
and forw
from less
get the next screen's worth of output from the input file.
more
does its paging with the screen
function. The first block is a comment describing the function's
purpose plus code defining the parameter list and initializing
variables. The function takes two parameters: the file to
display, and the size of each page expressed as vertical lines of
text.
/* ** Print out the contents of the file f, one screenful at a time. */ #define STOP -10 screen (f, num_lines) register FILE *f; register int num_lines; { register int c; register int nchars; int length; /* length of current line */ static int prev_len = 1; /* length of previous line */
The second block is the main loop where the heavy-lifting is done. Here single lines of text are read from the file, formatting checks are made, the line is output to the terminal, and the loop counter is updated. This is executed once for each line until the entire screen has been painted.
for (;;) { while (num_lines > 0 && !Pause) { if ((nchars = getline (f, &length)) == EOF) { if (clreol) clreos(); return; } if (ssp_opt && length == 0 && prev_len == 0) continue; prev_len = length; if (bad_so || (Senter && *Senter == ' ') && promptlen > 0) erase (0); /* must clear before drawing line since tabs on some terminals * do not erase what they tab over. */ if (clreol) cleareol (); prbuf (Line, length); if (nchars < promptlen) erase (nchars); /* erase () sets promptlen to 0 */ else promptlen = 0; /* is this needed? * if (clreol) * cleareol(); * must clear again in case we wrapped * */ if (nchars < Mcol || !fold_opt) prbuf("\n", 1); /* will turn off UL if necessary */ if (nchars == STOP) break; num_lines--; }
The last block is executed after the page has been processed. Here the
output buffer is flushed, additional formatting checks are made, and
the variables (line
and chrctr
within the
screen_start
structure) that track position within the
file are updated.
if (pstate) { tputs(ULexit, 1, putch); pstate = 0; } fflush(stdout); if ((c = Getc(f)) == EOF) { if (clreol) clreos (); return; } if (Pause && clreol) clreos (); Ungetc (c, f); setjmp (restore); Pause = 0; startup = 0; if ((num_lines = command (NULL, f)) == 0) return; if (hard && promptlen > 0) erase (0); if (noscroll && num_lines >= dlines) { if (clreol) home(); else doclear (); } screen_start.line = Currline; screen_start.chrctr = Ftell (f); } }
The forw
function also starts with a
comment describing the function's purpose as well as code defining the
parameter list and variable initialization. Notice that in addition
to terminal size, the forw
function takes in the position
within the file as well as several option control flags. One parameter
we don't see here is a pointer to a specific file.
/* * Display n lines, scrolling forward, * starting at position pos in the input file. * "force" means display the n lines even if we hit end of file. * "only_last" means display only the last screenful if n > screen size. * "nblank" is the number of blank lines to draw before the first * real line. If nblank > 0, the pos must be NULL_POSITION. * The first real line after the blanks will start at ch_zero(). */ public void forw(n, pos, force, only_last, nblank) register int n; POSITION pos; int force; int only_last; int nblank; { int eof = 0; int nlines = 0; int do_repaint; static int first_time = 1;
The next unit is called once prior to the display of each new page and carries out calculations that relate to the page as a whole. It initializes variables in preparation for the main line-by-line loop where the actual scrolling will be carried out.
squish_check(); /* * do_repaint tells us not to display anything till the end, * then just repaint the entire screen. * We repaint if we are supposed to display only the last * screenful and the request is for more than a screenful. * Also if the request exceeds the forward scroll limit * (but not if the request is for exactly a screenful, since * repainting itself involves scrolling forward a screenful). */ do_repaint = (only_last && n > sc_height-1) || (forw_scroll >= 0 && n > forw_scroll && n != sc_height-1); if (!do_repaint) { if (top_scroll && n >= sc_height - 1 && pos != ch_length()) { /* * Start a new screen. * {{ This is not really desirable if we happen * to hit eof in the middle of this screen, * but we don't yet know if that will happen. }} */ pos_clear(); add_forw_pos(pos); force = 1; if (top_scroll == OPT_ONPLUS || first_time) clear(); home(); } else { clear_bot(); /* * Remove the top n lines and scroll the rest * upward, leaving cursor at first new blank line. */ remove_top(n); } if (pos != position(BOTTOM_PLUS_ONE) || empty_screen()) { /* * This is not contiguous with what is * currently displayed. Clear the screen image * (position table) and start a new screen. */ pos_clear(); add_forw_pos(pos); force = 1; if (top_scroll) { if (top_scroll == OPT_ONPLUS) clear(); home(); } else if (!first_time) { putstr("...skipping...\n"); } } }
This third block is the main loop, which executes once for each line of output that is displayed to the screen. The loop pulls a single line from the file, updates the table used to track position within the file, prints the line to the screen, and checks to see if special output formatting is necessary.
while (--n >= 0) { /* * Read the next line of input. */ if (nblank > 0) { /* * Still drawing blanks; don't get a line * from the file yet. * If this is the last blank line, get ready to * read a line starting at ch_zero() next time. */ if (--nblank == 0) pos = ch_zero(); } else { /* * Get the next line from the file. */ pos = forw_line(pos); if (pos == NULL_POSITION) { /* * End of file: stop here unless the top line * is still empty, or "force" is true. * Even if force is true, stop when the last * line in the file reaches the top of screen. */ eof = 1; if (!force && position(TOP) != NULL_POSITION) break; if (!empty_lines(0, 0) && !empty_lines(1, 1) && empty_lines(2, sc_height-1)) break; } } /* * Add the position of the next line to the position table. * Display the current line on the screen. */ add_forw_pos(pos); nlines++; if (do_repaint) continue; /* * If this is the first screen displayed and * we hit an early EOF (i.e. before the requested * number of lines), we "squish" the display down * at the bottom of the screen. * But don't do this if a + option or a -t option * was given. These options can cause us to * start the display after the beginning of the file, * and it is not appropriate to squish in that case. */ if (first_time && pos == NULL_POSITION && !top_scroll && #if TAGS tagoption == NULL && #endif !plusoption) { squished = 1; continue; } if (top_scroll == OPT_ON) clear_eol(); put_line(); }
Finally there is a post-loop block that calls formatting and clean-up
functions. This code is run after each page of output has been
processed. Notably, this is where the repaint
function is
called to change what the user sees on his or her screen.
if (ignore_eoi) hit_eof = 0; else if (eof && !ABORT_SIGS()) hit_eof++; else eof_check(); if (nlines == 0) eof_bell(); else if (do_repaint) repaint(); first_time = 0; (void) currline(BOTTOM); }
For a C program, a sensible place to start an Abstraction Level 2
analysis is to list every function, giving a short description of its
purpose, and group those functions by source file. However, the newer
less
program contains approximately 10 times as much code
as the older more
program and is spread across 35
files. We observed a rough correspondence between each
function of more
and each file of
less
.
The source of more
is divided among 44 functions:
As noted above, the source files of less
correspond
roughly to the functions of more
and therefore
Abstraction Level 2 is a summary of the capabilities within each file:
To go up from Level 2 to Level 3, we simply collect the headlines for
each group of functions or files. These headlines are the same for both
less
and more
:
more
: displays a file as a sequence of pages, waiting
for user input between each page
less
: displays a file as a sequence of pages, waiting
for user input between each page, with the capability of going
backwards or forwards through the file
Of the example code that we selected at Abstraction Level 1, most sections outside of the main loop should be filtered.
The first blocks, which contain comments, parameter lists, and
variable initializations, should be filtered down to just the
parameter lists. The comments at the top of a function are
conventional and variable declaration and initialization is mostly
dictacted by the syntactical requirements of the C language. The
second and fourth blocks of less
and the third block of
more
contain initialization and clean-up code that are
primarily dictated by the nature of terminal output and common user
interface design conventions. Therefore much of these blocks may be
filtered out.
Here are the filtered functions, also stripped of comments:
screen (f, num_lines) ... { ... for (;;) { while (num_lines > 0 && !Pause) { if ((nchars = getline (f, &length)) == EOF) { if (clreol) clreos(); return; } ... prbuf (Line, length); ... num_lines--; } ... screen_start.line = Currline; screen_start.chrctr = Ftell (f); } }
public void forw(n, pos, force, only_last, nblank) ... { ... while (--n >= 0) { ... pos = forw_line(pos); if (pos == NULL_POSITION) { eof = 1; if (!force && position(TOP) != NULL_POSITION) break; if (!empty_lines(0, 0) && !empty_lines(1, 1) && empty_lines(2, sc_height-1)) break; } ... add_forw_pos(pos); ... put_line(); } ... }
Note that the fact that both programs loop using variables representing line count and decrement them in a while loop should be filtered out due to this being a common practice. However, these constructs were left in the above example for readability.
At Level 2 we filter out functional units not material to to the
programs in question. In the case of more
and
less
the command processing, operating system interface,
display formatting, basic input/output, and utility units exist
primarily to interface with external components of the Unix operating
system and construct a user interface that conforms to the
expectations of users of terminal applications.
In addition, some of the units at Level 2 are re-implementations of or wrappers around common system library functions. These units do not relate to the core purpose of the application and have also been filtered out in the following:
Based upon the function and file filtration at Abstraction Level 2, just two higher-level groups remain in Abstraction Level 3:
At Abstraction Level 4, the programs differ only in the ability to page both backwards and forwards through a file.
At Abstraction Level 3, the programs' architectural similarities are apparent. Both programs include modules for paging files and navigating those pages via a search mechanism.
At Abstraction Level 2, the many extra features provided by
less
are revealed. Whereas more
allows for
forward scrolling through the file and forward searching based on
regular expressions, less
allows both forward and
backward scrolling, horizontal scrolling, bracket matching,
bi-directional regular expression searching, line numbering, custom
key bindings, and customizable command prompts. At this level we also
notice that the modules from Abstraction Level 3 are segmented into
functional units differently.
This dissimilarity is magnified at Abstraction Level 1. The construction of
more
's paging routine reflects an assumption that the user will
be moving forward through the file one page at a time. less
on
the other hand is more flexible in that it will advance by a given amount and
only go forward if the forw
function is called. The strategy used
with respect to segmentation of functional units reflects these differences.
For example, less
calls other functions to update the comparatively
complex table structure it uses to keep track of offsets for each line whereas
more
uses a simple variable to track just the current line's
offset. While they share a general approach to paging files the differences
we first noticed at Level 2 are very much present at Level 1 as well.
Finally at Abstraction Level 0 we see a fundamental difference in the composition of the two programs. They are both UNIX paging utilities written in the C programming language, but differ in significant ways:
less
less
) rather than
Kernighan and Ritchie-style implicit int
typing
(more
)