Perl Programming


HOME Enhancing vi Super Sorts Graphical df Search Subdirs CPU Hogs

INDEX

HOME
Content...
About this site...
Graphical df
Super Sorts
Enhancing vi
Search Subdirs
CPU Hogs

Diagramming Web Pages with HTML-Tree


Ever get tired of poorly organized web pages that can't be navigated logically? This Perl script, html-tree, was designed to help with this and other common web site design problems. It's intended to be versatile enough to be used in a variety of ways.

html-tree performs a mapping of any set of directories on a webserver, or the entire server, and generates a graphical representation of the page system. Links are created to all mapped pages, and the map can be used to view the full organization of the site.

This program is highly configurable as well. Using a configuration file, lists of directories to map, and to ignore, may both be included. You can also specify which file types to map, and which to ignore.

For an example, here's the page setup for the user pages at the trilug.org webserver: TriLUG Web Page Map. Warning: the page listing is rather long!

Here's a shorter example. The output of html-tree was pasted directly into this HTML page:

Scott Chilcote's Web Page Map

This page was last updated on December 9, 2003 21:29:30.

Select any of the page titles below to browse the web page.

Scott Chilcote's Page
|    
|----Bike Tour of the C&O Canal Towpath
|----A Bicycle Trip through Umstead Park
|----RTP Bicycle Commuting FAQ
|----Digression
|----Umstead Ride 2
|----Umstead Ride 3
|----Umstead Park 4
|----Umstead Ride 5
|----Umstead Ride 6
|----Umstead Ride 7
|----A Bicycle Trip through Umstead Park
|    
|----Enhancing vi and vim with Perl
|----Enhancing vi and vim with Perl
|----Super Sorts for Flat Files
|----Graphical Disk Fullness with 
|----Finding CPU Hogs
|----Diagramming Web Pages with HTML-Tree
|----Diagramming Web Pages with HTML-Tree
|----Searching a Directory Tree
|----TriLUG Web Site Map
|    
|----Toyota Prius Main Page
|----Toyota Prius Controls
|----Toyota Prius Fun
|----Toyota Prius Main Page
|----Toyota Prius Nav System
|    
|----Faster Program Editing with ctags
|----Working with Unix Object Libraries
|----Setting the Xterm Title Bar
|----Shell Directory Tree Climbing
|----Working with Unix Object Libraries
|    
|----Subwoofer Project
`----Subwoofer Project


How html-tree Works

What the script does is takes the name of one or more top level directories on a web site, recursively searches below those directories, and creates a map of where all of the HTML files are. The entries in this map are labeled by the text from the TITLE tag within each HTML file. This map is created as an HTML file itself, which can then be made available to web site visitors who want a straightforward approach to browsing the server.

There are many uses for these maps. Not only do they provide a rapid and convenient way to get around web pages (especially when not provided by the pages themselves), they also provide site administrators with some useful information. Incorrectly titled pages often go overlooked; the map file makes this more visible. Pages which are intended to be unavailable are often found by html-tree.pl; it pays to have a look at the map and fix these problems.

The core of this script is based on an algorithm from the original Programming Perl by Wall & Swartz, page 56. The subroutine recursively descends directory trees. Several modifications had to be made to get this routine to generate an HTML server map that looked like a reasonable representation. In particular, it had to be changed to order files first and directories after; otherwise the branchings occurred above the HTML files.

How to Get html-tree

The script has been reasonably well commented and indented, although not rigorously so. Most of the script exists in the form of subroutines. No complex data structures have been used; instead, a large array of strings combined with separators is used to keep track of pages found. The source file, config files, and GIF images are combined into a gzipped TAR file. Click here to download this file.

To use this file, copy it to a work directory on your Unix system and type:

    gzip -d html-tree.tar.gz
    tar tvf html-tree.tar
    tar xvf html-tree.tar

Then read html-tree.README for the rest of the (short) instructions.


The above text and scripts are Copyright 1997,2001,2003 © by Scott Chilcote.

HOME HTML Tree Enhancing vi Super Sorts Search Subdirs CPU Hogs