diff options
Diffstat (limited to 'debian/htdig/htdig-3.2.0b6/htdoc/all.html')
-rw-r--r-- | debian/htdig/htdig-3.2.0b6/htdoc/all.html | 137 |
1 files changed, 137 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/htdoc/all.html b/debian/htdig/htdig-3.2.0b6/htdoc/all.html new file mode 100644 index 00000000..1f625a57 --- /dev/null +++ b/debian/htdig/htdig-3.2.0b6/htdoc/all.html @@ -0,0 +1,137 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> +<html> + <head> + <title> + ht://Dig: Overview of Programs + </title> + </head> + <body bgcolor="#eef7ff"> + <h1> + Overview of Programs + </h1> + <p> + ht://Dig Copyright © 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br> + Please see the file <a href="COPYING">COPYING</a> for + license information. + </p> + <hr size="4" noshade> + <p> + There are several programs in the ht://Dig package. + </p> + <h3> + <a href="htdig.html">htdig</a> + </h3> + <p> + Digging is the first step in creating a search database. This + system uses the word <em>digging</em> while other systems call + it <em>harvesting</em> or <em>gathering</em>. In the ht://Dig + system, the program <a href="htdig.html">htdig</a> performs + the information gathering stage. In this process, the program + will act as a regular web user, except that it will follow + <em>all</em> hyperlinks that it comes across. (Actually, it + will not follow all of them, just those that are within the + domain it needs to gather information on...)<br> + Each document it goes to is examined and all the unique + words in this document are extracted and stored. + </p> + <p> + The digging process will <em>only</em> follow links and has + no notion of JavaScript, applets, or user-input forms. + </p> + <hr noshade> + <h3> + <a href="htsearch.html" target="_top">htsearch</a> + </h3> + <p> + Searching is where the users actually get to use all the + information that was gathered during the dig and merge + stages. The <a href="htsearch.html" target="_top"> + htsearch</a> program performs the actual searches. It typically + produces <code>HTML</code> output which will be seen by the + users, though other text formats could be generated by + editing the output templates. + </p> + <hr noshade> + <h3> + <a href="htmerge.html">htmerge</a> + </h3> + <p> + Merging does exactly that--it merges one database + into another. In previous versions of ht://Dig, the htmerge + program also formed databases for use by htsearch from the + htdig output. This process is now largely unnecessary except + for removal of invalid URLs which is now done by the htpurge + program. + </p> + <hr noshade> + <h3> + <a href="htpurge.html">htpurge</a> + </h3> + <p> + Purging removes documents and the associated words from the + databases. This should be done after running htdig to remove + invalid URLs, documents marked not to be indexed, old + versions of modified documents, etc. You can also specify + specific URLs to be removed explicitly by htpurge. + </p> + <hr noshade> + <h3> + <a href="htload.html">htload</a> + </h3> + <p> + Loading involves importing the contents of the databases + from formatted ASCII text documents as created by htdump or + the -t flag from htdig. This is, of course, destructive by + nature and data from the text files will replace any + conflicting data in the databases. + </p> + <hr noshade> + <h3> + <a href="htdump.html">htdump</a> + </h3> + <p> + Dumping involves exporting the contents of the databases to + formatted ASCII text documents. This can be useful for + backups, transferring databases between different operating + systems, changing the compression or encodings in the + ht://Dig configuration, parsing by external utilities. It is + <em>not</em> recommended to edit these files by hand, so be + warned! (Minor edits will probably be fine.) + </p> + <hr noshade> + <h3> + <a href="htstat.html">htstat</a> + </h3> + <p> + The htstat program returns statistics on the databases, + similar to the -s flags for some of the programs. In + addition, it can return a list of URLs in the databases. + </p> + <hr noshade> + <h3> + <a href="htnotify.html">htnotify</a> + </h3> + <p> + The ht://Dig system includes a handy reminder service which + allows HTML authors to add some ht://Dig specific <a href="meta.html">meta + information</a> in HTML documents. This meta information is + used to email authors after a specified date. Very useful + to maintain lists that contain those annoying "new" + graphics with new items. (Hint: Things really aren't all + that new anymore after 6 months!)<br> + </p> + <hr noshade> + <h3> + <a href="htfuzzy.html">htfuzzy</a> + </h3> + <p> + To allow the searches to use "fuzzy" algorithms to match + words, the <a href="htfuzzy.html">htfuzzy</a> program can + create indexes for several different algorithms. + </p> + <hr size="4" noshade> + + Last modified: $Date: 2004/05/28 13:15:17 $ + + </body> +</html> |