summaryrefslogtreecommitdiffstats
path: root/debian/htdig/htdig-3.2.0b6/htdoc/all.html
diff options
context:
space:
mode:
Diffstat (limited to 'debian/htdig/htdig-3.2.0b6/htdoc/all.html')
-rw-r--r--debian/htdig/htdig-3.2.0b6/htdoc/all.html137
1 files changed, 137 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/htdoc/all.html b/debian/htdig/htdig-3.2.0b6/htdoc/all.html
new file mode 100644
index 00000000..1f625a57
--- /dev/null
+++ b/debian/htdig/htdig-3.2.0b6/htdoc/all.html
@@ -0,0 +1,137 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
+<html>
+ <head>
+ <title>
+ ht://Dig: Overview of Programs
+ </title>
+ </head>
+ <body bgcolor="#eef7ff">
+ <h1>
+ Overview of Programs
+ </h1>
+ <p>
+ ht://Dig Copyright &copy; 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br>
+ Please see the file <a href="COPYING">COPYING</a> for
+ license information.
+ </p>
+ <hr size="4" noshade>
+ <p>
+ There are several programs in the ht://Dig package.
+ </p>
+ <h3>
+ <a href="htdig.html">htdig</a>
+ </h3>
+ <p>
+ Digging is the first step in creating a search database. This
+ system uses the word <em>digging</em> while other systems call
+ it <em>harvesting</em> or <em>gathering</em>. In the ht://Dig
+ system, the program <a href="htdig.html">htdig</a> performs
+ the information gathering stage. In this process, the program
+ will act as a regular web user, except that it will follow
+ <em>all</em> hyperlinks that it comes across. (Actually, it
+ will not follow all of them, just those that are within the
+ domain it needs to gather information on...)<br>
+ Each document it goes to is examined and all the unique
+ words in this document are extracted and stored.
+ </p>
+ <p>
+ The digging process will <em>only</em> follow links and has
+ no notion of JavaScript, applets, or user-input forms.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htsearch.html" target="_top">htsearch</a>
+ </h3>
+ <p>
+ Searching is where the users actually get to use all the
+ information that was gathered during the dig and merge
+ stages. The <a href="htsearch.html" target="_top">
+ htsearch</a> program performs the actual searches. It typically
+ produces <code>HTML</code> output which will be seen by the
+ users, though other text formats could be generated by
+ editing the output templates.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htmerge.html">htmerge</a>
+ </h3>
+ <p>
+ Merging does exactly that--it merges one database
+ into another. In previous versions of ht://Dig, the htmerge
+ program also formed databases for use by htsearch from the
+ htdig output. This process is now largely unnecessary except
+ for removal of invalid URLs which is now done by the htpurge
+ program.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htpurge.html">htpurge</a>
+ </h3>
+ <p>
+ Purging removes documents and the associated words from the
+ databases. This should be done after running htdig to remove
+ invalid URLs, documents marked not to be indexed, old
+ versions of modified documents, etc. You can also specify
+ specific URLs to be removed explicitly by htpurge.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htload.html">htload</a>
+ </h3>
+ <p>
+ Loading involves importing the contents of the databases
+ from formatted ASCII text documents as created by htdump or
+ the -t flag from htdig. This is, of course, destructive by
+ nature and data from the text files will replace any
+ conflicting data in the databases.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htdump.html">htdump</a>
+ </h3>
+ <p>
+ Dumping involves exporting the contents of the databases to
+ formatted ASCII text documents. This can be useful for
+ backups, transferring databases between different operating
+ systems, changing the compression or encodings in the
+ ht://Dig configuration, parsing by external utilities. It is
+ <em>not</em> recommended to edit these files by hand, so be
+ warned! (Minor edits will probably be fine.)
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htstat.html">htstat</a>
+ </h3>
+ <p>
+ The htstat program returns statistics on the databases,
+ similar to the -s flags for some of the programs. In
+ addition, it can return a list of URLs in the databases.
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htnotify.html">htnotify</a>
+ </h3>
+ <p>
+ The ht://Dig system includes a handy reminder service which
+ allows HTML authors to add some ht://Dig specific <a href="meta.html">meta
+ information</a> in HTML documents. This meta information is
+ used to email authors after a specified date. Very useful
+ to maintain lists that contain those annoying &quot;new&quot;
+ graphics with new items. (Hint: Things really aren't all
+ that new anymore after 6 months!)<br>
+ </p>
+ <hr noshade>
+ <h3>
+ <a href="htfuzzy.html">htfuzzy</a>
+ </h3>
+ <p>
+ To allow the searches to use &quot;fuzzy&quot; algorithms to match
+ words, the <a href="htfuzzy.html">htfuzzy</a> program can
+ create indexes for several different algorithms.
+ </p>
+ <hr size="4" noshade>
+
+ Last modified: $Date: 2004/05/28 13:15:17 $
+
+ </body>
+</html>