summaryrefslogtreecommitdiffstats
path: root/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html
diff options
context:
space:
mode:
Diffstat (limited to 'debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html')
-rw-r--r--debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html239
1 files changed, 239 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html b/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html
new file mode 100644
index 00000000..2acec1d2
--- /dev/null
+++ b/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html
@@ -0,0 +1,239 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
+<html>
+ <head>
+ <title>
+ ht://Dig: htfuzzy
+ </title>
+ </head>
+ <body bgcolor="#eef7ff">
+ <h1>
+ htfuzzy
+ </h1>
+ <p>
+ ht://Dig Copyright &copy; 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br>
+ Please see the file <a href="COPYING">COPYING</a> for
+ license information.
+ </p>
+ <hr size="4" noshade>
+ <dl>
+ <dd>
+ <h2>
+ Synopsis
+ </h2>
+ </dd>
+ <dd>
+ htfuzzy [-c <em>configfile</em>][-v] <em>algorithm</em> ...
+ </dd>
+ </dl>
+ <dl>
+ <dd>
+ <h2>
+ Description
+ </h2>
+ </dd>
+ <dd>
+ Htfuzzy creates indexes for different "fuzzy" search
+ algorithms. These indexes can then be used by the
+ <a href="htsearch.html" target="_top">htsearch</a> program.
+ </dd>
+ </dl>
+ <dl>
+ <dd>
+ <h2>
+ Options
+ </h2>
+ </dd>
+ <dd>
+ <dl compact>
+ <dt>
+ -c <em>configfile</em>
+ </dt>
+ <dd>
+ Use the specified configuration file instead of the
+ default.
+ </dd>
+ <dt>
+ -v
+ </dt>
+ <dd>
+ Verbose mode. Used once will provide progress feedback,
+ used more than once will overflow even the biggest
+ buffers. :-)
+ </dd>
+ </dl>
+ </dd>
+ </dl>
+ <dl>
+ <dd>
+ <h2>
+ Algorithms
+ </h2>
+ </dd>
+ <dd>
+ Indexes for the following search algorithms can currently
+ be created:
+ <dl>
+ <dt>
+ <strong>soundex</strong>
+ </dt>
+ <dd>
+ Creates a slightly modified <a href="http://www.sog.org.uk/cig/vol6/605tdrake.pdf">soundex</a> key database.
+ A soundex key encodes letters as digits, with similar
+ sounding letters (c, k, q) given the same digit. Vowels
+ are not coded.
+ Differences with the standard soundex algorithm are:
+ <ul>
+ <li>
+ Keys are 6 digits.
+ </li>
+ <li>
+ The first letter is also encoded.
+ </li>
+ </ul>
+ </dd>
+ <dt>
+ <strong>metaphone</strong>
+ </dt>
+ <dd>
+ Creates a metaphone key database. This algorithm is
+ more specific to English, but will get fewer "weird"
+ matches than the soundex algorithm.
+ </dd>
+ <dt>
+ <strong>accents</strong>
+ </dt>
+ <dd>
+ Creates an accents key database. This algorithm will
+ map all accented letters to their unaccented
+ counterparts, so that a search for the unaccented
+ word will yield all variations of this word with
+ accents.
+ </dd>
+ <dt>
+ <strong>endings</strong>
+ </dt>
+ <dd>
+ Creates two databases which can be used to match common
+ word endings. The creation of these databases requires
+ a list of affix rules and a dictionary which uses those
+ affix rules. The format of the affix rules and
+ dictionary files are the ones used by the
+ <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html">
+ ispell</a> program. Included with the distribution are
+ the affix rules for English and a fairly small English
+ dictionary. Other languages can be supported by getting
+ the appropriate affix rules and dictionaries. These are
+ available for many languages; check the ispell
+ distribution for more details.
+ </dd>
+ <dt>
+ <strong>synonyms</strong>
+ </dt>
+ <dd>
+ Creates a database of synonyms for words. It reads a
+ text database of synonyms and creates a database that
+ htsearch can then use. Each line of the text database
+ consists of words where the first word will have the
+ other words on that line as synonyms.
+ </dd>
+ </dl>
+ </dd>
+ </dl>
+ <dl>
+ <dd>
+ <h2>
+ Files
+ </h2>
+ </dd>
+ <dd>
+ <dl>
+ <dt>
+ <a href="attrs.html#config_dir">CONFIG_DIR</a>/htdig.conf
+ </dt>
+ <dd>
+ The default configuration file.
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.accents.db
+ </dt>
+ <dd>
+ (Output) Maps between characters with and without
+ accents for accents fuzzy rule
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.metaphone.db
+ </dt>
+ <dd>
+ (Output) Database of similar-sounding words for
+ metaphone fuzzy rule
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.soundex.db
+ </dt>
+ <dd>
+ (Output) Database of similar-sounding words for soundex
+ fuzzy rule
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#common_dir">COMMON_DIR</a>/english.0, <a href="attrs.html#common_dir">COMMON_DIR</a>/english.aff
+ </dt>
+ <dd>
+ (Input) List of words and affix rules used to generate
+ endings
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#common_dir">COMMON_DIR</a>/root2word.db, <a href="attrs.html#common_dir">COMMON_DIR</a>/word2rood.db
+ </dt>
+ <dd>
+ (Output) Database used for endings fuzzy rule
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#common_dir">COMMON_DIR</a>/synonyms
+ </dt>
+ <dd>
+ (Input) List of groups of words considered synonymous
+ </dd>
+ </dl>
+ <dl>
+ <dt>
+ <a href="attrs.html#common_dir">COMMON_DIR</a>/synonyms.db
+ </dt>
+ <dd>
+ (Output) Database used for synonyms fuzzy rule
+ </dd>
+ </dl>
+ </dd>
+ </dl>
+ <dl>
+ <dd>
+ <h2>
+ See Also
+ </h2>
+ </dd>
+ <dd>
+ <a href="htdig.html">htdig</a>,
+ <a href="htmerge.html">htmerge</a>,
+ <a href="htsearch.html" target="_top">htsearch</a>,
+ <a href="attrs.html">Configuration file format</a>, and
+ <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html">
+ ispell</a>.
+ </dd>
+ </dl>
+ <hr size="4" noshade>
+
+ Last modified: $Date: 2004/06/12 13:39:13 $
+
+ </body>
+</html>