diff options
Diffstat (limited to 'debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html')
-rw-r--r-- | debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html | 239 |
1 files changed, 239 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html b/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html new file mode 100644 index 00000000..2acec1d2 --- /dev/null +++ b/debian/htdig/htdig-3.2.0b6/htdoc/htfuzzy.html @@ -0,0 +1,239 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> +<html> + <head> + <title> + ht://Dig: htfuzzy + </title> + </head> + <body bgcolor="#eef7ff"> + <h1> + htfuzzy + </h1> + <p> + ht://Dig Copyright © 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br> + Please see the file <a href="COPYING">COPYING</a> for + license information. + </p> + <hr size="4" noshade> + <dl> + <dd> + <h2> + Synopsis + </h2> + </dd> + <dd> + htfuzzy [-c <em>configfile</em>][-v] <em>algorithm</em> ... + </dd> + </dl> + <dl> + <dd> + <h2> + Description + </h2> + </dd> + <dd> + Htfuzzy creates indexes for different "fuzzy" search + algorithms. These indexes can then be used by the + <a href="htsearch.html" target="_top">htsearch</a> program. + </dd> + </dl> + <dl> + <dd> + <h2> + Options + </h2> + </dd> + <dd> + <dl compact> + <dt> + -c <em>configfile</em> + </dt> + <dd> + Use the specified configuration file instead of the + default. + </dd> + <dt> + -v + </dt> + <dd> + Verbose mode. Used once will provide progress feedback, + used more than once will overflow even the biggest + buffers. :-) + </dd> + </dl> + </dd> + </dl> + <dl> + <dd> + <h2> + Algorithms + </h2> + </dd> + <dd> + Indexes for the following search algorithms can currently + be created: + <dl> + <dt> + <strong>soundex</strong> + </dt> + <dd> + Creates a slightly modified <a href="http://www.sog.org.uk/cig/vol6/605tdrake.pdf">soundex</a> key database. + A soundex key encodes letters as digits, with similar + sounding letters (c, k, q) given the same digit. Vowels + are not coded. + Differences with the standard soundex algorithm are: + <ul> + <li> + Keys are 6 digits. + </li> + <li> + The first letter is also encoded. + </li> + </ul> + </dd> + <dt> + <strong>metaphone</strong> + </dt> + <dd> + Creates a metaphone key database. This algorithm is + more specific to English, but will get fewer "weird" + matches than the soundex algorithm. + </dd> + <dt> + <strong>accents</strong> + </dt> + <dd> + Creates an accents key database. This algorithm will + map all accented letters to their unaccented + counterparts, so that a search for the unaccented + word will yield all variations of this word with + accents. + </dd> + <dt> + <strong>endings</strong> + </dt> + <dd> + Creates two databases which can be used to match common + word endings. The creation of these databases requires + a list of affix rules and a dictionary which uses those + affix rules. The format of the affix rules and + dictionary files are the ones used by the + <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html"> + ispell</a> program. Included with the distribution are + the affix rules for English and a fairly small English + dictionary. Other languages can be supported by getting + the appropriate affix rules and dictionaries. These are + available for many languages; check the ispell + distribution for more details. + </dd> + <dt> + <strong>synonyms</strong> + </dt> + <dd> + Creates a database of synonyms for words. It reads a + text database of synonyms and creates a database that + htsearch can then use. Each line of the text database + consists of words where the first word will have the + other words on that line as synonyms. + </dd> + </dl> + </dd> + </dl> + <dl> + <dd> + <h2> + Files + </h2> + </dd> + <dd> + <dl> + <dt> + <a href="attrs.html#config_dir">CONFIG_DIR</a>/htdig.conf + </dt> + <dd> + The default configuration file. + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.accents.db + </dt> + <dd> + (Output) Maps between characters with and without + accents for accents fuzzy rule + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.metaphone.db + </dt> + <dd> + (Output) Database of similar-sounding words for + metaphone fuzzy rule + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#database_dir">DATABASE_DIR</a>/db.soundex.db + </dt> + <dd> + (Output) Database of similar-sounding words for soundex + fuzzy rule + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#common_dir">COMMON_DIR</a>/english.0, <a href="attrs.html#common_dir">COMMON_DIR</a>/english.aff + </dt> + <dd> + (Input) List of words and affix rules used to generate + endings + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#common_dir">COMMON_DIR</a>/root2word.db, <a href="attrs.html#common_dir">COMMON_DIR</a>/word2rood.db + </dt> + <dd> + (Output) Database used for endings fuzzy rule + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#common_dir">COMMON_DIR</a>/synonyms + </dt> + <dd> + (Input) List of groups of words considered synonymous + </dd> + </dl> + <dl> + <dt> + <a href="attrs.html#common_dir">COMMON_DIR</a>/synonyms.db + </dt> + <dd> + (Output) Database used for synonyms fuzzy rule + </dd> + </dl> + </dd> + </dl> + <dl> + <dd> + <h2> + See Also + </h2> + </dd> + <dd> + <a href="htdig.html">htdig</a>, + <a href="htmerge.html">htmerge</a>, + <a href="htsearch.html" target="_top">htsearch</a>, + <a href="attrs.html">Configuration file format</a>, and + <a href="http://fmg-www.cs.ucla.edu/fmg-members/geoff/ispell.html"> + ispell</a>. + </dd> + </dl> + <hr size="4" noshade> + + Last modified: $Date: 2004/06/12 13:39:13 $ + + </body> +</html> |