1 files changed, 102 insertions, 0 deletions
diff --git a/debian/htdig/htdig-3.2.0b6/htdoc/hts_method.html b/debian/htdig/htdig-3.2.0b6/htdoc/hts_method.html
new file mode 100644
index 00000000..d4a7c676
--- /dev/null
+++ b/debian/htdig/htdig-3.2.0b6/htdoc/hts_method.html
@@ -0,0 +1,102 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
+<html>
+  <head>
+	<title>
+	  ht://Dig: htsearch
+	</title>
+  </head>
+  <body bgcolor="#eef7ff">
+	<h1>
+	  htsearch
+	</h1>
+	<p>
+	  ht://Dig Copyright &copy; 1995-2004 <a href="THANKS.html">The ht://Dig Group</a><br>
+	  Please see the file <a href="COPYING">COPYING</a> for
+	  license information.
+	</p>
+	<hr size="4" noshade>
+	<h2>
+	  Search Method Used
+	</h2>
+	<p>
+	  The way htsearch performs it search and applies its ranking
+	  rules are fairly complicated. This is an attempt at explaining
+	  in global terms what goes on when htsearch searches.
+	</p>
+	<p>
+	  htsearch gets a list of (case insensitive) words from the HTML
+	  form that invoked
+	  it. If htsearch was invoked with boolean expression parsing
+	  enabled, it will do a quick syntax check on the input words.
+	  If there are syntax errors, it will display the syntax error
+	  file that is specified with the
+	  <a href="attrs.html#syntax_error_file">syntax_error_file</a>
+	  attribute.
+	</p>
+	<p>
+	  If the boolean parser was not enabled, the list of words is
+	  converted into a boolean expression by putting either "and"s
+	  or "or"s between the words. (This depends on the search
+	  type.)  Phrases within double quotes (") specify that the words
+	  must occur sequentially within the document.
+	</p>
+	<p>
+	  If a word is immediately preceeded by a field specifer
+	  (title:, heading:, author:, keyword:, descr:, link:, url:)
+	  then it will only match documents in which the word occurred
+	  within field.  For example, descr:foo only matches documents
+	  containing &lt;meta value="description" value="... foo ..."&gt;.
+	  The link: field refers to the text in the hyperlinks to a document,
+	  rather than text within the document itself.  Similarly url:
+	  (will eventually) refer to the actual URL of the document, not any
+	  of its contents.
+	  The prefixes exact: and hidden: are also accepted.
+	  The former (will) cause the
+	  <a href="attrs.html#search_algorithm">fuzzy search algorithm</a>
+	  not to be applied to this word, while the latter causes the word
+	  not to be displayed in the query string of the results page.
+	</p>
+	<p>
+	  Each of the words in the list (but not within a phrase) is now
+	  expanded using the search algorithms that were specified in the
+	  <a href="attrs.html#search_algorithm">search_algorithm</a>
+	  attribute. For example, the endings algorithm will convert a
+	  word like "person" into "person or persons". In this fashion,
+	  all the specified algorithms are used on each of the words
+	  and the result is a new boolean expression.
+	</p>
+	<p>
+	  The next step is to perform database lookups on the words in
+	  the expression. The result of these lookups are then passed
+	  to the boolean expression parser.
+	</p>
+	<p>
+	  The boolean expression parser is a simple recursive descent
+	  parser with an operand stack. It knows how to deal with
+	  "not", "and", "or" and parenthesis. The result of the parser
+	  will be one set of matches.<br>
+	  Note that the operator "not" is used as the word 'without' and
+	  is binary: You can not write "cat and not dog" or just "not
+	  dog" but you can write "cat not dog".
+	</p>
+	<p>
+	  At this point, the matches are ranked. The rank of a match is
+	  determined by the weight of the words that caused the match
+	  and the weight of the algorithm that generated the word. Word
+	  weights are generally determined by the importance of the
+	  word in a document. For example, words in the title of a
+	  document have a much higher weight than words at the bottom
+	  of the document.
+	</p>
+	<p>
+	  Finally, when the document ranks have been determined and the
+	  documents sorted, the resulting matches are displayed. If
+	  paged output is required, only a subset of all the matches
+	  will be displayed.
+	</p>
+	<hr size="4" noshade>
+
+	Last modified: $Date: 2004/05/28 13:15:18 $
+
+  </body>
+</html>