From 802da9dd5d4bc18f46a916eedc0c5c1980a15e59 Mon Sep 17 00:00:00 2001 From: Lorry Tar Creator Date: Sun, 17 Mar 2013 20:07:05 +0000 Subject: Imported from /home/lorry/working-area/delta_docbook-xsl/docbook-xsl-1.78.1.tar.bz2. --- webhelp/docs/ch03s02.html | 178 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 178 insertions(+) create mode 100644 webhelp/docs/ch03s02.html (limited to 'webhelp/docs/ch03s02.html') diff --git a/webhelp/docs/ch03s02.html b/webhelp/docs/ch03s02.html new file mode 100644 index 0000000..8fd6f85 --- /dev/null +++ b/webhelp/docs/ch03s02.html @@ -0,0 +1,178 @@ + + + + +Search - - README: Web-based Help from DocBook XML

Search

+

Overview design of Search mechanism.

The serching is a fully client-side implementation of querying texts for content + searching. There's no server involved. So, the search queries by the users are processed by + JavaScript inside the browser, and displays the matching results by comparing the query with + a simplified 'index' that too resides in JavaScript. Mainly the search mechanism has two + parts.

  • Indexing: First we need to traverse the content in + the docs folder and index the words in it. This is done + by webhelpindexer.jar in + xsl/extentions/ folder. You can + invoke it by ant index command from the + root of webhelp of directory. The source of + webhelpindexer is now moved to it's own location at + trunk/xsl-webhelpindexer/. + Checkout the Docbook trunk svn directory to get this + source. Then, do your changes and recompile it by simply + running ant command. My assumption is that + it can be opened by Netbeans IDE by one click. Or if you + are using IntelliJ Idea, you can simply create a new + project from existing sources. Indexer has extensive + support for features such as word scoring, stemming of + words, and support for languages English, German, + French. For CJK (Chinese, Japanese, Korean) languages, + it uses bi-gram tokenizing to break up the words (since + CJK languages does not have spaces between + words).

    When ant index is run, it generates five output files:

    • htmlFileList.js - This contains an array named + fl which stores details all the files indexed by the indexer. + Further, the doStem in it defines whether stemming should be used. It defaults + to false.

    • htmlFileInfoList.js - + This includes some meta data about the indexed + files in an array named fil. It + includes details about file name, file (html) + title, a summary of the content. Format would look + like, fil["4"]= "ch03.html@@@Developer + Docs@@@This chapter provides an overview of how + webhelp is implemented."; +

    • index-*.js (Three index files) - These three files + actually stores the index of the content. Index is added to an array named + w.

  • Querying: Query processing happens totally in client side. Following JavaScript + files handles them.

    • nwSearchFnt.js - This handles the user query and + returns the search results. It does query word tokenizing, drop unnecessary + punctuations and common words, do stemming if docbook language supports it, + etc.

    • {$indexer-language-code}_stemmer.js - This includes the + stemming library. nwSearchFnt.js file calls + stemmer method in this file for stemming. ex: var stem = + stemmer(foobar); +

    +

+

-- cgit v1.2.1