From 802da9dd5d4bc18f46a916eedc0c5c1980a15e59 Mon Sep 17 00:00:00 2001 From: Lorry Tar Creator Date: Sun, 17 Mar 2013 20:07:05 +0000 Subject: Imported from /home/lorry/working-area/delta_docbook-xsl/docbook-xsl-1.78.1.tar.bz2. --- webhelp/docs/ch03s02s01.html | 192 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 192 insertions(+) create mode 100644 webhelp/docs/ch03s02s01.html (limited to 'webhelp/docs/ch03s02s01.html') diff --git a/webhelp/docs/ch03s02s01.html b/webhelp/docs/ch03s02s01.html new file mode 100644 index 0000000..e9d07ed --- /dev/null +++ b/webhelp/docs/ch03s02s01.html @@ -0,0 +1,192 @@ + + + + +New Stemmers - - README: Web-based Help from DocBook XML

New Stemmers

+

Adding new Stemmers is very simple.

Currently, only English, French, and German stemmers are integrated in to WebHelp. But + the code is extensible such that you can add new stemmers easily by few steps.

What you need:

  • You'll need two versions of the stemmer; One written in JavaScript, and another + in Java. But fortunately, Snowball contains Java stemmers for number of popular + languages, and are already included with the package. You can see the full list in + Adding support for other (non-CJKV) languages. + If your language is listed there, Then you have to find javascript version of the + stemmer. Generally, new stemmers are getting added in to Snowball Stemmers in + other languages location. If javascript stemmer for your language is + available, then download it. Else, you can write a new stemmer in JavaScript using + SnowBall algorithm fairly easily. Algorithms are at Snowball.

  • Then, name the JS stemmer exactly like this: + {$language-code}_stemmer.js. + For example, for Italian(it), name it as, + it_stemmer.js. Then, copy it to + the + docbook-webhelp/template/search/stemmers/ + folder. (I assumed + docbook-webhelp is the root + folder for webhelp.)

    Note

    Make sure you changed the + webhelp.indexer.language property + in build.properties to your + language.

    +

  • Now two easy changes needed for the indexer.

    • Open + docbook-webhelp/indexer/src/com/nexwave/nquindexer/IndexerTask.java + in a text editor and add your language code to the + supportedLanguages String Array.

      Example 2. Add new language to supportedLanguages array

      change the Array from, +

      +private String[] supportedLanguages= {"en", "de", "fr", "cn", "ja", "ko"}; 
      +    //currently extended support available for
      +    // English, German, French and CJK (Chinese, Japanese, Korean) languages only.
      +

      + To,

      +private String[] supportedLanguages= {"en", "de", "fr", "cn", "ja", "ko", "it"}; 
      +  //currently extended support available for
      +  // English, German, French, CJK (Chinese, Japanese, Korean), and Italian languages only.
      +                    


    • Now, open + docbook-webhelp/indexer/src/com/nexwave/nquindexer/SaxHTMLIndex.java + and add the following line to the code where it initializes the Stemmer (Search + for SnowballStemmer stemmer;). Then add code to initialize the + stemmer Object in your language. It's self understandable. See the example. The + class names are at: + docbook-webhelp/indexer/src/com/nexwave/stemmer/snowball/ext/.

      Example 3. Initialize correct stemmer based on the + webhelp.indexer.language specified

      +      SnowballStemmer stemmer;
      +      if(indexerLanguage.equalsIgnoreCase("en")){
      +           stemmer = new EnglishStemmer();
      +      } else if (indexerLanguage.equalsIgnoreCase("de")){
      +          stemmer= new GermanStemmer();
      +      } else if (indexerLanguage.equalsIgnoreCase("fr")){
      +          stemmer= new FrenchStemmer();
      +      }
      +else if (indexerLanguage.equalsIgnoreCase("it")){ //If language code is "it" (Italian)
      +          stemmer= new italianStemmer();  //Initialize the stemmer to italianStemmer object.
      +      }       
      +      else {
      +          stemmer = null;
      +      }
      +


+

That's all. Now run ant build-indexer to compile and build the java code. + Then, run ant webhelp to generate the output from your docbook file. For any + questions, contact us or email to the docbook mailing list + .

-- cgit v1.2.1