Skip to content

A jupyter notebook outlining the concept of Paice method of evaluating stemmers

Notifications You must be signed in to change notification settings

tias112/weight_paice_method

 
 

Repository files navigation

paice_method

This jupyter notebook extends the concept of Paice method of evaluating stemmers for search application. This aproach introduces a weight for comparing overstemming (OI) / understermming (UI) scores. Additional weight is calculated based on word stats from application vocabulary. img.png

This approach has the advantage with a metric for the quality of a stemmer sensitive to application for search.

The second advantage of proposed method is that the effects of words in large concept groups(for example with all verbal forms) do not dominate the results. The weight of each word is equal.

Contents

  • stemmers.py: Different stemming classes that extends base Stemmer.
  • solr_client.py: client to solr with basic API for analysis.

Example

report1.png

About

A jupyter notebook outlining the concept of Paice method of evaluating stemmers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.6%
  • Python 10.4%