org.basex.query.ft
Class Scoring

java.lang.Object
  extended by org.basex.query.ft.Scoring

public final class Scoring
extends Object

Simple default scoring model, assembling all scoring calculations.

Author:
Workgroup DBIS, University of Konstanz 2005-10, ISC License, Christian Gruen

Field Summary
static int MP
          Scoring multiplier to store values as integers.
 
Constructor Summary
Scoring()
           
 
Method Summary
 double and(double o, double n)
          Combines two scoring values.
 double let(double s, int c)
          Returns a score for the let clause.
 double not(double d)
          Inverses the scoring value for FTNot.
 double or(double o, double n)
          Combines two scoring values.
static double phrase(double w1, double w2)
          Returns the scoring value for a phrase.
static double step(double sc)
          Returns a score for a single step.
static double textNode(double npv, double is, double tokl, double tl)
          Returns the score for a text node.
static int tfIDF(double freq, double mfreq, double docs, double tokens)
          Returns a tf-idf for the specified values.
static double union(double w1, double w2)
          Returns the union value.
 double word(int tl, double l)
          Calculates a score value, based on the token length and complete text length.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MP

public static final int MP
Scoring multiplier to store values as integers.

See Also:
Constant Field Values
Constructor Detail

Scoring

public Scoring()
Method Detail

word

public double word(int tl,
                   double l)
Calculates a score value, based on the token length and complete text length.

Parameters:
tl - token length
l - complete length
Returns:
result

and

public double and(double o,
                  double n)
Combines two scoring values.

Parameters:
o - old value
n - new value
Returns:
result

or

public double or(double o,
                 double n)
Combines two scoring values.

Parameters:
o - old value
n - new value
Returns:
result

not

public double not(double d)
Inverses the scoring value for FTNot.

Parameters:
d - scoring value
Returns:
inverse scoring value

let

public double let(double s,
                  int c)
Returns a score for the let clause.

Parameters:
s - summed up scoring values
c - number of values
Returns:
new score value

tfIDF

public static int tfIDF(double freq,
                        double mfreq,
                        double docs,
                        double tokens)
Returns a tf-idf for the specified values. Used definition: freq(i, j) / max(l, freq(l, j)) * log(1 + N / n(i)). The result is multiplied with the MP constant to yield integer values. The value 2 is used as minimum score, as the total minimum value will be subtracted by 1 to avoid eventual 0 scores.

Parameters:
freq - frequency of the token. TF: freq(i, j)
mfreq - maximum occurrence of a token. TF: max(l, freq(l, j))
docs - number of documents in the collection. IDF: N
tokens - number of documents containing the token. IDF: n(i)
Returns:
score value

textNode

public static double textNode(double npv,
                              double is,
                              double tokl,
                              double tl)
Returns the score for a text node. Used when no index score is available.

Parameters:
npv - number of pos values
is - index size
tokl - token length
tl - text length
Returns:
score value

phrase

public static double phrase(double w1,
                            double w2)
Returns the scoring value for a phrase.

Parameters:
w1 - score of word1
w2 - score of word2
Returns:
score of the phrase

union

public static double union(double w1,
                           double w2)
Returns the union value.

Parameters:
w1 - score of word1
w2 - score of word2
Returns:
score of the phrase

step

public static double step(double sc)
Returns a score for a single step.

Parameters:
sc - current score value
Returns:
new score value