org.basex.index
Class FTTokenizer

java.lang.Object
  extended by org.basex.index.IndexToken
      extended by org.basex.index.FTTokenizer

public final class FTTokenizer
extends IndexToken

Full-text tokenizer.

Author:
Workgroup DBIS, University of Konstanz 2005-08, ISC License, Christian Gruen

Nested Class Summary
 
Nested classes/interfaces inherited from class org.basex.index.IndexToken
IndexToken.Type
 
Field Summary
 boolean cs
          Sensitivity flag.
 boolean dc
          Diacritics flag.
 boolean fz
          Fuzzy flag.
 boolean lc
          Lowercase flag.
 boolean lp
          Flag for loading ftposition data.
 int p
          Current character position.
 int para
          Current paragraph.
 int pos
          Current token.
 int s
          Character start position.
 int sent
          Current sentence.
 boolean st
          Stemming flag.
 boolean uc
          Uppercase flag.
 boolean wc
          Wildcard flag.
 
Fields inherited from class org.basex.index.IndexToken
text, type
 
Constructor Summary
FTTokenizer()
          Empty constructor.
FTTokenizer(byte[] txt)
          Constructor.
 
Method Summary
 int count()
          Counts the number of tokens.
 byte[] get()
          Returns the current index token.
 TokenList getTokenList()
          Converts the tokens to a TokenList.
 void init()
          Initializes the iterator.
 void init(byte[] txt)
          Sets the text.
 boolean more()
          Checks if more tokens are to be returned.
 int size()
          Returns the text size.
 java.lang.String toString()
           
 
Methods inherited from class org.basex.index.IndexToken
range
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

st

public boolean st
Stemming flag.


dc

public boolean dc
Diacritics flag.


cs

public boolean cs
Sensitivity flag.


uc

public boolean uc
Uppercase flag.


lc

public boolean lc
Lowercase flag.


wc

public boolean wc
Wildcard flag.


fz

public boolean fz
Fuzzy flag.


lp

public boolean lp
Flag for loading ftposition data.


sent

public int sent
Current sentence.


para

public int para
Current paragraph.


pos

public int pos
Current token.


p

public int p
Current character position.


s

public int s
Character start position.

Constructor Detail

FTTokenizer

public FTTokenizer()
Empty constructor.


FTTokenizer

public FTTokenizer(byte[] txt)
Constructor.

Parameters:
txt - text
Method Detail

init

public void init(byte[] txt)
Sets the text.

Parameters:
txt - text

init

public void init()
Initializes the iterator.


more

public boolean more()
Checks if more tokens are to be returned.

Returns:
result of check

get

public byte[] get()
Description copied from class: IndexToken
Returns the current index token. Can be overwritten by an implementation to return other tokens, as is done in the FTTokenizer.

Overrides:
get in class IndexToken
Returns:
token

count

public int count()
Counts the number of tokens.

Returns:
number of tokens

size

public int size()
Returns the text size.

Returns:
size

getTokenList

public TokenList getTokenList()
Converts the tokens to a TokenList.

Returns:
TokenList

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object