|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.silverpeas.search.indexEngine.analysis.SilverTokenizer
public class SilverTokenizer
A grammar-based tokenizer constructed with JFlex
This should be a good tokenizer for most European-language documents:
Many applications have specific tokenizer needs. If this tokenizer does not suit your application, please consider copying this source code directory to your project and maintaining your own grammar-based tokenizer.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
|---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
| Field Summary |
|---|
| Fields inherited from class org.apache.lucene.analysis.Tokenizer |
|---|
input |
| Constructor Summary | |
|---|---|
SilverTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader input)
Creates a new StandardTokenizer with a given AttributeSource.AttributeFactory |
|
SilverTokenizer(org.apache.lucene.util.AttributeSource source,
Reader input)
Creates a new StandardTokenizer with a given AttributeSource. |
|
SilverTokenizer(Reader input)
Creates a new instance of the StandardTokenizer. |
|
| Method Summary | |
|---|---|
void |
end()
|
int |
getMaxTokenLength()
|
boolean |
incrementToken()
|
boolean |
isReplaceInvalidAcronym()
Deprecated. Remove in 3.X and make true the only valid value |
void |
reset(Reader reader)
|
void |
setMaxTokenLength(int length)
Set the max allowed token length. |
void |
setReplaceInvalidAcronym(boolean replaceInvalidAcronym)
Deprecated. Remove in 3.X and make true the only valid value See https://issues.apache.org/jira/browse/LUCENE-1068 |
| Methods inherited from class org.apache.lucene.analysis.Tokenizer |
|---|
close, correctOffset |
| Methods inherited from class org.apache.lucene.analysis.TokenStream |
|---|
reset |
| Methods inherited from class org.apache.lucene.util.AttributeSource |
|---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public SilverTokenizer(Reader input)
StandardTokenizer.
Attaches the
input to the newly created JFlex scanner.
input - The input reader
See http://issues.apache.org/jira/browse/LUCENE-1068
public SilverTokenizer(org.apache.lucene.util.AttributeSource source,
Reader input)
AttributeSource.
public SilverTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader input)
AttributeSource.AttributeFactory
| Method Detail |
|---|
public void setMaxTokenLength(int length)
public int getMaxTokenLength()
setMaxTokenLength(int)
public final boolean incrementToken()
throws IOException
incrementToken in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic final void end()
end in class org.apache.lucene.analysis.TokenStream
public void reset(Reader reader)
throws IOException
reset in class org.apache.lucene.analysis.TokenizerIOException@Deprecated public boolean isReplaceInvalidAcronym()
@Deprecated public void setReplaceInvalidAcronym(boolean replaceInvalidAcronym)
replaceInvalidAcronym - Set to true to replace mischaracterized acronyms as HOST.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||