Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.ar |
Analyzer for Arabic.
|
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters).
|
org.apache.lucene.analysis.cn |
Analyzer for Chinese, which indexes unigrams (individual chinese characters).
|
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words.
|
org.apache.lucene.analysis.in |
Analysis components for Indian languages.
|
org.apache.lucene.analysis.ngram |
Character n-gram tokenizers and filters.
|
org.apache.lucene.analysis.ru |
Analyzer for Russian.
|
org.apache.lucene.analysis.standard |
Standards-based analyzers implemented with JFlex.
|
org.apache.lucene.analysis.wikipedia |
Tokenizer that is aware of Wikipedia syntax.
|
org.apache.lucene.util |
Some utility classes.
|
Modifier and Type | Class | Description |
---|---|---|
static class |
Token.TokenAttributeFactory |
Expert: Creates a TokenAttributeFactory returning
Token as instance for the basic attributes
and for all other attributes calls the given delegate factory. |
Modifier and Type | Field | Description |
---|---|---|
static AttributeSource.AttributeFactory |
Token.TOKEN_ATTRIBUTE_FACTORY |
Convenience factory that returns
Token as implementation for the basic
attributes and return the default impl (with "Impl" appended) for all other
attributes. |
Constructor | Description |
---|---|
CharTokenizer(AttributeSource.AttributeFactory factory,
Reader input) |
Deprecated.
|
CharTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader input) |
Creates a new
CharTokenizer instance |
KeywordTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
int bufferSize) |
|
LetterTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
LetterTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in) |
Construct a new LetterTokenizer using a given
AttributeSource.AttributeFactory . |
LowerCaseTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
LowerCaseTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in) |
Construct a new LowerCaseTokenizer using a given
AttributeSource.AttributeFactory . |
MockTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
int pattern,
boolean lowerCase,
int maxTokenLength) |
|
NumericTokenStream(AttributeSource.AttributeFactory factory,
int precisionStep) |
Expert: Creates a token stream for numeric values with the specified
precisionStep using the given
AttributeSource.AttributeFactory . |
TokenAttributeFactory(AttributeSource.AttributeFactory delegate) |
Expert: Creates an AttributeFactory returning
Token as instance for the basic attributes
and for all other attributes calls the given delegate factory. |
Tokenizer(AttributeSource.AttributeFactory factory) |
Deprecated.
use
Tokenizer(AttributeSource.AttributeFactory, Reader) instead. |
Tokenizer(AttributeSource.AttributeFactory factory,
Reader input) |
Construct a token stream processing the given input using the given AttributeFactory.
|
TokenStream(AttributeSource.AttributeFactory factory) |
A TokenStream using the supplied AttributeFactory for creating new
Attribute instances. |
WhitespaceTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
WhitespaceTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in) |
Construct a new WhitespaceTokenizer using a given
AttributeSource.AttributeFactory . |
Constructor | Description |
---|---|
ArabicLetterTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
ArabicLetterTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
Construct a new ArabicLetterTokenizer using a given
AttributeSource.AttributeFactory . |
Constructor | Description |
---|---|
CJKTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
Constructor | Description |
---|---|
ChineseTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
Constructor | Description |
---|---|
SentenceTokenizer(AttributeSource.AttributeFactory factory,
Reader reader) |
Constructor | Description |
---|---|
IndicTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader input) |
Deprecated.
|
Constructor | Description |
---|---|
EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
String sideLabel,
int minGram,
int maxGram) |
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
|
EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
EdgeNGramTokenizer.Side side,
int minGram,
int maxGram) |
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
|
NGramTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
int minGram,
int maxGram) |
Creates NGramTokenizer with given min and max n-grams.
|
Constructor | Description |
---|---|
RussianLetterTokenizer(AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
|
RussianLetterTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in) |
Deprecated.
Construct a new RussianLetterTokenizer using a given
AttributeSource.AttributeFactory . |
Constructor | Description |
---|---|
ClassicTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader input) |
Creates a new ClassicTokenizer with a given
AttributeSource.AttributeFactory |
StandardTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader input) |
Creates a new StandardTokenizer with a given
AttributeSource.AttributeFactory |
UAX29URLEmailTokenizer(AttributeSource.AttributeFactory factory,
Reader input) |
Deprecated.
|
UAX29URLEmailTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader input) |
Creates a new UAX29URLEmailTokenizer with a given
AttributeSource.AttributeFactory |
Constructor | Description |
---|---|
WikipediaTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
int tokenOutput,
Set<String> untokenizedTypes) |
Creates a new instance of the
WikipediaTokenizer . |
Modifier and Type | Field | Description |
---|---|---|
static AttributeSource.AttributeFactory |
AttributeSource.AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY |
This is the default factory that creates
AttributeImpl s using the
class name of the supplied Attribute interface class by appending Impl to it. |
Modifier and Type | Method | Description |
---|---|---|
AttributeSource.AttributeFactory |
AttributeSource.getAttributeFactory() |
returns the used AttributeFactory.
|
Constructor | Description |
---|---|
AttributeSource(AttributeSource.AttributeFactory factory) |
An AttributeSource using the supplied
AttributeSource.AttributeFactory for creating new Attribute instances. |
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.