Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.ar |
Analyzer for Arabic.
|
org.apache.lucene.analysis.bg |
Analyzer for Bulgarian.
|
org.apache.lucene.analysis.br |
Analyzer for Brazilian Portuguese.
|
org.apache.lucene.analysis.ca |
Analyzer for Catalan.
|
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters).
|
org.apache.lucene.analysis.cn |
Analyzer for Chinese, which indexes unigrams (individual chinese characters).
|
org.apache.lucene.analysis.cz |
Analyzer for Czech.
|
org.apache.lucene.analysis.da |
Analyzer for Danish.
|
org.apache.lucene.analysis.de |
Analyzer for German.
|
org.apache.lucene.analysis.el |
Analyzer for Greek.
|
org.apache.lucene.analysis.en |
Analyzer for English.
|
org.apache.lucene.analysis.es |
Analyzer for Spanish.
|
org.apache.lucene.analysis.eu |
Analyzer for Basque.
|
org.apache.lucene.analysis.fa |
Analyzer for Persian.
|
org.apache.lucene.analysis.fi |
Analyzer for Finnish.
|
org.apache.lucene.analysis.fr |
Analyzer for French.
|
org.apache.lucene.analysis.ga |
Analysis for Irish.
|
org.apache.lucene.analysis.gl |
Analyzer for Galician.
|
org.apache.lucene.analysis.hi |
Analyzer for Hindi.
|
org.apache.lucene.analysis.hu |
Analyzer for Hungarian.
|
org.apache.lucene.analysis.hy |
Analyzer for Armenian.
|
org.apache.lucene.analysis.id |
Analyzer for Indonesian.
|
org.apache.lucene.analysis.it |
Analyzer for Italian.
|
org.apache.lucene.analysis.ja |
Analyzer for Japanese.
|
org.apache.lucene.analysis.lv |
Analyzer for Latvian.
|
org.apache.lucene.analysis.miscellaneous |
Miscellaneous TokenStreams
|
org.apache.lucene.analysis.nl |
Analyzer for Dutch.
|
org.apache.lucene.analysis.no |
Analyzer for Norwegian.
|
org.apache.lucene.analysis.pl |
Analyzer for Polish.
|
org.apache.lucene.analysis.pt |
Analyzer for Portuguese.
|
org.apache.lucene.analysis.ro |
Analyzer for Romanian.
|
org.apache.lucene.analysis.ru |
Analyzer for Russian.
|
org.apache.lucene.analysis.standard |
Standards-based analyzers implemented with JFlex.
|
org.apache.lucene.analysis.sv |
Analyzer for Swedish.
|
org.apache.lucene.analysis.th |
Analyzer for Thai.
|
org.apache.lucene.analysis.tr |
Analyzer for Turkish.
|
Modifier and Type | Class | Description |
---|---|---|
class |
KeywordAnalyzer |
"Tokenizes" the entire stream as a single token.
|
class |
SimpleAnalyzer |
|
class |
StopAnalyzer |
|
class |
StopwordAnalyzerBase |
Base class for Analyzers that need to make use of stopword sets.
|
class |
WhitespaceAnalyzer |
An Analyzer that uses
WhitespaceTokenizer . |
Modifier and Type | Method | Description |
---|---|---|
protected static CharArraySet |
StopwordAnalyzerBase.loadStopwordSet(boolean ignoreCase,
Class<? extends ReusableAnalyzerBase> aClass,
String resource,
String comment) |
Creates a CharArraySet from a file resource associated with a class.
|
Modifier and Type | Class | Description |
---|---|---|
class |
ArabicAnalyzer |
Analyzer for Arabic. |
Modifier and Type | Class | Description |
---|---|---|
class |
BulgarianAnalyzer |
Analyzer for Bulgarian. |
Modifier and Type | Class | Description |
---|---|---|
class |
BrazilianAnalyzer |
Analyzer for Brazilian Portuguese language. |
Modifier and Type | Class | Description |
---|---|---|
class |
CatalanAnalyzer |
Analyzer for Catalan. |
Modifier and Type | Class | Description |
---|---|---|
class |
CJKAnalyzer |
An
Analyzer that tokenizes text with StandardTokenizer ,
normalizes content with CJKWidthFilter , folds case with
LowerCaseFilter , forms bigrams of CJK with CJKBigramFilter ,
and filters stopwords with StopFilter |
Modifier and Type | Class | Description |
---|---|---|
class |
ChineseAnalyzer |
Deprecated.
Use
StandardAnalyzer instead, which has the same functionality. |
Modifier and Type | Class | Description |
---|---|---|
class |
CzechAnalyzer |
Analyzer for Czech language. |
Modifier and Type | Class | Description |
---|---|---|
class |
DanishAnalyzer |
Analyzer for Danish. |
Modifier and Type | Class | Description |
---|---|---|
class |
GermanAnalyzer |
Analyzer for German language. |
Modifier and Type | Class | Description |
---|---|---|
class |
GreekAnalyzer |
Analyzer for the Greek language. |
Modifier and Type | Class | Description |
---|---|---|
class |
EnglishAnalyzer |
Analyzer for English. |
Modifier and Type | Class | Description |
---|---|---|
class |
SpanishAnalyzer |
Analyzer for Spanish. |
Modifier and Type | Class | Description |
---|---|---|
class |
BasqueAnalyzer |
Analyzer for Basque. |
Modifier and Type | Class | Description |
---|---|---|
class |
PersianAnalyzer |
Analyzer for Persian. |
Modifier and Type | Class | Description |
---|---|---|
class |
FinnishAnalyzer |
Analyzer for Finnish. |
Modifier and Type | Class | Description |
---|---|---|
class |
FrenchAnalyzer |
Analyzer for French language. |
Modifier and Type | Class | Description |
---|---|---|
class |
IrishAnalyzer |
Analyzer for Irish. |
Modifier and Type | Class | Description |
---|---|---|
class |
GalicianAnalyzer |
Analyzer for Galician. |
Modifier and Type | Class | Description |
---|---|---|
class |
HindiAnalyzer |
Analyzer for Hindi.
|
Modifier and Type | Class | Description |
---|---|---|
class |
HungarianAnalyzer |
Analyzer for Hungarian. |
Modifier and Type | Class | Description |
---|---|---|
class |
ArmenianAnalyzer |
Analyzer for Armenian. |
Modifier and Type | Class | Description |
---|---|---|
class |
IndonesianAnalyzer |
Analyzer for Indonesian (Bahasa)
|
Modifier and Type | Class | Description |
---|---|---|
class |
ItalianAnalyzer |
Analyzer for Italian. |
Modifier and Type | Class | Description |
---|---|---|
class |
JapaneseAnalyzer |
Analyzer for Japanese that uses morphological analysis.
|
Modifier and Type | Class | Description |
---|---|---|
class |
LatvianAnalyzer |
Analyzer for Latvian. |
Modifier and Type | Class | Description |
---|---|---|
class |
PatternAnalyzer |
Efficient Lucene analyzer/tokenizer that preferably operates on a String rather than a
Reader , that can flexibly separate text into terms via a regular expression Pattern
(with behaviour identical to String.split(String) ),
and that combines the functionality of
LetterTokenizer ,
LowerCaseTokenizer ,
WhitespaceTokenizer ,
StopFilter into a single efficient
multi-purpose class. |
Modifier and Type | Class | Description |
---|---|---|
class |
DutchAnalyzer |
Analyzer for Dutch language. |
Modifier and Type | Class | Description |
---|---|---|
class |
NorwegianAnalyzer |
Analyzer for Norwegian. |
Modifier and Type | Class | Description |
---|---|---|
class |
PolishAnalyzer |
Analyzer for Polish. |
Modifier and Type | Class | Description |
---|---|---|
class |
PortugueseAnalyzer |
Analyzer for Portuguese. |
Modifier and Type | Class | Description |
---|---|---|
class |
RomanianAnalyzer |
Analyzer for Romanian. |
Modifier and Type | Class | Description |
---|---|---|
class |
RussianAnalyzer |
Analyzer for Russian language. |
Modifier and Type | Class | Description |
---|---|---|
class |
ClassicAnalyzer |
Filters
ClassicTokenizer with ClassicFilter , LowerCaseFilter and StopFilter , using a list of
English stop words. |
class |
StandardAnalyzer |
Filters
StandardTokenizer with StandardFilter , LowerCaseFilter and StopFilter , using a list of
English stop words. |
class |
UAX29URLEmailAnalyzer |
Filters
UAX29URLEmailTokenizer
with StandardFilter ,
LowerCaseFilter and
StopFilter , using a list of
English stop words. |
Modifier and Type | Class | Description |
---|---|---|
class |
SwedishAnalyzer |
Analyzer for Swedish. |
Modifier and Type | Class | Description |
---|---|---|
class |
ThaiAnalyzer |
Analyzer for Thai language. |
Modifier and Type | Class | Description |
---|---|---|
class |
TurkishAnalyzer |
Analyzer for Turkish. |
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.