public class TeraInputFormat extends FileInputFormat<Text,Text>
LOG
Constructor and Description |
---|
TeraInputFormat() |
Modifier and Type | Method and Description |
---|---|
RecordReader<Text,Text> |
getRecordReader(InputSplit split,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit . |
InputSplit[] |
getSplits(JobConf conf,
int splits)
Splits files returned by
FileInputFormat.listStatus(JobConf) when
they're too big. |
static void |
writePartitionFile(JobConf conf,
Path partFile)
Use the input splits to take samples of the input and generate sample
keys.
|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
public static void writePartitionFile(JobConf conf, Path partFile) throws IOException
conf
- the job to samplepartFile
- where to write the output file toIOException
- if something goes wrongpublic RecordReader<Text,Text> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException
InputFormat
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
in interface InputFormat<Text,Text>
getRecordReader
in class FileInputFormat<Text,Text>
split
- the InputSplit
job
- the job that this split belongs toRecordReader
IOException
public InputSplit[] getSplits(JobConf conf, int splits) throws IOException
FileInputFormat
FileInputFormat.listStatus(JobConf)
when
they're too big.getSplits
in interface InputFormat<Text,Text>
getSplits
in class FileInputFormat<Text,Text>
conf
- job configuration.splits
- the desired number of splits, a hint.InputSplit
s for the job.IOException
Copyright © 2010 The Apache Software Foundation