Driver and Job Configuration

`@Driver`

Applied at the class level to the driver class, this annotation is used by the container to identify available drivers and output helpful information if description, version, etc. are set. Additionally you can use this annotation to apply one or more listeners to the driver. If you don't specify an id, the container should generate one for you based on the class name. The convention is to convert all letters to lower case and split on embedded capitals, inserting a hyphen between at the word splits. It will also trim the words 'Driver' or 'Tool' from the end of the class name.

Example Usage:

@Driver("annotated-wordcount-v1")
public class AnnotatedWordCount { ... }

Elements:

element	type	default value	description	required?
value	string	derived from driver class name	id (unique to job jar) used by container to identify and execute this driver	N
version	string	null	software version	N
description	string	none	user-friendly description	N
listener	Class or array of classes implementing `ToolListener`	null	For handling driver lifecycle events	N
context	Class	null	Sets the context for the job. Not typically used.	N

`@DriverContext`

Field-level annotation marking which Java class (simple POJO) to use as our context. The context specifies and models command-line argument values.

Example Usage:

//Specify the context to use throughout the job
@DriverContext
private DriverContextBase context;

`@Option`

A field-level annotation for configuring a command-line option.

Elements:

element	type	default value	description	required?
name	string	annotated property	the long argument name	N
shortName	"cmdParam" specifies the -f short flag name in the command line no ""
description="A description of what this parameter does" specifies the help info on the command line parameter no ""
argName
no ""
argCount=2 specifies the number of arguments expected after flag no 1
required=true specifies whether or not the parameter is required no false
defaultValue="defaultVal" specifies a default value for the parameter if none is entered no ""
Usage example:

public class WordcountContext extends DriverContextBase {
	@Option(required=true, argName="local-file", description="Blacklist to apply to the wordcount.")
	private String blacklist; //the name or the variable to access via context.blacklist
}

`@JobInfo`

The @JobInfo annotation is applied at the field level to the Job instance in your driver.

Example Usage:

@JobInfo(value="Annotated Word Count v1")
@MapperInfo(WordCountMapper.class)
@ReducerInfo(WordCountReducer.class)
@FileInput(TextInputFormat.class)
@FileOutput(TextOutputFormat.class)
private Job job;

Elements:

element	type	default value	description	required?
name/value	string	null	A human-readable name of the job	N
numReducers	int	null	Set the number of reducers used	N
map	`@MapperInfo`	null	Sets the mapper via the `@MapperInfo` annotation	N
reduce	`@ReducerInfo`	null	Sets the reducer via the `@ReducerInfo` annotation	N
combine	`@CombinerInfo`	null	Sets the combiner via the `@CombinerInfo` annotation	N
sorter	`@Sorter`	null	Sets the sorter via the `@Sorter` annotation	N
grouping	`@Grouping`	null	Sets the grouping class via the `@Grouping` annotation	N
partitioner	`@Partitioner`	null	Sets the partitioner via the `@Partitioner` annotation	N

`@MapperInfo`

Specify the mapper class to use for this job. The output key/value types are read from the class parameters (if available). Otherwise you must provide a @KeyValue for the output element.

Elements:

element	type	default value	description	required?
value	class<? extends Mapper>	Mapper.class	specifies the mapper class to be used by the job	Y
output	@KeyValue	null	specifies the output key/value types of the mapper.	N

`@ReducerInfo`

Parameters Description Required Default value=Reducer.class specifies the reducer class to be used by the job no org.apache.hadoop.mapreduce.Reducer.class output=@KeyValue(key=LongWritable.class, value=Text.class) specifies the output types of the reducer no @KeyValue defaults (currently key=void.class value=void.class)

`@CombinerInfo`

Parameters Description Required Default value=Combiner.class sets a combiner class no org.apache.hadoop.mapreduce.Reducer.class

`@FileInput`

Parameters Description Required Default value=TextInputFormat.class set the class to be used for handling file input no FileInputFormat.class path="${context.input}" set where to get the filepath from no "${context.input}"

`@FileOutput`

Parameters Description Required Default value=TextOutputFormat.class set the class to be used for handling file output no FileOutputFormat.class path="${context.output}" set where to get the output path from no "#{context.output}"

`@MultiInput`

Parameters Description Required Default value={@Input(mapper=Mapper1.class,format=AvroKeyInputFormat.class), @Input(mapper=Mapper2.class,path="${context.input2}"} Use @Input annotations to specify which input format and mapper to use on a per-path basis yes none

`@Input`

Only to be used within the @MultiInput annotation Parameters Description Required Default mapper=mapper.class sets the mapper class to be used by that input no Mapper.class path="${context.input}" sets where to get the input path from no "${context.input}" format=AvroKeyInputFomat.class sets the class to be used for handling file input no TextInputFormat.class

`@Sorter`

Parameters Description Required Default value=Sorter.class sets a sorter class no NULLCOMPARATOR.class

`@Grouping`

Parameters Description Required Default value=Grouping.class sets a grouping class no NULLCOMPARATOR.class

`@Partitioner`

Parameters Description Required Default value=Partitioner.class sets a partitioner class no NULLPARTITIONER.class

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Driver and Job Configuration

`@Driver`

Example Usage:

Elements:

`@DriverContext`

Example Usage:

`@Option`

Elements:

`@JobInfo`

Example Usage:

Elements:

`@MapperInfo`

Elements:

`@ReducerInfo`

`@CombinerInfo`

`@FileInput`

`@FileOutput`

`@MultiInput`

`@Input`

`@Sorter`

`@Grouping`

`@Partitioner`

Clone this wiki locally