Driver and Job Configuration

The primary focus of Mara is on simplifying job configuration. Therefore the majority of the annotations pertain to specifying and configuring the various job components. The best way to learn about the various annotations available is to review the mara-examples project and look in com.conversantmedia.mapreduce.tool.annotations package in the source code.

A few of the most common configuration-related annotations are described below:

Job Configuration
Input/Output
- @FileInput
- @FileOutput

Job Configuration

`@Driver`

Applied at the class level to the driver class, this annotation is used by the container to identify available drivers and output helpful information if description, version, etc. are set. Additionally you can use this annotation to apply one or more listeners to the driver. If you don't specify an id, the container should generate one for you based on the class name. The convention is to convert all letters to lower case and split on embedded capitals, inserting a hyphen between at the word splits. It will also trim the words 'Driver' or 'Tool' from the end of the class name.

Example Usage:

@Driver("annotated-wordcount-v1")
public class AnnotatedWordCount { ... }

Elements:

element	type	default value	description	required?
value	string	derived from driver class name	id (unique to job jar) used by container to identify and execute this driver	N
version	string	null	software version	N
description	string	none	user-friendly description	N
listener	Class or array of classes implementing `ToolListener`	null	For handling driver lifecycle events	N
context	Class	null	Sets the context for the job. Not typically used.	N

`@JobInfo`

The @JobInfo annotation is applied at the field level to the Job instance in your driver.

Example Usage:

@JobInfo(value="Annotated Word Count v1")
@MapperInfo(WordCountMapper.class)
@ReducerInfo(WordCountReducer.class)
@FileInput(TextInputFormat.class)
@FileOutput(TextOutputFormat.class)
private Job job;

Elements:

element	type	default value	description	required?
name/value	string	null	A human-readable name of the job	N
numReducers	int	null	Set the number of reducers used	N
map	`@MapperInfo`	null	Sets the mapper via the `@MapperInfo` annotation	N
reduce	`@ReducerInfo`	null	Sets the reducer via the `@ReducerInfo` annotation	N
combine	`@CombinerInfo`	null	Sets the combiner via the `@CombinerInfo` annotation	N
sorter	`@Sorter`	null	Sets the sorter via the `@Sorter` annotation	N
grouping	`@Grouping`	null	Sets the grouping class via the `@Grouping` annotation	N
partitioner	`@Partitioner`	null	Sets the partitioner via the `@Partitioner` annotation	N

`@MapperInfo`

Specify the mapper class to use for this job. The output key/value types are read from the class parameters (if available). Otherwise you must provide a @KeyValue for the output element.

Elements:

element	type	default value	description	required?
value	class<? extends Mapper>	Mapper.class	specifies the mapper class to be used by the job	Y
output	@KeyValue	null	specifies the output key/value types of the mapper.	N

`@ReducerInfo`

Specify the reducer class for this job. The output key/value types are derived by examining the type parameters. Otherwise you must provide an explicit @KeyValue for the output.

Elements:

element	type	default value	description	required?
value	Class<? extends Reducer>	Reducer.class	specifies the reducer class to be used by the job	Y
output	@KeyValue	null	specifies the output key/value types of the reducer.	N

Input/Output

Along with the typical file-based input and output formats, Mara also includes support for HBase and Avro formats.

`@FileInput`

By default, Mara will configure your job to use TextInputFormat with the path provided by the value of your context's 'input' property (again, a default of the built-in context if not overridden.) For many applications therefore using text-based file inputs, this annotation doesn't need to be explicitly provided.

Elements:

element	type	default value	description	required?
value	Class<? extends FileInputFormat>	org.apache.hadoop.mapreduce.lib.input.TextInputFormat	input format to use	N
path	String	${context.input}	path or comma-separated list of paths	N

`@FileOutput`

Like @FileInput, if this annotation is absent Mara will automatically configure your job to use TextOutputFormat with the path specified by the value of your context's 'output' property.

Elements:

element	type	default value	description	required?
value	Class<? extends FileOututFormat>	org.apache.hadoop.mapreduce.lib.input.TextOutputFormat	input format to use	N
path	String	${context.output}	path	N

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Driver and Job Configuration

Job Configuration

`@Driver`

Example Usage:

Elements:

`@JobInfo`

Example Usage:

Elements:

`@MapperInfo`

Elements:

`@ReducerInfo`

Elements:

Input/Output

`@FileInput`

Elements:

`@FileOutput`

Elements:

Clone this wiki locally