Skip to main content

Adding new operators in Wayang

This guide shows the 3 steps that developers need to follow if they want to add new operators in Wayang. We use the Map operator as an example.

Step 1: Add a Wayang operator

Wayang operators are located under the wayang-basic in the org.apache.wayang.basic.operators package.
An operator needs to extend from one of the following abstract classes: UnaryToUnaryOperator, BinaryToUnaryOperator, UnarySource, UnarySink.
For a unary to unary operator, see for example here.

For enhanced performance in Wayang, consider adding a cardinality estimator by overriding the createCardinalityEstimator() function as here.

Step 2: Add the (platform-specific) execution operators

Execution operators are located under the corresponding module of wayang-platforms. For instance, Java execution operators are located in the org.apache.wayang.java.operators package of the wayang-java module.
An execution operator needs to extend from its corresponding Wayang operator and implement the corresponding platform operator interface.
For the above MapOperator, the following is the corresponding JavaMapOperator.

For enhanced performance in Wayang, consider adding a load function as well:
For this you need to overwrite the getLoadProfileEstimatorConfigurationKey() function and provide the right key that will then be read from a properties file. For the JavaMapOperator it's: wayang.java.map.load. Then add in the corresponding properties file (e.g., this is for the java executor) the template which is the mathematical formula that represents the cost of this operator and an instantiation of it. See here for the example of the map operator.

Step 3: Add mappings

Create mappings from the Wayang operator to the platform-specific execution operators.
The mappings are located in the corresponding execution module in the org.apache.wayang.java.operators package.
For the above MapOperator and JavaMapOperator, see here.

After that you need to declare this mapping in Wayang in the corresponding Mappings class.

Step 4: Expand the Java scala-like API

Once you created a new operator you need to expose it to the API so that users can use it as a function in the dataflow job they create. For this, you need to go to the module wayang-api/wayang-api-scala-java and expand the JavaPlanBuilder.scala file to include a new source operator or the DataQuantaBuilder for non-source operators.