Adding new operators in Wayang
This guide shows the 3 steps that developers need to follow if they want to add new operators in Wayang. We use the Map operator as an example.
Step 1: Add a Wayang operator
Wayang operators are located under the wayang-basic
in the org.apache.wayang.basic.operators
package.
An operator needs to extend from one of the following abstract classes: UnaryToUnaryOperator
, BinaryToUnaryOperator
, UnarySource
, UnarySink
.
For a unary to unary operator, see for example here.
For enhanced performance in Wayang, consider adding a cardinality estimator by overriding the createCardinalityEstimator()
function as here.
Step 2: Add the (platform-specific) execution operators
Execution operators are located under the corresponding module of wayang-platforms
. For instance, Java execution operators are located in the org.apache.wayang.java.operators
package of the wayang-java
module.
An execution operator needs to extend from its corresponding Wayang operator and implement the corresponding platform operator interface.
For the above MapOperator
, the following is the corresponding JavaMapOperator
.
For enhanced performance in Wayang, consider adding a load function as well:
For this you need to overwrite the getLoadProfileEstimatorConfigurationKey()
function and provide the right key that will then be read from a properties file.
For the JavaMapOperator it's: wayang.java.map.load. Then add in the corresponding properties file (e.g., this is for the java executor) the template which is the mathematical formula that represents the cost of this operator and an instantiation of it. See here for the example of the map operator.
Step 3: Add mappings
Create mappings from the Wayang operator to the platform-specific execution operators.
The mappings are located in the corresponding execution module in the org.apache.wayang.java.operators
package.
For the above MapOperator
and JavaMapOperator
, see here.
After that you need to declare this mapping in Wayang in the corresponding Mappings
class.
Step 4: Expand the Java scala-like API
Once you created a new operator you need to expose it to the API so that users can use it as a function in the dataflow job they create. For this, you need to go to the module wayang-api/wayang-api-scala-java
and expand the JavaPlanBuilder.scala
file to include a new source operator or the DataQuantaBuilder
for non-source operators.