Installing and building Apache Wayang
Clone repository
git clone https://github.com/apache/incubator-wayang.git
Create binaries
Running following commands to build Wayang and generate the tar.gz
cd incubator-wayang
./mvnw clean install -DskipTests
./mvnw clean package -pl :wayang-assembly -Pdistribution
Then you can find the wayang-assembly-0.7.1-SNAPSHOT-dist.tar.gz
under wayang-assembly/target
directory.
Prepare the environment
Wayang
tar -xvf wayang-assembly-0.7.1-SNAPSHOT-dist.tar.gz
cd wayang-0.7.1-SNAPSHOT
In linux
echo "export WAYANG_HOME=$(pwd)" >> ~/.bashrc
echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.bashrc
source ~/.bashrc
In MacOS
echo "export WAYANG_HOME=$(pwd)" >> ~/.zshrc
echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.zshrc
source ~/.zshrc
Others
- You need to install Apache Spark version 3 or higher. Don’t forget to set the
SPARK_HOME
environment variable. - You need to install Apache Hadoop version 3 or higher. Don’t forget to set the
HADOOP_HOME
environment variable.
Run the program
To execute the WordCount example with Apache Wayang, you need to execute your program with the 'wayang-submit' command:
cd wayang-0.7.1-SNAPSHOT
./bin/wayang-submit org.apache.wayang.apps.wordcount.Main java file://$(pwd)/README.md
Then you should be able to see the output of the Wordcount example.
Compiling Apache Wayang
Apache Wayang (incubating) has different dependencies, for compiling, it needs to add some profile in the compilation to enable maven works properly.
mvn clean compile
The line before is because the plugin the Antlr is not needed in all the modules, as well it has happened with Scala language.
When maven compiles one or more modules using those plugins in the compilation time, it needs to add.
The modules are:
- wayang-api-scala-java
- wayang-core (Antlr)
- wayang-iejoin
- wayang-spark
- wayang-profiler
- wayang-tests-integration
Executing Coverage
mvn clean verify jacoco:report
the final report is placed on ./target/aggregate.exec/aggregate.exec