Apache Hudi 1.0.0源码编译

Apache Hudi 1.0.0源码编译

编译Hudi1.0.0

1、Maven软件下载

https://maven.apache.org/download.cgi

地址:https://dlcdn.apache.org/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz

wget https://dlcdn.apache.org/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz

tar -zxvf apache-maven-3.9.9-bin.tar.gz

2、添加mvn环境变量

编辑环境变量

vi /etc/profile

环境变量添加

export MAVEN_HOME=/usr/local/soft/apache-maven-3.9.9 export PATH=$PATH:$MAVEN_HOME/bin

使环境变量生效

source /etc/profile

3、添加Maven镜像

/usr/local/soft/apache-maven-3.9.9/conf/settings.xml

两个都需要,只有阿里云有些库下载不了

<mirror> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror> <mirror> <id>confluent</id> <name>confluent maven</name> <url>http://packages.confluent.io/maven/</url> <mirrorOf>confluent</mirrorOf> </mirror>

4、验证mvn

mvn -v


5、下载hudi 1.0.0

hudi下载地址

Download | Apache Hudi

或Index of /hudi/1.0.0

下载

wget https://downloads.apache.org/hudi/1.0.0/hudi-1.0.0.src.tgz

6、解压hudi

tar -zxvf hudi-1.0.0.src.tgz

7、修改hudi源码

a、 修改/usr/local/soft/hudi-1.0.0/hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/testutils/HiveTestUtil.java文件第250行,把 zkServer.shutdown(true);改为 zkServer.shutdown();

b、修改/usr/local/soft/hudi-1.0.0/pom.xml,注释或去掉410行内容

cd /usr/local/soft wget http://packages.confluent.io/archive/5.5/confluent-5.5.0-2.12.zip unzip confluent-5.5.0-2.12.zip cd confluent-5.5.0/ mvn install:install-file -DgroupId=io.confluent -DartifactId=common-config -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/common-config-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=ommon-utils -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/ommon-utils-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=common-utils -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/confluent-common/common-utils-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-avro-serializer -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-avro-serializer-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-schema-registry-client -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-schema-registry-client-5.5.0.jar mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-json-schema-serializer -Dversion=5.5.0 -Dpackaging=jar -Dfile=./confluent-5.5.0/share/java/kafka-rest/kafka-json-schema-serializer-5.5.0.jar

c、修改pom添加如下内容

/usr/local/soft/hudi-1.0.0/packaging/hudi-spark-bundle/pom.xml

/usr/local/soft/hudi-1.0.0/packaging/hudi-utilities-bundle/pom.xml

<!-- 增加hudi配置版本的jetty --> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-server</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-util</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-webapp</artifactId> <version>${jetty.version}</version> </dependency> <dependency> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-http</artifactId> <version>${jetty.version}</version> </dependency>

8、编译hudi

cd hudi-1.0.0
mvn clean package -DskipTests -Dspark3.5 -Dflink1.20 -Dscala-2.12 -Dhadoop.version=3.4.0 -Pflink-bundle-shade-hive3

mvn clean package -DskipTests -Dspark3.4 -Dflink1.14 -Dscala-2.12 -Dhadoop.version=3.1.1 -Pflink-bundle-shade-hive3

参考:

CDP集成Hudi-编译部署-CSDN博客

大数据之数据湖Apache Hudi-CSDN博客