HDFS + HMS on Raspberry

ΠŸΡ€Π΅Ρ€Π΅ΠΊΠ²ΠΈΠ·ΠΈΡ‚Ρ‹

➜  ~ ssh-keygen -t ed25519

Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/maksim/.ssh/id_ed25519): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/maksim/.ssh/id_ed25519
Your public key has been saved in /home/maksim/.ssh/id_ed25519.pub
The key fingerprint is:
SHA256:OoRvk0QoY3Vnwl9n4rfguWKQJVbhua5u8vVqlJKTyKE maksim@hadoop
The key's randomart image is:
+--[ED25519 256]--+
|    ..o +.       |
|   . o.=..o o    |
|  + . .ooo +     |
| . o.oo o.o .    |
|   o.+o*So + .   |
|  E o+B+o o .    |
|      B=o  .     |
|    ...=+..      |
|     =+o.o.      |
+----[SHA256]-----+

Packages

  • sudo apt install openjdk-17-jdk openjdk-17-jre

Downloads

  • wget -c https://dlcdn.apache.org/hive/hive-standalone-metastore-3.0.0/hive-standalone-metastore-3.0.0-bin.tar.gz
  • wget -c https://dlcdn.apache.org/hadoop/common/hadoop-3.4.1/hadoop-3.4.1.tar.gz
  • wget -c https://jdbc.postgresql.org/download/postgresql-42.7.7.jar

PostgreSQL

sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt install postgresql-17

psql

postgres=# CREATE USER hive WITH PASSWORD 'iddqd';
CREATE ROLE
postgres=# CREATE DATABASE metastore WITH OWNER hive;
CREATE DATABASE

Dirs & Files

➜  ~ sudo mkdir /opt/hive
➜  ~ sudo mkdir /opt/hadoop
➜  ~ sudo mkdir /data
➜  ~ sudo chown -R maksim:maksim /opt/hive /opt/hadoop /data
➜  ~ mkdir -p /data/{namenode,datanode}
➜  ~ tar -xzf hadoop-3.4.1.tar.gz -C /opt/hadoop --strip-components=1
➜  ~ tar -xzf hive-standalone-metastore-3.0.0-bin.tar.gz -C /opt/hive --strip-components=1

Π’ ~/.zshrc Π΄ΠΎΠ±Π°Π²ΠΈΡ‚ΡŒ:

export JAVA_HOME="/usr/lib/jvm/java-17-openjdk-arm64"
export HADOOP_HOME="/opt/hadoop"
export HIVE_HOME="/opt/hive"

HDFS

ΠšΠΎΠ½Ρ„ΠΈΠ³ΠΈ

  • /opt/hadoop/etc/hadoop/hadoop-env.sh
...
export JAVA_HOME="/usr/lib/jvm/java-17-openjdk-arm64"
...
  • /opt/hadoop/etc/hadoop/core-site.xml
<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://100.64.88.101:9000</value>
        </property>
</configuration>
  • /opt/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
	<property>
 	   <name>dfs.datanode.data.dir</name>
    		<value>/data/datanode</value>
	</property>
	<property>
    		<name>dfs.namenode.name.dir</name>
    		<value>/data/namenode</value>
	</property>
</configuration>

Запуск

  • ΠžΡ‚Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ namenode
/opt/hadoop/bin/hdfs namenode -format
  • Π—Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ HDFS
/opt/hadoop/sbin/start-dfs.sh

ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ°

➜  hadoop ./bin/hdfs dfs -ls /                      
2025-06-25 11:42:42,002 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
 
➜  hadoop ./bin/hdfs dfs -put LICENSE.txt /         
2025-06-25 11:43:00,056 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
 
➜  hadoop ./bin/hdfs dfs -ls /             
2025-06-25 11:43:19,030 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   1 maksim supergroup      15696 2025-06-25 11:43 /LICENSE.txt
➜  hadoop ./bin/hdfs dfs -rm -skipTrash /LICENSE.txt
2025-06-25 11:45:45,293 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /LICENSE.txt

HMS

  • Π’ ΠΏΠ°ΠΏΠΊΡƒ /opt/hive/lib ΠΏΠΎΠ»ΠΎΠΆΠΈΡ‚ΡŒ postgresql-42.7.7.jar.
  • /opt/hive/conf/metastore-site.xml
<configuration>
    <property>
        <name>hive.metastore.port</name>
        <value>9083</value>
    </property>
    <property>
        <name>metastore.thrift.uris</name>
        <value>thrift://100.64.88.101:9083</value>
    </property>
    <property>
        <name>metastore.task.threads.always</name>
        <value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
    </property>
    <property>
        <name>metastore.expression.proxy</name>
        <value>org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>org.postgresql.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:postgresql://127.0.0.1:5432/metastore</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>iddqd</value>
    </property>
    <property>
        <name>metastore.warehouse.dir</name>
	<value>hdfs://100.64.88.101:9000/warehouse</value>
    </property>
</configuration>
  • ΠŸΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΈΡ‚ΡŒ ΠΏΠ°ΠΏΠΊΠΈ Π² HDFS
./bin/hdfs dfs -mkdir /warehouse
./bin/hdfs dfs -chown -R trino:supergroup /warehouse
  • Π—Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ HMS
./bin/start-metastore