How to Install Apache Spark on Debian 12
Apache Spark is a free, open-source, general-purpose and distributed computational framework that is created to provide faster computational results. It supports several APIs for streaming, graph processing including, Java, Python, Scala, and R. Generally, Apache Spark can be used in Hadoop clusters, but you can also install it in standalone mode.
In this tutorial, we will show you how to install Apache Spark framework on Debian 12
Useful Links:
VPS/VDS – https://www.mivocloud.com/
WARNING – ANGLED BRACKETS AREN’T ALLOWED IN DESCRIPTION SO BE ATTENTIVE TO THE VIDEO IN NANO EDITOR
Commands Used:
apt-get install default-jdk curl -y
java –version
wget https://dlcdn.apache.org/spark/spark-3.4.1/spark-3.4.1-bin-hadoop3.tgz
tar -xvzf spark-3.4.1-bin-hadoop3.tgz
mv spark-3.4.1-bin-hadoop3/ /opt/spark
nano ~/.bashrc
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
source ~/.bashrc
start-master.sh
ss -tunelp | grep 8080
start-slave.sh spark://your-server-ip:7077
spark-shell
apt-get install python -y
pyspark