Apache Spark is an open-source framework packaged with higher-level libraries, including support for SQL queries, streaming data, machine learning, and graph processing. It is also capable of analyzing a large amount of data and distributing it across clusters and processing the data in parallel. If you are a developer who needs to produce seamless and creates complex workflows, then Apache Spark is a great place to start. Getting started with installing Apache Spark on Ubuntu.

Install Java JDK

Apache Spark requires Java JDK. In Ubuntu, the commands below can install the latest version. After installing, run the commands below to verify the version of Java installed. That should display similar lines as shown below:

Install Scala

One package that you’ll also need to run Apache Spark in Scala. To install in Ubuntu, simply run the commands below: To verify the version of Scala installed, run the commands below: Doing that will display a similar line below:

Install Apache Spark

Now that you have installed the required packages to run Apache Spark, continue below to install it. Run the commands below to download the latest version. Next, extract the downloaded file and move it to the /opt directory. Next, create environment variables to be able to execute and run Spark. Then add the lines at the bottom of the file and save. After that, run the commands below to apply your environment changes.

Start Apache Spark

At this point, Apache Spark is installed and ready to use. Run the commands below to start it up. Next, start the Spark work process by running the commands below. You can replace the localhost host with the server hostname or IP address. When the process started, open your browser and browse to the server hostname or IP address. If you wish to connect to Spark via its command shell, run the commands below: The commands above will launch Spark shell. That should do it! Conclusion: This post showed you how to install Apache Spark on Ubuntu 20.04 | 18.04. If you find any error above, please use the form below to report.