Monitoring for APACHE SPARK®

Installing AlertPro for Apache Spark

Introduction

Install one licensed instance of AlertPro to monitor each Spark History Server in your environment. The AlertPro installation can then be used to monitor any spark job run on any instance of Apache Spark which connects to that Spark History server. AlertPro only needs to be able to connect to the API of your Spark History Server, so there’s no tie-up between the mode in which you run Spark, and the way that you choose to deploy AlertPro. The diagram illustrates how AlertPro interacts with your Spark History Server.

The different deployment options for AlertPro are listed further down this page with quickstart installation and configuration instructions for each. 

Kubernetes

Download our Kubernetes installation script and execute it from wherever you have the ability to run kubectl commands against your cluster.
 
In this deployment mode, AlertPro runs as a scheduled Kubernetes job. It creates a 1Mb persistent volume using your cluster’s default storage class, and stores Spark job state information on it. If you want to check the details and configuration of the AlertPro job before deploying it on your Kubernetes cluster, you can of course download the yaml file first, check and edit it according to your needs, and then kubectl apply your edited file.
 
Single line deploy command:
				
					kubectl apply -f https://alertplane.io/downloads/spark-alert-pro.yaml
				
			

To apply the license and edit SparkAlertPro configuration:

				
					kubectl -n alertplane edit cm sparkalertpro-cm
				
			

By default, the Kubernetes job runs every minute (*/*/*/*/*). You can edit this to suit your needs. To see logs, use the ‘kubectl logs’ command on the completed pods that are left behind after a run (the most recent 5 pods are retained).

To increase the logging level, edit the config map and change logLevel from the default 1 (error logging only) up to 4 (debug logging).

Linux

The Linux install of AlertPro works on RHEL, CentOS, Fedora, Ubuntu, Debian and SUSE. The self-extracting installer does not require any root privileges and can be run by any user.
 
  • First of all, the AlertPro binary, configuration file and persistent storage directory are installed under ~/.alertplane.
  • Next, the user is prompted to enter the license key, Spark History Server URL, optional authentication parameters for Spark History Server, optional SMTP server configuration for sending of email alerts via corporate SMTP server, and optional Slack webhook.
  • Finally, the installer sets up a cronjob to run AlertPro every minute.
  • Logging is sent to the logger program by default. You can change this by editing the cronjob (crontab -e). In most systems, you can then read the logs by running journalctl -t sparkalertpro -f
 
Single line deploy command:
				
					wget https://alertplane.io/downloads/sparkalertpro-installer-amd64.tar && \
      tar -xvf sparkalertpro-installer-amd64.tar && \
     ./sparkalertpro-installer-amd64*.bin
				
			

Docker

AlertPro can be run as a persistent Docker container on any host capable of running a Linux-based container (our base image is Alpine). The AlertPro container mounts a host directory, which you create and specify in the docker run command. You create the config.yaml file in a ‘config’ directory under your chosen host directory and you can change its contents at any time without restarting the container – AlertPro will pick up on your changes the next time it runs inside the container.
 
Commands:
				
					LICENCE=<paste your license key here>
SPARKHS=<your Spark History Server URL>

mkdir -p ~/.alertplane/config && mkdir ~/.alertplane/data

cat << EOF > ~/.alertplane/config/config.yaml
license: "$LICENSE"
hsURL: "$SPARKHS""
alertingURL: "https://alertplane.io:3002/alert"
emailFrom: "AlertPlane <sparkalert@datadoc.info>"
EOF

docker run -v ~/.alertplane/:/root/.alertplane --name sparkalertpro -d alertplane/sparkalertpro:latest