Running Java Spark in Docker (1)


I just realized that Java Spark ( , translator note: Java Spark is inspired by Ruby's Sinatra Web development framework, similar to Scalatra, SpringBoot, etc., and Scala Scalatra have similar ) And Apache Spark ( , Translator's Note: Apache Spark is Apache's data processing engine) is not confusing. In the next content I will show these. In all the web development I've done, Java Spark is the first Java Web development framework that I have encountered and let me enjoy.

This article will show you how to run a Java Spark application in a Docker container. This tutorial is broken down into multiple Git branches and helps illustrate the process.

This tutorial is due to Matthias from GiantSwarm. This tutorial bypasses the basics of the Java Spark tutorial. If you want to start from scratch, you can start with the add-java-spark-handler branch of this Github repository .


  • GitHub Warehouse Address:
  • Git clone : bdparrish / docker-spark-tutorial.git
  • Cd docker-spark-tutorial
  • Install Java8 (translator note: only use Java8)
  • Install Apache Maven

Step 1 – Build Dockerfile

Our first step involves launching a Docker container that will contain Java Spark servers and applications. We need to be an initial Dockerfile file and verify that the file can be built and run correctly. Then we will install Apache Mavin in the container.

Starting point :

  Git clone bdparrish / docker-spark-tutorial.git 

1, add a Dockerfile to the project root directory

  FROM java: 8 
RUN apt-get update
RUN apt-get install -y maven

2, build this Docker image

  Docker build -t bdparrish / docker-spark-tutorial. 

3. Verify that the Docker image is correct

  Docker run -it bdparrish / docker-spark-tutorial 

4, check the Maven version

  Mvn -version}} 
If it is correct, you will see similar information as follows:
{{{Apache Maven 3.0.5 Maven home: / usr / share / maven
Java version: 1.8.0_66-internal, vendor: Oracle Corporation
Java home: / usr / lib / jvm / java-8-openjdk-amd64 / jre
Default locale: en, platform encoding: UTF-8
OS name: "linux", version: "4.0.9-boot2docker", arch: "amd64", family: "unix"

5, in order to speed up the instructions of this tutorial, you can switch to the new git branch
Git checkout build-dockerfile.

Step 2 – Add pom-xml

Now we have to build a Maven project, and increase the Java Spark Core dependency package. Again, we must check that our Docker image is available before.

Starting point :

  Git checkout build-dockerfile 

1, add a pom.xml seat to the Maven guide

  <Project xmlns = "" xmlns: xsi = "" 
Xsi: schemaLocation = "">
<ModelVersion> 4.0.0 </ modelVersion>
<GroupId> bdparrish </ groupId>
<ArtifactId> docker-spark-tutorial </ artifactId>
<Version> 1.0-SNAPSHOT </ version>
<GroupId> com.sparkjava </ groupId>
<ArtifactId> spark-core </ artifactId>
<Version> 2.0.0 </ version>
</ Dependency>
</ Dependencies>
</ Project>

2, update Dockerfile, add the following:

  WORKDIR / tutorial 
ADD pom.xml /tutorial/pom.xml
RUN ["mvn", "dependency: resolve"]
RUN ["mvn", "verify"]

3, re-build Docker mirror

  Docker build -t bdparrish / docker-spark-tutorial. 

4. Run a container instance to test our new Docker image

  Docker run -it bdparrish / docker-spark-tutorial 

5, run mvn verify to test our previous steps

  [INFO] ----------------------------------------------- ------------------------- 
[INFO] ----------------------------------------------- -------------------------
[INFO] Total time: 2.912s
[INFO] Finished at: Sun Oct 04 00:58:42 UTC 2015
[INFO] Final Memory: 7M / 31M
[INFO] ----------------------------------------------- -------------------------

6, git checkout add-pom-xml to jump to this tutorial corresponding git warehouse corresponding to the content

Step 3 – Add a Java-Spark server

From this step we started to write some code. We need a main method to start our Java Spark server.

Our maven project is probably the following structure:

  - java.tutorial 
| -
| -

Starting point :

  Git checkout add-pom-xml 

1, first we increase junit dependency package to our project. Add the following section to the pood.xml dependency section.

<GroupId> junit </ groupId>
<ArtifactId> junit </ artifactId>
<Version> 4.12 </ version>
<Scope> test </ scope>
</ Dependency>

I am a faithful TDD supporter. We first step to do a test case to test we correctly generated a Server class. Generate a new package into your source file directory (src / main / java) and name it tutorial, and then generate the same package to test the source file directory (src / test / java)

2, add a new test class ServerTest to the package under the package and include the following reference package in the header.

  Import org.junit.Assert; 
Import org.junit.Test;

3, add the following code to the test class. This is a simple test, but can help us start from a little bit. If you try to run this test, it will fail.

Public void canCreateServer () {
// arrange / act
Server server = new Server ();

// assert
Assert.assertNotNull (server);

4, the previous test will fail because we do not have this server class. Now let's generate this class in the tutorial package.

Public class Server {
Public static void main (String [] args) {

5, re-run the test, the results should be successful.

6, now update our previous Dockerfile file, add these new source code. Add the following to the Dockerfile

  ADD src / tutorial / src 
RUN ["mvn", "package"

7, build our Docker mirror again and run it

  Docker build -t bdparrish / docker-spark-tutorial. 
Docker run -it bdparrish / docker-spark-tutorial

8, check and ensure that our source code has been properly copied to the corresponding directory. We should be able to see all of our own files and the corresponding directory.

  Ls -R src 


  Root @ 51049cdab012: / tutorial # ls -R src 
Main test
Src / main:
Java resources
Src / main / java:
Src / main / java / tutorial:
Src / main / resources:
Src / test:
Java resources
Src / test / java:
Src / test / java / tutorial:
Src / test / resources:

9, complete and then switch to the next git branch

  Git checkout add-java-spark-server 

Step 4 – Add Java-Spark-Handler

Add a request handler to our code. The handler will handle all web access requests to the root level ("/").

Starting point :

  Git checkout add-java-spark-server 

1, to the request handler to add a test class. This test class is our basic test of this handler class.

Import org.junit.Assert;
Import org.junit.Test;
Public class MainControllerTest {
Public void canCreatMainController () {
// arrange / act
MainController controller = new MainController ();
// assert
Assert.assertNotNull (controller);

2, increase the MainController class. This class will handle our root request ("/")

Import static spark.Spark.get;
Public class MainController {

3, in front of this class did not do anything. Now add the real request to the mainControlle class.
Add new dependencies to pom.xml.

<GroupId> org.jsoup </ groupId>
<ArtifactId> jsoup </ artifactId>
<Version> 1.7.2 </ version>
<Scope> test </ scope>
</ Dependency>

Add the test case to the MainControllerTest class.

Public void canRequestRootPage () throws IOException {
// arrange
Before ((request, response) -> {
Server.main (null);
After ((request, response) -> {
Spark.awaitInitialization ();
// act
Document doc = Jsoup.connect ("http: // localhost: 4567 /") .get ();
// assert
Assert.assertNotNull (doc);
Assert.assertTrue (doc.toString (). Contains ("Hello from MainController"));
// clean up

Add the code to the MainController class

  Public MainController () { 
Get ("/", (req, res) -> {
Return "Hello from MainController.";

4, so far, we have a controller to handle our request. And we have the corresponding test cases that we can handle these requests as we expected. But when we try to run our server, this server does not know the MainController class. So now let's improve it. Modify the server class to:

Public class Server {
Public static void main (String [] args) {
New MainController ();

5, finally, let's finish the pom.xml file. Add the following to the </ dependencies> element.

<GroupId> org.apache.maven.plugins </ groupId>
<ArtifactId> maven-compiler-plugin </ artifactId>
<Version> 3.1 </ version>
<Source> 1.8 </ source>
<Target> 1.8 </ target>
</ Configuration>
</ Plugin>
<ArtifactId> maven-assembly-plugin </ artifactId>
<Id> bin </ id>
<Phase> package </ phase>
<Goal> single </ goal>
</ Goals>
</ Execution>
</ Executions>
<DescriptorRef> jar-with-dependencies </ descriptorRef>
</ DescriptorRefs>
<MainClass> tutorial.Server </ mainClass>
</ Manifest>
</ Archive>
</ Configuration>
</ Plugin>
</ Plugins>
</ Build>

6, now, you should be able to run mvn clean install to successfully build this project.

7, we will be able to build a mirror with this application to build the mirror.

  Docker build -t bdparrish / docker-spark-tutorial. 

Run this mirror container. Note that we added the -p parameter to expose the ports we applied.

  Docker run -it -p 4567: 4567 bdparrish / docker-spark-tutorial 

From the log you should be able to see that Java Spark started the Jetty server on port 4567

8, we can browse the root address of the web application to verify the container to obtain IP address.

  Docker inspect <container-id> | grep IPAddress 

Browse this web url


We should be able to see a page showing "Hello from MainController"

9, and finally, this project can be found in this git branch: git checkout add-java-spark-handler ,

Concluding remarks

To get all the first part of the content, you can execute git checkout add-java-spark-handler. In the second part, we will add a DAO and data database to this prototype.

See also: Running Java Spark in Docker – Part 2

Original link: RUNNING JAVA SPARK IN DOCKER – PART 1 (translated: Qiu Gachuan)

Heads up! This alert needs your attention, but it's not super important.