Docker + Pentaho BA Server CE

I am little late to the party but I finally got a chance to play around with Docker. This technology is built on top of LXC, a virtualization environment that offers "near-native" performance for applications. Unlike a traditional Virtual Machine, it does not run a separate kernel nor it simulates computer hardware. But like a VM, it provides resource isolation - enabling applications to take advantage of independent and portable self-contained environments with very minimal overhead.

So why is this technology so important? Well, just to name a few :

  • It can provide standardized and repeatable deployment models for applications
  • It can simplify management of development, testing and production environments
  • It can automate application assembly, workflow, dependency management, etc.
  • It can help streamline auto scaling of applications

There are many other use cases for Docker. It is particularly interesting when building microservices infrastructure. However, for this example, let's look at a more traditional development workflow.

Example Dev/QA Use Case

Developers and QA often need to test a fresh installation of a server application - In this example, I am using the Pentaho BA Server 5.3 Community Edition.

A typical workflow involves starting from a vanilla server, installing package dependencies, downloading the latest product binaries, loading the default database, etc.

But we all know this could get very time consuming when testing different server configurations - perhaps you need to test the application against different database servers; perhaps against an older version of the Java Runtime, etc.

This is when a tool like Docker could really help automate the tasks required to quickly set-up a test environment.

Consider this sample Dockerfile:

FROM tutum/ubuntu:trusty

MAINTAINER Rowell Belen developer@bytekast.com

# Add a repo where Oracle JDK7 can be found.
RUN apt-get update
RUN apt-get install -y software-properties-common
RUN add-apt-repository -y ppa:webupd8team/java

# Auto-accept the Oracle JDK license
RUN echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections

RUN apt-get update
RUN apt-get install -y oracle-java7-installer

# Install supervisor
RUN apt-get update && apt-get install -y supervisor
RUN mkdir -p /var/log/supervisor

# Install useful command line utilities
RUN apt-get -y install man vim sudo
RUN apt-get install -y bash-completion
RUN apt-get install -y libwebkitgtk-1.0-0 libxtst6

# Install networking tools
RUN apt-get -y install net-tools dnsutils

# Install postgres
RUN apt-get -y install postgresql-9.3

# Add pentaho user
RUN useradd --create-home -s /bin/bash -G sudo pentaho
RUN sed -i.orig 's/%sudo.*/%sudo ALL=(ALL:ALL) NOPASSWD:ALL/' /etc/sudoers
RUN cp -rvT /root /home/pentaho
RUN chown -Rv pentaho:pentaho /home/pentaho

# Setup Pentaho Environment
RUN echo export JAVA_HOME=/usr/lib/jvm/java-7-oracle >>/etc/bash.bashrc
ADD psqlfix.sql /root/
RUN /etc/init.d/postgresql start && \
	sudo -u postgres psql </root/psqlfix.sql && rm /root/psqlfix.sql
ADD psqlfix.sh /root/
RUN sh /root/psqlfix.sh && rm /root/psqlfix.sh

# Download and extract Pentaho BA Server package
WORKDIR /home/pentaho/
RUN wget http://downloads.sourceforge.net/project/pentaho/Business%20Intelligence%20Server/5.3/biserver-ce-5.3.0.0-213.zip
RUN unzip biserver-ce-5.3.0.0-213.zip -d biserver-ce-5.3.0.0-213
RUN rm biserver-ce-5.3.0.0-213.zip

# Add/run script to load default tables
ADD loaddb.sh /home/pentaho/
RUN chmod +x /home/pentaho/loaddb.sh
RUN /etc/init.d/postgresql start && \
	printf 'password\n' | /home/pentaho/loaddb.sh

# Copy supervisor config
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf

# Add script to start biserver
ADD start.sh /home/pentaho/
RUN chmod +x /home/pentaho/start.sh

# Redirect Tomcat output
ENV CATALINA_OUT /dev/stdout

# Start Service
EXPOSE 22 5432 8080
CMD /home/pentaho/start.sh && /usr/bin/supervisord

The sample Dockerfile above automates the following tasks:

  • Pull down a base Linux server image
  • Install the Java SDK and Maven
  • Install administrator utilities
  • Install and set-up a Postgres database
  • Download and extract the Pentaho BA Server
  • Load the default database tables
  • Run the Postgres database and BA Server on startup

You can find the Docker Repository and instructions here: https://registry.hub.docker.com/u/bytekast/docker-pentaho-ce-5.3/

If you already have Docker installed, just run the command below to grab the image from the Docker Registry Hub:

docker pull bytekast/docker-pentaho-ce-5.3

To start the linux server and launch the BI Server application, run:

docker run -d -p 2222:22 -p 8888:8080 -e AUTHORIZED_KEYS="`cat ~/.ssh/id_rsa.pub`" bytekast/docker-pentaho-ce-5.3:latest

Wait a few seconds and the server should be ready. To check the logs, run:

docker logs $CONTAINER_ID

You can find the sample configuration on Github: https://github.com/bytekast/docker-pentaho-ce-5.3

As you can see, this enables us to quickly and easily provision a fresh installation of Pentaho BI Server CE on demand.

Pretty cool, huh?

Rowell Belen

Read more posts by this author.