Entries from January 2008 ↓

Principle #2 - Environments should be consistent (Part 2)

In my previous post, I discussed the second of The Five Principles of Software Deployment, namely that Environments should be consistent. In that post, I looked at how developers could achieve consistency in their build environments. In this post, I’ll take a look at achieving consistency in other environments.

As I mentioned in the previous post, an environment includes all the components needed to build and run your application.

As an application is built, tested, and deployed, it generally passes through several environments. These environment can include:

  1. Developer Test environment - This is the place where a developer builds and tests an application. Typically this is an environment running on the developer’s workstation and is not shared by other team members.
  2. System Integration Test (SIT) environment - This is a more formal test environment for developers. This environment is used to test an application before it is released for others to use. On a large project, this environment is often integrated with environments from other projects.
  3. User Acceptance Test (UAT) environment - This environment allows end users to test an application in an environment that closely resembles the production environment. Often snapshots of production data are used to populate this environment to simulate actual usage.
  4. Production environment - This is the environment where the application is ultimately used. In some cases, this environment can span multiple machines for load balancing and redundancy.
  5. Disaster recovery (DR) environment - This is an environment that is ready to use if the production environment becomes unavailable.

Larger projects may contain additional environments, while smaller projects may only contain some of these. As a minimum, there should be at least one environment between the developer’s build environment and the ultimate production environment. Unfortunately, as I wrote about earlier, this is often not the case.

The goal is to have consistency between environments. The closer an environment is to the final production environment, the more important this consistency is. Achieving this consistency is not easy, as it involves resources, both human and financial, which are often in short supply on a project.

Here are some things that can help in achieving consistency throughout your deployment environments.

  1. Document your application dependencies - This was already mentioned in the previous post as a tip for achieving consistency in build environments, but it also applies to deployment environments. Armed with a list of the various third-party components (OS, databases, services, libraries) that an application needs, you can more easily set up a new environment or new machine to run the application.
  2. Package your applications for deployment - Creating complete packages for your application is an important method of achieving consistency across your environments. These packages can contain any libraries and frameworks needed to run the application. By including them along with the application, you are ensuring that the same version is used wherever the application runs. These packages can also include various scripts to install, upgrade, start, and stop your application. These scripts allow for greater control over the deploying and running of an application. It can also be very useful to package third-party applications to achieve the same level of control.
  3. Build your applications to run independently - It is often desirable or necessary to run multiple environments on a single machine. You need to consider this when building your applications. Some things to check for:
    • Does your application open any network sockets? If so, you need to make port numbers configurable to allow multiple instances of your application to run on the same machine.
    • Does your application use fixed paths? If these paths are absolute, there will be problems (unless you use chroot under UNIX). It is better to have your paths be relative to the application or rely on an environment variable.
    • Does your application rely on a well-known directory like ‘/tmp’ on UNIX boxes? This can often cause subtle problems. In Java applications, you might need to change the Java property, java.io.tmpdir, to avoid problems.
  4. Consider virtualization - The use of virtualization tools like VMWare, Microsoft Virtual Server, and others can help in a number of ways. First, it allows you to run multiple independent environments on the same server. Second, it allows you to create a standard virtual machine and run this standard configuration on multiple servers. There are other benefits to virtualization which I’ll cover in a future post, when I discuss my own uses of virtualization.
  5. Track environment changes - Once you have an environment set up, it is very important to track any changes to that environment, and propagate these changes to other environments. For example, if you move to a newer version of Java on your production servers, you need to upgrade Java on any servers used for disaster recovery (DR). A good change management process is important.

In my next post, I’ll continue with the third of the Five Principles of Software Deployment, which is Packages should be autonomous. In the meantime, if you want to share your experiences with achieving consistent environments, feel free to post a comment here.

Principle #2 - Environments should be consistent (Part 1)

This post continues the series The Five Principles of Software Deployment. It covers Principle #2, Environments should be consistent.

An environment includes all the components needed to build and run your application. As an example, take a typical database-driven Java web application, which relies on the following components on the server side:

  • Operating system (e.g. Solaris)
  • System libraries (e.g. OpenSSL)
  • Service providers (e.g. Apache and Oracle)
  • Runtime environments (e.g. JDK or JRE)
  • Application containers (e.g. Tomcat)
  • Application frameworks and libraries (e.g Struts, Spring)

One of the keys to successful deployment is keeping components like these consistent throughout the environments used to build and run your application.

This post discusses consistency in build environments. The next post will look at maintaining consistency across deployment environments.

Having a consistent build environment means two things. First, a build environment should use the same components used to run the application in production. Second, a build environment needs to be consistent between developers.

In an ideal world, each developer would use the same versions of every component involved in building and running an application. Looking at the list above, this means running the same versions of everything from the underlying operating system to the application libraries.

This is not always possible or practical. For example, a developer may use a Windows-based workstation for development, while deploying to a Linux or UNIX-based environment. In this case, consistency can still be maintained further up the technology stack, by running the same version of the JDK, application server, and application libraries.

The following are some practical tips for achieving consistent build environments:

  1. Document your application dependencies - The first step to consistency is knowing exactly what is needed to build and run your application. This may seem obvious, but good project documentation is often overlooked. This documentation is especially important for new members that join the team.
  2. Version control as many components as possible - This tip is related into the advice presented in previous blog entries about achieving repeatable builds. At a minimum, check in application frameworks and libraries like Struts and Spring and update your build scripts when the versions of these components change. This will ensure that all developers are building with the same libraries.
  3. Provide a repository for component downloads - For those components that are not checked into version control, it is useful to have a shared drive or web server where team members can retrieve standard component versions. For example, this repository could contain the standard version of Eclipse used by the team.
  4. Develop a strategy for running multiple versions of tools - It is sometimes necessary to switch versions of tools during a project. One example is moving to a newer version of Java - say from 1.5 to 1.6. To make this change cleanly, it is best to have both versions of the component available to allow new development to occur with the new version, while keeping the old version available for running and debugging an older version of the application. Some components, like the Java JDK, allow side-by-side installation, while others are not as easy to set up this way. Our BundleWorks product is designed to allow multiple versions of components to run side-by-side.
  5. Build open-source components from source - The thought of building a component from source may make some people cringe, and although it can sometimes be painful,  there are benefits to doing this. The first is that you know exactly what version of a component you are running across all environments, since you are not relying on the version provided by the OS vendor. An example of this is the Apache web server. Different operating system versions include different versions of Apache. Building from source gives you a known, consistent version. Second, building from source allows you to install multiple versions of a component side-by-side. This is done by passing a –prefix argument to the configure script when you build the component (e.g. –prefix=/opt/apache/2.2.6). If you decide to do this, check the original source and the build script (with configure options) into your source code repository.
  6. Provide common scripts for setting up a build environment - This tip applies more to UNIX-based operating systems than to Windows. When a user logs into a UNIX-based machine, startup scripts are run based on the user’s choice of shell. In a team environment, these startup scripts should rely on a common script to set up the build environment. This common script should set the standard versions of any tools needed (compilers, runtime environments, etc). Any group scripts should also be checked into version control, as they can change over time.

I’d like to make one final point about consistent build environments. This point involves Integrated Development Environments, or IDEs. I’ve used a number of IDEs over the course of my career and have found them to be helpful. Developers have differing opinions about which IDE, if any, is best. Deciding on a standard IDE for a team is not as important as standardizing versions of other build components. As long as the underlying components are consistent, a developer using vi and Ant can achieve the same end result as a developer using the latest and greatest version of IDEA.

Technology and Car Doors

There are days when working with technology feels like getting your fingers caught in a closing car door. Yesterday was one of those days.

The goal seemed simple: re-compile PHP on a 64-bit CentOS 5.1 Linux host to add support for IMAP. This blog post shows exactly how simple it was[n’t].

As a rule, I like to compile my development tools from source. This allows me to standardize on a particular version of the tool and not be tied to the version available in the Linux distribution. Additionally, it lets me run multiple versions of a tool side-by-side (with the help of our BundleWorks product). For system-level libraries, I prefer to install them using the packaging system provided with the OS (apt-get, yum/rpm, etc.).

Anyway, I had already built Apache, MySQL, and PHP from source, when I discovered that I needed IMAP support in PHP. According to the PHP website, I needed to install the c-client IMAP library first.

Since I was running CentOS, I used

yum install libc-client-devel libc-client

to install the c-client library and the necessary headers. I then added –with-imap to the list of configure options.

This resulted in the following configure error:

This c-client library is built with Kerberos support.
Add --with-kerberos to your configure line. Check config.log for details

Fair enough. I added –with-kerberos to the configure command line.

This resulted in a new error:

error: Kerberos libraries not found.
Check the path given to --with-kerberos (if no path is given, searches in /usr/kerberos, /usr/local and /usr )

I checked my system, and found that the Kerberos libraries were installed in /usr/lib64. So I passed –with-kerberos=/usr/lib64 to the configure script, but the script still reported that the Kerberos libraries could not be found.

So I “googled the error message”, like any good programmer would do. After trying a couple of suggestions without any luck, I decided to do something which has helped me diagnose compile errors in the past:

sh -x ./configure ...configure options...

This provides debugging output for the configure script, by using the -x switch for the sh command (the configure script is shell script, after all). It also provides more useful information than you would normally find in the config.log file.

From the pages of output that filled my terminal, I found that the configure script was appending “lib” to the –with-kerberos path that I provided, so it was looking inside a non-existent “/usr/lib64/lib” directory. However, I found that I could change “lib” to “lib64″ by passing –with-libdir=lib64 to the configure script. Victory seemed imminent.

It was not imminent enough. The configure script now complained that it couldn’t find the MySQL client libraries in the location that I previously installed them. It was using this libdir value of “lib64″ to look inside the MySQL installation, and the libraries instead were in a directory named “lib”. Exasperated, I ended up creating a softlink named “lib64″ inside the MySQL installation to point to the “lib” directory.

The car door finally closed without catching my fingers. The configure and compile then ran successfully.