Simple ways to build web applications using servlets.
When I began to write server-side web applications, there were two mainstream choices: if you wanted the program to execute quickly, you chose C, and if you wanted to write the program quickly, you used Perl. C, as we all know, is great when the binary needs to be small, fast and efficient. But C's lack of automatic memory management and decent string handling, along with the extreme care that programmers need in order to use it, made it a poor second choice when compared with Perl.
But in the last few years, a number of programming languages have begun to challenge Perl for the server-side web programming mantle. In particular, Python has gained significant ground, thanks in no small part to the growth of the Zope web development environment.
But perhaps the largest groundswell of server-side programming is coming from the Java community. Java, as many of us might remember, began life as a client-side programming language. For the most part, applets are an unpleasant memory of what happens when you try to mix two client-side paradigms—a lesson that the increasingly common use of Flash seems to ignore.
The basic unit of server-side Java is the servlet, a small program that is executed in response to an HTTP request and that generates a legal HTTP response. Since servlets are written in Java, they are written as object classes, inherit from a servlet ancestor and can take advantage of Java's threading and exception handling. Moreover, servlets (like all Java programs) run inside of a Java virtual machine (JVM), an abstraction layer that can run on any operating platform. This means that the same servlets can run on nearly any operating system, providing even greater portability than CGI programs.
I have used servlets in a small number of projects so far, but this number is rising rapidly. Java is now the “in” language. This is partly because Sun has invested a large amount in its marketing, partly because it offers some technological and infrastructural advantages over its rivals and also because it poses a serious platform challenge to Windows. In addition, server-side Java is the cornerstone for an increasing number of Java-based application servers.
This month, we will begin to explore Java as a server-side programming language. As a beginning step, we will install the Jakarta-Tomcat environment for running servlets, as well as the associated Jasper environment for creating Java Server Pages (JSPs). In coming months, we will look at how to connect our servlets and JSPs to a relational database, as well as how to use Enterprise JavaBeans and the Enhydra application server for an even more powerful environment.
When I first started to work with Java on Linux a few years ago, the situation seemed fairly grim: while Linux was the best known, open-source operating system, and Sun was promoting Java as a universal programming language, it was difficult to impossible to get a good version of Java for Linux. Some volunteer porting efforts, particularly the one done by the Blackdown porters, were impressive, but installation was prone to problems and not nearly as stable as developers might have liked.
As I write this article in January 2001, the situation has changed dramatically: you can now download a Linux version of the latest Java development kit (JDK) directly from Sun's web site. Further, the Tomcat servlet/JSP system works just fine on Linux. As Linux picks up more steam, it is becoming an increasingly attractive platform on which to program in Java.
Because my main Linux box runs Red Hat 6.2, I downloaded the JDK 1.3 RPM from Sun's web site, http://java.sun.com/. In order to download the JDK, I had to sign up as a member of the “Java Developer Connection”; while I'm not thrilled by the notion of having to register in order to download software, it does not seem like a terrible price to pay. The RPM cannot be installed directly; first, you must agree to Sun's Java licensing agreement.
Once you have accepted the agreement, the RPM is unpacked and made available for install. You can then log in as root and install the JDK, which is placed in the /usr/java directory. By putting /usr/java/jdk1.3/bin/ in your PATH environment variable, you can execute the javac compiler and the java runtime environment without having to specify an explicit path.
Once you have installed the JDK, you should run at least one simple test to ensure that it works. Listing 1 contains a simple program that can be invoked without any parameters and prints “hello, world” to STDOUT. If the program is passed through any parameters, it prints those parameters separated by a pipe character (|).
To compile our test class (Test.java) into bytecodes (Test.class), use the Java compiler, javac:
javac Test.java
To run the program, we must invoke the Java runtime environment (java), giving it the name of our class (without the .class suffix):
java TestIf we don't pass any arguments, we get the following output:
Hello, worldWe can, however, pass arguments to our program:
java Test a b "q r s" 123In this case, we get the following output:
a|b|q r s|123In addition to setting your PATH correctly, you should set the environment variable JAVA_HOME to point to the location of the JDK. If you use bash, you can simply put the following line inside one of your startup files:
export JAVA_HOME=/usr/java/jdk1.3
Now that we have installed the JDK, we can install Jakarta-Tomcat. Jakarta is the overall name for Java-related projects sponsored by the Apache Software Foundation, and Tomcat is the ASF's project for servlets and JSPs. (JSPs, as we will see next month, are simply an easy way to create servlets.) Tomcat is meant to be the reference standard for servlets and JSPs on a variety of platforms, making it portable and easy to use Java in server-side web applications.
Unlike a CGI program, which executes within its own UNIX process, and unlike a mod_perl handler, which executes as a subroutine within Apache, servlets execute within a Java virtual machine. This JVM is known as a “servlet container” and can be the server itself (if the server is written in Java), embedded inside of the server or external to the server.
This article assumes that you will be using Apache, in which case the servlet container is external to the HTTP server. However, Tomcat is itself a full-fledged HTTP server, meaning that we can conduct some initial tests without having to configure Apache at all.
You can download and install the latest version of Tomcat from the Jakarta web site at http://jakarta.apache.org/. The Jakarta Project distributes software for a variety of platforms and on a number of schedules, in both source and binary formats. It may take a bit of looking, but you should be able to find a downloadable binary of the latest stable Tomcat release for Linux. As of this writing, the most recent stable version of Tomcat is 3.2.1, which I downloaded in the file jakarta-tomcat-3.2.1.tar.gz.
Once downloaded onto your computer, change to the directory where you want to install Tomcat, and open it up:
cd /usr/java tar -zxvf jakarta-tomcat-3.2.1.tar.gz
Your /usr/java directory will now contain two subdirectories, one named jdk1.3 and the other jakarta-tomcat-3.2.1.
Just as you had to set JAVA_HOME to indicate where your Java distribution is located, you must also set the TOMCAT_HOME variable to indicate where Tomcat was installed. Those using bash can add the following line to one of their startup files:
export TOMCAT_HOME=/usr/java/jakarta-tomcat-3.2.1
If you are planning to write your own servlets, you will also need to tell Java where to look for the servlet-related classes. These are located in a Java archive (.jar) file, $TOMCAT_HOME/lib/servlet.jar. If you use bash and don't otherwise set your CLASSPATH, you can set it as follows:
export CLASSPATH=$TOMCAT_HOME/lib/servlet.jar:.You don't need to modify CLASSPATH in this way if you are only planning to run servlets that other people have written. The runtime Java servlet engine knows where to look for the appropriate .jar files, and its CLASSPATH is set correctly when you install Tomcat.
Once you have performed all of these steps, Tomcat is ready to go. You can start it up using the shell script under $TOMCAT_HOME/bin:
$TOMCAT_HOME/bin/startup.sh
A number of diagnostic messages will appear on the screen. However, the main servlet.log log file is normally in $TOMCAT_HOME/logs.
You can check to see if Tomcat works by pointing your browser at port 8080 (the default) on the computer where it has been started. In other words, http://localhost:8080/ should give you a welcome message, indicating that “this is the Tomcat default home page” with some additional links to examples of servlets and JSPs installed on the system. The example servlets should execute correctly, providing you with a demonstration of some simple tasks that we can perform with Tomcat.
Servlet classes are normally installed under a directory named WEB-INF, underneath the directory named in the URL; that is, the example servlet RequestInfoExample, which comes with Tomcat, is available at http://localhost:8080/examples/servlet/RequestInfoExample.
The actual Java .class file (as well as the .java source file for that class) is stored in $TOMCAT_HOME/webapps/examples/WEB-INF/classes/RequestInfoExample.class.
We will soon see how to configure additional directories for servlets. However, we will always have to install our classes under the directory WEB-INF/classes, and the WEB-INF hierarchy will be hidden from public view.
Listing 2. HelloWorld.java, a simple applet that handles the GET request method.
We can test our Tomcat installation by placing a simple servlet, HelloWorld.java (see Listing 2), inside of the directory mentioned above, $TOMCAT_HOME/webapps/examples/WEB-INF/classes/.
Remember that Java requires filenames to match the class names. If you want to change the filename to ABC.java, you will have to change the class declaration inside of the source code to the same name. Otherwise, the Java compiler will complain with a fatal error.
To compile HelloWorld.java into an executable servlet, use the Java compiler, just as we would normally do:
javac HelloWorld.java
If your CLASSPATH environment variable was not set correctly, javac will complain that it cannot resolve the symbols HttpServletResponse, ServletException and a number of other classes. Rectify this by setting your CLASSPATH to include servlet.jar, as indicated above.
Once the servlet has been complied, you should be able to invoke it with http://localhost:8080/examples/servlet/RequestInfoExample.
If you attach a firstname parameter to the URL, the servlet should print your first name as well: http://localhost:8080/examples/servlet/RequestInfoExample?firstname=Reuven.
As you can see, this servlet is extremely simple. It imports a number of other useful Java packages, including the all-important javax.servlet.* and javax.servlet.http.* hierarchies. We then define our servlet as a subclass of HttpServlet. In doing so, we inherit all of the logic of HttpServlet.
Our HelloWorld servlet is particularly simple and includes a single method, doGet. doGet is invoked whenever the servlet is invoked with the GET method. HTTP supports a number of methods, but the most common are GET and POST; GET is typically used when a user directly requests a URL or clicks on a hyperlink, while POST is used when someone clicks on a “submit” button at the bottom of an HTML form. Because our servlet defines a doGet method but no doPost method, it can only handle GET requests.
The two arguments to doGet describe the HTTP request and response. If we want to retrieve information from the HTTP request, we use a method on our request object. For example, we can retrieve the value associated with the firstname parameter with the getParameter method:
String firstname = request.getParameter("firstname");
If no firstname parameter was passed in the request, the variable, “firstname”, will be assigned the null value. (This is distinct from the empty string, which indicates that the parameter was passed in the HTTP request but contained no value.)
We can similarly affect the HTTP response by invoking methods on our response object. For example, we can set the MIME type of the HTTP response with the setContentType method:
response.setContentType("text/html");
To send information to the user's browser, we use response.getWriter( ), which returns a PrintWriter:
PrintWriter out = response.getWriter();Assuming that we are sending content of type text/html, we can now use out.println to send HTML to the user's browser:
out.println("<HTML>"); out.println("<Head><Title>Hello, world</Title></Head>");
We could continue to use Tomcat as our main HTTP server. However, it is neither as fast nor as configurable as Apache or most other servers. For this reason, it's typical to use Apache for most HTTP requests and to forward servlet- or JSP-related requests to Tomcat.
In order for Apache to communicate with Tomcat, we must compile a module into our Apache server. The traditional way was with mod_jserv, based on a project called JServ. A new module, known as mod_jk, compiles into your Apache server similarly to mod_jserv but uses a more efficient and flexible protocol to communicate with Tomcat.
The easiest way to install mod_jk is to download the source code for Tomcat from the Jakarta web site. Even if you have downloaded the binary version of Tomcat for general use, you will need to retrieve the source code in order to compile and install mod_jk. After you unpack the source distribution, change to src/native/apache1.3. If you are using an early version of Apache 2.0, you should go to src/native/apache2.0 for the mod_jk source code.
The following instructions assume you have built your Apache server with the ability to handle DSO, dynamically loaded modules that were not originally compiled into the server. DSO is a wonderfully flexible mechanism for adding new modules; not all modules can always handle this flexibility, and you might find yourself compiling some or all of your Apache server statically. While mod_jk can certainly be compiled statically, the on-line documentation encourages users to install it as a DSO, both because it is easier to do so and because it means that you can update mod_jk without having to recompile Apache.
Compiling a module as a DSO means it must link against the Apache server using the names and addresses that were specified at the server's time of compilation. In order for us to compile modules with the same environment and information as the server, Apache provides apxs, a Perl program that ensures our modules are compiled correctly. apxs takes the same arguments as cc, as well as several of its own that allow us to install the module automatically.
From inside the apache1.3 directory, we can compile mod_jk in httpd.conf with the following command:
/usr/local/apache/bin/apxs -i -o mod_jk.so -I../jk \ -I$JAVA_HOME/include \ -I$JAVA_HOME/include/linux -c *.c ../jk/*.c
If you enter the above command in three lines, rather than one, remember to include the backslashes (\) as the final character on each of the first two lines. Also note that we did not include the -a option, which activates the module in the Apache configuration file, because (as we shall soon see) that is done from within another, automatically generated, configuration file.
Now that mod_jk is installed, we must get Tomcat and mod_jk to speak to each other. Normally, Tomcat expects to receive requests from Apache with the Ajpv12 protocol. However, mod_jk and Tomcat both understand the Ajpv13 protocol, which is more advanced in a number of ways. We will thus need to modify our Tomcat configuration so that it supports Ajpv13, and then configure Apache to use that protocol to communicate with Tomcat.
Tomcat has two configuration files, one for the HTTP server (web.xml) and one for the Java servlet container (server.xml) files. Even if you have never worked with XML, you should not have to worry very much; not only is XML easy to learn, but the Tomcat configuration files are heavily commented. Both configuration files are in the $TOMCAT_HOME/conf directory.
In order to tell Tomcat to use Ajpv13, we must find the section of server.xml that defines the Ajp12 connector. The section normally looks like this when you install Tomcat:
<ConnectorclassName="org.apache.tomcat.service. <Parameter name="handler" value="org.apache.tomcat.service.connector. <Parameter name="port" value="8007"/> </Connector>
As you can see, this defines the TCP/IP connector handler and indicates that Tomcat should use the Ajp12ConnectionHandler object in the org.apache.tomcat.service.connector package. We will add a similar block immediately following the one shown above:
<Connector className="org.apache.tomcat.service. <Parameter name="handler" value="org.apache.tomcat.service.connector. <Parameter name="port" value="8009"/> </Connector>Aside from changing the name of the handler object, you can see that we have modified the port number to be 8009.
The mod_jk instructions make it clear that we should add the new Ajp13 handler to server.xml, leaving the Ajp12 handler in place. Otherwise, there may be problems when you try to shut down Tomcat.
Shut down Tomcat with $TOMCAT_HOME/bin/shutdown.sh, and start it up again with $TOMCAT_HOME/bin/startup.sh, just to make sure that the configuration changes did not break anything. If all is fine, you should see messages indicating that the HttpConnectionHandler is running on port 8080, the Ajp12ConnectionHandler is running on port 8007 and the Ajp13ConnectionHandler is running on port 8009.
The easiest and fastest way to tell Apache how to connect to Tomcat is to use mod_jk.conf-auto, a file that Tomcat generates each time it is restarted. This file, which is located in $TOMCAT_HOME/conf, contains all of the Apache directives necessary to load and use Tomcat. You simply need to include this set of definitions from within your Apache configuration:
Include /usr/java/jakarta-tomcat-3.2.1/conf/mod_jk.conf-auto
mod_jk.conf-auto is not only useful and automatic, it also provides a good sense of how to configure mod_jk and how to create sophisticated interactions between Apache and Tomcat. One thing to remember when working with Apache and Tomcat is that Apache must always start up after Tomcat is running so it can connect to the appropriate socket.
To demonstrate how easy it is to write servlets, we will create a simple web application—a blog-creation tool. Blogs, or “web logs”, are increasingly popular web diaries in which the newest entries traditionally appear at the top. The first web log was Dave Winer's Scripting News (http://www.scripting.com/), but there are many thousands of web logs that provide useful news and commentary on a variety of topics.
We will use servlets to create a very simple web log. The actual log entries will be stored in a PostgreSQL database, which we can define as follows:
CREATE TABLE BlogEntries ( entry_id SERIAL NOT NULL PRIMARY KEY, entry_date DATETIME NOT NULL CHECK entry_headline TEXT NOT NULL CHECK entry_text TEXT NOT NULL CHECK UNIQUE(entry_date, entry_headline) );
Since we're going to be retrieving data by date and headline, we create an index on each of two columns:
CREATE INDEX headline_date_index ON BlogEntries CREATE INDEX entry_headline_index ON BlogEntries (entry_headline);Now that we have created our database table and indices, we will need to create two servlets: one servlet will receive input from an HTML form and use that input to insert a new row into the BlogEntries table. (Presumably, this servlet will only be available to the owner of the site, who is the editor of the web log.) The second servlet will retrieve all web log entries from the last three days, displaying them in the traditional last-in-first-printed order.
The servlet for adding a new web log entry, AddBlogEntry [see Listing 3 at ftp://ftp.linuxjournal.com/pub/lj/listings/issue84/], expects to receive two parameters from an HTML form. The first parameter (entry_headline) contains the headline, while the second (entry_text) contains the text associated with it.
The servlet in Listing 3 defines an instance variable con which contains the JDBC database connection. The servlet also defines three methods:
init, which is before the servlet is first executed. In init, we make an initial connection to the database, keeping the connection around for future use.
doGet, which prints an error message indicating that only POST requests will be honored by this servlet.
doPost, which uses the database connection established by init to INSERT a new row into the BlogEntries table.
Modifying a servlet is different from modifying a CGI program in that the servlet container must reload the servlet from disk. Apache and mod_perl do not reload Perl modules by default; so too does Tomcat ignore modified servlets by default. You can change this behavior by setting the “reloadble” attribute to “true”; if you fail to do this, you will need to restart Tomcat each time you modify and recompile a servlet. Of course, there is a performance penalty when servlets are reloadable, which is why the Tomcat documentation suggests keeping them nonreloadable in production systems.
Our doPost method is the real workhorse in this servlet, taking input from the user's HTML form and inserting them into our table in PostgreSQL.
First we make sure that we have received the entry_headline and entry_text parameters from the user and the parameters aren't empty. If one or more is empty, then we create a message that indicates what was missing. Otherwise, we go ahead and create a PreparedStatement for inserting a new row into the database.
Perl programmers will see many similarities between JDBC and Perl's DBI. JDBC requires that we create a statement based on the database connection:
PreparedStatement statement = con.prepareStatement( "INSERT INTO BlogEntries " + " (entry_date, entry_headline, entry_text) " + " VALUES " + " (NOW(), ?, ?)" );
Since we are using a PreparedStatement rather than a simple statement, we can use question marks (?) instead of variable values. The drivers for some databases, such as Oracle, take advantage of these placeholders and use them for greater speed. But even users of low-end databases can benefit from using placeholders because they ensure strings will be quoted correctly, even if they contain quotation marks or apostrophes:
statement.setString(1, entry_headline); statement.setString(2, entry_text);Notice how the first placeholder is numbered 1, rather than 0. Keep in mind that these two values are strings; if they were integers or floating-point numbers, we would have to use a different method on statement.
Next, we perform the actual insert:
int updateCount = statement.executeUpdate();
updateCount is assigned the number of rows that were affected by the executeUpdate() method. In this particular case, we were trying to insert a single row, so we compare updateCount with 1. If we were to use executeUpdate() to perform an SQL “UPDATE”, updateCount might contain a different number.
Finally, we catch exceptions that might have occurred during our use of SQL. We then print an error message, including the text of the exception. While printing such explicit messages to the end user might not be a good idea on a production web site, it is an excellent idea during development.
Now that we have seen how a servlet can be used to enter information into our web log, we will write another servlet to display the latest contents. This servlet will be relatively simple; it will take no parameters and will display the latest contents of the web log (see Listing 4 at ftp://ftp.linuxjournal.com/pub/lj/listings/issue84/).
Our ShowBlog servlet will only have two methods, init (which is identical to the “init” method from AddBlogEntry) and doGet. doGet will retrieve all of the entries in a web log, from the newest to the oldest. It displays each entry as a three-column row in an HTML table, showing the date and time at which it was added, the headline and the text associated with that headline.
Of course, a real web log will do things in a slightly more intelligent way, limiting the number of remarks and arranging them with a better sense of design. But that's easy enough to do once we have retrieved the information from the database in the correct order.
We create our query (inside of a “synchronized” block) and wrap it into a Statement. Notice how we need not use a PreparedStatement because we are not planning to instantiate any variable values into the statement.
We retrieve the results from the query into a ResultSet:
ResultSet rs = statement.executeQuery(query);
A ResultSet allows us to pull results out of the database one row at a time. We can iterate through each row inside of a while loop using the rs.next( ) method. Within each iteration, we can retrieve a column as a String value using the rs.getString( ) method, passing the name of the column as a parameter.
After compiling this servlet and placing it on my system, I was able to add some new web log entries and display them within a matter of minutes.
Servlets are the Java world's equivalent to the Perl world's modules for mod_perl. In many ways, they are actually better as they provide a great deal of power without endangering the web server with potentially risky programs. This month, we saw some simple ways to build web applications using servlets and open-source tools that we can download from the Web. Next month, we will continue our exploration of server-side Java by looking at some simple uses for Java Server Pages, also known as JSPs.