Quixote: a Python-Centric Web Application Framework

Greg Ward

Issue #0, linuxjournal.com

If you need to create dynamic web sites and don't want to learn the syntax and arbitrary limitations of yet another templating language, you should give Quixote a serious look.

Quixote is a web application framework for Python programmers. It was primarily developed by Andrew Kuchling, Neil Schemenauer and myself (Greg Ward) at the MEMS Exchange, in order to make our real job—the creation of a web-driven network of semiconductor fabrication sites—easier. For the development of our main web site (www.mems-exchange.org), we needed to concentrate on the complex business logic needed for such a network and draw a clear line between the backend and the user interface. We also wanted to use Python as much as possible, because in our opinion it is the most appropriate language for such a complex and rapidly changing application domain.

Quixote requires Python 2.0 or greater, a good understanding of Python and a web server that implements the CGI protocol. (Although your applications will be much happier using a mechanism, such as FastCGI or SCGI, that allows long-running web processes.)

Intended Audience

Quixote was written by and for Python programmers who need to develop dynamic web sites while using as much of their existing Python knowledge as possible. In particular, Quixote is not very accommodating of the commonly made distinction between “web designers” and “web developers”. If the web designers at your organization are keen to try out a real programming language, then Quixote might provide them with a good introduction to Python; but anyone who doesn't understand what “import a module” or “call a function” means isn't going to get very far with Quixote. Similarly, anyone who expects to use a dedicated, WYSIWYG HTML editor for creating web pages will be left out.

This, incidentally, is completely opposite to the stance taken by most other web application frameworks, which is precisely why we don't like most other web application frameworks. In our limited experience, they all invent an HTML templating language that embeds some sort of programming language in HTML, often with deliberate limitations to prevent naive users from shooting themselves in the feet. This usually ends up being painful and frustrating for programmers who want power and flexibility and are perfectly capable of aiming the gun away from their own feet.

Specifically, Quixote's templating language, PTL (Python Template Language), inverts the usual model by making it easier for Python code to generate long text strings such as HTML documents, rather than by embedding Python code in an HTML-like template language. We'll cover PTL in more detail later.

Quixote might be the tool for you if:

  • you need to develop dynamic web sites with complex programming needs, either in the backend or for presentation/user interfaces;

  • you're more concerned with providing good content and getting the logic behind the site right than you are with fancy design tricks;

  • you don't want to learn (and fight with) yet another HTML-templating language; and/or

  • you want to use everything you already know about Python (modules, packages, functions, classes and so forth) to develop web sites

Using Quixote

Quixote is built on four core principles:

  • Publishing function results: Quixote's main job is using a URL to look up a Python callable (e.g., a function or method) and put its results on the web.

  • The URL is part of the user interface, and the organization of source code and URL-space should roughly correspond.

  • Embedding HTML in Python is cleaner and easier than embedding Python in HTML.

  • No magic: when Quixote can't figure out what to do, it refuses to guess the programmer's intent, preferring to raise an exception instead.

The usual way to develop a Quixote application is to write a set of classes that implement the fundamental logic of your system—variously referred to as domain classes, domain objects, business logic and so forth. Your domain classes ideally should have nothing to do with the type of user interface you're going to implement. Then you implement the web interface as a separate set of PTL modules. The relationship between these two bodies of code should be entirely one-way: the web interface will import and rely heavily on the domain classes, but the domain classes will be completely ignorant of the web interface.

As a real-world example, consider SPLAT!, a simple bug-tracking tool I wrote as a sample Quixote application (and also because we needed a simple bug tracker). SPLAT! (named for the sound of a bug being squashed) consists of a Python package, splat, with a sub-package called splat.web. The domain classes, SPLAT!'s idea of what a bug is, what a user is, how its bugs are stored, are in modules named things like splat.bug, splat.user, splat.database and so on.

The web interface to SPLAT! is implemented in the splat.web package, with the following modules:

splat.web.util      (splat/web/util.ptl)
splat.web.index     (splat/web/index.ptl)
splat.web.bug_ui    (splat/web/bug_ui.ptl)
splat.web.prefs     (splat/web/prefs.ptl)

From Python to PTL

Let's take a look at the code in the splat.web package. Every namespace (package or module) that Quixote uses must supply two special identifiers: _q_index() and _q_exports. We'll get to _q_exports momentarily. For now, let's concentrate on _q_index(); for the splat.web package, it is supplied through an import in splat/web/__init__.py:

from splat.web.index import _q_index

This fits with the recommended practice of putting as little code as possible in __init__.py files. Any functions or classes that need to be supplied by a package itself, as opposed to a module in that package, should simply be imported in the package's __init__.py.

Quixote' _q_index() is equivalent to index.html, but instead of a file containing the default web page for a directory, _q_index() is a Python callable (e.g., a function, a method or a PTL template) that returns the default web page for a namespace. In fact, there are many useful analogies between a traditional filesystem-based web server (such as Apache) and Quixote's Python-centric way of building a web site.:

Filesystem (e.g. Apache)

Quixote

directory

Python namespace (module, package, ...)

file

Python callable object (function, method, ...)

index.html

_q_index()

file exists, is readable

callable object exists, is in _q_exports

Let's consider a simple _q_index() for SPLAT!, written as a Python function:

from quixote.html import html_quote
from splat.web.util import get_bug_database
def _q_index (request):
    result = ["""\
        <html>
        <head><title>SPLAT! Bug Index</title></head>
        <body>
        <table>
          <tr>
            <th>bug id</th>
            <th>description</th>
          </tr>
        """]
    bug_db = get_bug_database()
    for bug in bug_db.get_all_bugs():
        if bug.status != bug.ST_RESOLVED:
            result.append("""\
                <tr>
                  <td>%s</td>
                  <td>%s</td>
                </tr>
                """ % (bug, html_quote(bug.description))
    result.append("""\
        </table>
        </body>
        </html>
        """)
    return "".join(result)

We build up the web page as a list of strings, which are concatenated at the end. (This is much more efficient that repeated string concatenation border. In fact, a loop of result += ... probably qualifies as an antipattern in Python because of its quadratic running time.)

For this simple example, writing _q_index() as a Python function isn't too inconvenient, but there is a better way: PTL. PTL is simply Python with a different way of specifying function return values. In fact, the above function is quite easy to rewrite as a PTL template:

template _q_index (request):
    """\
    <html>
    <head><title>SPLAT! Bug Index</title></head>
    <body>
    <table>
      <tr>
        <th>bug id</th>
        <th>description</th>
      </tr>
    """
    bug_db = get_bug_database()
    for bug in bug_db.get_all_bugs():
        if bug.status != bug.ST_RESOLVED:
            """\
              <tr>
                <td>%s</td>
                <td>%s</td>
              </tr>
            """ % (bug, html_quote(bug.description))
    """\
    </table>
    </body>
    </html>
    """

At this stage, the differences are unremarkable: instead of explicitly accumulating and returning the HTML document, the PTL version does so implicitly. PTL works by accumulating the result of every statement that runs in a template; each non-None result is stored in an instance of TemplateIO (a class provided by Quixote). When the template returns, it actually returns str() of the TemplateIO object. This converts all of the accumulated statement results to strings (again, with str()) and returns the concatenation of those strings.

PTL starts to get fun when you realize that you can refactor PTL templates just like you can Python functions. For example, we might break up our _q_index() template as follows:

template header (title):
    """\
    <html>
    <head><title>SPLAT! - %s</title></head>
    <body>
    """ % html_quote(title)
template footer ():
    """\
    </table>
    </body>
    </html>
    """
template bug_row (bug):
    """\
      <tr>
        <td>%s</td>
        <td>%s</td>
      </tr>
    """ % (bug, html_quote(bug.description)
template _q_index (request, bug):
    header("Bug Index")
    """\
    <table>
      <tr>
        <th>bug id</th>
        <th>description</th>
      </tr>
    """
    bug_db = get_bug_database()
    for bug in bug_db.get_all_bugs():
        if bug.status != bug.ST_RESOLVED:
            bug_row(bug)
    "</table>\"
    footer()

Now we have reusable header() and footer() templates, and we have simplified the main loop of _q_index(). Any programmer recognizes the value of procedural abstraction; most web-templating languages, unfortunately, do not.

Writing a _q_index() function for our root namespace means Quixote can generate a response when a user requests the base URL corresponding to this application. For example, you might set things up so that /bugs/ is the base URL for SPLAT! at your site, i.e., /bugs/ corresponds to the splat.web package. When your server receives a request for /bugs/, then Quixote will call splat.web._q_index()--which, thanks to that import in splat/web/__init__.py, is really splat.web.index._q_index()--and return the resulting HTML page. But, as long as you can implement something in Python (or PTL), you can use Quixote to associate it with a URL and put it on the web.

Beyond _q_index()

As the table above implies, Quixote can publish the results of any Python function or PTL template, as long as you list them in _q_exports. For instance, I might want to borrow a convention from GUI programming and add an “About” page to SPLAT!. The obvious place to put this in URL-space is /bugs/about, so I need to add a callable object about() to the splat.web package. One way to do this (although not necessarily a recommended practice) is to define a Python function in splat/web/__init__.py:

import splat  # for __version__
from splat.web.util import header, footer
[...]
def about (request):
    text = '''\
        <p>This bug database is brought to you by:</p>
        <p align="center"><font size="+3">SPLAT! %s</font></p>
        <p>For more information, please visit the
        <a href="http://www.mems-exchange.org/software/splat/">
        SPLAT! web page</a>.</p>
        ''' % splat.__version__
    return header("About") + text + footer()

This demonstrates how cleanly Python and PTL code mesh. I can import the header() and footer() templates shown above (which incidentally, actually live in splat.web.util) and call them just like a Python function.

The about() function doesn't actually work yet, though. It would be dangerous for Quixote to trust that any random Python function that happens to be named by a URL should be published on the Web. Thus, you must explicitly declare that about() is meant to be exported to the world by listing it in the _q_exports list for this namespace, which also lives in splat/web/__init__.py:

_q_exports = ['about']

You get pretty far with Quixote just writing Python functions and PTL templates, and having Quixote publish their results via the Web. However, making the URL part of the user interface means that the obvious way for SPLAT! to publish individual bugs is via URLs like /bugs/0001, /bugs/0134, etc. Quixote has a nice hook that lets you handle arbitrary URLs like this.

Object Publishing with _q_getname()

Object publishing is just a fancy term (shamelessly stolen from Zope) to mean that you can use Quixote to wrap a web interface around arbitrary objects. As always with Quixote, the trick is to map a URL onto Python code. But now, instead of providing a Python function that is named to match a URL component, you provide a Python function, _q_getname(), that Quixote uses as a fallback. For example, if Quixote is processing a request for the URL /bugs/0124, it's not going to find a function called 0124 in the splat.web package. Before giving up and raising an exception (which turns into an HTTP 404 error), Quixote looks for a function _q_getname() in that package. If it finds one, Quixote calls your _q_getname(), passing it the string “0124”--the URL component currently being examined.

Don't think of _q_getname() as being like _q_index() or about(). Quixote only calls functions like these at the end of URL traversal: e.g., when processing the URL /bugs/about, the bugs component corresponds to a Python namespace, splat.web, so Quixote doesn't have to call anything. Only when it's looking at the terminal component, about, does it call a Python function—the splat.web.about() function defined above. Likewise, in processing the URL /bugs/, Quixote only calls _q_index() because the terminal component of the URL (the part after the last slash) is the empty string.

However, _q_getname() can be called anywhere along a URL. For instance, SPLAT! actually implements per-bug URLs as namespaces (e.g., /bugs/0134/ calls a _q_index() method, /bugs/0134/edit calls an edit() method, etc.). In this case, the bug ID is not the terminal component of the URL, but Quixote handles it the same way, via _q_getname(). For this article, though, the bug ID will be the terminal URL component, and we're only going to deal with URLs like /bugs/0134. The easiest way to do this is to write a _q_getname() function. Again, assume this code lives in splat/web/__init__.py, which returns the HTML page for the requested bug:

from quixote.errors import TraversalError
from splat.web.util import get_bug_database
[...]
def _q_getname (request, name):
    try:
        bug_id = int(name)
    except ValueError:
        raise TraversalError("invalid bug ID: %r (not an integer)" % name)
    bug_db = get_bug_database()
    bug = bug_db.get_bug(bug_id)
    if bug is None:
        raise TraversalError("no such bug: %r" % bug_id)
    return header("Bug %s" % bug) + format_bug_page(bug) + footer()

(I'm omitting the implementation of format_bug_page().) Most of this function is concerned with taking arbitrary user input (in the form of a URL component) and either fetching a bug object from the bug database or raising the appropriate exception. (Quixote exceptions generally correspond to HTTP error codes; TraversalException becomes a 404 “not found” error. The only time applications need to raise TraversalException is inside _q_getname() functions, because all other URL interpretation is handled by Quixote internally.)

Using _q_getname() to publish a namespace for an object rather than a single page is even more fun, but beyond the scope of this article. Now that we've got a good feel for programming with Quixote, let's take a look at the bureaucracy necessary to get from your web server to Quixote application code.

Connecting Your Web Server to Your Quixote Application

Every Quixote application needs a bit of glue to to connect the web server to the application; the nature of this glue depends on the nature of the connection. The simplest way to connect a web server to a Quixote application is CGI, in which case you need to supply a CGI driver script for your application. The CGI driver script for SPLAT! (which, incidentally, also works with FastCGI) looks something like this:

1.#!/usr/bin/python
2.
3. from quixote import enable_ptl, Publisher
4. from splat.config import OptionParser, get_config
5.
6. enable_ptl()
7. config = get_config()
8. config.read_file("/www/conf/splat.conf")
9. pub = Publisher("splat.web", config=config)
10. pub.setup_logs()
11. pub.publish_cgi()

The call to enable_ptl() in line 6 installs an import hook that makes Python's import statement treat PTL modules the same as Python modules. It only has to be done once in each Python process, so the driver script is the obvious place to do it.

Lines 7 and 8 create a standard SPLAT! configuration object and customize it by reading a local configuration file. (In reality, it's more complex than this because SPLAT! has several auxiliary command-line scripts that need to read the same config file.) Most Quixote applications will want to do something like this in order to customize Quixote's behaviour. In particular, Quixote's default settings prefer security and performance over ease of debugging, so for developing new applications, it's useful to override them by reading a local config file. The demo provided with Quixote has a simple example of doing this.

Line 9 is where we finally establish that the web interface for SPLAT! is implemented by the splat.web package. Every Quixote application is centered around an instance of the Publisher class, which is where all of Quixote's URL interpretation is done. Because this object needs to know the root namespace of your application, it is passed to the Publisher constructor as shown.

Every Quixote application can have up to three log files: the error log, the debug log and the access log. The names of these log files are specified via the configuration object passed to Publisher's constructor (they are a good thing to put in your local configuration file), but you need to call setup_logs(), as shown in line 10, to make sure the log files are actually opened and written to. Every HTTP request processed by Quixote is logged in the access log; every string written to sys.stdout is written to the debug log; and every string written to sys.stderr goes to the error log. This means that instrumenting a Quixote application for debugging is as simple as adding print statements.

Finally, in line 11, we pass control over to Quixote. If this driver script is used as a CGI script, then this whole process will repeat for every HTTP request; if it is handled as a FastCGI script, then the process_cgi() method will process requests as long as the web server keeps the script running.

At this point, you can install the driver script wherever your web server is configured to look for CGI scripts, e.g., with the standard Debian Apache package, you would put it in /usr/lib/cgi-bin. Now you can access SPLAT! via a URL like /cgi-bin/splat.cgi/, which will work but is rather ugly and exposes a lot of implementation details. If you use Apache with the rewrite engine enabled, it's trivial to add a rule that rewrites /bugs/ to /cgi-bin/splat.cgi/, so end users never have to see that ugly, over-informative URL. See doc/web-server.txt in the Quixote source distribution for more information.

Availability (and More Sample Code)

Quixote is available from www.mems-exchange.org/software/quixote. You can download the latest source distribution (the current version as of this writing is 0.5), browse the documentation, join the quixote-users@mems-exchange.org mailing list or browse the mailing list archive.

Installation instructions can be found on the web site and also are included in the source distribution, in doc/INSTALL.

The Quixote demo, included in the source distribution, is a much simpler example application than SPLAT!. The documentation for the Quixote demo goes over the code in great detail, explaining most of Quixote's important principles along the way.

You'll also find documentation for some interesting Quixote features I haven't covered here, notably Quixote's session management interface and its HTML form/widget library. Session management lets you maintain server-side information about individual users of your site via a session cookie, which has all sorts of useful applications for dynamic web sites. Quixote's form/widget library makes constructing and processing complex web forms (still the only portable, reliable way to interact with users over the web) much easier. Like the rest of Quixote, it wraps an object-oriented Python interface around a common web programming task.

Conclusion

Quixote originally was written because we were dissatisfied with the available options for writing web applications in Python. The only tool that came close to what we wanted was Zope, which turned out to be much bigger and more complex than we needed. Zope has the “web designer” vs. “web developer” distinction built in from the start, and works very hard to make a web site mostly editable through the web itself. This is an interesting idea, but it adds tremendous complexity to Zope. As programmers who are quite happy using text editors and the filesystem, we felt left out in the cold. Thus, in creating Quixote, we shamelessly stole Zope's best idea (mapping URLs to Python objects) and geared the whole thing towards Python programmers. The most obvious example of this is that where Zope maps URLs to arbitrary objects in an object database, Quixote maps them to Python packages, modules and functions—objects that are easily created and manipulated by Python programmers using nothing more than a text editor. The result is a web application framework that makes the creation of dynamic web pages so easy it almost feels like cheating.

email: gward@python.net