This article will help you understand print filters and how to create and install your own personalized filters.
In the December 1999 issue of Linux Journal, Michael Hughes introduced a few basics for setting up printing on your Linux computer. This article will go into greater detail on getting your print installation to work for a variety of printers. The biggest problem is that most applications, especially the graphics-based ones that run under X, print using the PostScript printer language, and most of us do not have a PostScript printer.
I'm basically lazy—I want my computer to make things easier for me. What's a computer for if not to automate tasks and simplify my life? It didn't take me long to become frustrated with having to deal with printing plaintext files vs. PostScript files from Netscape or other applications. So I wrote a printer filter to take care of these decisions for me, and you can, too. The BSD print system that comes with your Linux distribution has the capability to pipe print files through a filter, which can be any executable program. Since the filters usually have a simple job to do, they are normally written as a bash or Perl script.
The printer daemon sends the file to be printed to the standard input of the filter, and the filter modifies the file as needed. The filter can then write the modified file to standard output, which the print system sends on to the printer. Your filter can make any kind of modification to the file, and you don't even have to write the modified file to standard output, as we shall see later.
Specific printer capabilities are defined in the /etc/printcap file, which is normally created when Linux is installed. Each printer or printer configuration that you use must be defined in this file. The printers can be attached locally, attached to another computer on your network or to network printers. You can set up multiple configurations for the same printer to handle different file types, and the print system can send output to other devices or save it to a file. Virtually anything that can accept a computer file is fair game for the output device.
Each printer configuration in your printcap file defines a print queue. You print something from the command line with the lpr command or through a print button or menu in various applications. UNIX and Linux applications use the lpr command to make the actual print request by using the line-printer daemon, lpd, which spawns a copy of itself to handle each print request. lpd is the main program and runs continuously in the background to handle all requests from lpr, lpq (for querying print job and printer status) and lprm (to remove jobs from the print queue).
Listing 1 is a typical printcap entry that defines a printer attached to your parallel printer port. Comments in the file start with a hash (#) character. Each entry defines a printer queue and consists of a series of fields delimited with the colon (:) character. The first field of the entry supplies one or more names for the print queue, separated by the pipe (|) character. You should include “lp” as the name of one of your print queues, since it will be the default queue when you print without specifying a queue. The last sub-field of this first field should contain a description of this print queue and may be displayed by some print management software. I gave this queue an alias of “text” so that the print command could be either lpr filename or lpr -Ptext filename.
For readability, I usually place each field on a separate line, with the backslash escape character at the end of each line except the last one. But you don't have to. It is traditional to place extra colons at the beginning and end of each field when they are given on separate lines, as in this example, with all continuation lines indented by a tab. The extra colons are ignored by lpd when it reads this file.
The fields after the first one tell lpd the properties for this queue. Each of these subsequent fields begins with a keyword, an equal (=) sign for text values or a hash (#) sign for numeric values, and the value for the field. Some keywords are switches and do not have values. In Listing 1, the second keyword, “lp”, defines the device to which the output will be sent. In this case, /dev/lp should be a symbolic link to /dev/lp0 or /dev/lp1 for your printer port. Don't confuse this field name with the default print queue name in the first field. The order of the fields does not matter, except for the first field that names the queue.
The lf keyword specifies the full path to the log file, where errors from the print daemon concerning this queue can be recorded. When something goes wrong, the contents of this file may give you a clue to the problem, although many print-related error messages will be sent to your system log, /var/log/messages. The sb boolean keyword tells lpd to print a short one-line “banner”. Finally, sd, the spool directory keyword, lets you specify the directory path in which to place a temporary copy of the file being printed. You must specify a spool directory for each print queue. Other controlling information for the queue and each print job is also placed in this directory, so each queue in your printcap file needs to have its own spool directory.
These are the most common of the many available parameters. The printcap man page lists all the parameters that can be used to configure your print queue entries.
If we want to modify the print file that is sent to a queue, or perform some other task before printing, we use the output filter printcap keyword, of. The filter program is usually kept in the spool directory for the queue which uses it, such as /var/spool/lpd/lp1. Listing 2 shows a printcap entry that uses a simple filter. The printer port is defined by the lp= field as usual. When we use the of=/var/spool/lpd/filter field, the print daemon pipes the original file through this filter before it is sent to the device set by the lp field. After the lf field, I'm using the sf and sh boolean fields to tell the print daemon to suppress form feeds and suppress the header page for this queue.
So what can we do with our printer filter? Just about anything you could imagine. Listing 3 is a filter for printing plain text on HP printers. When Hewlett-Packard printers are not sent data in their native print language (such as PCL, HPGL or Postscript), they expect MS-DOS line endings consisting of an ASCII carriage return and line feed (newline). If you print a UNIX-style text file, which uses a newline only to mark the end of a line, the second and subsequent lines of your file will not start on the left edge of the paper. Without the carriage return to tell the printer to move to the left edge, you get a stairstep effect. There is no line wrap, so once the text moves off the right edge of the paper, all is lost to the proverbial bit bucket.
This filter uses the awk command to insert a carriage-return and newline at the end of each line of UNIX text. As I mentioned above, the BSD printer daemon pipes the file being printed to standard input of the filter. It then passes the standard output of the filter to the print device specified on the lp= field of the printcap entry.
The above example works great to fix a common printing problem, but it is good for text files only. If you accidentally send a PostScript or, heaven forbid, a graphics file to this queue, you will be rewarded with lots of useless paper and a scornful system administrator.
Listing 4 is the filter we want to use to solve this problem. First, we specify a log file where we can write debugging information from the script. Next, we capture standard input to a file with the cat command. We can then check this file to determine what kind of file it is using the file command. If it is a text file, we process it as we did in our simple filter above.
If the file is a PostScript file, we run ghostscript to format it for our particular type of non-Postscript Printer. In this case, I have specified a Color Deskjet, which works for my HP Deskjet 660C. Of course, if you should be so lucky as to have a PostScript printer, you would not need this filter at all.
The last branch of our filter lets us print TIFF graphic files. We simply insert the tiff2ps command in front of the ghostscript command. In this way, you can easily add the ability to process other types of files without manually formatting them. You might want to use the full path names to the commands, since the lp daemon will not have the same path that a login shell does.
I also wanted my smart print queue to be able to deal with several different printers. You see, I use a laptop for all my e-mail and my personal computing. When I am at home, I use an old Epson-compatible printer for simple print tasks. If I need higher quality output, and my kids let me unplug it from their computer, I connect to the HP Deskjet. When I am at work, I print to an HP Laserjet 4MV PostScript printer on the network.
The change in printers is handled easily by adding information about the type of printer to the filter. To set the type of printer that is available, I added a line in $HOME/.profile to ask me which printer I will be using. This is run once when I first log in after booting up my laptop. The printer type I enter is written to the file /tmp/printer. I use the names that Ghostscript recognizes.
Listing 5 is the filter that uses the printer type information. I added the variable PTYPE to hold the name of the printer I am currently using. The first-level “if” branch checks for the file type, as in the previous filter. Within these branches, a second-level “if” branch checks for the different printers.
The other change in the filter is that instead of writing the output to standard output, I pipe it to another lpr print command. On the lpr command, I specify a print queue for the attached or network printer. This bypasses the lp= line in printcap. For this to work, I created another printcap entry for a print queue called “raw”. The raw printcap entry outputs to the actual printer. To create a raw print queue, just take Listing 1, remove the lp default name, and change the queue name to raw. I use a different raw print queue for the network printer, since it is a PostScript printer. Each raw printer has its own print queue in the printcap file. The raw queue for whichever printer is attached to the parallel port, and “4mvraw” is for the network printer when I'm at work.
Listing 6 is the printcap entry for the above filter and its corresponding raw print queue. One thing that is different in this entry compared to that of a normal printer queue is using /dev/null as the printer device. Since our filter will be handling the output (by printing the filtered output to the raw device), we don't need lpd to send it to any device.
If you have more than one printer available to you, you could use this filter technique to send any of the different types of files to different printers. And if you have a FAX modem, you could even add a “print to FAX” feature to your printing system.
After adding a new queue in /etc/printcap, you will need to tell the print daemon to reread the printcap file. Listing 7 is a script I wrote years ago to take care of the additional work necessary to activate your new printcap entries. You may want to verify the path names to the various directories and commands for your system. You must run it as superuser.
What if you do not always have a printer attached or available, but you wanted to send something to the print queue for printing at a later time? Will the job wait until you attach the printer or connect to the network? Yes, it will if your raw printer is attached to your parallel port because /dev/lp waits for handshake signals from the printer.
If jobs fail to print, you need to determine if the problem lies with the print system or your filter. The first place to look is in your system log file, /var/log/messages, then look in the log file for the queue. Next, carefully check the syntax in your printcap file. Make sure each line except the last one for each entry ends in a “\”. Turn on some debugging output to check the operation of your filter. If the problem is with the print system, you can try to stop and restart printing. You should have a script in your system startup area that does this properly. In the SuSE distribution, the startup and shutdown script is /sbin/init.d/lpd. As with any System V-style boot script, this may be manually run as /sbin/init.d/lpd start or /sbin/init.d/lpd stop.
A print queue can get stuck when the spool directory's disk gets full. This can happen for a very large job or if your disk partition is too small. You will have to remove the job with lprm to free it up, then clean up unnecessary files on that disk partition or move your spool directories to a larger partition. After copying all the directories and files to a new location, you can remove all the old directories (rm -r lpd) and replace the lpd directory with a symbolic link to the new location where you have copied the old lpd directory tree (ln -s /path/to/new/location/lpd /var/spool/lpd).
We discussed how BSD printing works and some of the common options in the printcap configuration file. We showed you how to write a filter for altering the file that is sent to the actual printer and how to write a filter that automatically processes different types of input files for printing. We also learned that a filter can redirect the output to different devices or files. Hopefully, this will give you enough information to understand and create your own “smart” printer queues and simplify your printing tasks.
Mark Plimley (markp@blueneptune.com) started home-computing in 1978 with an Imsai 8080 (8-bit, 2MHz Intel 8080). In his former life as a mechanical engineer, he programmed in FORTRAN. He has been employed mainly as a UNIX systems administrator since 1992 and has been using Linux since January, 1995. When not in front of a computer screen, he can be found tinkering around the house, doing some activity with his wife and two teenagers or helping out at church.
All listings referred to in this article are available by anonymous download in the file ftp.linuxjournal.com/pub/lj/listings/issue73/3741.tgz.