Embedding Linux in a Commercial Product

Joel R. Williams

Issue #66, October 1999

A look at embedded systems and what it takes to build one.

Most Linux systems run on PC platforms; however, Linux can also be a reliable workhorse for embedded systems. This article gives an overview of embedded systems and demonstrates what is involved in using Linux in a commercial embedded system.

Embedded Systems—Older than Moses

The computers used to control equipment, or embedded systems, have been around for almost as long as computers themselves.

In communications, they were used back in the late 1960s to control electro-mechanical telephone switches and were called “Stored Program Control” systems. The word “computer” was not as ubiquitous back then, and the stored program referred to the memory that held the program and routing information. Storing this logic, instead of hard-wiring it into the hardware, was a real breakthrough concept. Today, we take it for granted that this is the way things work.

These computers were custom-designed for each application. By today's standards, they look like a collection of mutant deviants, with strange special-purpose instructions and I/O devices that were integrated with the main computing engine.

The microprocessor changed that by providing a small, low-cost, CPU engine that could be used as a building block in a larger system. It imposed a rigid hardware architecture based on peripherals connected by a bus and provided a general purpose programming model, which simplified programming.

Software also advanced along with the hardware. Initially, only simple program development tools were available for creating and testing software. The runtime software for each project was usually written entirely from scratch. This was almost always written in assembly language or macro languages, because compilers were often buggy and lacked decent debuggers. The idea of software building blocks and standardized libraries did not come into vogue until the mid 1970s.

Off-the-shelf operating systems for embedded systems began to appear in the late 1970s. Many of these were written in assembly language, and could be used only on the microprocessor for which they were written. When microprocessor became obsolete, so did its operating system, unless it was rewritten to run on a newer microprocessor. Today, many of these early systems are just a faint memory; does anyone remember MTOS? When the C language came along, operating systems could be written in an efficient, stable and portable manner. This had instant appeal to management, because it held the hope of preserving the software investment when the current microprocessor became obsolete. This sounded like a good story in a marketing pitch. Operating systems written in C became the norm and remain so today. In general, reusability of software has taken hold and is doing rather nicely.

My favorite OS in the early 1980s was the Wendon operating system. For about $150, you received a library of C source code. It was a kit, and you built your own operating system by choosing components—kind of like ordering dinner from a Chinese menu. For example, you could pick a task scheduler algorithm and a memory management scheme from a list of possibilities in the library.

A number of commercial operating systems for embedded systems sprang to life in the 1980s. This primordial stew has evolved to the present-day stew of commercial operating systems. Today, there are a few dozen viable commercial operating systems from which to choose. A few big players have emerged, such as VxWorks, pSOS, Neculeus and Windows CE.

Many embedded systems do not have any operating system at all, just a control loop. This may be sufficient for very simple ones; however, as systems grow in complexity, an operating system becomes essential or the software grows unreasonably complex. Sadly, there are some horribly complex embedded systems that are complex only because the designers insisted they did not need an operating system.

Increasingly, more embedded systems need to be connected to some sort of network, and hence, require a networking stack. Even the doorknob in many hotels has an embedded microprocessor connected to a network.

For simple embedded systems that are just coded in a loop, adding a network stack may raise the complexity level to the point that an operating system is desirable.

In addition to a variety of commercial operating systems, there is an amazing number of proprietary operating systems. Many of these are created from scratch, such as Cisco's IOS; others are derived from some other operating system. For example, many network products are derived from the same version of the Berkeley UNIX system, because it has complete networking capability. Others are based on public domain operating systems such as KA9Q from Phil Karn.

Linux as an embedded OS is a new candidate with some attractive advantages. It is portable to many CPUs and hardware platforms, stable, scalable over a wide range of capabilities and easy to use for development.

Tools—Breaking the ICE Barrier

A key element in developing embedded systems is the set of available tools. Like any craft or profession, good tools help to get the job done faster and better. At different stages of development, different tools may be required.

Traditionally, the first tool used to develop embedded systems was the in-circuit emulator (ICE). This is a relatively expensive piece of equipment that typically hacks into the circuitry between the microprocessor and its bus, which allows the user to monitor and control all activity in and out of the microprocessor. These can be difficult to set up, and because of their invasive nature, can provoke erratic performance. However, they give a very clear picture of what is happening at the bus level, and eliminate a lot of guesswork at the very lowest level of the hardware/software interface.

In the past, some projects relied on this as the primary debugging tool, often through all stages of development. However, once the initial software works well enough to support a serial port, most debugging can be done without an ICE using other methods. Also, most newer embedded systems use a fairly cookbook microprocessor design. Often, corresponding working startup code is available that can be used to get the serial port working in short order. This means that one can often get along quite nicely without an ICE. Eliminating the ICE stage lowers the cost of development. Once the serial port is up, it can be used to support several layers of increasingly sophisticated development tools.

Linux is based on the GNU C compiler, which, as part of the GNU tool chain, works with the gdb source-level debugger. This provides all the software tools you need to develop an embedded Linux system. Here is a typical sequence of debug tools used to bring up a new embedded Linux system on new hardware.

  1. Write or port startup code. (We will talk more about this later.)

  2. Write code to print a string on the serial port, i.e., “Hello World”. (Actually, I prefer “Watson, come here I need you”, the first words spoken over a telephone.)

  3. Port the gdb target code to work over the serial port. This talks to another Linux “host” system which is running the gdb program. You simply tell gdb to debug the program via the serial port. It talks over the serial port to the gdb target code on your test computer, giving you full C source-level debugging. You may also want to use this same capability to download the additional code into RAM or flash memory.

  4. Use gdb to get the rest of the hardware and software initialization code to work, to the point where the Linux kernel starts up.

  5. Once the Linux kernel starts, the serial port becomes the Linux console port and can be used for subsequent development. Use kgdb, the kernel debug version of gdb. Often, this step is not required. If you have a network connection, such as 10BaseT, you will probably want to get it working next.

  6. Once you have a fully functional Linux kernel running on your target hardware, you can debug your application processes. Use either gdb or a graphical overlay on gdb such as xgdb.

Real Time—Says Who?

Simply put, the majority of real-time systems aren't. Embedded systems are often misclassified as real-time systems. However, most systems simply do not require real-time capabilities. Real time is a relative term. Purists will often define hard real time as the need to respond to an event in a deterministic manner and in a short time, i.e., microseconds. Increasingly, hard real-time functions in this tight time range are being implemented in dedicated DSP (digital signal processor) chips or ASICs (application-specific ICs). Also, these requirements are often simply designed out through the use of a deeper hardware FIFO, scatter/gather DMA engines and custom hardware.

Many designers agonize over the need for real-time performance without a clear understanding of what their real requirements are. For most systems, near real-time response in the one- to five-millisecond range is sufficient. Also, a softer requirement may be quite acceptable, something like:

The Windows 98 Crashed_Yet monitor interrupt must be processed within 4 milliseconds 98% of the time, and within 20 milliseconds 100% of the time.

These soft requirements are much easier to achieve. Meeting them involves a discussion of context switch time, interrupt latency, task prioritization and scheduling. Context switch time was once a hot topic among OS folks. However, most CPUs handle this acceptably well, and CPU speeds have gotten fast enough that this has ceased to be a major concern.

Tight real-time requirements should usually be handled by an interrupt routine or other kernel context driver functions in order to assure consistent behavior. Latency time, the time required to service the interrupt once it has occurred, is largely determined by interrupt priority and other software that may temporarily mask the interrupt.

Interrupts must be engineered and managed to assure that the timing requirements can be met, just as with any other operating system. On Intel x86 processors, this job can be handled quite nicely by the real-time extension to Linux (RTLinux, see http://www.rtlinux.org/). This essentially provides an interrupt processing scheduler that runs Linux as its background task. Critical interrupts can be serviced without the rest of Linux knowing about them. Thus, you get a lot of control over critical timing. Interfaces are then provided between the real-time level and the basic-Linux level with relaxed timing constraints. This provides a real-time framework similar to other embedded operating systems. In essence, the real-time critical code is isolated and “engineered” to meet the requirement, and the results of this code are handled in a more generic manner, perhaps at the application task (process) level.

Embedded System—a Definition

One view is that if an application does not have a user interface, it must be embedded, since the user does not directly interact with it. This is, of course, overly simplistic. An elevator-control computer is considered embedded, but has a user interface: buttons to select the floor and an indicator to show on which floor the elevator is now located. For embedded systems connected to a network, this distinction blurs even further if the system contains a web server for monitoring and control. A better definition might focus on the intended functions or primary purpose of the system.

Since Linux provides both a basic kernel for performing the embedded functions and also has all the user interface bells and whistles you could ever want, it is very versatile. It can handle both embedded tasks and user interfaces. Look at Linux as a continuum: scaling from a stripped-down micro-kernel with memory management, task switching and timer services and nothing else, to a full-blown server, supporting a full range of file system and network services.

A minimal embedded Linux system needs just these essential elements:

  • a boot utility

  • the Linux micro-kernel, composed of memory management, process management and timing services

  • an initialization process

To get it to do something useful and still remain minimal, you need to add:

  • drivers for hardware

  • one or more application processes to provide the needed functionality

As you add more capabilities, you might also need these:
  • a file system (perhaps in ROM or RAM)

  • TCP/IP network stack

  • a disk for storing semi-transient data and swap capability

Hardware Platforms

Choosing the best hardware is a complex job and fraught with tar pits of company politics, prejudices, legacies of other projects and a lack of complete or accurate information.

Cost is often a key issue. When looking at the costs, make sure you look at total product costs, not just the CPU. Sometimes a fast, cheap CPU can become an expensive dog of a product, once you add the bus logic and delays to make it work with your peripherals. If you are a software geek, chances are the hardware decisions have already been made. However, if you are the system designer, it is your due diligence to make a real-time budget and satisfy yourself that the hardware can handle the job.

Start with a realistic view of how fast the CPU needs to run to get the job done—then triple it. It is amazing how fast theoretical CPU capacity disappears in the real world. Don't forget to factor in how your application will utilize any cache.

Also, figure out how fast the bus needs to run. If there are secondary buses such as a PCI bus, include them also. A slow bus or one that is saturated with DMA traffic can slow a fast CPU to a crawl.

CPUs with integrated peripherals are nice because there is less hardware to be debugged, and working drivers are frequently already available to support the popular CPUs. However, in my projects, these chips always seem to have the wrong combination of peripherals or don't have the capabilities we need. Also, just because the peripherals are integrated, don't assume this leads to the cheapest solution.

Squeezing 10 Pounds of Linux into a 5-Pound Bag

One of the common perceptions about Linux is that it is too bloated to use for an embedded system. This need not be true. The typical Linux distribution set up for a PC has more features than you need and usually more than the PC user needs also.

For starters, let's separate the kernel from the tasks. The standard Linux kernel is always resident in memory. Each application program that is run is loaded from disk to memory where it executes. When the program finishes, the memory it occupies is discarded, that is, the program is unloaded.

In an embedded system, there may be no disk. There are two ways to handle removing the dependence on a disk, depending on the complexity of the system and the hardware design.

In a simple system, the kernel and all applications processes are resident in memory, when the system starts up. This is how most traditional embedded systems work and can also be supported by Linux.

With Linux, a second possibility opens up. Since Linux already has the ability to “load” and “unload” programs, an embedded system can exploit this to save RAM. Consider a typical system that includes a flash memory, perhaps 8 to 16MB of flash, and 8MB of RAM. The flash memory can be organized as a file system. A flash driver is used to interface the flash to the file system. Alternatively, a flash disk can be used. This is a flash part that emulates a disk to the software. One example of this is the DiskOnChip from M-Systems (http://www.m-systems.com/) which can support up to 160MB. All of the programs are stored as files on the flash file system and are loaded into RAM as needed. This dynamic “load on demand” capability is a powerful feature that makes it easier to support a range of features:

  • It allows the initialization code to be discarded after the system boots. Linux typically uses a number of utility programs that run outside the kernel. These usually run once at initialization time, then never again. Furthermore, these utility programs can run sequentially, one after the other, in a mutually exclusive fashion. Thus, the same memory can be used over and over to “page in” each program, as the system boots. This can be a real memory saver, particularly for things like network stacks that are configured once and never changed.

  • If the Linux loadable module feature is included in the kernel, drivers can be loaded as well as the application programs. The software can check the hardware environment and adaptively load only the appropriate software for that hardware. This eliminates the complexity of having one program to handle many variations of the hardware at the expense of more flash memory.

  • Software upgrades are more modular. You can upgrade the application and loadable drivers on the flash, often while the system is running.

  • Configuration information and runtime parameters can be stored as data files on the flash.

Un-Virtual Memory

Another feature of standard Linux is its virtual memory capability. This is that magical feature that enables application programmers to write code with reckless abandon, without regard to how big the program is. The program simply overflows onto the swap area of the disk. In an embedded system without a disk, this capability is usually unavailable.

This powerful feature is not needed in an embedded system. In fact, you probably do not want it in real-time critical systems, because it introduces uncontrolled timing factors. The software must be more tightly engineered to fit into the available physical memory, just like other embedded systems.

Note that depending on the CPU, it is usually advisable to keep the virtual memory code in Linux, because cutting it out entails quite a bit of work. Also, it is highly desirable for another reason—it supports shared text, which allows multiple processes to share one copy of the software. Without this, each program would need to have its own copy of library routines like printf.

The virtual-memory paging capability can be turned off simply by setting the swap space size down to zero. Then if you write programs that are bigger than actual memory, the system will behave the same way as it does when you run out of swap space; the program will not load, or perhaps a malloc will fail, if the program asks for too much memory.

On many CPUs, virtual memory also provides memory management isolation between processes to keep them from overwriting each other's address space. This is not usually available on embedded systems which just support a simple, flat address space. Linux offers this as a bonus feature to aid in development. It reduces the probability of a wild write crashing the system. Many embedded systems intentionally use “global” data, shared between processes for efficiency reasons. This is also supported in Linux via the shared memory feature, which exposes only the parts of memory intended to be shared.

File Systems

Many embedded systems do not have a disk or a file system. Linux does not need either one to run. As mentioned before, the application tasks can be compiled along with the kernel and loaded as one image at boot time. This is sufficient for simple systems. However, it lacks the flexibility described previously.

In fact, if you look at many commercial embedded systems, you'll see that they offer file systems as options. Most are either a proprietary file system or an MS-DOS-compatible file system. Linux offers an MS-DOS-compatible file system, as well as a number of other choices. The other choices are usually recommended, because they are more robust and fault-tolerant. Linux also has check and repair utilities, generally missing in offerings from commercial vendors. This is especially important for flash systems which are updated over a network. If the system loses power in the middle of an upgrade, it can become unusable. A repair utility can usually fix such problems.

The file systems can be located on a traditional disk drive, on flash memory, or any other media for that matter. Also, a small RAM disk is usually desirable for holding transient files.

Flash memories are segmented into blocks. These may include a boot block containing the first software that runs when the CPU powers up. This could include the Linux boot code. The rest of the flash can be used as a file system. The Linux kernel can be copied from flash to RAM by the boot code, or alternatively, the kernel can be stored in a separate section of the flash and executed directly from there.

Another interesting alternative for some systems is to include a cheap CD-ROM drive. This can be cheaper than flash memory, and supports easy upgrades by swapping CD-ROMs. With this, Linux simply boots off the CD-ROM and gets all of its programs from the CD-ROM in the same way it would from a hard disk.

Finally, for networked embedded systems, Linux supports NFS (Network File System). This opens the door for implementing many of the value-added features in networked systems. First, it permits loading the application programs over a network. This is the ultimate in controlling software revisions, since the software for each embedded system can be loaded from a common server. It is also useful, while running, to import and export a plethora of data, configuration and status information. This can be a very powerful feature for user monitoring and control. For example, the embedded system can set up a small RAM disk, containing files which it keeps updated with current status information. Other systems can simply mount this RAM disk as a remote disk over the network and access status files on the fly. This allows a web server on another machine to access the status information via simple CGI scripts. Other application packages running on other computers can easily access the data. For more complex monitoring, an application package such as MatLab (http://www.mathworks.com/products/matlab/) can easily be used to provide graphical displays of system operation at an operator's PC or workstation.

Booting—Where's LILO and the BIOS?

When a microprocessor first powers up, it begins executing instructions at a predetermined address. Usually there is some sort of read-only memory at that location, which contains the initial start-up or boot code. In a PC, this is the BIOS. It performs some low-level CPU initialization and configures other hardware. The BIOS goes on to figure out which disk contains the operating system, copies the OS to RAM and jumps to it. Actually, it is significantly more complex than that, but this is sufficient for our purposes. Linux systems running on a PC depend on the PC's BIOS to provide these configuration and OS-loading functions.

In an embedded system, there often is no such BIOS. Thus, you need to provide the equivalent startup code. Fortunately, an embedded system does not need the flexibility of a PC BIOS boot program, since it usually needs to deal with only one hardware configuration. The code is simpler and tends to be fairly boring. It is just a list of instructions that jam fixed numbers into hardware registers. However, this is critical code, because these values need to be correct for your hardware and often must be done in a specific order. There is also, in most cases, a minimal power-on self-test module that sanity-checks the memory, blinks some LEDs, and may exercise some other hardware necessary to get the main Linux OS up and running. This startup code is highly hardware-specific and not portable.

Fortunately, most systems use a fairly cookbook hardware design for the core microprocessor and memory. Typically, the chip manufacturer has a demo board that can be used as a reference design—more or less copying it for the new design. Often, startup code is available for these cookbook designs, which can be modified for your needs fairly easily. Rarely will new startup code need to be written from scratch.

To test the code, you can use an in-circuit emulator containing its own “emulation memory”, which replaces the target memory. You load the code into the emulator and debug via the emulator. If this is not available, you may skip this step, but count on a longer debug cycle.

This code ultimately needs to run from some non-volatile memory, usually either flash or EPROM chip. You will need some way to get the code into this chip. How this is done will depend on the “target” hardware and tools.

One popular method is to take the flash or EPROM chip and plug it into an “EPROM” or “flash burner”. This will “burn” (store) your program into the chip. Then, plug the chip into a socket on your target board and turn on the power. This method requires the part to be “socketed” on the board; however, some device package formats cannot be socketed.

Another method is via a JTAG interface. Some chips include a JTAG interface which can be used to program the chip. This is the most convenient way to do it. The chip can be permanently soldered onto the board, and a small cable run from a JTAG connector on the board, usually a PC card, to a JTAG interface. The downside is some custom programming is usually required on the PC to operate the JTAG interface. This same facility can also be used in production for smaller-quantity runs.

Robustness—More Reliable than a Politician's Promise

Linux is generally considered to be very reliable and stable when running on PC hardware, particularly when compared to a popular alternative. How stable is the embedded kernel itself? For most microprocessors, Linux is quite good. A Linux kernel port to a new microprocessor family is usually done to more than just the microprocessor. Typically, it is ported to one or more specific target boards to which Linux is ported. These boards include some specific peripherals as well as the CPU.

Fortunately, much of the kernel code is processor-independent, so porting concentrates on the differences. Most of these are in the memory management and interrupt handling areas. Once these are ported, they tend to be fairly stable. As discussed before, boot strategies vary depending on the hardware specifics, and you should plan on doing some customization.

The device drivers are more of a wild card: some are more stable than others. Also, the selection is rather limited; once you leave the ubiquitous PC platform, you may need to create your own. Luckily, many device drivers are floating around, and you can probably find one close to what you need that can be modified. The driver interfaces are well-defined. Most drivers of a like kind are fairly similar, so migrating a disk, network or serial port driver from one device to another is usually not too difficult. I have found most drivers to be well-written and easy to understand, but keep a book on the kernel structures handy.

In my experience, Linux is at least as stable as the big-name commercial operating systems with which I have worked. Generally, the problems with these operating systems and Linux stem from a misunderstanding of the subtlety of how things work, rather than hard coding bugs or basic design errors. Plenty of war stories abound for any operating system and need not be repeated here. The advantage to Linux is that the source code is available, well-commented and very well-documented. As a result, you are in control of dealing with any problems that come up.

Along with the basic kernel and device drivers, some additional issues arise. If the system has a hard disk, the reliability of the file system comes into question. We have over two years of field experience with an embedded Linux system design employing a disk. These systems are almost never shut down properly. Power just gets disconnected at random times. The experience has been very good, using the standard (EXT2) file system. The standard Linux initialization scripts run the fsck program, which does an excellent job of checking and cleaning up any dangling inodes. One change that may be wise is to run the update program at a 5 or 10-second interval instead of the default 30 seconds. This shortens the time window that data sits in the local memory cache before being flushed to disk, thus lowering the probability of losing data.

Where Are the Skeletons?

Embedded Linux does have its drawbacks. For one, it can be a memory hog, although it's not much worse than some of its commercial competitors. Part of this can be whittled down by removing unused features, but this may take more time than it is worth and may induce some ugly bugs if not done carefully.

Most Linux applications are addicted to virtual memory, which is of dubious value in many embedded systems, so don't assume an embedded system with no disk will run any Linux application.

The kernel-level debug tools are not all that great, particularly at the lower levels. kgdb can have hard landings fairly easily, and you just have to reboot. Unfortunately, print statements get used more than we'd like.

The worst problem for me, however, is a psychological one. Linux is extremely flexible. Embedded systems are generally not flexible; rather, they are highly engineered to perform their intended function optimally. There is a tendency to preserve the flexibility, keep the general purpose capabilities and make as few changes as possible. These are noble goals, but they come at the expense of sharply tuning the system for the specific job. Keeping this flexibility can result in extra work and carrying extra software baggage around, and sometimes involves tradeoffs which degrade performance. One example that comes up repeatedly is configuration. Consider configuring the IP address on a network interface, which is usually done by running the ifconfig program from a startup script. This is a 28K program that could be replaced with a few lines of code to initialize the appropriate structures, using data from a configuration file. Nevertheless, while this is a reasonable thing to do, it still “hurts” because it is contorting the software in ways it was never meant to be used.

The Bottom Line

Using Linux for an embedded system is possible and has been done. It works. It is reliable. The development costs are in line with the alternatives.

Glossary

Joel R. Williams is the principal architect at emLinux. He develops embedded systems for networking applications and telecommunications. He can be reached at joel@emlinux.com, or visit http://www.emlinux.com/.