Best of Technical Support

Various

Issue #70, February 2000

Our experts answer your technical questions.

Configuring PPP

When I installed PPP for Internet connection, I changed some properties by using the linuxconf command in Linux configuration. Now whenever I start Linux, it gives me an error like:

starting system
loggers:ypbind[187]:clnt_create for server 127.0.0.1 fasiled starting
NFS Services:rpc.mountd rpc.nfsd YPBINDPROC_DOMAIN: Domain not bound.

Please help me to solve this problem. —Kalpesh Vakharia, suryaksh@hotmail.com

You haven't hurt your PPP configuration; you've just activated a tool (Yellow Pages or yp) that you haven't fully configured. The error is harmless, but you can remove it by commenting out any entries that point to yp in your boot scripts. —Chad Robinson, Chad.Robinson@brt.com

Your message shows that you enabled yellow pages, but that you have no server to talk to. Run chkconfig --del ypbind and you should be fine. —Marc Merlin, merlin@varesearch.com

Crash After Booting

My system just crashed and left me with the following message:

checking root filesystems parallelizing fsck version 1.04 (16 may 96)
[/sbin/fsck.ext2] fsck.ext2 -a /dev/sda1 /dev/sda1 contains a file
system with errors check forced Block 4294967295 of inode 131128 >
Blocks (1208304) /dev/sda1: UNEXPECTED INCONSISTENCY; Run fsck manually
an error occurred during the file system check. Dropping you to a shell;
the system will reboot when you leave the shell (repair filesystem) #

The system is completely stalled at the moment. What should I do from here? Another person was using the computer when it crashed. What did they do to get it in such a situation? —Nathan Cutter, NCutter@ricegrowers.com.au

Like most modern operating systems, Linux uses write caching. Turning the computer off (it's hard to get it to crash without an actual hardware problem) without properly shutting it down can cause data errors on your hard drive. To solve this problem, you should do exactly what it says: run fsck manually on your hard drive. You should do this from a boot disk. The boot disks that you used to install Linux are usually fine for this purpose. Simply run fsck /dev/sda1 (the partition that is showing the errors). The program will prompt you to fix each error in exactly the same way that—Chad Robinson, Chad.Robinson@brt.com

No, it is not stalled, it just isn't yet booted. You must run fsck by hand (e2fsck /dev/sda1) and reply to questions. Most likely, you'll just answer yes to any questions, so you might even add the -y switch to e2fsck, although this is considered unsafe. When you are done, exit from the shell (exit) and the system will reboot. Since not every file system repair can be performed automatically in a fail-safe way, human intervention can be required when bad errors are detected. I can't tell what caused the problem, but I dare say the most likely reason is some hardware failure (the disk itself or a RAM chip), as fsck found an all-one word (0xffffffff, or 4294967295) where real data was expected. —Alessandro Rubini, rubini@linux.it

Kernel Won't Upgrade

I am trying to upgrade the kernel from 2.2.7 to 2.2.12. I have downloaded the kernel and it compiles fine. I installed the zImage and System.map files where required. However, when I try to boot up again, the kernel version is still 2.2.7. Subsequently, the system tries to load the 2.2.7 modules and not the 2.2.12 modules I require. —Michael Hoegen, m_hoegen@yahoo.com

Copying the image files to the correct disk locations isn't enough. In fact, your system may stop booting shortly. You need to tell your boot loader that you've done this. Linux isn't running yet when your system needs to find that file, so to work around that problem, your boot loader records its physical disk location, and copying one file over another always changes this location. If you are using LILO as your boot loader, you can simply type lilo at your shell prompt (as root) to force it to see this file. If you are using another boot loader, consult that program's documentation. —Chad Robinson, Chad.Robinson@brt.com

I don't think you reran /sbin/lilo after copying the kernel, and most likely you didn't even copy the kernel to the right place. If you truly overwrote the kernel and only forgot to run lilo, your system will stop booting very soon (I won't dig into the technical details here; please check LILO documentation and my article about booting in the June 1997 LJ). When upgrading the kernel on a working box, you should always keep a copy of the previous (working) kernel, to recover the computer in case the new kernel image doesn't work for you. To do that, you should add another “image=” stanza to /etc/lilo.conf and rerun lilo. If your lilo.conf is not well-commented, you'll need to refer to proper documentation (such as man lilo.conf). —Alessandro Rubini, rubini@linux.it

Odd Booting Message

When Linux boots, I receive the message “LIL-”. Documentation said that it is a description table error. How can I fix this problem? —Adrian Lasso, alasso@baufest.com

LILO is looking for your kernel and can't find it. Usually this happens in one of two cases. The first is when you install a new kernel and forget to tell LILO it's there by running lilo as root before rebooting. The second is when LILO simply can't cope with your hard drive format. You can solve it either way by booting from a set of boot disks and rerunning LILO. If running it alone doesn't help, try running it as lilo -l. This often lets LILO work around certain hard drive formats that it otherwise might not be able to read. —Chad Robinson, Chad.Robinson@brt.com

Recovering from such problems is not trivial, as you need an alternate way to boot. I'd suggest you rerun /sbin/lilo after entering your system on booting from CD or floppy. Also, adding a “linear” keyword to the /etc/lilo.conf file is usually beneficial. These problems, however, are usually very hard to track down; you can find a lot of information on architectural problems related to system boot in Andries Brouwer's pages, at http://www.win.tue.nl/~aeb/. —Alessandro Rubini, rubini@linux.it

The Headless Serial Console

I would like to set up a serial console. I would like to be able to issue LILO boot commands on the serial console. Eventually, if all goes well, I'd like the machine to be totally headless. If you could give me a hands-on “how to do this”, it would be great. I have already read your “Serial Terminal as Console” (Issue #36, April 1997) article, but I don't get the LILO prompt on my terminal. —Rick McFarland, mbsrick@ctel.net

This is not a problem at all. If you run kernel 2.2 or 2.3, just configure serial console support and read Documentation/serial-console.txt. If you run 2.0, you must apply the serial-console patch (ftp://ftp.cistron.nl/pub/people/miquels/kernel/) and fall back to the previous case. To use LILO on the serial port, just add serial=0,9600n8 or an equivalent line to the /etc/lilo.conf file. In order to interact with LILO, you'll need to send a “break” character. —Alessandro Rubini, rubini@linux.it

Here are the relevant two lines from my lilo.conf file for serial port 1:

append="panic=40 console=ttyS0,38400n8 console=tty0"
serial=0,38400n8

On my machine, I had to disable hardware flow control for minicom to interact with the lilo prompt over the serial port. —Marc Merlin, merlin@varesearch.com

Workstation and Server Connection

I am having problems understanding the concept of setting up a Linux box as a workstation and connecting it to the server. I have set up a server: DOMAIN:server.dungarvin.com IP:192.168.100.1, and a box with DOMAIN:w1.dungarvin.com IP:192.168.100.2; netmask is 255.255.255.0 on both machines. My network cards are working, and I can have them ping themselves, but I can't ping one box to the other—I'm stuck there.

Once I do get connectivity, I'd like to run StarOffice from w1 off of the server. I'm wondering how I go about creating a simple network between the two and the concepts involved. —Nick Anderson, neekolai1@hotmail.com

You are slightly confused when you use the DOMAIN term; what you are giving are FQDNs (fully qualified domain names or host names with the domain attached).

Being able to ping yourself doesn't mean much; it can work even if your network is non-functional. You may have a bad cable or some other link-layer problem. You should check the link light on both of your Ethernet cards to make sure it's lit. Maybe you connected the machines directly without using a crossover cable.

To run an application on a remote display, you need to do the following:

w1:~$ xhost server
w1:~$ rlogin/telnet server
server:~$ export DISPLAY=w1:0
server:~$ soffice &

This is not completely secure, since you're allowing anyone on the server to take over your X display and snoop on your keystrokes. Instead of going into detail about exporting an MIT-MAGIC-COOKIE to allow only one user (you) to access the display, I'd recommend using SSH (secure shell) version 1.2.x (not 2.x which isn't free), as SSH takes care of X security and the display exporting for you. —Marc Merlin, merlin@varesearch.com

Every computer can ping itself, even if the Ethernet card is not working or not there at all. If you try running ifconfig and route, you'll be able to see what the problem is (errors transmitting, receiving, or just no route at all). I suspect you have no routing associated to the interfaces; try: route add -net 192.168.100.0 dev eth0. —Alessandro Rubini, rubini@linux.it