An Intranet Filing System

Justin Seiferth

Issue #50, June 1998

Mr. Seiferth offers us a solution for keeping track of shared files on over an Intranet that utilizes several operating systems.

Like many of you, I often face challenges when recommending a UNIX-based solution at work. Despite its track record, many managers feel UNIX still has something to prove. So putting a UNIX system in place means working extra fast and decisively. I had just such an opening recently when I put together a system to publish the common file sharing areas of our Microsoft Windows NT-based network. I thought others might be interested in this system and created a distribution, so you too can have your own Intranet Filing System.

One of Microsoft's Windows NT failings is its archaic file system. While many of its inadequacies may be overcome, the users of our internal LAN were having problems finding information among the millions of files spread across thousands of directories located on hundreds of servers across the world. We used Linux, Sun's Solaris and some Netscape products to integrate and publish these file-sharing areas on our Intranet. You can use this system to publish any mountable file system to include FTP sites, Novell and Appleshare File Servers and NFS shares.

The key to this system working on a network utilizing Microsoft products is the ability of the Linux kernel (later 2.0 and 2.1 versions) to mount local NTFS and remote SMB volumes. There's nothing esoteric about enabling or using this option, just check it off when you're building the kernel. Once you've got an SMB-capable kernel installed, you're almost ready to go. The other components are the smbfs utilities, the wu-ftpd FTP server suite, a web server, a search engine and a javascript-capable browser. Your distribution has probably installed an operational FTP and web server, and most people nowadays have the Netscape browser installed, so all you need to do is compile the smbfs utilities and set up a search engine.

If most of the documents on the file areas you wish to publish are in text or HTML format, there are a number of search engines from which to choose: htdig and glimpse, for example. If you want to be able to search non-HTML documents then you might need one of the commercial search engines. We use Netscape's Catalog Server for Solaris, which has the ability to index binary files such as Adobe's PDF and popular office-automation application formats.

Before your “enterprise information warehouse” becomes operational, a few configuration files must be modified on your system. For instance, if you want your system to configure itself after a reboot, you'll need to modify your startup files. Also, make sure your computer is automatically mounting SMB shares into an area accessible by an anonymous FTP session when you boot your system and enable a few features of your FTP server.

First, let's contend with mounting shares automatically. I do this with a standard run-level 3/5 initscript; Listing 1 is an excerpt showing the most critical lines.

The smbmount(8) and umount(8) man pages have more details on what all those flags are about. Basically, we are mounting the shares into a directory accessible via anonymous FTP. Our common shares are read/write for everyone. If your site is more cautious you may want to review the implications of the file permissions and ownership or perhaps impose access controls using your file system and web server's security mechanisms.

Now, let's take a look at scripts used to configure your FTP server. If you start your FTP daemon using the -a option, the /etc/ftpaccess file will allow you to customize many aspects of the FTP server's performance and capabilities. Normally, you enable the -a option of your FTP server in your /etc/inetd.conf file. Some people run their FTP full time, in this case check the startup files in your /etc/rc.d/rc3.d or rc5.d directory and add the option there. Among the benefits of using ftpaccess is the ability to specify header and trailer messages for the directory listings generated by your FTP server.

A piece of our user interface is composed using an HTML file in the root directory of the FTP daemon's file area. The entry in ftpaccess is as follows:

...
message /welcome.html login

Now the contents of welcome.html are displayed at the beginning of FTP directory listings. The source of welcome.html is shown in Listing 2.

The best way to understand what this file does is to just load it up and go. In a nutshell, if the FTP listing is contained within a frame then the if part of the conditional is presented. The interface allows the user to press an “upload” button which will pop open another browser instance with the FTP directory as the root window. When welcome.html is displayed as “top” within this browser; it displays instructions on how to upload a file using the FTP capabilities of the browser.

This interface is not the first one we tried. We needed a design that allowed easy navigation around a complicated system and kept at least minimal help hints in front of the users all the time. I wanted to make the system intuitive, so we could spend less time answering questions and more time working on new ideas. The tests we conducted showed most people knew how to download files but were surprised to learn they could upload or view the contents of files. We tried HTTP uploads and downloads but settled on the combination of FTP and HTTP servers distributed across several machines. Encapsulating the FTP file display simplified uploads and downloads. Unlike a web server, our FTP server labels all files as a single MIME type allowing us to use a single helper application to easily display all files. Getting the preview function to work will require the association of MIME type(s) with applications on the user's computer. We use a universal viewer. You might investigate one of the many plug-ins which allow viewing files within the browser.

Now the majority of the work and trickery is done; all that remains is the remainder of the frame-based user interface, a few snazzy graphics and some help files. When you download the distribution, you may notice that within our frame definition document we are distributing this system across several machines. This is an important feature of the system. We make use of local proxy servers for FTP and HTTP traffic to keep down the loading of our MAN/WAN backbone. We place a proxy and web server on the department's subdomain. Since the proxy and web are local to users, we've found that lots of times the files retrieved from the central Linux-based FTP server (those ultimately stored on our SMB common file area) are served from the local proxy (cache) speeding up the file transfer dramatically and reducing our network traffic.

I mentioned before that we are using the SPARC Solaris version of the Netscape Catalog Server to allow users to expediently find any document or content within a wide variety of popular UNIX (in particular, Linux), Macintosh and Microsoft Windows application formats. We inserted a custom interface to this server which places the results into the frame normally utilized by the FTP directory display or opening splash screen. This feature provides some much needed help to users who must retrieve one of several hundred thousand documents stored on servers spread across the globe. Locating documents is absolutely not feasible using the Microsoft Windows NT file manager search feature previously recommended by the Microsoft Windows NT operators.

You can add many other enhancements such as browser access to multiple file system types (NFS, Appleshare, SMB, AFS, etc.) and Internet/Intranet FTP areas. We are also working on a configuration management add-on using PHP/FI and Postgres to present users with a fully graphical file upload facility which will also store meta data on documents such as the originator of the information, the originator's e-mail address and other information.

Resources

Credits

Justin is a Major in the US Air Force and Linux Journal is delivered to his office so his co-workers can also benefit from it. If you go to the Pentagon, drop by and say hello. Justin can be reached via e-mail at seifertj@www.disa.mil.