Computer virus

In computer security terminology, a virus is a piece of program code that, like a biological virus, makes copies of itself and spreads by attaching itself to a host, often damaging the host in the process. The host is another computer program, often a computer operating system, which then infects the applications that are transferred to other computers. The plural of virus is viruses, not virii, which is sometimes used incorrectly, both knowingly and otherwise. See the usage note at virus (disambiguation).

As with all code, viruses use the host's resources: memory and hard disk space, amongst others, and are sometimes deliberately destructive (erasing files / formatting hard disks) or allow others to access the machine without authorization across a network.

The term is often used in common parlance to describe all kinds of malware (malicious software), including those that are more properly classified as worms or trojans. Most popular anti-viral software packages defend against all of these types of attack.

There are a few relatively "harmless" viruses that have been written to perform a simple task (such as flashing a single message onto the user's computer screen). A small percentage of viruses are the result of computer code that operates in an unexpected manner, but the majority of viruses are programs deliberately written to interfere with, or damage, other programs or computer systems.

The term "virus" was first used in this sense in print by Fred Cohen in his 1984 paper Experiments with Computer Viruses, where he credits Len Adleman with coining it. However, a mid-1970s science fiction novel by David Gerrold, When H.A.R.L.I.E. was One, includes a description of a fictional computer program called "VIRUS" that worked just like a virus (and was countered by a program called "ANTIBODY"); and John Brunner's 1975 novel The Shockwave Rider describes programs known as "tapeworms" which spread through a network for the purpose of deleting data. The term "computer virus" with current usage also appears in the comic book "Uncanny X-Men" No. 158, published in 1982. Therefore, we may conclude that although Cohen's use of "virus" may, perhaps, have been the first "academic" use, it had been in the common parlance long before that.

A program called "Elk Cloner" is credited with being the first computer virus to appear "in the wild" -- that is, outside the single computer or lab where it was created. Written in 1982 by Rich Skrenta, it attached itself to the Apple DOS 3.3 operating system and spread by floppy disk. Details of this virus, including source, may be found at [1].

Since the mid-1990s, viruses which infect operating systems or applications directly have been eclipsed by macro viruses. Written in the scripting languages for Microsoft programs such as Word and Outlook, these viruses spread in the Windows monoculture by infecting documents and sending infected e-mail. Although Windows is the most popular operating system for virus writers, viruses numbering in the single digits have been seen on Mac OS X. Some viruses also exist on other Unix based OSes. It is important to note that any operating system that allows third-party programs to run can theoretically run viruses. However, some operating systems are less secure than others. Unix-based OSes (and NTFS-aware applications on Windows NT based platforms) only allow their users to run executables within their protected space in their own directories.

Nature of viruses

Much bandwidth has been wasted arguing about the difference between a virus and a computer worm; the important thing about both is that they spread, and therefore can cause orders of magnitude more trouble than a direct attack or a typical non-spreading Trojan horse.

While viruses can be (and often are) malicious, destroying data, many are fairly benign or merely annoying (for example, displaying a message to the user). Many such viruses have a delayed payload, playing a message on a specific holiday, day of the month, or time of day; or waiting for a certain number of infections or reboots, or randomly occurring with a small chance.

The predominant destructive effect of viruses is their uncontrolled self-reproduction, which wastes or overwhelms computer resources.

"Good" viruses have also appeared that spread improvements to the programs they infected, or delete other viruses. These are, however, quite rare, and still consume system resources.

Replication Strategies

A virus requires several features from its host software to successfully duplicate itself. It must be permitted to execute code and write to memory. For this reason, many viruses attach themselves to useful programs, in the hope that users will run those programs (and therefore the virus).

Before computer networks became widespread, most viruses spread on removable media, particularly floppy disks. In the early days of personal computers, many users regularly exchanged information and programs on floppies. Some viruses spread by infecting programs stored on these disks, while others installed themselves into the disk boot sector, ensuring that they would be run when the user booted the computer from the disk.

As bulletin board systems and online software exchange became popular in the late 1980s and early 1990s, more viruses were written to infect popularly traded software. Shareware and bootleg software were equally common vectorss for viruses on BBSes. Within the "pirate scene" of hobbyists trading illicit copies of commercial software, traders in a hurry to obtain the latest applications and games were easy targets for viruses.

Many personal computers are now connected to the Internet and to local-area networks. Today's viruses take advantage of standard network protocols such as the World Wide Web, e-mail, and file sharing systems to spread, blurring the line between viruses and worms.

Hiding Strategies

In order to stay alive, some well written viruses employ different kinds of obfuscation. Some old viruses (especially in MS-DOS) alters the information attached to the files it infect, last updated and the filesize. Antivirus software that just searched through recently edited files or files that has changed in size will not notice the virus presence in this case. Note that changing the information on the size of the file is not the same thing as actually changing the size of the file under MS-DOS. This approach does not fool current antivirus software.

Another hiding technique, and at the same time an easy way to spread for old viruses, was to infect the hard disk drive instead of the files saved on it. At bootstrap the computer runs the code located in the boot sector, which is replaced by virus-code. The virus loads itself from the hard disk into memory and making itself memory resident, then loads the original bootsector into memory and then it transfers control to the code in it. This way, not even the operating system notices the presence of the virus.

As computers and operating systems grow larger and more complex, old hiding-techniques needs to be updated or replaced. The stealth of the viruses nowadays often tries to exploit the failings of how modern antivirus software tries to detect viral presence. Most modern antivirus programs tries to find virus-patterns inside ordinary programs by scanning them. If they find any byte-pattern that corresponds to any specific virus-pattern, the antivirus software tries to remove/contain/delete the virus/file.

The CIH Virus, or Chernobyl Virus, infected Portable Executable files. Because those files had many empty gaps, the virus, which was 1 Kilobyte in length, did not add to the size of the file.

Modern state-of-the-art viruses tries to encrypt themselves in order to avoid being detected in any search from a antivirus software. This is often done with a combination of encryption and self-modifying code. A virus that is using this technique is said to be polymorphic.

There is usually two different parts of the virus when we speak of polymorphic viruses: The encryption/decryption engine and the infector. The crypto engine encrypts/decrypts the infector, each time the virus runs it uses a different cryptokey. The crypto engine cannot encrypt itself, because if it did, there would be no code to decrypt the engine next time the virus ran. Therefore, the crypto-engine has to use a form of self modifying code that modifies itself differently each time, without any part of the old algorithm getting lost. This is possible to do with a good knowledge of assembly language and the use of polymorphic code.

Viruses and popular software

Another analogy to biological viruses is worth noting: just as genetic diversity in a population decreases the chance of a single disease wiping out a population, the diversity of software systems on a network similarly limits the destructive potential of viruses.

This became a particular concern in the 1990s, when Microsoft gained market dominance in desktop operating systems and office software. Users of Microsoft software (especially networking software such as Microsoft Outlook and Microsoft Internet Explorer) are particularly vulnerable to the spread of viruses.

Integrated applications, applications with scripting languages with access to the file system (eg: Visual Basic Script, or VBS, and applications with networking features) are also particularly vulnerable.

Viruses and software development

Because software is often designed with security features to prevent unauthorized use of system resources, many viruses must exploit software bugs in a system or application to spread. Software development strategies which produce large numbers of bugs will generally also produce potential exploits.

Closed-source software development as practiced by Microsoft and other commercial software companies is also seen by some as a security weakness. Open source software such as Linux, for example, allows all users to find and fix security problems without relying on a single vendor. Some advocate that commercial software makers practice vulnerability disclosure to ameliorate this weakness.

Countermeasures

Many users install anti-virus software that can detect and eliminate known viruses after the computer downloads or mounts the executable. Some virus scanners can also warn a user if a file is likely to contain a virus based on the file type; some antivirus vendors also claim the effective use of other types of heuristic analysis. Some industry groups do not like this practice because it often increases the number of false positives the anti-virus software detects. They work by examining the contents of the computers memory (its RAM, and boot sector) and the files stored on fixed or removable drives (hard drives, floppy drives), and comparing those files against a database of known virus signatures. Some anti-virus programs are able to scan opened files in addition to sent and received emails 'on the fly' in a similar manner. This practice is known as "on-access scanning." Anti-virus software does not change the underlying capability of host software to transmit viruses. Users must therefore update their software regularly to patch security holes. Anti-virus software also needs to be updated in order to gain knowledge about the latest threats and hoaxes.

A well-patched and well-maintained Unix system is very well-secured against viruses. Windows has the same type of scripting ability as Unix-based systems, but doesn't natively block normal users from executing such scripts written by a third-party as Unix does for users who are not running as root. More recently, Microsoft's Outlook (but not Outlook Express) e-mail client has developed similar features when dealing with executable file types that Outlook may download as attachments. Ordinary users would do well to patch their operating systems and e-mail clients to prevent viruses and worms from reproducing through security "holes" prudence (and most virus scanners) are unable to prevent.

References

Fred Cohen's 1984 paper
An editorial on beneficial viruses (con)
For a thorough, hypothetical pro discussion, see: "Are Good Viruses still a Bad idea?"
Malicious Code & Viruses - Articles, Links, and Whitepapers