You are on page 1of 6

Computer le

This article is about computer les and le systems in

general terms. For a more detailed and technical article,
see File system.
A computer le is a resource for storing information,
which is available to a computer program and is usually based on some kind of durable storage. A le is
durable in the sense that it remains available for other
programs to use after the program that created it has nished executing. Computer les can be considered as the
information technology counterpart of paper documents
which traditionally are kept in oce and library les, and
this is the source of the term.
The twin disk les of an IBM 305 system


Laboratories. Electronically it retains gures

fed into calculating machines, holds them in
storage while it memorizes new ones - speeds intelligent solutions through mazes of mathematics.
In 1952, le was used in referring to information stored
on punched cards.[2]
In early usage, people regarded the underlying hardware
(rather than the contents) as a le. For example, the IBM
350 disk drives were called disk les.[3] In about 1961
the Burroughs MCP and the MIT Compatible TimeSharing System introduced the concept of a "le system",
which managed several virtual les on one storage device, giving the term its present-day meaning. Although
the current term "register le" shows the early concept of
les, it has largely disappeared.
The word ultimately comes from the Latin lum a

2 File contents
A punched card le

On most modern operating systems, les are organized

into one-dimensional arrays of bytes. The format of a le
is dened by its content since a le is solely a container for
data, although, on some platforms the format is usually indicated by its lename extension, specifying the rules for
how the bytes must be organized and interpreted meaningfully. For example, the bytes of a plain text le (.txt
in Windows) are associated with either ASCII or UTF8 characters, while the bytes of image, video, and audio
les are interpreted otherwise. Most le types also allo-

The word le was used publicly in the context of computer storage as early as February, 1950. In an RCA (Radio Corporation of America) advertisement in Popular
Science Magazine[1] describing a new memory vacuum
tube it had developed, RCA stated:
"...the results of countless computations can be
kept on le and taken out again. Such a le
now exists in a memory tube developed at RCA


cate a few bytes for metadata, which allows a le to carry may contain lines of text, corresponding to printed lines
some basic information about itself.
on a piece of paper. Alternatively, a le may contain an
Some le systems can store arbitrary (not interpreted by arbitrary binary image (a BLOB) or it may contain an
the le system) le-specic data outside of the le format, executable.
but linked to the le, for example extended attributes or
forks. On other le systems this can be done via sidecar
les or software-specic databases. All those methods,
however, are more susceptible to loss of metadata than
are container and archive le formats.


File size

Main article: File size

The way information is grouped into a le is entirely up

to how it is designed. This has led to a plethora of more
or less standardized le structures for all imaginable purposes, from the simplest to the most complex. Most computer les are used by computer programs which create,
modify or delete the les for their own use on an asneeded basis. The programmers who create the programs
decide what les are needed, how they are to be used and
(often) their names.
In some cases, computer programs manipulate les that
are made visible to the computer user. For example, in
a word-processing program, the user manipulates document les that the user personally names. Although
the content of the document le is arranged in a format
that the word-processing program understands, the user
is able to choose the name and location of the le and
provide the bulk of the information (such as words and
text) that will be stored in the le.

At any instant in time, a le might have a size, normally

expressed as number of bytes, that indicates how much
storage is associated with the le. In most modern operating systems the size can be any non-negative whole
number of bytes up to a system limit. Many older operating systems kept track only of the number of blocks or
tracks occupied by a le on a physical storage device. In
such systems, software employed other methods to track Many applications pack all their data les into a single
the exact byte count (e.g., CP/M used a special control le called an archive le, using internal markers to discharacter, Ctrl-Z, to signal the end of text les).
cern the dierent types of information contained within.
The general denition of a le does not require that its The benets of the archive le are to lower the number
size have any real meaning, however, unless the data of les for easier transfer, to reduce storage usage, or just
within the le happens to correspond to data within a to organize outdated les. The archive le must often be
pool of persistent storage. A special case is a zero byte unpacked before next using.
le; these les can be newly created les that have not yet
had any data written to them, or may serve as some kind
of ag in the le system, or are accidents (the results of 2.3 Operations
aborted disk operations). For example, the le to which
the link /bin/ls points in a typical Unix-like system prob- The most basic operations that programs can perform on
ably has a dened size that seldom changes. Compare a le are:
this with /dev/null which is also a le, but its size may be
Create a new le
obscure. (This is misleading because /dev/null is not really a le: in Unix-like systems, all resources, including
Change the access permissions and attributes of a
devices, are accessed like les, but there is still a real disle
tinction between les and devicesat core, they behave
dierentlyand the obscurity of the size of /dev/null is
Open a le, which makes the le contents available
one manifestation of this. As a character device, /dev/null
to the program
has no size.)
Read data from a le


Organization of data in a le

Information in a computer le can consist of smaller

packets of information (often called "records" or lines)
that are individually dierent but share some common
traits. For example, a payroll le might contain information concerning all the employees in a company and their
payroll details; each record in the payroll le concerns
just one employee, and all the records have the common
trait of being related to payrollthis is very similar to
placing all payroll information into a specic ling cabinet in an oce that does not have a computer. A text le

Write data to a le
Close a le, terminating the association between it
and the program
Files on a computer can be created, moved, modied,
grown, shrunk, and deleted. In most cases, computer
programs that are executed on the computer handle these
operations, but the user of a computer can also manipulate les if necessary. For instance, Microsoft Word les
are normally created and modied by the Microsoft Word
program in response to user commands, but the user can
also move, rename, or delete these les directly by using

a le manager program such as Windows Explorer (on using names (lenames). In some operating systems, the
Windows computers) or by command lines (CLI).
name is associated with the le itself. In others, the le
In Unix-like systems, user-space programs do not oper- is anonymous, and is pointed to by links that have names.
ate directly, at a low level, on a le. Only the kernel In the latter case, a user can identify the name of the link
deals with les, and it handles all user-space interaction with the le itself, but this is a false analogue, especially
with les in a manner that is transparent to the user- where there exists more than one link to the same le.
space programs. The operating system provides a level
of abstraction, which means that interaction with a le
from user-space is simply through its lename (instead of
its lehandle). For example, rm lename will not delete
the le itself, but only a link to the le. There can be
many links to a le, but when they are all removed, the
kernel considers that les memory space free to be reallocated. This free space is commonly considered a security risk (due to the existence of le recovery software).
Any secure-deletion program uses kernel-space (system)
functions to wipe the les data.

Identifying and organizing



Files and folders arranged in a hierarchy

In modern computer systems, les are typically accessed

Files (or links to les) can be located in directories. However, more generally, a directory can contain either a list
of les or a list of links to les. Within this denition, it
is of paramount importance that the term le includes
directories. This permits the existence of directory hierarchies, i.e., directories containing sub-directories. A
name that refers to a le within a directory must be typically unique. In other words, there must be no identical
names within a directory. However, in some operating
systems, a name may include a specication of type that
means a directory can contain an identical name for more
than one type of object such as a directory and a le.
In environments in which a le is named, a les name
and the path to the les directory must uniquely identify it among all other les in the computer systemno
two les can have the same name and path. Where a le
is anonymous, named references to it will exist within a
namespace. In most cases, any name within the namespace will refer to exactly zero or one le. However, any
le may be represented within any namespace by zero,
one or more names.
Any string of characters may or may not be a well-formed
name for a le or a link depending upon the context of application. Whether or not a name is well-formed depends
on the type of computer system being used. Early computers permitted only a few letters or digits in the name of
a le, but modern computers allow long names (some up
to 255 characters) containing almost any combination of
unicode letters or unicode digits, making it easier to understand the purpose of a le at a glance. Some computer
systems allow le names to contain spaces; others do not.
Case-sensitivity of le names is determined by the le
system. Unix le systems are usually case sensitive and
allow user-level applications to create les whose names
dier only in the case of characters. Microsoft Windows
supports multiple le systems, each with dierent policies regarding case-sensitivity. The common FAT le
system can have multiple les whose names dier only
in case if the user uses a disk editor to edit the le names
in the directory entries. User applications, however, will
usually not allow the user to create multiple les with the
same name but diering in case.
Most computers organize les into hierarchies using folders, directories, or catalogs. The concept is the same irrespective of the terminology used. Each folder can contain
an arbitrary number of les, and it can also contain other
folders. These other folders are referred to as subfolders.
Subfolders can contain still more les and folders and so
on, thus building a tree-like structure in which one master folder (or root folder the name varies from one


operating system to another) can contain any number of computer system to hide essential system les that users
levels of other folders and les. Folders can be named should not alter.
just as les can (except for the root folder, which often
does not have a name). The use of folders makes it easier
to organize les in a logical way.
When a computer allows the use of folders, each le
and folder has not only a name of its own, but also a
path, which identies the folder or folders in which a
le or folder resides. In the path, some sort of special charactersuch as a slashis used to separate the
le and folder names. For example, in the illustration
shown in this article, the path /Payroll/Salaries/Managers
uniquely identies a le called Managers in a folder called
Salaries, which in turn is contained in a folder called Payroll. The folder and le names are separated by slashes
in this example; the topmost or root folder has no name,
and so the path begins with a slash (if the root folder had
a name, it would precede this rst slash).

5 Storage

Any le that has any useful purpose, must have some

physical manifestation. That is, a le (an abstract concept) in a real computer system must have a real physical
analogue if it is to exist at all.

In physical terms, most computer les are stored on some

type of data storage device. For example, most operating
systems store les on a hard disk. Hard disks have been
the ubiquitous form of non-volatile storage since the early
1960s.[5] Where les contain only temporary information, they may be stored in RAM. Computer les can
be also stored on other media in some cases, such as
magnetic tapes, compact discs, Digital Versatile Discs,
Many (but not all) computer systems use extensions in le
Zip drives, USB ash drives, etc. The use of solid state
names to help identify what they contain, also known as
drives is also beginning to rival the hard disk drive.
the le type. On Windows computers, extensions consist
of a dot (period) at the end of a le name, followed by a In Unix-like operating systems, many les have no associfew letters to identify the type of le. An extension of .txt ated physical storage device. Examples are /dev/null and
identies a text le; a .doc extension identies any type of most les under directories /dev, /proc and /sys. These
document or documentation, commonly in the Microsoft are virtual les: they exist as objects within the operating
Word le format; and so on. Even when extensions are system kernel.
used in a computer system, the degree to which the com- As seen by a running user program, les are usually repputer system recognizes and heeds them can vary; in some resented either by a File control block or by a le handle.
systems, they are required, while in other systems, they A File control block (FCB) is an area of memory which is
are completely ignored if they are presented.
manipulated to establish a lename etc. and then passed
to the operating system as a parameter, it was used by
older IBM operating systems and early PC operating systems including CP/M and early versions of MS-DOS. A
4 Protection
le handle is generally either an opaque data type or an
integer, it was introduced in around 1961 by the ALGOLMany modern computer systems provide methods for based Burroughs MCP running on the Burroughs B5000
protecting les against accidental and deliberate dam- but is now ubiquitous.
age. Computers that allow for multiple users implement
le permissions to control who may or may not modify,
delete, or create les and folders. For example, a given
user may be granted only permission to read a le or 6 Back up
folder, but not to modify or delete it; or a user may be
given permission to read and modify les or folders, but When computer les contain information that is exnot to execute them. Permissions may also be used to tremely important, a back-up process is used to protect
allow only certain users to see the contents of a le or against disasters that might destroy the les. Backing up
folder. Permissions protect against unauthorized tamper- les simply means making copies of the les in a separate
ing or destruction of information in les, and keep private location so that they can be restored if something happens
information condential from unauthorized users.
to the computer, or if they are deleted accidentally.
Another protection mechanism implemented in many
computers is a read-only ag. When this ag is turned
on for a le (which can be accomplished by a computer
program or by a human user), the le can be examined,
but it cannot be modied. This ag is useful for critical information that must not be modied or erased, such
as special les that are used only by internal parts of the
computer system. Some systems also include a hidden
ag to make certain les invisible; this ag is used by the

There are many ways to back up les. Most computer

systems provide utility programs to assist in the back-up
process, which can become very time-consuming if there
are many les to safeguard. Files are often copied to removable media such as writable CDs or cartridge tapes.
Copying les to another hard disk in the same computer
protects against failure of one disk, but if it is necessary
to protect against failure or destruction of the entire computer, then copies of the les must be made on other me-

dia that can be taken away from the computer and stored
in a safe, distant location.
The grandfather-father-son backup method automatically
makes three back-ups; the grandfather le is the oldest
copy of the le and the son is the current copy.

File systems and le managers

The way a computer organizes, names, stores and manipulates les is globally referred to as its le system. Most
computers have at least one le system. Some computers allow the use of several dierent le systems. For
instance, on newer MS Windows computers, the older
FAT-type le systems of MS-DOS and old versions of
Windows are supported, in addition to the NTFS le system that is the normal le system for recent versions of
Windows. Each system has its own advantages and disadvantages. Standard FAT allows only eight-character le
names (plus a three-character extension) with no spaces,
for example, whereas NTFS allows much longer names
that can contain spaces. You can call a le Payroll
records in NTFS, but in FAT you would be restricted to
something like payroll.dat (unless you were using VFAT,
a FAT extension allowing long le names).
File manager programs are utility programs that allow
users to manipulate les directly. They allow you to
move, create, delete and rename les and folders, although they do not actually allow you to read the contents
of a le or store information in it. Every computer system
provides at least one le-manager program for its native
le system. For example, File Explorer (formerly Windows Explorer) is commonly used in Microsoft Windows
operating systems, and Nautilus is common under several
distributions of Linux.

See also
Block (data storage)
Computer le management
Data hierarchy
File camouage
File copying
File conversion
File deletion
File directory
File manager
File system

Flat le database
Object composition
Soft copy

9 Notes
[1] Popular Science Magazine, February 1950, page 96. Retrieved 2014-03-07.
[2] Robert S. Casey, et al. Punched Cards: Their Applications
to Science and Industry, 1952.
[3] Martin H. Weik. Ballistic Research Laboratories Report
#1115. March 1961. pp. 314-331.
[4] Online Etymology Dictionary.
[5] Magnetic Storage Handbook 2nd Ed., Section 2.1.1, Disk
File Technology, Mee and Daniel, (c)1990,

10 External links
Data Formats Computer le at DMOZ




Text and image sources, contributors, and licenses


Computer le Source: Contributors: Damian Yerrick, Lee Daniel

Crocker, Zundark, Arvindn, Christian List, Nate Silva, Patrick, RTC, Michael Hardy, Modster, Norm, Tannin, Ixfd64, Dcljr, Eurleif,
Delirium, Haakon, Iulianu, Mac, PeterBrooks, Macar~enwiki, Ghewgill, Hashar, Coren, Dcoetzee, Bemoeial, SatyrTN, Sabbut, Vaceituno,
Robbot, PBS, RedWolf, Romanm, Chris Roy, Bkell, David Gerard, Giftlite, Hagedis, Monedula, Mboverload, Vadmium, Pgan002,
LiDaobing, Antandrus, Creidieki, Humblefool, Gazpacho, Ularsen, Rich Farmbrough, Sladen, Mani1, Nabla, CanisRufus, JeTan,
Orzetto, Foant, Guy Harris, Riana, Yamla, Wtmitchell, MIT Trekkie, Chiprunner, Japanese Searobin, Nuno Tavares, Woohookitty, LOL,
TheNightFly, TotoBaggins, Bluemoose, Mekong Bluesman, Mandarax, RichardWeiss, Graham87, Brolin Empey, Reisio, Strait, Dmccreary, Leningrad, Andrzej P. Wozniak, FlaBot, Chobot, YurikBot, Eraserhead1, RussBot, Yyy, Bovineone, NawlinWiki, Snek01, The
Fish, Petri Krohn, David Biddulph, GrinBot~enwiki, That Guy, From That Show!, SmackBot, Midway, Eskimbot, BiT, Skizzik, Kurykh,
Taelus, Agateller, Jerome Charles Potts, Octahedron80, Nbarth, Baa, DHN-bot~enwiki, Joerite, Onorem, Yidisheryid, Dibeneditto, Eurgain, Warren, Acdx, SashatoBot, Xandi, Kuru, Aljullu, Loadmaster, Hu12, MihaS, .Koen, Cydebot, K Wedge, Tawkerbot4, Thijs!bot,
Epbr123, Kubanczyk, Mojo Hand, Sobreira, Vertium, AlefZet, AntiVandalBot, QuiteUnusual, Mk*, JAnDbot, Barek, MER-C, VoABot
II, Tedickey, Jatkins, Cic, Gwern, AVRS, MartinBot,
, Jim.henderson, Pbroks13, Mikek999, Numbo3, Stolkin, Theo Mark, Petrwiki, Cpiral, Martial75, Idioma-bot, Funandtrvl, Rajrishi1985, Wirelessben, Nrwilk, TXiKiBoT, A4bot, Ziounclesi, ^demonBot2, Milan
Kerlger, SheeldSteel, Synthebot, Kbrose, SieBot, Restre419, Phe-bot, Ham Pastrami, Jerryobject, Flyer22 Reborn, Yerpo, Puuropyssy,
StaticGull, MarkMLl, Denisarona, ClueBot, Purplemonkydude3, Pointillist, Alexbot, M4gnum0n, Estirabot, Track n Field, Sparkiegeek,
XLinkBot, Hotcrocodile, Ceriak, Ost316, WikHead, RP459, CalumH93, Ghettoblaster, Some jerk on the Internet, Elsendero, LaaknorBot,
Glane23, SpBot, Numbo3-bot, Tide rolls, Lightbot, OC Ripper, Jarble, Surendar.chandra, Legobot, Luckas-bot, Yobot, OrgasGirl, Mishedmashed, Peter Flass, AnomieBOT, Materialscientist, ArthurBot, Jhuk77, Xqbot, 4twenty42o, Kinginthecorner, Omnipaedista, FrescoBot,
X7q, DivineAlpha, Michelin106, Rapsar, I dream of horses, HRoestBot, Todd Peng, MJ94, RedBot, Ureek, Idunius, Lotje, Adrie7, Vrenator, Noommos, Lukaares, Fantasy zone, EmausBot, John of Reading, , Immunize, RenamedUser01302013, TuHanBot, Wikipelli, Dcirovic, F, Bamyers99, NGPriest, Ocaasi, Aleksandar030, ClueBot NG, Jack Greenmaven, MelbourneStar, Gilderien,
Vibhijain, Novusuna, Sabatour 1, Sabatour 2, WNYY98, Doorknob747, Hvfr800rider, ElphiBot, Dentalplanlisa, David.moreno72, EnzaiBot, , Mogism, Lugia2453, Frosty, Faizan, Melonkelon, , Rademers, Sugo diyos, Marvellous Spider-Man, Bender the Bot and
Anonymous: 216



File:BRL61-IBM_305_RAMAC.jpeg Source:

jpeg License: Public domain Contributors: Photo by U. S. Army Red River Arsenal Original artist: User RTC on en.wikipedia
File:Commons-logo.svg Source: License: PD Contributors: ? Original artist: ?
File:FileFolders.svg Source: License: CC-BY-SA-3.0 Contributors: This le was derived from: FileFolders.jpg
Original artist: TotoBaggins at English Wikipedia
File:PunchCardDecks.agr.jpg Source: License: CC
BY-SA 2.5 Contributors: I took this picture of artifacts in my possession. The markings on the tops of the card decks are mine. Original
artist: mehul panchal
File:Question_book-new.svg Source: License: Cc-by-sa-3.0
Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:


Content license

Creative Commons Attribution-Share Alike 3.0