You are on page 1of 2

For texting language, see SMS language. A stylized iconic depiction of a CSV-formatted text file.

A text file (sometimes spelled "textfile": an old alternative name is "flatfile" ) is a kind of computer file that is structured as a sequence of lines of electr onic text. A text file exists within a computer file system. The end of a text f ile is often denoted by placing one or more special characters, known as an endof-file marker, after the last line in a text file. "Text file" refers to a type of container, while plain text refers to a type of content. Text files can contain plain text, but they are not limited to such. At a generic level of description, there are two kinds of computer files: text f iles and binary files.[1] Contents [hide] 1 Data storage 2 Formats 2.1 ASCII 2.2 UTF-8 2.3 MIME 2.4 .TXT 2.5 Standard Windows .txt files 3 Standardisation 4 Rendering 5 See also 6 Notes and references 7 External links Data storage[edit] Because of their simplicity, text files are commonly used for storage of informa tion. They avoid some of the problems encountered with other file formats, such as endianness, padding bytes, or differences in the number of bytes in a machine word. Further, when data corruption occurs in a text file, it is often easier t o recover and continue processing the remaining contents. A disadvantage of text files is that they usually have a low entropy, meaning that the information occ upies more storage than is strictly necessary. A simple text file needs no additional metadata to assist the reader in interpre tation, and therefore may contain no data at all, which is a case of zero byte f ile. Formats[edit] ASCII[edit] The ASCII standard allows ASCII-only text files (unlike most other file types) t o be freely interchanged and readable on Unix, Macintosh, Microsoft Windows, DOS , and other systems. These differ in their preferred line ending convention and their interpretation of values outside the ASCII range (their character encoding ). UTF-8[edit] In English context text files can be uniquely ASCII, when in an international co ntext text files are usually 8 bits permissive allowing storage of native texts. In those international context, a Byte Order Mark can appear in start of file to differentiate UTF-8 encoding from legacy regional encoding.[2] MIME[edit] Text files usually have the MIME type "text/plain", usually with additional info rmation indicating an encoding. Prior to the advent of Mac OS X, the Mac OS syst em regarded the content of a file (the data fork) to be a text file when its res ource fork indicated that the type of the file was "TEXT". Under the Microsoft W indows operating system, a file is regarded as a text file if the suffix of the name of the file (the "extension") is "txt". However, many other suffixes are us ed for text files with specific purposes. For example, source code for computer programs is usually kept in text files that have file name suffixes indicating t

Standard Windows . The ASCII character set is the most common format for English-language text file s. the most common is UTF-8. and many text editors (including Notepad) do not automatically insert one on th e last line. Files with the .he programming language in which the source is written. ANSI encodings were traditionally used as default system lo cales within Windows. What Window s terminology calls "ANSI encodings" are usually single-byte ISO-8859 encodings. Unicod e is an attempt to create a common standard for representing all known languages . They typically include graphical and line-draw ing characters common in (possibly full-screen) MS-DOS applications. no bolding or italics).txt extension can easily be read or opened by any program that reads text and. Newer Windo ws text files may use a Unicode encoding such as UTF-16LE or UTF-8. OEM or Unicode encoding. also known as MS-DOS code pages. are considered u niversal (or platform independent). were defined by IBM for use in the original IBM PC text mode display system. every ASCII text file is also a UTF-8 text file with identical meaning. Although there are multiple character encodings available for Unicode. which have ASCII codes 13 and 10. except for in locales such as Chinese.txt is a file format for files consisting of text usually containing very littl e formatting (e. the Unicode protocol used for txt fil es is UTF-8. with Byte Or der Mark. . Because many encodings have only a limited repertoire of characters. before the transition to Unicode. but typically matches the format accepted by the system terminal or simple text editor. In many systems. By contrast. Standardisation[edit] POSIX defines a text file as a file that contains characters organized into zero or more lines. Common character encodings include ISO 8859 -1 for many European languages. that is. The main issue between pure ASCII and pure UTF-8 is limited to the presence or a bsence of the BOM. for that reason. and is generally assumed to be the default file format in many situations. which has the advantage of being backwards-compatible with ASCII.txt f ormat is not specified.g. The precise definition of the . It is common for the last line of text not to be terminated with a CR-LF marker..txt files[edit] MS-DOS and Windows use a common text file format. According to Microsoft. they are of ten only usable to represent text in a limited subset of human languages.[3] . it is necessary to choose a character encoding. Japanese and Korean that require doublebyte character sets. Most Windows text files use a form of ANSI.TXT[edit] . this is chosen on the basis of the default locale se tting on the computer it is read on. Fo r accented and other non-ASCII characters. OEM encodin gs. and most known character sets are subsets of the very large Unicode character set. with each line of text separat ed by a two-character combination: CR and LF.