Encyclopedia > File format

  Article Content

File format

A file format is a particular way to encode data for storage in a computer file.

Most well-known file formats have a published specification document (often with a reference implementation) that describes exactly how the data is to be encoded, and which can be used to determine whether or not a particular program treats a particular file format correctly. This is not always the case, however; some proprietary file formats may effectively be defined only implicitly, by the programs that implement support for them, due to the intentional withholding of the specifications by the developers, who consider the specifications to be trade secrets. As a general rule, file formats with publicly available specifications are supported by a large number of programs, while non-public formats are supported by only a few programs, since supporting non-public formats requires costly licensing or elaborate reverse engineering efforts.

Some file formats are designed to store very particular sorts of data; the JPEG format, for example, is designed only to store still images. Other file formats, however, are designed for storage of several different types of data; the GIF format supports storage of both pictures and simple animations, and the AVI format can support many different types of multimedia.

Since files are seen by programs as streams of data, a method is required to mark the format of the file. One way to indicate these metadata is with a file extension. Another is with off-band[?] data if supported by the filesystem. And another is in-band[?], within the file with an distinctive sequence (often called the magic number[?]).

For example, a GIF file can be recognized by its extension ".gif", by some metadata about type or by its first four bytes "GIF8".

It is sometimes possible to cause a program to read a file encoded in one format as if it were encoded in another format. With a bit of work, for example, a music playing program can be used to play a (specially modified) Microsoft Word document as if it were a song. The result does not sound very musical, however. This is so because a sensible arrangement of bits in one format is almost always nonsensical in another.

It should be noted that it is very difficult to make a principled distinction between a file format and a programming language, or between a "normal program" and a programming language interpreter. A programming language can be seen as a file format for storing algorithms, while even a simple image file viewer can be seen as an "interpreter" for, say, the GIF "language."

The most useful part of intellectual property law for protecting ownership of a file format appears to be patent law. Although patents for file formats are not permitted, some formats require encoding data with patented algorithms. For example, the GIF file format requires use of a patented algorithm; at first, the patent owner did not collect fees for use of the algorithm, then started to collect fees. This has resulted in a significant decrease in the use of GIFs.

See also: list of file formats; graphics file format; audio file format; video file format

All Wikipedia text is available under the terms of the GNU Free Documentation License

  Search Encyclopedia

Search over one million articles, find something about almost anything!
  Featured Article
Hidden London

... of Mithras[?] in the City of London and many more curious places External links: http://www.users.globalnet.co.uk/~koganrh/london/hiddenlondon.htm ...