Diseño del sistema de archivos FAT

El sistema de archivos FAT es un sistema de archivos utilizado en los sistemas operativos de la familia MS-DOS y Windows 9x . [3] Sigue utilizándose en dispositivos móviles y sistemas integrados , y por lo tanto es un sistema de archivos adecuado para el intercambio de datos entre computadoras y dispositivos de casi cualquier tipo y antigüedad desde 1981 hasta la actualidad.


Un sistema de archivos FAT se compone de cuatro regiones:

FAT utiliza el formato little-endian para todas las entradas en el encabezado (excepto, donde se menciona explícitamente, algunas entradas en los sectores de arranque de Atari ST) y las FAT. [5] Es posible asignar más sectores FAT de los necesarios para la cantidad de clústeres. El final del último sector de cada copia FAT puede quedar sin usar si no hay clústeres correspondientes. La cantidad total de sectores (como se indica en el registro de arranque) puede ser mayor que la cantidad de sectores utilizados por los datos (clústeres × sectores por clúster), las FAT (cantidad de FAT × sectores por FAT), el directorio raíz (n/a para FAT32) y los sectores ocultos, incluido el sector de arranque: esto daría como resultado sectores sin usar al final del volumen. Si una partición contiene más sectores que la cantidad total de sectores ocupados por el sistema de archivos, también daría como resultado sectores sin usar, al final de la partición, después del volumen.

Área de sectores reservados

Sector de arranque

En dispositivos de almacenamiento no particionados , como disquetes , el sector de arranque ( VBR ) es el primer sector (sector lógico 0 con dirección CHS física 0/0/1 o dirección LBA 0). En dispositivos de almacenamiento particionados, como discos duros, el sector de arranque es el primer sector de una partición, como se especifica en la tabla de particiones del dispositivo.

Bloque de parámetros del BIOS

DOS 3.0 BPB:

Las siguientes extensiones fueron documentadas desde DOS 3.0, sin embargo, ya eran compatibles con algunas ediciones de DOS 2.11. [28] MS-DOS 3.10 todavía admitía el formato DOS 2.0, pero también podía usar el formato DOS 3.0.

DOS 3.2 BPB:

Oficialmente, MS-DOS 3.20 todavía utilizaba el formato DOS 3.0, pero SYSy FORMATfueron adaptados para soportar un formato 6 bytes más largo (del cual no se usaban todas las entradas).

DOS 3.31 BPB:

Se introdujo oficialmente con DOS 3.31 y no se utiliza en DOS 3.2. Algunas utilidades de DOS 3.2 se diseñaron para que reconocieran este nuevo formato. La documentación oficial recomienda confiar en estos valores solo si la entrada de sectores lógicos en el desplazamiento 0x013 es cero.

Una fórmula simple traduce el número de clúster dado de un volumen CNa un número de sector lógico LSN: [24] [25] [26]

  1. Determinar (una vez) , dónde se almacena el recuento del sector reservado en el desplazamiento 0x00E , la cantidad de FAT en el desplazamiento 0x010 , los sectores por FAT en el desplazamiento 0x016 (FAT12/FAT16) o 0x024 (FAT32), las entradas del directorio raíz en el desplazamiento 0x011 , el tamaño del sector en el desplazamiento 0x00B y redondear a un número entero.SSA=RSC+FN×SF+ceil((32×RDE)/SS)RSCFNSFRDESSceil(x)
  2. Determinar dónde se almacenan los sectores por clúster en el desplazamiento 0x00D .LSN=SSA+(CN−2)×SCSC

En los medios no particionados, la cantidad de sectores ocultos del volumen es cero y, por lo tanto, LSNlas LBAdirecciones son las mismas siempre que el tamaño del sector lógico de un volumen sea idéntico al tamaño del sector físico del medio subyacente. En estas condiciones, también es sencillo traducir entre CHSdirecciones y LSNs:

LSN=SPT×(HN+(NOS×TN))+SN−1, donde los sectores por pista SPTse almacenan en el desplazamiento 0x018 y el número de lados en el desplazamiento 0x01A . El número de pista , el número de cabezal y el número de sector corresponden a Cilindro-cabeza-sector : la fórmula proporciona la traducción conocida de CHS a LBA .NOSTNHNSN

Bloque de parámetros de BIOS extendido

Estructura adicional utilizada por FAT12 y FAT16 desde OS/2 1.0 y DOS 4.0, también conocida como Bloque de parámetros BIOS extendidos (EBPB) (los bytes debajo del desplazamiento de sector 0x024 son los mismos que para el BPB de DOS 3.31):

Bloque de parámetros BIOS extendido FAT32

En esencia, FAT32 inserta 28 bytes en el EBPB, seguidos de los 26 bytes EBPB restantes (o a veces solo 7) como se muestra arriba para FAT12 y FAT16. Los sistemas operativos Microsoft e IBM determinan el tipo de sistema de archivos FAT utilizado en un volumen únicamente por la cantidad de clústeres, no por el formato BPB utilizado o el tipo de sistema de archivos indicado, es decir, es técnicamente posible utilizar un "EBPB FAT32" también para volúmenes FAT12 y FAT16, así como un EBPB DOS 4.0 para volúmenes FAT32 pequeños. Dado que se descubrió que los sistemas operativos Windows creaban dichos volúmenes en algunas condiciones extrañas, [nb 6] los sistemas operativos deberían estar preparados para lidiar con estas formas híbridas.


Las versiones de DOS anteriores a la 3.2 dependían total o parcialmente del byte de descriptor de medio en el BPB o del byte de identificación de FAT en el clúster 0 del primer FAT para determinar los formatos de disquete FAT12 incluso si había un BPB presente. Según el ID de FAT encontrado y el tipo de unidad detectado, se utiliza de manera predeterminada uno de los siguientes prototipos de BPB en lugar de utilizar los valores realmente almacenados en el BPB. [nb 1]

Originalmente, el FAT ID estaba destinado a ser un indicador de bits con todos los bits establecidos excepto el bit 2, que se borraba para indicar un formato de 80 pistas (en lugar de 40 pistas), el bit 1, que se borraba para indicar un formato de 9 sectores (en lugar de 8 sectores) y el bit 0, que se borraba para indicar un formato de una sola cara (en lugar de dos caras), [7] pero este esquema no fue seguido por todos los OEM y se volvió obsoleto con la introducción de los discos duros y los formatos de alta densidad. Además, los diversos formatos de 8 pulgadas compatibles con 86-DOS y MS-DOS no se ajustan a este esquema.

Microsoft recomienda distinguir entre los dos formatos de 8 pulgadas para FAT ID 0xFE intentando leer una marca de dirección de densidad simple. Si esto da como resultado un error, el medio debe ser de doble densidad. [23]

La tabla no enumera una serie de formatos de disquetes FAT12 de 8 y 5,25 pulgadas incompatibles admitidos por 86-DOS , que difieren en el tamaño de las entradas de directorio (16 bytes frente a 32 bytes) o en la extensión del área de sectores reservados (varias pistas completas frente a un solo sector lógico).

La implementación de un formato FAT12 de una sola cara de 315 KB utilizado en MS-DOS para Apricot PC y F1e [34] tenía un diseño de sector de arranque diferente, para adaptarse al BIOS no compatible con IBM de esa computadora. La instrucción de salto y el nombre OEM se omitieron, y los parámetros BPB de MS-DOS (desplazamientos 0x00B - 0x017 en el sector de arranque estándar) se ubicaron en el desplazamiento 0x050 . En cambio, Portable , F1, PC duo y Xi FD admitieron un formato FAT12 de doble cara no estándar de 720 KB. [34] Las diferencias en el diseño del sector de arranque y los identificadores de los medios hicieron que estos formatos fueran incompatibles con muchos otros sistemas operativos. Los parámetros de geometría para estos formatos son:

Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one. These formats were also supported by DOS Plus 2.1e/g for the Apricot ACT series.

The DOS Plus adaptation for the BBC Master 512 supported two FAT12 formats on 80-track, double-sided, double-density 5.25" drives, which did not use conventional boot sectors at all. 800 KB data disks omitted a boot sector and began with a single copy of the FAT.[35] The first byte of the relocated FAT in logical sector 0 was used to determine the disk's capacity. 640 KB boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT.[35][36] Also, the 640 KB format differed by using physical CHS sector numbers starting with 0 (not 1, as common) and incrementing sectors in the order sector-track-head (not sector-head-track, as common).[36] The FAT started at the beginning of the next track. These differences make these formats unrecognizable by other operating systems. The geometry parameters for these formats are:

DOS Plus for the Master 512 could also access standard PC disks formatted to 180 KB or 360 KB, using the first byte of the FAT in logical sector 1 to determine the capacity.

The DEC Rainbow 100 (all variations) supported one FAT12 format on 80-track, single-sided, quad-density 5.25" drives. The first two tracks were reserved for the boot loader, but didn't contain an MBR nor a BPB (MS-DOS used a static in-memory BPB instead). The boot sector (track 0, side 0, sector 1) was Z80 code beginning with DI 0xF3. The 8088 bootstrap was loaded by the Z80. Track 1, side 0, sector 2 starts with the Media/FAT ID byte 0xFA. Unformatted disks use 0xE5 instead. The file system starts on track 2, side 0, sector 1. There are 2 copies of the FAT and 96 entries in the root directory. In addition, there is a physical to logical track mapping to effect a 2:1 sector interleaving. The disks were formatted with the physical sectors in order numbered 1 to 10 on each track after the reserved tracks, but the logical sectors from 1 to 10 were stored in physical sectors 1, 6, 2, 7, 3, 8, 4, 9, 5, 10.[37]

FS Information Sector

The "FS Information Sector" was introduced in FAT32[38] for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a logical sector number specified in the FAT32 EBPB boot record at position 0x030 (usually logical sector 1, immediately after the boot record itself).

The sector's data may be outdated and not reflect the current media contents, because not all operating systems update or use this sector, and even if they do, the contents is not valid when the medium has been ejected without properly unmounting the volume or after a power-failure. Therefore, operating systems should first inspect a volume's optional shutdown status bitflags residing in the FAT entry of cluster 1 or the FAT32 EBPB at offset 0x041 and ignore the data stored in the FS information sector, if these bitflags indicate that the volume was not properly unmounted before. This does not cause any problems other than a possible speed penalty for the first free space query or data cluster allocation; see fragmentation.

If this sector is present on a FAT32 volume, the minimum allowed logical sector size is 512 bytes, whereas otherwise it would be 128 bytes. Some FAT32 implementations support a slight variation of Microsoft's specification by making the FS information sector optional by specifying a value of 0xFFFF[19] (or 0x0000) in the entry at offset 0x030.

FAT region

File Allocation Table

Cluster map

A volume's data area is divided into identically sized clusters—small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the drive; typical cluster sizes range from 2 to 32 KiB.[39]

Each file may occupy one or more clusters depending on its size. Thus, a file is represented by a chain of clusters (referred to as a singly linked list). These clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.

Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT, but waste space in large partitions by needing to allocate in large clusters.

The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if those three bytes are considered as one little-endian 24-bit number, the 12 least significant bits represent the first entry (e.g. cluster 0) and the 12 most significant bits the second (e.g. cluster 1). In other words, while the low eight bits of the first cluster in the row are stored in the first byte, the top four bits are stored in the low nibble of the second byte, whereas the low four bits of the subsequent cluster in the row are stored in the high nibble of the second byte and its higher eight bits in the third byte.

The FAT16 file system uses 16 bits per FAT entry, thus one entry spans two bytes in little-endian byte order:

The FAT32 file system uses 32 bits per FAT entry, thus one entry spans four bytes in little-endian byte order. The four top bits of each entry are reserved for other purposes; they are cleared during formatting and should not be changed otherwise. They must be masked off before interpreting the entry as 28-bit cluster address.

The File Allocation Table (FAT) is a contiguous number of sectors immediately following the area of reserved sectors. It represents a list of entries that map to each cluster on the volume. Each entry records one of four things:

For very early versions of DOS to recognize the file system, the system must have been booted from the volume or the volume's FAT must start with the volume's second sector (logical sector 1 with physical CHS address 0/0/2 or LBA address 1), that is, immediately following the boot sector. Operating systems assume this hard-wired location of the FAT in order to find the FAT ID in the FAT's cluster 0 entry on DOS 1.0-1.1 FAT diskettes, where no valid BPB is found.

Special entries

The first two entries in a FAT store special values:

The first entry (cluster 0 in the FAT) holds the FAT ID since MS-DOS 1.20 and PC DOS 1.1 (allowed values 0xF0-0xFF with 0xF1-0xF7 reserved) in bits 7-0, which is also copied into the BPB of the boot sector, offset 0x015 since DOS 2.0. The remaining 4 bits (if FAT12), 8 bits (if FAT16) or 20 bits (if FAT32, the 4 MSB bits are zero) of this entry are always 1. These values were arranged so that the entry would also function as a "trap-all" end-of-chain marker for all data clusters holding a value of zero. Additionally, for FAT IDs other than 0xFF (and 0x00) it is possible to determine the correct nibble and byte order (to be) used by the file system driver, however, the FAT file system officially uses a little-endian representation only and there are no known implementations of variants using big-endian values instead. 86-DOS 0.42 up to MS-DOS 1.14 used hard-wired drive profiles instead of a FAT ID, but used this byte to distinguish between media formatted with 32-byte or 16-byte directory entries, as they were used prior to 86-DOS 0.42.

The second entry (cluster 1 in the FAT) nominally stores the end-of-cluster-chain marker as used by the formater, but typically always holds 0xFFF / 0xFFFF / 0x0FFFFFFF, that is, with the exception of bits 31-28 on FAT32 volumes these bits are normally always set. Some Microsoft operating systems, however, set these bits if the volume is not the volume holding the running operating system (that is, use 0xFFFFFFFF instead of 0x0FFFFFFF here).[40] (In conjunction with alternative end-of-chain markers the lowest bits 2-0 can become zero for the lowest allowed end-of-chain marker 0xFF8 / 0xFFF8 / 0x?FFFFFF8; bit 3 should be reserved as well given that clusters 0xFF0 / 0xFFF0 / 0x?FFFFFF0 and higher are officially reserved. Some operating systems may not be able to mount some volumes if any of these bits are not set, therefore the default end-of-chain marker should not be changed.) For DOS 1 and 2, the entry was documented as reserved for future use.

Since DOS 7.1 the two most-significant bits of this cluster entry may hold two optional bitflags representing the current volume status on FAT16 and FAT32, but not on FAT12 volumes. These bitflags are not supported by all operating systems, but operating systems supporting this feature would set these bits on shutdown and clear the most significant bit on startup:
If bit 15 (on FAT16) or bit 27 (on FAT32)[41] is not set when mounting the volume, the volume was not properly unmounted before shutdown or ejection and thus is in an unknown and possibly "dirty" state.[27] On FAT32 volumes, the FS Information Sector may hold outdated data and thus should not be used. The operating system would then typically run SCANDISK or CHKDSK on the next startup[nb 9][41] (but not on insertion of removable media) to ensure and possibly reestablish the volume's integrity.
If bit 14 (on FAT16) or bit 26 (on FAT32)[41] is cleared, the operating system has encountered disk I/O errors on startup,[41] a possible indication for bad sectors. Operating systems aware of this extension will interpret this as a recommendation to carry out a surface scan (SCANDISK) on the next boot.[27][41] (A similar set of bitflags exists in the FAT12/FAT16 EBPB at offset 0x1A or the FAT32 EBPB at offset 0x36. While the cluster 1 entry can be accessed by file system drivers once they have mounted the volume, the EBPB entry is available even when the volume is not mounted and thus easier to use by disk block device drivers or partitioning tools.)

If the number of FATs in the BPB is not set to 2, the second cluster entry in the first FAT (cluster 1) may also reflect the status of a TFAT volume for TFAT-aware operating systems. If the cluster 1 entry in that FAT holds the value 0, this may indicate that the second FAT represents the last known valid transaction state and should be copied over the first FAT, whereas the first FAT should be copied over the second FAT if all bits are set.

Some non-standard FAT12/FAT16 implementations utilize the cluster 1 entry to store the starting cluster of a variable-sized root directory (typically 2[33]). This may occur when the number of root directory entries in the BPB holds a value of 0 and no FAT32 EBPB is found (no signature 0x29 or 0x28 at offset 0x042).[20] This extension, however, is not supported by mainstream operating systems,[20] as it conflicts with other possible uses of the cluster 1 entry. Most conflicts can be ruled out if this extension is only allowed for FAT12 with less than 0xFEF and FAT16 volumes with less than 0x3FEF clusters and 2 FATs.

Because these first two FAT entries store special values, there are no data clusters 0 or 1. The first data cluster (after the root directory if FAT12/FAT16) is cluster 2,[33] marking the beginning of the data area.

Cluster values

FAT entry values:

FAT32 uses 28 bits for cluster numbers. The remaining 4 bits in the 32-bit FAT entry are usually zero, but are reserved and should be left untouched. A standard conformant FAT32 file system driver or maintenance tool must not rely on the upper 4 bits to be zero and it must strip them off before evaluating the cluster number in order to cope with possible future expansions where these bits may be used for other purposes. They must not be cleared by the file system driver when allocating new clusters, but should be cleared during a reformat.

Root directory region

The root directory table in FAT12 and FAT16 file systems occupies the special Root Directory Region location.

Data region

Aside from the root directory table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all directory tables are stored in the data region. The actual number of entries in a directory stored in the data region can grow by adding another cluster to the chain in the FAT.

Directory table

A directory table is a special type of file that represents a directory (also known as a folder). Since 86-DOS 0.42,[46] each file or (since MS-DOS 1.40 and PC DOS 2.0) subdirectory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the address of the first cluster of the file/directory's data, the size of the file/directory, and the date[46] and (since PC DOS 1.1) also the time of last modification. Earlier versions of 86-DOS used 16-byte directory entries only, supporting no files larger than 16 MB and no time of last modification.[46]

The FAT file system itself does not impose any limits on the depth of a subdirectory tree for as long as there are free clusters available to allocate the subdirectories, however, the internal Current Directory Structure (CDS) under MS-DOS/PC DOS limits the absolute path of a directory to 66 characters (including the drive letter, but excluding the NUL byte delimiter),[24][25][26] thereby limiting the maximum supported depth of subdirectories to 32, whatever occurs earlier. Concurrent DOS, Multiuser DOS and DR DOS 3.31 to 6.0 (up to including the 1992-11 updates) do not store absolute paths to working directories internally and therefore do not show this limitation.[47] The same applies to Atari GEMDOS, but the Atari Desktop does not support more than 8 sub-directory levels. Most applications aware of this extension support paths up to at least 127 bytes. FlexOS, 4680 OS and 4690 OS support a length of up to 127 bytes as well, allowing depths down to 60 levels.[48] PalmDOS, DR DOS 6.0 (since BDOS 7.1) and higher, Novell DOS, and OpenDOS sport a MS-DOS-compatible CDS and therefore have the same length limits as MS-DOS/PC DOS.

Each entry can be preceded by "fake entries" to support a VFAT long filename (LFN); see further below.

Legal characters for DOS short filenames include the following:

This excludes the following ASCII characters:

Character 229 (0xE5) was not allowed as first character in a filename in DOS 1 and 2 due to its use as free entry marker. A special case was added to circumvent this limitation with DOS 3.0 and higher.

The following additional characters are allowed on Atari's GEMDOS, but should be avoided for compatibility with MS-DOS/PC DOS:

The semicolon (;) should be avoided in filenames under DR DOS 3.31 and higher, PalmDOS, Novell DOS, OpenDOS, Concurrent DOS, Multiuser DOS, System Manager and REAL/32, because it may conflict with the syntax to specify file and directory passwords: "...\DIRSPEC.EXT;DIRPWD\FILESPEC.EXT;FILEPWD". The operating system will strip off one[47] (and also two—since DR-DOS 7.02) semicolons and pending passwords from the filenames before storing them on disk. (The command processor 4DOS uses semicolons for include lists and requires the semicolon to be doubled for password protected files with any commands supporting wildcards.[47])

The at-sign character (@) is used for filelists by many DR-DOS, PalmDOS, Novell DOS, OpenDOS and Multiuser DOS, System Manager and REAL/32 commands, as well as by 4DOS and may therefore sometimes be difficult to use in filenames.[47]

Under Multiuser DOS and REAL/32, the exclamation mark (!) is not a valid filename character since it is used to separate multiple commands in a single command line.[47]

Under IBM 4680 OS and 4690 OS, the following characters are not allowed in filenames:

Additionally, the following special characters are not allowed in the first, fourth, fifth and eight character of a filename, as they conflict with the host command processor (HCP) and input sequence table build file names:

The DOS file names are in the current OEM character set: this can have surprising effects if characters handled in one way for a given code page are interpreted differently for another code page (DOS command CHCP) with respect to lower and upper case, sorting, or validity as file name character.

Directory entry

Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C0x15 of the directory entry were used by other operating systems to store additional metadata, most notably the operating systems of the Digital Research family stored file passwords, access rights, owner IDs, and file deletion data there. While Microsoft's newer extensions are not fully compatible with these extensions by default, most of them can coexist in third-party FAT implementations (at least on FAT12 and FAT16 volumes).

32-byte directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):

The FlexOS-based operating systems IBM 4680 OS and IBM 4690 OS support unique distribution attributes stored in some bits of the previously reserved areas in the directory entries:[62]

  1. Local: Don't distribute file but keep on local controller only.[nb 14]
  2. Mirror file on update: Distribute file to server only when file is updated.
  3. Mirror file on close: Distribute file to server only when file is closed.
  4. Compound file on update: Distribute file to all controllers when file is updated.
  5. Compound file on close: Distribute file to all controllers when file is closed.[63]

Some incompatible extensions found in some operating systems include:

Size limits

The FAT12, FAT16, FAT16B, and FAT32 variants of the FAT file systems have clear limits based on the number of clusters and the number of sectors per cluster (1, 2, 4, ..., 128). For the typical value of 512 bytes per sector:

FAT12 requirements : 3 sectors on each copy of FAT for every 1,024 clusters
FAT16 requirements : 1 sector on each copy of FAT for every 256 clusters
FAT32 requirements : 1 sector on each copy of FAT for every 128 clusters

FAT12 range : 1 to 4,084 clusters : 1 to 12 sectors per copy of FAT
FAT16 range : 4,085 to 65,524 clusters : 16 to 256 sectors per copy of FAT
FAT32 range : 65,525 to 268,435,444 clusters : 512 to 2,097,152 sectors per copy of FAT

FAT12 minimum : 1 sector per cluster × 1 clusters = 512 bytes (0.5 KiB)
FAT16 minimum : 1 sector per cluster × 4,085 clusters = 2,091,520 bytes (2,042.5 KB)
FAT32 minimum : 1 sector per cluster × 65,525 clusters = 33,548,800 bytes (32,762.5 KB)

FAT12 maximum : 64 sectors per cluster × 4,084 clusters = 133,824,512 bytes (≈ 127 MB)
[FAT12 maximum : 128 sectors per cluster × 4,084 clusters = 267,694,024 bytes (≈ 255 MB)]

FAT16 maximum : 64 sectors per cluster × 65,524 clusters = 2,147,090,432 bytes (≈2,047 MB)
[FAT16 maximum : 128 sectors per cluster × 65,524 clusters = 4,294,180,864 bytes (≈4,095 MB)]

FAT32 maximum : 8 sectors per cluster × 268,435,444 clusters = 1,099,511,578,624 bytes (≈1,024 GB)
FAT32 maximum : 16 sectors per cluster × 268,173,557 clusters = 2,196,877,778,944 bytes (≈2,046 GB)
[FAT32 maximum : 32 sectors per cluster × 134,152,181 clusters = 2,197,949,333,504 bytes (≈2,047 GB)]
[FAT32 maximum : 64 sectors per cluster × 67,092,469 clusters = 2,198,486,024,192 bytes (≈2,047 GB)]
[FAT32 maximum : 128 sectors per cluster × 33,550,325 clusters = 2,198,754,099,200 bytes (≈2,047 GB)]

Legend: 268435444+3 is 0x0FFFFFF7, because FAT32 version 0 uses only 28 bits in the 32-bit cluster numbers, cluster numbers 0x0FFFFFF7 up to 0x0FFFFFFF flag bad clusters or the end of a file, cluster number 0 flags a free cluster, and cluster number 1 is not used.[33] Likewise 65524+3 is 0xFFF7 for FAT16, and 4084+3 is 0xFF7 for FAT12. The number of sectors per cluster is a power of 2 fitting in a single byte, the smallest value is 1 (0x01), the biggest value is 128 (0x80). Lines in square brackets indicate the unusual cluster size 128, and for FAT32 the bigger than necessary cluster sizes 32 or 64.[64]

Because each FAT32 entry occupies 32 bits (4 bytes) the maximal number of clusters (268435444) requires 2097152 FAT sectors for a sector size of 512 bytes. 2097152 is 0x200000, and storing this value needs more than two bytes. Therefore, FAT32 introduced a new 32-bit value in the FAT32 boot sector immediately following the 32-bit value for the total number of sectors introduced in the FAT16B variant.

The boot record extensions introduced with DOS 4.0 start with a magic 40 (0x28) or 41 (0x29). Typically FAT drivers look only at the number of clusters to distinguish FAT12, FAT16, and FAT32: the human readable strings identifying the FAT variant in the boot record are ignored, because they exist only for media formatted with DOS 4.0 or later.

Determining the number of directory entries per cluster is straightforward. Each entry occupies 32 bytes; this results in 16 entries per sector for a sector size of 512 bytes. The DOS 5 RMDIR/RD command removes the initial "." (this directory) and ".." (parent directory) entries in subdirectories directly, therefore sector size 32 on a RAM disk is possible for FAT12, but requires 2 or more sectors per cluster. A FAT12 boot sector without the DOS 4 extensions needs 29 bytes before the first unnecessary FAT16B 32-bit number of hidden sectors, this leaves three bytes for the (on a RAM disk unused) boot code and the magic 0x55 0xAA at the end of all boot sectors. On Windows NT the smallest supported sector size is 128.

On Windows NT operating systems the FORMAT command options /A:128K and /A:256K correspond to the maximal cluster size 0x80 (128) with a sector size 1024 and 2048, respectively. For the common sector size 512 /A:64K yields 128 sectors per cluster.

Both editions of each ECMA-107[24] and ISO/IEC 9293[25][26] specify a Max Cluster Number MAX determined by the formula MAX=1+trunc((TS-SSA)/SC), and reserve cluster numbers MAX+1 up to 4086 (0xFF6, FAT12) and later 65526 (0xFFF6, FAT16) for future standardization.

Microsoft's EFI FAT32 specification[4] states that any FAT file system with less than 4085 clusters is FAT12, else any FAT file system with less than 65,525 clusters is FAT16, and otherwise it is FAT32. The entry for cluster 0 at the beginning of the FAT must be identical to the media descriptor byte found in the BPB, whereas the entry for cluster 1 reflects the end-of-chain value used by the formatter for cluster chains (0xFFF, 0xFFFF or 0x0FFFFFFF). The entries for cluster numbers 0 and 1 end at a byte boundary even for FAT12, e.g., 0xF9FFFF for media descriptor 0xF9.

The first data cluster is 2,[33] and consequently the last cluster MAX gets number MAX+1. This results in data cluster numbers 2...4085 (0xFF5) for FAT12, 2...65525 (0xFFF5) for FAT16, and 2...268435445 (0x0FFFFFF5) for FAT32.

The only available values reserved for future standardization are therefore 0xFF6 (FAT12) and 0xFFF6 (FAT16). As noted below "less than 4085" is also used for Linux implementations,[44] or as Microsoft's FAT specification puts it:[4]

...when it says <, it does not mean <=. Note also that the numbers are correct. The first number for FAT12 is 4085; the second number for FAT16 is 65525. These numbers and the "<" signs are not wrong."


The FAT file system does not contain built-in mechanisms which prevent newly written files from becoming scattered across the partition.[65] On volumes where files are created and deleted frequently or their lengths often changed, the medium will become increasingly fragmented over time.

While the design of the FAT file system does not cause any organizational overhead in disk structures or reduce the amount of free storage space with increased amounts of fragmentation, as it occurs with external fragmentation, the time required to read and write fragmented files will increase as the operating system will have to follow the cluster chains in the FAT (with parts having to be loaded into memory first in particular on large volumes) and read the corresponding data physically scattered over the whole medium reducing chances for the low-level block device driver to perform multi-sector disk I/O or initiate larger DMA transfers, thereby effectively increasing I/O protocol overhead as well as arm movement and head settle times inside the disk drive. Also, file operations will become slower with growing fragmentation as it takes increasingly longer for the operating system to find files or free clusters.

Other file systems, e.g., HPFS or exFAT, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas. Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.

In fact, seeking for files in large subdirectories or computing the free disk space on FAT volumes is one of the most resource intensive operations, as it requires reading the directory tables or even the entire FAT linearly. Since the total amount of clusters and the size of their entries in the FAT was still small on FAT12 and FAT16 volumes, this could still be tolerated on FAT12 and FAT16 volumes most of the time, considering that the introduction of more sophisticated disk structures would have also increased the complexity and memory footprint of real-mode operating systems with their minimum total memory requirements of 128 KB or less (such as with DOS) for which FAT has been designed and optimized originally.

With the introduction of FAT32, long seek and scan times became more apparent, particularly on very large volumes. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line.[66] Displaying this line took longer and longer as the number of clusters increased. FAT32 therefore introduced a special file system information sector where the previously computed amount of free space is preserved over power cycles, so that the free space counter needs to be recalculated only when a removable FAT32 formatted medium gets ejected without first unmounting it or if the system is switched off without properly shutting down the operating system, a problem mostly visible with pre-ATX-style PCs, on plain DOS systems and some battery-powered consumer products.

With the huge cluster sizes (16 KB, 32 KB, 64 KB) forced by larger FAT partitions, internal fragmentation in form of disk space waste by file slack due to cluster overhang (as files are rarely exact multiples of cluster size) starts to be a problem as well, especially when there are a great many small files.

Various optimizations and tweaks to the implementation of FAT file system drivers, block device drivers and disk tools have been devised to overcome most of the performance bottlenecks in the file system's inherent design without having to change the layout of the on-disk structures.[67][68] They can be divided into on-line and off-line methods and work by trying to avoid fragmentation in the file system in the first place, deploying methods to better cope with existing fragmentation, and by reordering and optimizing the on-disk structures. With optimizations in place, the performance on FAT volumes can often reach that of more sophisticated file systems in practical scenarios, while at the same time retaining the advantage of being accessible even on very small or old systems.

DOS 3.0 and higher will not immediately reuse disk space of deleted files for new allocations but instead seek for previously unused space before starting to use disk space of previously deleted files as well. This not only helps to maintain the integrity of deleted files for as long as possible but also speeds up file allocations and avoids fragmentation, since never before allocated disk space is always unfragmented. DOS accomplishes this by keeping a pointer to the last allocated cluster on each mounted volume in memory and starts searching for free space from this location upwards instead of at the beginning of the FAT, as it was still done by DOS 2.x.[13] If the end of the FAT is reached, it would wrap around to continue the search at the beginning of the FAT until either free space has been found or the original position has been reached again without having found free space.[13] These pointers are initialized to point to the start of the FATs after bootup,[13] but on FAT32 volumes, DOS 7.1 and higher will attempt to retrieve the last position from the FS Information Sector. This mechanism is defeated, however, if an application often deletes and recreates temporary files as the operating system would then try to maintain the integrity of void data effectively causing more fragmentation in the end.[13] In some DOS versions, the usage of a special API function to create temporary files can be used to avoid this problem.

Additionally, directory entries of deleted files will be marked 0xE5 since DOS 3.0.[42] DOS 5.0 and higher will start to reuse these entries only when previously unused directory entries have been used up in the table and the system would otherwise have to expand the table itself.[6]

Since DOS 3.3 the operating system provides means to improve the performance of file operations with FASTOPEN by keeping track of the position of recently opened files or directories in various forms of lists (MS-DOS/PC DOS) or hash tables (DR-DOS), which can reduce file seek and open times significantly. Before DOS 5.0 special care must be taken when using such mechanisms in conjunction with disk defragmentation software bypassing the file system or disk drivers.

Windows NT will allocate disk space to files on FAT in advance, selecting large contiguous areas, but in case of a failure, files which were being appended will appear larger than they were ever written into, with a lot of random data at the end.

Other high-level mechanisms may read in and process larger parts or the complete FAT on startup or on demand when needed and dynamically build up in-memory tree representations of the volume's file structures different from the on-disk structures.[67][68] This may, on volumes with many free clusters, occupy even less memory than an image of the FAT itself. In particular on highly fragmented or filled volumes, seeks become much faster than with linear scans over the actual FAT, even if an image of the FAT would be stored in memory. Also, operating on the logically high level of files and cluster-chains instead of on sector or track level, it becomes possible to avoid some degree of file fragmentation in the first place or to carry out local file defragmentation and reordering of directory entries based on their names or access patterns in the background.

Some of the perceived problems with fragmentation of FAT file systems also result from performance limitations of the underlying block device drivers, which becomes more visible the lesser memory is available for sector buffering and track blocking/deblocking:

While the single-tasking DOS had provisions for multi-sector reads and track blocking/deblocking, the operating system and the traditional PC hard disk architecture (only one outstanding input/output request at a time and no DMA transfers) originally did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks. Such features became available later. Later DOS versions also provided built-in support for look-ahead sector buffering and came with dynamically loadable disk caching programs working on physical or logical sector level, often utilizing EMS or XMS memory and sometimes providing adaptive caching strategies or even run in protected mode through DPMS or Cloaking to increase performance by gaining direct access to the cached data in linear memory rather than through conventional DOS APIs.

Write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a power failure or crash, made easier by the lack of hardware protection between applications and the system.

VFAT long file names

FAT32 directory structure with three files, two of which use VFAT long file names.

VFAT Long File Names (LFNs) are stored on a FAT file system using a trick: adding additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS. This method is very similar to the DELWATCH method to utilize the volume attribute to hide pending delete files for possible future undeletion since DR DOS 6.0 (1991) and higher. It is also similar to a method publicly discussed to store long filenames on Ataris and under Linux in 1992.[69][70]

Because older versions of DOS could mistake LFN names in the root directory for the volume label, VFAT was designed to create a blank volume label in the root directory before adding any LFN name entries (if a volume label did not already exist).[nb 13]

Each phony entry can contain up to 13 UCS-2 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UCS-2 characters.[55]

If the position of the LFN's last character is not at a directory entry boundary (13, 26, 39, ...), then a 0x0000 terminator is added in the next character position. Then, if that terminator is also not at the boundary, remaining character positions are filled with 0xFFFF. No directory entry containing a lone terminator will exist.

LFN entries use the following format:

If there are multiple LFN entries required to represent a file name, the entry representing the end of the filename comes first. The sequence number of this entry has bit 6 (0x40) set to represent that it is the last logical LFN entry, and it has the highest sequence number. The sequence number decreases in the following entries. The entry representing the start of the filename has sequence number 1. A value of 0xE5 is used to indicate that the entry is deleted.

On FAT12 and FAT16 volumes, testing for the values at 0x1A to be zero and at 0x1C to be non-zero can be used to distinguish between VFAT LFNs and pending delete files under DELWATCH.

For example, a filename like "File with very long filename.ext" would be formatted like this:

A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (pFCBName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with space characters (ASCII 0x20). For example, "Readme.txt" would be "README␠␠TXT".)

unsigned char lfn_checksum(const unsigned char *pFCBName){ int i; unsigned char sum = 0; for (i = 11; i; i--) sum = ((sum & 1) << 7) + (sum >> 1) + *pFCBName++; return sum;}

If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions of Windows such as XP. Instead, two bits in byte 0x0C of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support it. This creates a backwards-compatibility problem with older Windows versions (Windows 95 / 98 / 98 SE / ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported between operating systems, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.[71]

  1. ^ a b c For maximum compatibility with MS-DOS/PC DOS and DR-DOS, operating systems trying to determine a floppy disk's format should test on all mentioned opcode sequences at sector offset 0x000 in addition to looking for a valid media descriptor byte at sector offset 0x015 before assuming the presence of a BPB. Although PC DOS 1.0 floppy disks do not contain a BPB, they start with 0xEB as well, but do not show a 0x90 at offset 0x002. PC DOS 1.10 floppy disks even start with 0xEB 0x?? 0x90, although they still do not feature a BPB. In both cases, a test for a valid media descriptor at offset 0x015 would fail (value 0x00 instead of valid media descriptors 0xF0 and higher). If these tests fail, DOS checks for the presence of a media descriptor byte in the first byte of the first FAT in the sector following the boot sector (logical sector 1 on FAT12/FAT16 floppies).
  2. ^ a b c d e The signature at offset 0x1FE in boot sectors is 0x55 0xAA, that is 0x55 at offset 0x1FE and 0xAA at offset 0x1FF. Since little-endian representation must be assumed in the context of IBM PC compatible machines, this can be written as 16-bit word 0xAA55 in programs for x86 processors (note the swapped order), whereas it would have to be written as 0x55AA in programs for other CPU architectures using a big-endian representation. Since this has been mixed up numerous times in books and even in original Microsoft reference documents, this article uses the offset-based byte-wise on-disk representation to avoid any possible misinterpretation.
  3. ^ a b c The checksum entry in Atari boot sectors holds the alignment value, not the magic value itself. The magic value 0x1234 is not stored anywhere on disk. In contrast to Intel x86 processors, the Motorola 680x0 processors as used in Atari machines use a big-endian memory representation and therefore a big-endian representation must be assumed when calculating the checksum. As a consequence of this, for checksum verification code running on x86 machines, pairs of bytes must be swapped before the 16-bit addition.
  4. ^ DR-DOS is able to boot off FAT12/FAT16 logical sectored media with logical sector sizes up to 1024 bytes.
  5. ^ a b The following DOS functions return these register values: INT 21h/AH=2Ah "Get system date" returned values: CX = year (1980..2099), DH = month (1..12), DL = day (1..31). INT 21h/AH=2Ch "Get system time" returned values: CH = hour (0..23), CL = minute (0..59), DH = second (0..59), DL = 1/100 seconds (0..99).
  6. ^ Windows XP has been observed to create such hybrid disks when reformatting FAT16B formatted ZIP-100 disks to FAT32 format. The resulting volumes were FAT32 by format, but still used the FAT16B EBPB. (It is unclear how Windows determines the location of the root directory on FAT32 volumes, if only a FAT16 EBPB was used.)
  7. ^ a b One utility providing an option to specify the desired format filler value for hard disks is DR-DOS' FDISK R2.31 with its optional wipe parameter /W:246. In contrast to other FDISK utilities, DR-DOS FDISK is not only a partitioning tool, but can also format freshly created partitions as FAT12, FAT16 or FAT32. This reduces the risk to accidentally format wrong volumes.
  8. ^ In order to support the coexistence of DR-DOS with PC DOS and multiple parallel installations of DR-DOS, the extension of the default "IBMBIO␠␠COM" boot file name can be changed using the SYS /DR:ext option, where ext represents the new extension. Other potential DR-DOS boot file names to be expected in special scenarios are "DRBIOS␠␠SYS", "DRDOS␠␠␠SYS", "IO␠␠␠␠␠␠SYS", "JO␠␠␠␠␠␠SYS".
  9. ^ If a volume's dirty shutdown flag is still cleared on startup, the volume was not properly unmounted. This would, for example, cause Windows 98 WIN.COM to start SCANDISK in order to check for and repair potential logical file system errors. If the bad sector flag is cleared, it will force a surface scan to be carried out as well. This can be disabled by setting AUTOSCAN=0 in the [OPTIONS] section in MSDOS.SYS file.
  10. ^ a b c d See other links for special precautions in regard to occurrences of a cluster value of 0xFF0 on FAT12 volumes under MS-DOS/PC DOS 3.3 and higher.
  11. ^ a b Some versions of FORMAT since MS-DOS 1.25 and PC DOS 2.0 supported an option /O (for old) to fill the first byte of all directory entries with 0xE5 instead of utilizing the end marker 0x00. Thereby. the volume remained accessible under PC DOS 1.0-1.1, while formatting took somewhat longer and newer versions of DOS could not take advantage of the considerable speed-up caused by using the end marker 0x00.
  12. ^ This is the reason, why 0xE5 had a special meaning in directory entries.
  13. ^ a b To avoid potential misinterpretation of directory volume labels with VFAT LFN entries by non-VFAT aware operating systems, the DR-DOS 7.07 FDISK and FORMAT tools are known to explicitly write dummy "NO␠NAME␠␠␠␠" directory volume labels if the user skips entering a volume label. The operating system would internally default to return the same string if no directory volume label could be found in the root of a volume, but without a real volume label stored as the first entry (after the directory entries), older operating systems could erroneously pick up VFAT LFN entries instead.
  14. ^ This IBM 4680 OS and 4690 OS distribution attribute type must have an on-disk bit value of 0 as files fall back to this type when attributes get lost accidentally.


