File System Introduction

Before you start

Objectives: Learn about different partitioning schemes, the difference between basic and dynamic disks, what are partitions and volumes, what is a file system and which file systems can be used in Windows.

Prerequisites: no prerequisites.

Key terms: disk, volume, file, system, hard, disk, partition, volume, drive, fat, space, data, windows


File System

A file system is used to organize and store data on a storage device. The file system and the operating system work together to ensure data availability, integrity, and accessibility. When we first install a new disk, if we go to Disk Management console, we will see that it is not initialized. Before we can use the disk we must first initialize it and select one of the two partitioning schemes.

Partitioning Schemes

We partition hard drives to prepare them for storing our data. When partitioning we can create multiple partitions instead of one. That way we can separate operating system files from user files or we can have multiple operating systems installed on one computer. We can use the Disk Management tool to create and manage existing partitions or volumes. We also have a command line utility called “diskpart” to perform the same tasks.

The first partitioning scheme that we will talk about is the Master Boot Record or MBR scheme. The MBR contains the partition table for the disk. MBR is always the first sector of the physical disk and it is created when we partition the disk. MBR stores the size and location of each of the partitions on the disk. When we power on our computer, the BIOS checks the MBR to find the partition which is marked as active (which stores the operating system). With MBR we can only have 4 partitions per physical disk. Each of that partition can only have a maximum size of 2 TB. With MBR we don’t get any data redundancy which can be used to manage the MBR information.

Since disks are getting bigger and bigger, a new scheme was introduced to address the limitations of MBR scheme. This second scheme is called GUID Partition Table or GPT. Before we move on, we have to mention the biggest limitation of GPT scheme and that is that Windows can only use a GPT disk as a data disk. That means that we can not boot our Windows machine from a GPT partition. To boot our computer we have to use a MBR partition. GPT scheme has some advantages and the first one is that we can have 128 partitions per physical disk. The maximum size of each partition is 18 EB (exabytes). GPT provides redundancy of the partition table information by storing it on the beginning and the end of the disk. GPT scheme can not be used on removable media.

We can convert our partitioning scheme from MBR to GPT and vice versa.

Basic and Dynamic Disks

We have two different types of disks that we can work with in Windows. We have Basic disks and Dynamic disks. By default, when we first install our disks, they will be Basic disks. A Basic disk can be accessed by all operating systems. Basic disks use the original MS-DOS style MBR partition table to store partitions, extended partitions, and logical drives information. With Basic disk type we can only have four partitions on our disk. Three partitions can be primary and only one can be an extended partition. The extended partition can be divided into multiple logical drives (up to 26). The active partition on Basic disk is the one which contains the operating system that we can boot from. The active primary partition is represented with one drive letter (typically C:). The system as a whole can only have one active partition. Note that on Basic disks we can create Simple volumes. Those are actually partitions, since they reside on Basic disk, but we simply call them Simple volumes since partitions are equivalent to Simple volumes on Dynamic disks.

On newer operating systems we also have support for Dynamic disks. Dynamic disks use a private area on the disk to maintain the Logical Disk Manager (LDM) database. The LDM stores information about volume types, sizes locations, drive letters, and configurations. LDM information is also redundant as it is copied to other dynamic disks on the computer. Dynamic disks provide many new features and benefits.

In order to use dynamic disks we have to upgrade our basic disks to dynamic disk. We can do this in Disk Management, by right-clicking on the disk and then selecting ‘Upgrade to Dynamic Disk’ option from the menu. We can also do that in command line by using the “diskpart” command with the “convert dynamic” option. Once the disk has been upgraded to the dynamic, we can create Volumes on the partition. All existing partitions and logical drives will be converted to Simple volumes. We cannot install the operating system on a dynamic disk. We can, however, upgrade a basic disk containing the operating system to the dynamic disk after the installation. If we want to convert our Dynamic disk back to basic disk, we have to delete all existing volumes first. That means that our data will be deleted, so we have to make sure that we back up our data before we do that.

Both Basic and Dynamic disks can be moved from one computer to another. Before we attempt to move disks, we should check their health in Disk Management. In case they are not healthy, we can try to repair the volumes by using the repair option. Also, before we remove it, we should uninstall it from the system. For Basic disks we can go to Disk Management, right click the disk and select the Uninstall option. For Dynamic disks we can right-click the disk and select the “Remove disk” option. To add an existing Dynamic disk to the new computer, we have to use the “Rescan and import foreign disk” option in the Disk Management console. When we install Basic disk, it will simply show with a new drive letter. If we can’t get our Dynamic disk to work on our new computer, we can try and use the “Reactivate disk” option in Disk Management.

Partition

When we buy new a new hard drive and attach it to our computer, typically the entire hard drive will be unallocated. The first step is to define one or more partitions on the drive. A partition is a logical division of the disk. It carves out a portion of the disk and prepares it for saving data. We can have a partition that takes the entire hard disk, or we can create multiple partitions on a single hard disk. In case of multiple partitions a drive letter is assigned to represent each partition. From that we can conclude that multiple letters do not always mean that there are multiple devices, just multiple partitions. Any space not assigned to a partition is labeled as unallocated space.

Volume

A volume is a single storage area within a file system. There are several different types of volumes that are allowed on a Dynamic disk. The first type of volume is called a Simple volume. Simple volume is very similar to the partition on a Basic disk. It is used to assign a drive letter to free space on one physical disk. It contains a single, contiguous block of space from a single hard disk.

The second is called the Extended volume, which contains space from multiple partitions which are located on the same hard drive. We also have aSpanned volume, which uses space on multiple hard drives to create single volume. The free space on each of the disk can be of different size. Typically, Spanned volumes are used if we need a single storage area which is larger than single hard disk. The downside of  the Spanned volume is that if we lose any of the Hard Drives, we lose the entire Volume. That means that with spanned volume we increase the risk of data loss. So to recap, Extended volume contains space from multiple areas on the same disk, while Spanned volume consists of multiple areas located on different disks. However, extended volume is also sometimes referred to as a spanned volume on single disk.

The third type of volume is called Striped volume. In order to create a striped volume we need to have a minimum of two Hard Drives that have equal amount of space available on their partitions. With Striped volume the operating system writes data across all the disks in a small blocks of data known as stripes. This process distributes the load across disks in the volume, so the main reason for using striped volumes is the speed. We can increase read and write performance because we can write simultaneously to all drives in the Striped volume. It is commonly used for temporary files, and example of this is the Page File. Since our Page File is closely associated with memory, we can optimize its performance by placing it into a Striped volume. Striped volume does not provide fault tolerance. If one hard disk fails, we will lose data on all disks that were part of the Striped volume. Also, it can not contain system or boot files. To determine the total size of the Striped volume, we simply multiply the number of disks in a volume with the size of free space used by each disk in the volume. Striped volume uses the same logic as RAID 0.

Another type of volume that we can create is a Mirrored volume. This type is the contrast of the Striped volume. With Mirrored volume we get fault tolerance by writing the same data to each of the hard drive in the volume. With Mirrored volume we duplicate the same data onto two disks. This method does not increase the performance of our storage system. We can only use two disks in a Mirrored volume. Mirrored volume uses the same logic as RAID 1.

As we already mentioned, with Striped volume we can increase performance, but we don’t have any fault tolerance. To deal with that we can also create aStriped volume with Parity. With parity we add fault tolerance to regular Striped volumes. This type of volume requires at least 3 disks of equal size or more. In this case, striping works in the same manner, but we also add a small piece of information about the content of the other disks in the same volume. This piece of information is called Parity. If one disk fails in this type of volume, the information on that disk is read from the parity sections of the other disks. While this is the case, we will have a degradation in the performance until we replace the problematic disk. Once we install new disk, the content of the original disk will be restored from the parity information and in that way, the volume will be repaired. To calculate the total size, we simply remove the size of one disk in the volume. For example, if we have 3 disks with 100 GB of space, the total size of the volume will be 200 GB. 100 GB will be used for parity information. Striped volume with Parity uses the same logic as RAID 5. For this type of volume we have to use a hardware RAID solution.

Keep in mind that the RAID implementations that we mentioned, which can be used in Windows versions intended for workstations (such as Windows Vista, 7 or 8), are implemented trough the software. It is always recommended to use the hardware RAID system, since it provides additional advantages when it comes to reliability and performance. The problem with software implementations is that if the software fails, we will possibly lose data on the RAID as well.

If we are running out of space on our volume, and we have free space on the same disk or some other disk, we can resize the volume by either expanding it or shrinking it. The system volume can only be extended using contiguous free space on the same disk. Remember that resizing is only supported in simple and spanned volumes.

Unallocated Space

Unallocated space is space on a partition that has not been assigned to a volume. We cannot store or read data in unallocated space.

Drive Letter

When we define partitions and volumes, we also have to assign them a drive letter. This is done within the operating system, typically in Disk Management console. Typical drive letters are C:, D:, E:, etcetera. Drive letters help us keep track of different volumes that have been assigned within the operating system.

Formatting

Another thing that we need to do is format our hard drive. This must be done before we can save data on the disk. Formatting applies the rules for how data is saved. When we format a disk, we identify the file system type and identify the cluster size used to store data. Windows supports three different formats for hard disks. Those are FAT, FAT32, and the NTFS file system. We can install Windows 2000 or XP on a FAT32 partition, however, we can install Windows Vista/7 only on an NTFS partition. NTFS offers some advantages over the FAT32 file system. For example, we have support for larger disk sizes, larger files sizes, disk compression, encryption, disk quotas and folder and file permissions.

Logical Objects

Within a volume we can create logical objects called a directory (folder) or a file. Directory is a container in a volume that holds files or other directories. The directory helps us organize files in a logical way so that we can easily find them on the drive. Directories take up very little space on the hard drive. They are simple entries that have files associated with it. A file contains the actual data that is created by some application. Files occupy space on the hard disk and they have a start and an end point. Files are the most basic component that a file system uses to organize raw bits of data on the storage device itself. Most file systems are able to store files as separate chunks. That means that we may have two or more parts of the same file on the different places on the hard drive. The file name is made up of the directory path plus the file name. An extension can also be added to the file name to identify the file type and the program used to create, view, and modify the file. The file system describes the format of how files are identified on the disk, where a file starts and where a file stops and how the file chunks are linked together so that the entire file can be pulled together from different portions of the hard disk.

File Systems Supported on Windows

When we configure our hard drive we must choose a file system that will be implemented on the drive. Operating systems up to and including Windows 98/ME support only the FAT32 file system. For newer Windows systems (Windows 2000/XP and later), we should choose NTFS to take advantage of additional features not supported by FAT.

FAT

FAT stands for File Allocation Table. This file system was originally developed for use with the DOS, but it can be used with any Windows operating system. File Allocation Table tells the system where are files saved on the hard drive. Hard drive is divided to clusters. In DOS, maximum cluster size for FAT is 32K. This gave us limitation in partition size, which in DOS can’t be larger then 2GB. In Windows XP using FAT we can have partition that is 4GB in size, but we have to increase the cluster size to 64K.

FAT32

FAT32 was introduced with Windows 95 and was used extensively with Windows 98 and even later operating systems. It allows us to use more clusters, and because of that it supports much larger hard drives. With FAT32 we get more efficient usage of hard disk. Maximum volume size is 2 TB, but we can make it 8 TB with 32 kB clusters or even 16 TB with 64 kB clusters (not widely supported).

NTFS

New Technology File System (NTFS) is available only for the Windows NT family, which includes Windows XP and later operating systems. NTFS allows us to have 16EB (exabyte) of partition space. It can use smaller cluster sizes for more efficient storage with less wasted space. Some of the other advantages of NTFS is that it also supports file compression, encryption, disk quotas and volume mount points (map disk space on another partition into an existing volume). In addition to that, NTFS allows us to utilize file system security, which means that we can assign permissions to the files.

Because of all these advantages of NTFS over the FAT and FAT32, if we have older file system, we should convert to NTFS. In order to convert to NTFS we can use a command line utility called ‘convert’. The syntax is ‘convert [drive] /fs:[filesystem]‘. So, if we want to convert our C drive from FAT to NTFS, we would enter ‘convert c: /fs:ntfs’. Once the command is entered it will start the conversion. If we are converting a boot partition we will have to close all system files that are opened. To do that we will have to restart our computer. When the system reboots, it’s going to check the boot execute value in the registry. This value will let the system know that it only needs to load the conversion process to convert the file system. Once the filesystem is converted to NTFS, the system will be allowed to boot normally. This conversion process is a one-way conversion. That means that it only allows us to go from FAT or FAT32 to NTFS. If we need to go back to FAT or FAT32, we will have to reformat our hard drive. Format will destroy all files on the hard drive, so before we convert back to FAT or FAT32 we should backup our files.

Examples

We have several articles which will show you how to work with file systems and disks in Windows. See how to convert convert from FAT to NTFS in Windows XP, and how to manage hard disks in XP. Learn how to manage disks in Vista or how to use Disk Management and Diskpart to manage disks in Windows 7.