We’re all used to doing a disk check in Windows XP.  It’s easy.  Just double-click on “My Computer”, then select the drive you want to run the check on.  Right-click, Properties, Tools tab, then select “Check Now…” in the Error-checking section.  In almost every instance you’ll be told that the check will be done upon the next reboot.  Easy.

So how does one go about it on Linux?  Well… as you may have guessed, it’s not quite so straightforward.  Linux, by default, does actually have an intelligent disk-checking system already in place. By all accounts, you generally needn’t worry.  But if you have a reason to believe your disk may be slowly dying, and nothing is reporting in the SMART status of your drive, perhaps it’s worth checking the file system instead.

That’s where File System Check comes in (duh!).  Like all Linux tools, it’s painfully abbreviated to simply “fsck”.  Terse, to say the least.  Now the warning:

DO NOT.  I REPEAT, DO NOT EVER EVER EVER RUN THIS COMMAND WHILE YOUR DRIVE IS MOUNTED (I.E. IN USE).  I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA THAT YOU MAY CAUSE BY FOLLOWING THESE INSTRUCTIONS.

To unmount your root (/) volume, follow these easy steps:

  1. Boot from a Live CD. Your root volume will not be mounted by default.
  2. Open a terminal and type:# dmesg | grep sda If you see output relating to your “SCSI” device, then this will identify that your hard disk, in all likelihood, contains your root partition. For example, amongst other output, I see this:
    sd 2:0:0:0: [sda] Assuming drive cache: write through
    sda: sda1 sda2
    sd 2:0:0:0: [sda] Attached SCSI disk

  3. In the example above, we see that SCSI disk 2 (2:0:0:0:) the Linux kernel registers it as the first logical drive (“sda”) in the system.  We can also see it has only 2 partitions, sda1 and sda2.  If this is the only physical drive in the machine, we should strongly suspect that it uses one partition as /boot (formatted with ext4) and the other as a Logical Volume containing both root (/) and swap. Furthermore, it’s foregone conculsion that the smallest partition will be /boot and the larger one will contain our swap and / partitions, so let’s proceed with accessing them.
  4. So, how do we access a “Logical Volume” within an equally mystical “Volume Group”?  Luckily, Linux LVM comes with a plethora of useful tools to make the job easy.

    # /sbin/vgscan
    Reading all physical volumes. This may take a while...
    Found volume group "VolGroup00" using metadata type lvm2
    Great. We have identified the volume group.  But before we can identify the logical volumes it contains, we need access it.
    # /sbin/vgchange -a y
    2 logical volume(s) in volume group "VolGroup00" now active

    Here, the -a flag indicates that we want to change the “active” status of the volume group, and the y means “yes”.
    # /sbin/lvdisplay
    --- Logical volume ---
    LV Name                /dev/VolGroup00/LogVol00
    VG Name                VolGroup00
    LV UUID                DG2WxJ-sKa5-20mg-NtjW-CsPW-t99V-Egqlja
    LV Write Access        read/write
    LV Status              available
    # open                 0
    LV Size                7.25 GB
    Current LE             232
    Segments               1
    Allocation             inherit
    Read ahead sectors     auto
    - currently set to     256
    Block device           253:2

    --- Logical volume ---
    LV Name                /dev/VolGroup00/LogVol01
    VG Name                VolGroup00
    LV UUID                HqKozT-16PQ-HUaT-Yyc7-lMCO-007m-Xcc2c8
    LV Write Access        read/write
    LV Status              available
    # open                 1
    LV Size                512.00 MB
    Current LE             16
    Segments               1
    Allocation             inherit
    Read ahead sectors     auto
    - currently set to     256
    Block device           253:3

    We can now see two partitions contained within the volume group. The first partition, although small by today’s standards, looks a lot larger than the second.  We can also see that each logical volume has a device node (/dev/VolGroup00/LogVol01, for example).

    As we want to perform the disk check without the parition being mounted, we do not issue any mount command here.  However, if you wanted to double-check that this is the partition to check, mount it and have a quick look around.  The following step is only offered to help in this case – skip this if you wish to perform a disk check.

    # mkdir /tmp/lv0

    For me, the first logical volume (the 7.5GB one) would be the one to test.
    # mount -t ext4 /dev/VolGroup00/LogVol00 /tmp/lv0
    # cd /tmp/lv0
    # ls
    bin  boot  dev  etc  home  lib  lib64  lost+found  media  mnt  opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var

    Ok, that looks like the root partition, so let’s get out of it and unmount it before running the file system check on it.
    # cd /
    # umount /tmp/lv0

  5. An alternative to the above steps, if you have already booted into your main system, is to investigate /etc/fstab to see which is your / volume.  All you do is open a terminal and issue: # cat /etc/fstab On my CentOS 5 system, I see this:

    /dev/VolGroup00/LogVol00 /                      ext4    defaults        1 1
    LABEL=/boot1            /boot                   ext4    defaults        1 2
    tmpfs                   /dev/shm                tmpfs   defaults        0 0
    devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
    sysfs                   /sys                    sysfs   defaults        0 0
    proc                    /proc                   proc    defaults        0 0
    LABEL=SWAP-sdb1         swap                    swap    defaults        0 0

    So, /dev/VolGroup00/LogVol00 is my root volume.

So, now that that’s out of the way, what next?  Well, assuming you now know which is your root partition, the most sensible thing to do would be to boot from a Live CD of some distribution (Ubuntu, Fedora, etc) if you haven’t done so, and then perform the disk check from that.

Once in the LiveCD desktop, we’ll need to fire up a Terminal window.
If you know your filesystem type, e.g. if it’s Ext4, which is the default on the most common distributions, you can run a modified version of the fsck command specifically for that file system.  Here’s what I run for a thorough disk check:


# fsck.ext4 -c -D -f -P -v /dev/
VolGroup00/LogVol00
Alternatively, if your partition structure is slightly older and only contains physical paritions (not Logical Volumes), it may just be a case of finding the partition directly – by checking /etc/fstab on the system when running. In that case, your command may look more like this (when / is unmounted!!):
# fsck.ext4 -c -D -f -P -v /dev/sda2


Here’s what the flags do:
-c  – forces a bad block scan.  Although bad blocks are remapped dynamically by the file system, if the file system or its journal are corrupt, this may not work correctly.
-D  – performs a directory check and optimisation.  Doesn’t hurt, and can speed up directory listings of a large number of files.
-f  – forces the check itself to actually run.  As mentioned previously, the file system maintains itself quite well, and if you don’t force the check, fsck may look at the last check interval and decide a check is not required.
-P  – perform all file system fixes automatically.  This is usually a safe flag, but if your file system is potentially very corrupt, this may not be advisable.  In this situation, contact an expert – or restore your back-up… ;-)
-v   – verbose output. See what’s going on.
/dev/VolGroup00/LogVol00 or /dev/sda2 – this is the partition I want to perform the disk check on.

This little guide doesn’t explain how to perform a check on an encrypted logical volume… That one’s coming. :-)

Updated from post originally put here: http://onecool1.wordpress.com/2008/09/19/how-to-do-a-disk-check-in-linux/

5 thoughts on “How to do a disk check in Linux

  1. The first option should be lower case, ‘-c’, as ‘-C’ will error out with;
    —>
    Invalid non-numeric argument to -C
    —>

    Regards, Rob (AU)

  2. Rob – many thanks for your comment. Yes, looks like I got it wrong. I’ll edit and re-post accordingly.

    Your comment also spurred me to check out the man page. The -C (uppercase option) is interesting, as it can print a progress bar to the screen while the check is happening.

    You can specify -C 0 (that’s a zero) to show this. The absence of the numeric argument caused the error you mentioned.

    Best wishes.

  3. I’m fairly computer literate, but have no formal training in Linux. Need to explore more awesome guide to learn more about Linux and how it works.