This is the BSDA Study Guide Book written via a
This is a work in progress. You may contribute to or discuss this specific page at http://bsdwiki.reedmedia.net/wiki/Determine_disk_capacity_and_which_files_are_consuming_the_most_disk_space.html.
Determine disk capacity and which files are consuming the most disk space
- Be able to combine common Unix command line utilities to quickly determine which files are consuming the most disk space.
As disk sizes have increased over the years, so have the amount of data that we seem to want to keep on them. At one time or another, you may be faced with the "too much data/not enough space" problem. How can you quickly find the "disk hogs"?
Use the tools!
The BSD systems are full of tools that can assist with this problem, including:
df(1) - "disk free"
du(1) - "disk usage"
find(1) - "walk a file hierarchy"
If you're using NetBSD, you can also get a
df type reading from
systat(1). And with any BSD variant, using common "Unix-fu" (in particular,
find and shell pipes), these commands can quickly produce useful information about disk usage.
df and du
For a quick summary of disk space, simply call
df. Using "-c" with
df provides an "overall total"; using "-h" with either
du produces "human readable" output: that is, calculated into K, M, G (kilobytes, megabytes, gigabyes), etc., instead of "blocks" as indicated by the environment variable $BLOCKSIZE.
df, you probably don't want to simply call
du. Without arguments,
du lists the size of every file and subdirectory (and its files and subfiles, ad infinitum) of your CWD, roughly in the order of inodes --- if you happen to be in "/", you'd be a long time reading the output of
du. Usually it's better to use
du with "-s", possibly even with a specific file or file "glob" argument, or with "-h" and maybe "-c", and pipe the output through sort(1); look for a rather convoluted (yet effective) example below.
du can also read the sizes of files listed to its standard input, which makes
find a fairly useful "frontend" to
du on occasion (but see the section on
find below before you scratch your head too hard on this).
Note: under certain conditions,
du may disagree somewhat about the amount of free space on a filesystem. Generally, this occurs when a program is holding an open file descriptor to a file that has been unlinked; in such a case,
du wouldn't count the file's size, but the blocks are still unavailable as "free blocks" (
df="disk free", remember?) In such cases, you can use
fstat(1) to see currently open files.
find and the "size" primary
The complete use of
find is beyond the scope of this section; please see Find a file with a given set of attributes for complete information. However, using the "size" primary and an expression representing a given filesize, you can quickly produce a list of "disk hogs". See the Examples below.
Are any partitions nearing "full"?
/dev/ad0s1a 1978 977 842 54% /
/dev/ad0s1e 67765 49502 12841 79% /usr
/dev/ad0s1d 3962 2182 1463 60% /var
Display all the *.mp3 files in my homedir, and their sizes with a total:
$ du -sc *mp3 $HOME
List all files in the current directory, in order of size (almost):
$ du -h | sort -n | more
Here's a pretty wild set of pipes for "du", showing the largest disk hogs (unless files are >999MB - if so change "M" to "G" in the regular expression); to see the smallest files, use "head" rather than "tail", or for a complete listing pipe it to $PAGER instead of either. The "-n" option to sort(1) ensures that the filesizes are in numeric rather than alphabetical order:
# du -hc * | sort -n | grep "[0-9]M" | tail
But this brings us to the relative power of find(1). A similar report could be produced like this ("find all files in the cwd greater than approximately 900MB in size"):
# find . -size +940000000c
The main difference between this statement's output and that of the "piped arrangement" above is that find doesn't report the actual sizes and the list isn't "sorted". Note that if you're using FreeBSD, you can use "[KMGTP]" with the size designation, thus: "find . -size +900M".
df to see if your hard drives are nearing "full".
find to find out whom in
/home/ is the biggest "disk hog". (Optional: Use grep to see if any of these files are "mp3"s).
du along with
grep(1) to produce lists of files by size.
du(1), df(1), find(1), sort(1), and, for NetBSD systat(1)