Recovering Deleted Files
Perhaps the most common type of filesystem problem is files that are accidentally deleted. Users frequently delete the wrong files or delete a file only to discover that it’s actually needed. Windows system users may be accustomed to undelete utilities, which scour the disk for recently deleted files in order to recover them. Unfortunately, such tools are rare on Linux. You can make undeletion easier by encouraging the use of special utilities that don’t really delete files, but instead place them in temporary holding areas for deletion later. If all else fails, you may need to recover files from a backup.
Trash Can Utilities
One of the simplest ways to recover "deleted" files is to not delete them at all. This is the idea behind a trash can—a tool or procedure to hold onto files that are to be deleted without actually deleting them. These files can be deleted automatically or manually, depending on the tool or procedure. The most familiar form of trash can utility for most users, and the one from which the name derives, is the trash can icon that exists in many popular GUI environments, including KDE and GNOME. To use a GUI trash can, you drag files you want to delete to its icon. The icon is basically just a pointer to a specific directory that’s out of the way or hidden from view, such as ~/Desktop/Trash or ~/.gnome-desktop/Trash. When you drag a file to the trash can, you’re really just moving it to that directory. If you subsequently decide you want to undelete the file, you can click or double-click the trash can icon to open a file browser on the trash directory. This enables you to drag the files you want to rescue out of the trash directory. Typically, files are only deleted from the trash directory when you say so by right-clicking the trash can icon and selecting an option called Empty Trash or something similar.
When you’re working from the command line, the rm command is the usual method of deleting files, as in rm somefile.txt. This command doesn’t use anything akin to the trash directory by default, and depending on your distribution and its default settings, rm may not even prompt you to be sure you’re deleting the files you want to delete. You can improve rm’s safety considerably by forcing it to confirm each deletion by using the -i option, as in rm -i somefile.txt. In fact, you may want to make this the default by creating an alias in your shell startup scripts, "Mastering Shells and Shell Scripting." For instance, the following line in ~/.bashrc or /etc/profile will set up such an alias for bash:
alias rm='rm -i'
This configuration can become tedious if you use the -r option to delete an entire directory tree, though, or if you simply want to delete a lot of files by using wildcards. You can override the alias by specifying the complete path to rm (/bin/rm) when you type the command.
Forcing confirmation before deleting files can be a useful preventive measure, but it’s not really a way of recovering deleted files. One simple way to allow such recovery is to mimic the GUI environments’ trash cans—instead of deleting files with rm, move them to a holding directory with mv. You can then empty the holding directory whenever it’s convenient. In fact, if you use both a command shell and a GUI environment that implements a trash can, you can use the same directory for both.
If you or your users are already familiar with rm, you may find it difficult to switch to using mv. It’s also easy to forget how many files have been moved into the trash directory, and so disk space may fill up. One solution is to write a simple script that takes the place of rm, but that moves files to the trash directory. This script can simultaneously delete files older than a specified date or delete files if the trash directory contains more than a certain number of files. Alternatively, you could create a cron job to periodically delete files in the trash directory. An example of such a script is saferm, which is available from http://myocard.com/sites/linker/pages/linux/saferm.html. To use saferm or any similar script, you install it in place of the regular rm command, create an alias to call the script instead of rm, or call it by its true name. For instance, the following alias will work:
alias rm='saferm'
In the case of saferm, the script prompts before deleting files, but you can eliminate the prompt by changing the line that reads read answer to read answer=A and commenting out the immediately preceding echo lines. The script uses a trash directory in the user’s home directory, ~/.trash. When users need to recover "deleted" files, they can simply move them out of ~/.trash. This specific script doesn’t attempt to empty the trash bin, so users must do this themselves using the real rm; or you or your users can create cron jobs to do the task.
File Recovery Tools
Undelete utilities for Linux are few and far between. The Linux philosophy is that users shouldn’t delete files they really don’t want to delete, and if they do, they should be restored from backups. Nonetheless, in a pinch there are some tricks you can use to try to recover accidentally deleted files.
Note
Low-level disk accesses require full read (and often write) privileges to the partition in question. Normally, only root has this access level to hard disks, although ordinary users may have such access to floppies. Therefore, normally only root may perform low-level file recoveries.
One of these tricks is the recover utility, which is headquartered at http://recover.sourceforge.net/linux/recover/ and available with most Linux distributions. Unfortunately, this tool has several drawbacks. The first is that it was designed for ext2fs, and so it doesn’t work with most journaling filesystems. (It may work with ext3fs, though.) Another problem is that recover takes a long time to do anything, even on small partitions. I frequently see network programs such as web browsers and mail clients crash when recover runs. Finally, in my experience, recover frequently fails to work at all; if you type recover /dev/sda4, for instance, to recover files from /dev/sda4, the program may churn for a while, consume a lot of CPU time, and return with a Terminated notice. In sum, recover isn’t a reliable tool, but you might try it if you’re desperate. If you do try to run it, I recommend shutting down unnecessary network-enabled programs first.
Another method of file recovery is to use grep to search for text contained in the file. This approach is unlikely to work on anything but text files, and even then it may return a partial file or a file surrounded by text or binary junk. To use this approach, you type a command such as the following:
# grep -a -B5 -A100 "Dear Senator Jones" /dev/sda4 > recover.txt
This command searches for the text Dear Senator Jones on /dev/sda4 and returns the five lines before (-B5) and the 100 lines after (-A100) that string. The redirection operator stores the results in the file recover.txt. Because this operation involves a scan of the entire raw disk device, it’s likely to take a while. (You can speed matters up slightly by omitting the redirection operator and instead cutting and pasting the returned lines from an xterm into a text editor; this enables you to hit Ctrl+C to cancel the operation once it’s located the file. Another option is to use script to start a new shell that copies its output to a file, so you don’t need to copy text into an editor.) This approach also works with any filesystem. If the file is fragmented, though, it will only return part of the file. If you misjudge the size of the file in lines, you’ll either get just part of the file or too much—possibly including binary data before, after, or even within the target file.
Restoring Files from a Backup
"Protecting Your System with Backups," describes system backup procedures. That chapter also includes information on emergency recovery procedures—restoring most or all of a working system from a backup. Such procedures are useful after a disk failure, security breach, or a seriously damaging administrative blunder. System backups can also be very useful in restoring deleted files. In this scenario, an accidentally deleted file can be restored from a backup. One drawback to this procedure is that the original file must have existed prior to the last regular system backup. If your backups are infrequent, the file might not exist. Even if you make daily backups, this procedure is unlikely to help if a user creates a file, quickly deletes it, and then wants it back immediately. A trash can utility is the best protection against that sort of damage.
As an example, suppose you create backups to tape using tar. You can recover files from this backup by using the –extract (-x) command. Typically, you also pass the –verbose (-v) option so that you know when the target file has been restored, and you use –file (-f) to point to the tape device file. You must also pass the name of the file to be restored:
# tar -xvf /dev/st0 home/al/election.txt
This command recovers the file home/al/election.txt from the /dev/st0 tape device. A few points about this command require attention:
Permissions The user who runs the command must have read/write access to the tape device. This user must also have write permission to the restore directory (normally, the current directory). Therefore, root normally runs this command, although other users may have sufficient privileges on some systems. Ownership and permissions on the restored file may change if a user other than root runs the command.
Filename Specification The preceding command omitted the leading slash (/) in the target filename specification (home/al/election.txt). This is because tar normally strips this slash when it writes files, so when you specify files for restoration, the slash must also be missing. A few utilities and methods of creating a backup add a leading ./ to the filename. If your backups include this feature, you must include it in the filename specification to restore the file.
Restore Directory Normally, tar restores files to the current working directory. Thus, if you type the preceding command while in /root, it will create a /root/home/al/election.txt file (assuming it’s on the tape). I recommend restoring to an empty subdirectory and then moving the restored file to its intended target area. This practice minimizes the risk that you might mistype the target file specification and overwrite a newer file with an older one, or even overwrite the entire Linux installation with the backup.
Unfortunately, tar requires that you have a complete filename, including its path, ready in order to recover a file. If you don’t know the exact filename, you can try taking a directory of the tape by typing tar tvf /dev/st0 (substituting another tape device filename, if necessary). You may want to pipe the result through less or grep to help you search for the correct filename, or redirect it to a file you can search.
Tip
You can keep a record of files on a tape at backup time to simplify searches at restore time. Using the –verbose option and redirecting the results to a file will do the trick. Some incremental backup methods automatically store information on a backup’s contents, too. Some backup tools, such as the commercial Backup/Recover Utility (BRU; http://www.bru.com), store an index of files on the tape. This index enables you to quickly scan the tape and select files for recovery from the index