Friday, January 28, 2011

Find and Delete certain files in CentOS Linux

This command can be scripted (needs to be run as root), but this script is not covered here.

Basically what this command does is search a specified directory for a name/filetype and then delete. 
If you run it from the top level directory (/), chances are something important may get deleted. Be sure to specify your folder, it will traverse subfolders.

My particular reason for using this is I need to clean up over 3000 courses in Moodle and remove any old course backups that were used for creating course copies by the site admin. This involves sorting through tens of thousands of folders for course backups.

Bear in mind that this action cannot be undone, so if Lecturers/Teachers manage their own Moodle course backups, this script will not work for you. As always, you should have backups of all necessary files before attempting this sort of activity.


In order to ensure that no unnecessary files get deleted, run the command without the delete section.
find /path/to/your/folder -name \*backup\*.zip

*edit: to pipe the list including the size of the files to a textfile, please see below:
find /path/to/your/folder -size +5k -name \*searchstring\*.zip -exec ls -lh {} \; | awk '{ print $9 ": " $5 }' > /temp/report.txt

much thanks to http://www.cyberciti.biz/faq/find-large-files-linux/ for the size addition
A list of files matching your search string will be displayed.






Breakdown:
find - Linux search command
/path/to/your/folder/ - self explanatory
-name - tells the command to search for specific filenames
\*backup\*.zip - search string, will return the following:
mybackup.zip
mybackup2001.zip
backup2002.zip
will not return:
backup.txt
mybackup.doc
backup1.log
 | xargs -   allows commands to be run from previous query/command
/bin/rm -f - delete command


Now for the cleanup:

find /path/to/your/folder -name \*backup\*.zip | xargs /bin/rm -f

Congrats! All your base are belong to us..err wait, I meant the files should have been deleted.
Rerun the original find command with your search string and it should no longer return any results.



Cheers,
-n