stevepedwards.com/DebianAdmin linux mint IT admin tips info

Cool Command #2 – Find with -exec or -delete or | xargs rm -v

man find

FIND(1)

NAME

find - search for files in a directory hierarchy

SYNOPSIS

find [-H] [-L] [-P] [-D debugopts] [-Olevel] [path...] [expression]

DESCRIPTION

This manual page documents the GNU version of find. GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name.

If you are using find in an environment where security is important (for example if you are using it to search directories that are writable by other users), you should read the "Security Considerations" chapter of the findutils documentation, which is called Finding Files and comes with findutils. That document also includes a lot more detail and discussion than this manual page, so you may find it a more useful source of information.

In most Linux commands, the form is usually in the order of:

CMD [OPTION]... [DIR/FILE]...

The find cmd is unusual in that it has its options switches placed AFTER the DIR/FILE to be searched e.g.

find /home/ -name libflashplayer.so

This will recursively search for a file named libflashplayer.so in the home directory, and finds it in my Downloads folder:

root@LinuxLaptop:~# find /home/ -name libflashplayer.so

/home/stevee/Downloads/libflashplayer.so

So its default behaviour is to file search recursively with no switch required.

A very useful addition can be made to this command by adding the -exec command switch, a program in its own right, who's description is a bit cryptic:

man exec

NAME

execl, execlp, execle, execv, execvp, execvpe - execute a file

SYNOPSIS

#include <unistd.h>

extern char **environ;

int execl(const char *path, const char *arg, ...);

int execlp(const char *file, const char *arg, ...);

int execle(const char *path, const char *arg,

..., char * const envp[]);

int execv(const char *path, char *const argv[]);

int execvp(const char *file, char *const argv[]);

int execvpe(const char *file, char *const argv[],

char *const envp[]);

Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

execvpe(): _GNU_SOURCE

DESCRIPTION

The exec () family of functions replaces the current process image with

a new process image. The functions described in this manual page are

front-ends for execve(2). (See the manual page for execve(2) for furâ

ther details about the replacement of the current process image.)

The initial argument for these functions is the name of a file that is to be executed.

The key thing to take from that is the execution of a "file" that follows it. In Linux, everything that is not a directory is a "file" – even hardware, which gets virtualised as an alias. So, exec can start another program file.

"-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command until
an argument consisting of `;' is encountered. The string `{}'
is replaced by the current file name being processed everywhere
it occurs in the arguments to the command, not just in arguments
where it is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a `\') or quoted to
protect them from expansion by the shell. See the EXAMPLES section for examples of the use of the -exec option. The specified
command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the -exec action; you should
use the -execdir option instead.

-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocaâ
tions of the command will be much less than the number of
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of `{}'
is allowed within the command. The command is executed in the
starting directory."

For this example it starts the copy command (cp) which copies the file libflashplayer.so that find found, and copies it elsewhere:

find /home/ -name libflashplayer.so -exec cp –vt /etc/chromium/plugins/ {} \;

The cp cmd could be replaced with any other cmd like remove (rm) etc. So this makes this a useful line for housekeeping your backups etc.

I used it to finally sort out my messy /Files subdirectories on my 1TB BU drive, by creating directories of all Windows files of a specific extension type - .doc, .docx, .rtf, .txt, .pdf, .PDF, jpg, .jpeg, .mp3, .mp4, .mov, .xls, .html, and any others I wanted to move to separate folders of the same type name, so they could be checked for content, then deleted to save space as required.

To find all file types of mixed caps like .m4p and .M4P, use the -iname format instead of -name e.g.

find /1500/MP3 -iname *m4p -exec cp -vt /1500/M4P/ {} \;

This took time but I plodded on, letting Linux sort it all out in the background as I did other things, by amending each line for each file type, over a day or two.

Once done though, I could check through everything, finding some old useful/important but forgotten documents in the process, and deleting the no longer relevant. The original messy folder containing all the mixed up originals could then be deleted saving a load of space.

Once all the file types to be found and copied were decided on and folders created, the command was amended to find and copy them there using the wildcard *:

find /Storebird/Files -name *.pdf -exec cp -vt /Storebird/PDFs {} \;

The –vt after cp is for "verbose" and "to" destination folder and the "t" MUST come last i.e. right before the destination folder, as "–tv" /DestFolder does not work.

Note, on search types like those containing UPPER and lower case letters, you can use the -iname switch to find ALL file types that end in either .pdf or PDF, or .doc and .DOC for example:

find /Storebird/PDF/ -iname *PDF

/Storebird/PDF/5.pdf
/Storebird/PDF/514168_NIKOLA_TESLA.PDF

Experiment with other examples using this syntax from the book TLCL-13.07.pdf by William Shotts here:

TLCL-13.07.pdf

"Table 4-1: Wildcards
Wildcard Meaning
* Matches any characters
? Matches any single character
[characters] Matches any character that is a member of the set characters
[!characters] Matches any character that is not a member of the set
characters
[[:class:]] Matches any character that is a member of the specified class
Table 4-2 lists the most commonly used character classes:
Table 4-2: Commonly Used Character Classes
Character Class Meaning
[:alnum:] Matches any alphanumeric character
[:alpha:] Matches any alphabetic character
[:digit:] Matches any numeral
[:lower:] Matches any lowercase letter
[:upper:] Matches any uppercase letter
Using wildcards makes it possible to construct very sophisticated selection criteria for
filenames. Here are some examples of patterns and what they match:
Table 4-3: Wildcard Examples
Pattern Matches
* All files
g* Any file beginning with “g”
b*.txt Any file beginning with “b” followed by
any characters and ending with “.txt”
26Wildcards
Data???

Any file beginning with “Data” followed by exactly three characters 

[abc]* Any file beginning with either an “a”, a “b”, or a “c”
BACKUP.[0-9][0-9][0-9] Any file beginning with “BACKUP.”
followed by exactly three numerals 

[[:upper:]]* Any file beginning with an uppercase letter 

[![:digit:]]* Any file not beginning with a numeral 

*[[:lower:]123] Any file ending with a lowercase letter or the numerals “1”, “2”, or “3”

Wildcards can be used with any command that accepts filenames as arguments, but we’ll talk more about that in Chapter 7.
Character Ranges
If you are coming from another Unix-like environment or have been reading some other books on this subject, you may have encountered the [A-Z] or the [a-z] character range notations. These are traditional Unix notations and worked in older versions of Linux as well. They can still work, but you have to be very careful with them because they will not produce the expected results unless properly configured. For now, you should avoid using them and use character classes instead."

Here's a good house cleaner for removing those pesky Windows "Thumbs.db", ".tmp" or "desktop.ini" files that get everywhere:

find  /Quadra/Test/ -name Thumbs.db  -exec rm -v  {} \;

CAUTION: You may want to test is on a known directory first before you unleash it on a whole drive; like the root dir, or using a *.db shortcut, as you might delete all your system's databases!

You could also just use functions of "find" alone for the same job in that case:

find ~ -type f -name 'Thumbs.db' -delete

sudo find ~ -type f -name 'desktop.ini' | xargs rm -v

BUT!! Beware - many linux files ALSO end in .ini so be careful what directories you use wildcards in! Check the results first without the -delete.

/usr/share/themes/Mint-X-Sand/gtk-3.0/settings.ini

If you have a very large ext4 backup drive - say 1TB or more, it may pay you to defrag it before doing any housecleaning to speed up finds. Word tmp files alone could account for much space.

man e4defrag

My main 1500GB backup drive has this many (mainly Win) .tmp files:

find . -name *.tmp | wc -l
333

ALSO! beware of mounted file systems on backup and/or smb networked drives! Use the:

 -mount Don't descend directories on other file systems. An alternate name for -xdev, for compatibility with some other versions of find.

Comments are closed.

Post Navigation