How to exclude files on copy (cp) on GNU / Linux / Linux copy and exclude files and directories (cp -r) exclusion

Saturday, 3rd March 2012

I've recently had to make a copy of one /usr/local/nginx directory under /usr/local/nginx-bak, in order to have a working copy of nginx, just in case if during my nginx update to new version from source mess ups.

I did not check the size of /usr/local/nginx , so just run the usual:

nginx:~# cp -rpf /usr/local/nginx /usr/local/nginx-bak
...

Execution took more than 20 seconds, so I check the size and figured out /usr/local/nginx/logs has grown to 120 gigabytes.

I didn't wanted to extra load the production server with copying thousands of gigabytes so I asked myself if this is possible with normal Linux copy (cp) command?. I checked cp manual e.g. man cp, but there is no argument like –exclude or something.

Even though the cp command exclude feature is not implemented by default there are a couple of ways to copy a directory with exclusion of subdirectories of files on G / Linux.

Here are the 3 major ones:

1. Copy directory recursively and exclude sub-directories or files with GNU tar

Maybe the quickest way to copy and exclude directories is through a littke 'hack' with GNU tar nginx:~# mkdir /usr/local/nginx-new;
nginx:~# cd /usr/local/nginx#
nginx:/usr/local/nginx# tar cvf - . --exclude=/usr/local/nginx/logs/*
| (cd /usr/local/nginx-new; tar -xvf - )

Copying that way however is slow, in my case it fits me perfectly but for copying large chunks of data it is better not to use pipe and instead use regular tar operation + mv

# cd /source_directory
# tar cvf test.tar --exclude=dir_to_exclude/*--exclude=dir_to_exclude1/* .
# mv test.tar /destination_directory
# cd /destination# tar xvf test.tar

2. Copy folder recursively excluding some directories with rsync

P>eople who has experience with rsync , already know how invaluable this tool is. Rsync can completely be used as for substitute=de.a# rsync -av –exclude='path1/to/exclude' –exclude='path2/to/exclude' source destination

This example, can also be used as a solution to my copy nginx and exclude logs directory casus like so:

nginx:~# rsync -av --exclude='/usr/local/nginx/logs/' /usr/local/nginx/ /usr/local/nginx-new

As you can see for yourself, this is a way more readable for the tar, however it will not work on servers, where rsync is not installed and it is unusable if you have to do operations as a regular users on such for that case surely the GNU tar hack is more 'portable' across systems.
rsync has also Windows version and therefore, the same methodology should be working on MS Windows and good for batch scripting.
I've not tested it myself, yet as I've never used rsync on Windows, if someone has tried and it works pls drop me a short msg in comments.
3. Copy directory and exclude sub directories and files with find

Find in collaboration with cp can also be used to exclude certain directories while copying. Actually this method is better than the GNU tar hack and surely more efficient. For machines, where rsync is not installed it is just a perfect way to copy files from location to location, while excluding some directories, here is an example use of find and cp, for the above nginx case:

nginx:~# cd /usr/local/nginx
nginx:~# mkdir /usr/local/nginx
nginx:/usr/local/nginx# find . -type d ( ! -name logs ) -print -exec cp -rpf '{}' /usr/local/nginx-bak ;

This will find all directories inside /usr/local/nginx with find command print them on the screen, then execute recursive copy over each found directory and copy to /usr/local/nginx-bak

This example will work fine in the nginx case because /usr/local/nginx does not contain any files but only sub-directories. In other occwhere the directory does contain some files besides sub-directories the files had to also be copied e.g.:

# for i in $(ls -l | egrep -v '^d'); do
cp -rpf $i /destination/directory

This will copy the files from source directory (for instance /usr/local/nginx/my_file.txt, /usr/local/nginx/my_file1.txt etc.), which doesn't belong to a subdirectory.

The cmd expression:

# ls -l | egrep -v '^d'

Lists only the files while excluding all the directories and in a for loop each of the files is copied to /destination/directory

If someone has better ideas, please share with me 🙂

Share this on:

Download PDFDownload PDF

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

2 Responses to “How to exclude files on copy (cp) on GNU / Linux / Linux copy and exclude files and directories (cp -r) exclusion”

  1. Morten Pedersen says:
    Google Chrome 21.0.1180.79 Google Chrome 21.0.1180.79 Mac OS X 10.6.8 Mac OS X 10.6.8
    Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.79 Safari/537.1

    Why not just use the cp command and exclude the “logs” subdirectory like this:

    cp -rpf /usr/local/nginx/!(logs) /usr/local/nginx-bak

    View CommentView Comment

Leave a Reply

CommentLuv badge