Solved find - does not seem to return filenames as they are found?

We have a server with an excessively large /var/spool/clientmqueue. I am trying to clear this out. The technique I tried was:
Code:
find /var/spool/clientmqueue -type f -exec rm -f {} \;

This command exhibits behaviour that I was not expecting. I anticipated that this would remove files as find encountered them. This does not seem to be the case. There was an interruption due to a network error and when I checked the directory size it had not changed although the find had been running for a significant amount of time.

I then used this command to see what was going on:
Code:
find /var/spool/clientmqueue -type f -ok rm {} \;

This command eventually, after a very, very long time, began to request permission to delete files. The problem is that it took an extraordinary length of time before it began the delete process. I had previously believed that find acted upon files as it encountered them. I therefore expected the first confirmation request to be nearly instantaneous. Either my belief is unfounded or something else is going on. Does not findexecute as it encounters qualified files? If so then what is causing the inordinate delay. If not then is there a utility that does exactly that?
 
No matter what is going on, my take would be to forget find's -exec rm in favor of xargs,

Code:
find /var/spool/clientmqueue -type f -print0 | xargs -0 -r rm

which saves a ton of forks.
 
If you try find /var/spool/clientmqueue -type f -exec echo {} \; you will see that files are echoed immediately. Something else must be going on. However, as schweikh says, pipe to xargs instead of using -exec.
 
my WAG: filesystem operations are probably being cached. Don't look at the filesystem from another shell and expect 100% syncronous behaviour.

I see this all the time when processing large datasets. It does in line with that concept where if you delete a file that someone else has open then the dirent may go away, but the file still occupies cluster space until the other user closes the file.

And in OPs use case they are messing with a queue directory, which undoubtedly has other daemons concurrently accessing that directory tree.
 
The technique I tried was:
As others have said, this is just about the least efficient technique. Doing "rm -r ..." would be much more efficient, and find | xargs would still be better. If you have to be selective (only delete certain things, for example in your find you select only files, not directories): Run ls -l into a file, then filter that file, and use the filtered file as input into xargs, or turn the file into a script. When doing this, minimize the number of times the rm executable needs to be started, so every time you run rm, give it hundreds or thousands of files at once.

and when I checked the directory size it had not changed
What do you mean by "directory size"? Did you do an "ls -l" on the directory itself, and look at its filesize in bytes? Be careful with that, since the file size of a directory is not a well defined concept. In ZFS, the size in bytes is roughly the number of entries in the directory (typically +2 for . and .., and if any of the directory entries are directories themselves, there might be an extra +1 for each subdirectory's backlink, but I'm not sure). On UFS, the size is measured in sectors, and the rules for when directories are compacted are interestingly complex, so don't expect directories to shrink immediately.

This command eventually, after a very, very long time, began to request permission to delete files. The problem is that it took an extraordinary length of time before it began the delete process.
Something is going on with your underlying file system. As you said, the find should start pretty much immediately. You didn't by chance run top while it was running?

my WAG: filesystem operations are probably being cached. Don't look at the filesystem from another shell and expect 100% syncronous behaviour.
Yes and no. Deleting a file is synchronous from the viewpoint of the name space: when rm finishes (or equivalently when the unlink system call returns), the directory entry (the name of the file) shall be gone. Any ls (or readdir() or equivalent operation, such as the find command) shall not see the directory entry. From the viewpoint of the space usage of the file, the story is more complex. As you said, if the file is open by another process (or has another name = hardlink), then the underlying storage can not be released. And cleaning up storage is theoretically allowed to happen later. I think UFS will immediately release it in the unlink call. With ZFS, the log cleaner adds a lots of complexity, and I don't know whether delete operations are synchronous or not by default. Even if they are synchronous, a large number of deletes can create a huge backlog for log cleaners to work on, which will slow them down. Now add dedup and snapshots, and the answer is complex.
 
you can rm -fr /var/spool/clientmqueue and recreate it
I was thinking about doing that. The problem arose from an inadvertent start of the sendmail service on a remote host. Sendmail was not configured and so there were a lot of local warning messages generated before the problem was noticed. I thought it best to leave unfamilar parts of the file system alone and just clean out the files. The unexpected results turned this into a learning experience for me.
 
If you try find /var/spool/clientmqueue -type f -exec echo {} \; you will see that files are echoed immediately. Something else must be going on. However, as schweikh says, pipe to xargs instead of using -exec.
Sadly, this does not work for me as suggested. This also results in a long delay with no visible output.
 
my WAG: filesystem operations are probably being cached. Don't look at the filesystem from another shell and expect 100% syncronous behaviour.

I see this all the time when processing large datasets. It does in line with that concept where if you delete a file that someone else has open then the dirent may go away, but the file still occupies cluster space until the other user closes the file.

And in OPs use case they are messing with a queue directory, which undoubtedly has other daemons concurrently accessing that directory tree.
Sendmail and its associated daemons are not running. We use dma on that host. I believe that /var/spool/clientmqueue is specific to sendmail. I was only checking whether or not there were any changes happening at all.
 
Back
Top