Can the “du” program be made less aggressive?

Question

We have a regular job that does du summaries of a number of subdirectories, picking out worst offenders, and use the output to find if there are things that are rapidly rising to spot potential problems. We use diff against snapshots to compare them.

There is a top level directory, with a number (few hundred) of subdirectories, each of which may contain 10's of thousands of files each (or more).

A "du -s" in this context can be very IO aggressive, causing our server to bail its cache and then massive IO spikes which are a very unwelcome side affect.

What strategy can be used to get the same data, without the unwanted side effects?

Chris Down · Accepted Answer · 2013-05-28 12:38:18Z

up vote 27 down vote accepted

Take a look at ionice. From man ionice:

This program sets or gets the io scheduling class and priority for a program. If no arguments or just -p is given, ionice will query the current io scheduling class and priority for that process.

To run du with the "idle" I/O class, which is the lowest priority available, you can do something like this:

ionice -c 3 du -s

This should stop du from interfering with other process' I/O. You might also want to consider renicing the program to lower its CPU priority, like so:

renice -n 19 "$duPid"

You can also do both at initialisation time:

nice -n 19 ionice -c 3 du

edited May 28 at 12:38

answered May 28 at 10:48

Chris Down
20.5k23078

5

To renice an existing program, you need to call renice instead of nice. To start du with both ionice and nice, you can chain both programs: nice -n19 ionice -c3 du. – jofel May 28 at 11:22

nice itself also affects the I/O scheduler priority, not just CPU. – jordanm May 28 at 13:13

1

@jordanm As far as I know (at least in Linux), nice only impacts the CPU niceness (which may indirectly affect I/O, but shouldn't affect the I/O scheduler priority). Where do you see this behaviour? Is it documented somewhere? – Chris Down May 28 at 13:58

@ChrisDown - I recall reading it in Understanding the Linux Kernel – jordanm May 28 at 14:06

@jordanm Hm, I've got that book. I did a quick skim through it and only found page 263 giving explicit details about nice, and it only talks about CPU base time quantums. Do you have any idea where in the book it was? I'd be interested to read an authoritative source that states it, it isn't mentioned in man nice, man 2 nice, man 2 setpriority, info nice or info 'nice invocation' as far as I can tell, which is strange because some of these go into quite a bit of detail about how the nice call works and what it does. – Chris Down May 28 at 14:14

show 2 more comments

frostschutz · Answer 2 · 2013-05-28 11:15:48Z

If you have tons of files in a single directory, this can be responsible for I/O spikes, as many file systems don't handle large file trees in a single directory well. Splitting it up into more subdirectories can help there. If you have more than 10k files in a single dir and that's causing problems, you should probably split it up.

As for tracking disk usage, you could first have a look at df, if the usage value there didn't rapidly rise then the subdirs didn't either and you can skip the du altogether.

Another alternative may be a disk quota system which keeps continuous track of usage, if your filesystem supports it.

The system is a build farm, so the number of files in a dir is really down to the devs of the components we build. df wouldn't quite work - the problem is that we do have clean up scripts, so we may see no change, but have missed an early warning for an automatic job that is aggressively grabbing disk space. We use zabbix to monitor overall disk use, but knowing the individual directories where things are going a bit nuts is pretty essential.

Hauke Laging · Answer 3 · 2013-05-28 23:25:31Z

In addition to ionice you can try to make the disk accesses more efficient. This can be tries by executing

find /du/root -printf ""
find /du/root -perm 777 -printf ""

first (maybe through ionice, too). It will not work if there are too many files. How many is too many depends on the amount of free RAM.

asked	11 days ago
viewed	873 times
active	10 days ago

Can the “du” program be made less aggressive?

3 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged disk-usage io limit or ask your own question.

Can the “du” program be made less aggressive?

3 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged disk-usage io limit or ask your own question.

Related