I have a terribly uncomplicated test program that prints out the following numbers.
i.e.
int main(int argc, char* argv[])
struct statvfs vfs;
statvfs(argv[1], &vfs);
printf("f_bsize (block size): %lu\n"
"f_frsize (fragment size): %lu\n"
"f_blocks (size of fs in f_frsize units): %lu\n"
"f_bfree (free blocks): %lu\n"
"f_bavail free blocks for unprivileged users): %lu\n"
"f_files (inodes): %lu\n"
"f_ffree (free inodes): %lu\n"
"f_favail (free inodes for unprivileged users): %lu\n"
"f_fsid (file system ID): %lu\n"
"f_flag (mount flags): %lu\n"
"f_namemax (maximum filename length)%lu\n",
vfs.f_bsize,
vfs.f_frsize,
vfs.f_blocks,
vfs.f_bfree,
vfs.f_bavail,
vfs.f_files,
vfs.f_ffree,
vfs.f_favail,
vfs.f_fsid,
vfs.f_flag,
vfs.f_namemax);
return 0;
}
Prints out:
f_bsize (block size): 4096
f_frsize (fragment size): 4096
f_blocks (size of fs in f_frsize units): 10534466
f_bfree (free blocks): 6994546
f_bavail free blocks for unprivileged users): 6459417
f_files (inodes): 2678784
f_ffree (free inodes): 2402069
f_favail (free inodes for unprivileged users): 2402069
f_fsid (file system ID): 12719298601114463092
f_flag (mount flags): 4096
f_namemax (maximum filename length)255
df prints out for the root fs:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 42137864 14159676 25837672 36% /
But here is where I'm confused.
25837672+14159676 != 42137846 (actually 39997348)
Therefore if I were to do the calc 14159676 / 42137864 * 100 I get 33% not 36% as df prints.
But if I calc
14159676 / 39997348 * 100 I get 35%.
Why all the discrepencies and where is df getting the number 42137864? Is it related to some conversion to 1k blocks vs the actual system block size which is 4k?
This will be in开发者_如何学编程tegrated into my caching app to tell me when the drive is at some threshold... e.g. 90% before I start freeing fixed size blocks that are sized in 2^n sizing. So what I'm after is a function that gives me a reasonably accurate %used.
EDIT: I can now match what df prints. Except for the %Used. It makes we wonder how accurate all this is. What is the fragment size?
unsigned long total = vfs.f_blocks * vfs.f_frsize / 1024;
unsigned long available = vfs.f_bavail * vfs.f_frsize / 1024;
unsigned long free = vfs.f_bfree * vfs.f_frsize / 1024;
printf("Total: %luK\n", total);
printf("Available: %luK\n", available);
printf("Used: %luK\n", total - free);
EDIT2:
unsigned long total = vfs.f_blocks * vfs.f_frsize / 1024;
unsigned long available = vfs.f_bavail * vfs.f_frsize / 1024;
unsigned long free = vfs.f_bfree * vfs.f_frsize / 1024;
unsigned long used = total - free;
printf("Total: %luK\n", total);
printf("Available: %luK\n", available);
printf("Used: %luK\n", used);
printf("Free: %luK\n", free);
// Calculate % used based on f_bavail not f_bfree. This is still giving out a different answer to df???
printf("Use%%: %f%%\n", (vfs.f_blocks - vfs.f_bavail) / (double)(vfs.f_blocks) * 100.0);
f_bsize (block size): 4096
f_frsize (fragment size): 4096
f_blocks (size of fs in f_frsize units): 10534466
f_bfree (free blocks): 6994182
f_bavail (free blocks for unprivileged users): 6459053
f_files (inodes): 2678784
f_ffree (free inodes): 2402056
f_favail (free inodes for unprivileged users): 2402056
f_fsid (file system ID): 12719298601114463092
f_flag (mount flags): 4096
f_namemax (maximum filename length)255
Total: 42137864K
Available: 25836212K
Used: 14161136K
Free: 27976728K
Use%: 38.686470%
matth@kubuntu:~/dev$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 42137864 14161136 25836212 36% /
I get 38% not 36. If calculated by f_bfree I get 33%. Is df wrong or is this just never going to be accurate? If this is the case then I want to lean on the side of being conservative.
df
's data may be based on f_bavail
, not f_bfree
. You may find it helpful to look at the source code to df to see how it does things. It has a number of edge cases it needs to deal with (eg, when the used space exceeds the amount of space available to non-root users), but the relevant code for the normal case is here:
uintmax_t u100 = used * 100;
uintmax_t nonroot_total = used + available;
pct = u100 / nonroot_total + (u100 % nonroot_total != 0);
In other words, 100 * used / (used + available)
, rounded up. Plugging in the values from your df output gives 100 * 14159676 / (14159676 + 25837672) = 35.4015371
, which rounded up is 36%, just as df
calculated.
On your Edit #2, the Usage% calculation needs to be updated to this to match df output:
100.0 * (double) (vfs.f_blocks - vfs.f_bfree) / (double) (vfs.f_blocks - vfs.f_bfree + vfs.f_bavail)
Reasoning:
Used = f_blocks - f_bfree
Avail = f_bavail
df % = Used / (Used + Avail)
This is the closest I've got to matching the output of df -h
for used percentage:
const uint GB = (1024 * 1024) * 1024;
struct statvfs buffer;
int ret = statvfs(diskMountPoint.c_str(), &buffer);
const double total = ceil((double)(buffer.f_blocks * buffer.f_frsize) / GB);
const double available = ceil((double)(buffer.f_bfree * buffer.f_frsize) / GB);
const double used = total - available;
const double usedPercentage = ceil((double)(used / total) * (double)100);
return usedPercentage;
It seems I get confused whenever I deal with this issue. I hope the following C code is helpful to someone looking for percentage of used space:
/*
* It is helpful to use a picture to aid the calculation of disk space.
*
* |<--------------------- f_blocks ---------------------------->|
* |<---------------- f_bfree ------------------>|
*
* ---------------------------------------------------------------
* | USED | f_bavail | Reserved for root |
* ---------------------------------------------------------------
*
* We want the percentage of used blocks vs. all the
* non-reserved blocks: USED / (USED + f_bavail)
*/
fsblkcnt_t used = fs_stats.f_blocks - fs_stats.f_bfree;
double fraction_used = (double) used / ((double) used + (double) fs_stats.f_bavail);
uint8_t percent_used = (uint8_t) ((fraction_used * 100.0) + 0.5); // Add 0.5 for rounding
statvfs metrics are kinda confusing. You can use psutil source code as an example on how to get meaningful values in bytes: https://github.com/giampaolo/psutil/blob/f4734c80203023458cb05b1499db611ed4916af2/psutil/_psposix.py#L119
Here is an implementation that mimics the behavior of df
:
#include <string>
#include <sys/statvfs.h>
double amountOfDiskSpaceUsed(const std::string& filePath)
{
// Based on the implementation in https://github.com/coreutils/coreutils/blob/master/src/df.c
// See how PCENT_FIELD and IPCENT_FIELD are calculated.
struct statvfs diskInfo;
statvfs(filePath.c_str(), &diskInfo);
const auto total = static_cast<unsigned long>(diskInfo.f_blocks);
const auto available = static_cast<unsigned long>(diskInfo.f_bavail);
const auto availableToRoot = static_cast<unsigned long>(diskInfo.f_bfree);
const auto used = total - availableToRoot;
const auto nonRootTotal = used + available;
return 100.0 * static_cast<double>(used) / static_cast<double>(nonRootTotal);
}
E.g. it may return 39.623889
while df
outputs 40%
(rounded value).
精彩评论