开发者

Ordering file location on linux partition

开发者 https://www.devze.com 2023-04-10 08:03 出处:网络
I have a process which processes a lot of files (~96,000 files, ~12 TB data). Seve开发者_Go百科ral runs of the process has left the files scattered about the drive. Each iteration in the process, uses

I have a process which processes a lot of files (~96,000 files, ~12 TB data). Seve开发者_Go百科ral runs of the process has left the files scattered about the drive. Each iteration in the process, uses several files. This leads to a lot of whipsawing around the disk collecting the files.

Ideally, I would like the process to write the files it uses in order, so that the next run will read them in order (file sizes change). Is there a way to hint at a physical ordering/grouping, short of writing to the raw partition?

Any other suggestions would be helpful.

Thanks


There are two system calls you might lookup: fadvise64, fallocate tell the kernel how you intend to read or write a given file.

Another tip is the "Orlov block allocator" (Wikipedia, LWN) affects the way the kernel will allocate new directories and file-entries.


In the end I decided not to worry about writing the files in any particular ordering. Instead, before I started a run, I would figure out where the first block of each file was located, and then sort the file processing order by first block location. Not perfect, but it did make a big difference in processing times.

Here's the C code I used to get the first block of supplied file list I adapted it from example code I found online (can't seem to find the original source).

#include <stdio.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>

#include <linux/fs.h>

//
// Get the first block for each file passed to stdin,
// write filename & first block for each file to stdout
//


int main(int argc, char **argv) {
    int     fd;
    int     block;
    char fname[512];

    while(fgets(fname, 511, stdin) != NULL) {

        fname[strlen(fname) - 1] = '\0';
        assert(fd=open(fname, O_RDONLY));

        block = 0;
        if (ioctl(fd, FIBMAP, &block)) {
            printf("FIBMAP ioctl failed - errno: %s\n", strerror(errno));
        }
        printf("%010d, %s\n", block, fname);
        close(fd);
    }
    return 0;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消