开发者

List directory filenames by file word count

开发者 https://www.devze.com 2023-01-04 01:14 出处:网络
I\'d like to, Check the word count for a folder full of text files. Output a list of the file开发者_JS百科s arranged by word count in the format - FILENAME is WORDCOUNT

I'd like to,

  1. Check the word count for a folder full of text files.
  2. Output a list of the file开发者_JS百科s arranged by word count in the format - FILENAME is WORDCOUNT

I know str_word_count is used to get individual wordcounts for files but I'm not sure how to rearrange the output.

Thanks in advance.


Adapted from here.

<?php
    $files = array();
    $it = new DirectoryIterator("/tmp");
    $it->rewind();
    while ($it->valid()) { 
        $count = str_word_count(file_get_contents($it->getFilename()));
        $files[sprintf("%010d", $count) . $it->getFilename()] =
            array($count, $it->getFilename()); 
        $it->next();
    }

    ksort($files);
    foreach ($files as $tup) {
        echo sprintf("%s is %d\n", $tup[1], $tup[0]);
    }

EDIT It would be more elegant to have $file's key be the file name and $file's value be the word count and then sort by value.


I don't use php but I would

  1. create array to hold filename and wordcount
  2. read through the folder full of text files and for each save the filename and wordcount to the array
  3. sort the array by wordcount
  4. output the array

To store the information (#2) I would put the information into a 2D array. There is more information about 2D arrays here at Free PHP Tutorial. Thus array[0][0] would equal the name of the first file and array0 would be the wordcount. array1[0] and array1 would be the for the next file.

To sort the array (#3) you can use the tutorial firsttube.com.

The to output I would do a loop through the array and output the first and second location.

for ($i = 0; $i < sizeof($array); ++$i) {
    print the filename ($array[$i][0]) and wordcount ($array[$i][1])
}


If you would like to keep the iterator-style approach (yet still do essentially the same as Artefacto's answer) then something like the following would suffice.

$dir_it = new FilesystemIterator("/tmp");
// Build array iterator with word counts
$arr_it = new ArrayIterator();
foreach ($dir_it as $fileinfo) {
    // Skip non-files
    if ( ! $fileinfo->isFile()) continue;
    $fileinfo->word_count = str_word_count(file_get_contents($fileinfo->getPathname()));
    $arr_it->append($fileinfo);
}
// Sort by word count descending
$arr_it->uasort(function($a, $b){
    return $b->word_count - $a->word_count;
});

// Display sorted files and their word counts
foreach ($arr_it as $fileinfo) {
    printf("%10d %s\n", $fileinfo->word_count, $fileinfo->getFilename());
}

Aside: If the files are particularly large (read: loading each one entirely into memory just to count the words is too much) then you could loop over the file line-by-line (or byte-by-byte if you really wanted to) with the SplFileObject.

0

精彩评论

暂无评论...
验证码 换一张
取 消