开发者

Delete similar lines - PHP

开发者 https://www.devze.com 2023-04-05 06:24 出处:网络
Is it possible to delete all lines that have the same first 30 characters of the line and then only to left the first line that has these characters?

Is it possible to delete all lines that have the same first 30 characters of the line and then only to left the first line that has these characters?

Example:

xx2 Lorem ipsum dolor sit amet, fdsfdsfs
xx2 Lorem ipsum dolor sit amet, 43434343

The second should be deleted... Hope it's possible... Thank开发者_开发问答s


$page = explode( "\n", $file );
$count = 0;
foreach( $page as $line )
{
  if( in_array( substr( $line, 0, 30 ), $search ) ){
    unset( $page[$count] );  // delete the duplicate..
  }else{
    $search[] = substr( $line, 0, 30 );
  }
  $count++;
}

Basically it takes a file or multi-line string and loops through file line by line. If the first 30 characters have been encountered then this deletes the line. If it has not then it is added to the list to be checked against. When it is done looping through the file there will be one instance of each unique beginning string only. Give it a try, Good luck.


If you need to deal with really large files, reading only one line at a time and writing to a temp file will consume less memory. Using a temp file and renaming it to the input file when done will do the operation atomically without losing your original file. Checking for array keys instead of values will offer fast lookup since keys are indexed. You also need to handle the edge case of a blank line returning false on substr.

<?php
$infile_name = "infile.txt";

$seen = array();
$infile = fopen($infile_name, "r");
if ( $infile !== false ) {
    // Temporary file to write results to
    $outfile_name = tempnam(sys_get_temp_dir(), 'tmp');
    $outfile = fopen($outfile_name, "w");

    while (!feof($infile)) {
        $line = fgets($infile);
        if ( $line == '' ) {
            // blank line, just write it
            fwrite($outfile, $line);
        }
        else {
            $prefix = substr( $line, 0, 30 );

            if ( !array_key_exists($prefix, $seen) ) {
               fwrite($outfile, $line);

               // Store the prefix as a key for fast indexed lookup
               $seen[$prefix] = true;
            }
        }
    }

    fclose($infile);
    fclose($outfile);

    // Remove the old file and put the new file in its place
    unlink($infile_name);
    rename($outfile_name, $infile_name);
}
?>
0

精彩评论

暂无评论...
验证码 换一张
取 消