So I have a space/new line after a closing ?>
(php tag) that is breaking my application.
How can I find it easily I have 1000 of files and 100000 lines o开发者_开发问答f code in this app.
Ideally im after some regex combined with find grep to run on a unix box.
The problem here is normal grep doesn't match multiple lines. So, I would install pcregrep
and try the following command:
pcregrep -rMl '\?>[\s\n]+\z' *
This will match all files in the folder and subfolders (the -r
part) using PCRE multiline match (the -M
part), and only list their filenames (the -l
part).
As for the pattern, well that matches ?>
followed by 1 or more whitespace or newline characters, followed by the end of the file \z
. I found though, when I ran this on my folder, many of the PHP files do in fact end with a single newline. So you can update that regex to be '\?>[\s\n]+\n\z'
to match files with whitespace over and above the single \n
character terminator.
Lastly, you can always use od -c filename
to print unambiguous representation of the file if you need to check its exact character sequence ending.
use perl;
perl -0777 -i -pe 's/\s*$//s' *.php
- -0777 will slurp he whole file (-0 will be ok too)
- -i - inplace editing, so the file will be replaces with the result
- -p print lines
- -e perl expression
s/\s*$//s - treat all lines as a single line and substitute any space at the end to nothing
This is possible with regular grep
grep -Pz '\?>[\s]+$' -Rl
Will search for all files starting from the current directory and list all that have a ?>
followed by white space at the end of the file.
-P
Interpret the pattern as a Perl-compatible regular expression (PCRE).-z
Treats the input file as one long line - this is in part what makes it work[\s]+
matches at least one white space - including newlines
If you want to match PHP files only:
find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l
To search for white space at the beginning of the file before
find -name '*.php' | xargs grep -Pz '^[\s]+<\?' -l
This works on my box:
for i in `find . -name "*.php"`; do (echo -n "$i: "; tail -c 3 $i) | grep -v "[?]>"; done
The idea is that you take just the last 3 characters with tail, then discard the files where those are '?', '>' and newline. If there's a space or another newline, you won't get the '?' character..
sed -e :a -e '/^[ \n]*$/{$d;N;ba' -e '}' -e '$ s/\([^ ]\)*?>[ ]*/\1?>/' file.php > new_file.php
to be executed for each file not completely tested..
remember to work on a temporary file and after the sed operation copy the new file on the original one..
This works for me...
<\?php | \?>
If you need to use in in a sublime-settings file or something like that which doesn't like forward slashes, you might have to add an extra slash for each of them like so...
<\\?php | \\?>
Hope that helps!
This worked for me to find white spaces before php files
find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l
grep '?> ' *.php
? Of course, it may not be a space and could be a linebreak or a tab, so you may want to try other characters.
Using notepad++ you can replace easily all documents at the same time, drap and drop that folder of files and press CTRL + R, also you can use Regex
精彩评论