i need a bash script for mac osx working in this way:
./script.sh * folder/to/files/
#
# or #
#
./script.sh xx folder/to/files/
This script
- read a list of files
- open each file and read each lines
- if lines ended with the same letters ('*' mode) or with custom letters ('xx') then remove line and RE-SAVE file
- backup original file
My first approach to do this:
#!/bin/bash
# ck init params
if [ $# -le 0 ]
then
echo "Usage: $0 <letters>"
exit 0
fi
# list files in current dir
list=`ls BRUTE*`
for i in $list
do
# prepare regex
case $1 in
"*") REGEXP="^.*(.)\1+$";;
*) REGEXP="^.*[$1]$";;
esac
FILE=$i
# backup file
cp $FILE $FILE.bak
# removing line with same letters
sed -Ee "s/$REGEXP//g" -i '' $FILE
cat $FILE | grep -v "^$"
done
exit 0
But it doesn't work as i want....
What's wrong?
How can i fix this script?Example:
$cat BRUTE02.dat BRUTE03.dat
aa
ab
ac
ad
ee
ef
ff
hhh
$
If i use '*' i want all files that ended with same letters to be clean.
If i use 'ff' i want all files that ended with 'ff' to be clean.Ah, it's on Mac OSx. Remember that sed is a little different from classical linux sed.
man sed
sed [-Ealn] command [file ...] sed [-Ealn] [-e command] [-f command_file] [-i extension] [file
...]
DESCRIPTION The sed utility reads the specified files, or the standard input if no files are specified, modifying the input as specified by a list of commands. The input is then written to the standard output.
A single command may be specified as the first argument to sed.
Multiple commands may be specified by using the -e or -f options. All commands are applied to the input in the order they are specified regardless of their origin.
The following options are available: -E Interpret regular expressions as extended (modern)
regular expressions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both formats.
-a The files listed as parameters for the ``w'开发者_JAVA技巧' functions
are created (or truncated) before any processing begins, by default. The -a option causes sed to delay opening each file until a command containing the related ``w'' function is applied to a line of input.
-e command Append the editing commands specified by the command
argument to the list of commands.
-f command_file Append the editing commands found in the file
command_file to the list of commands. The editing commands should each be listed on a separate line.
-i extension Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup will be saved. It is not recom- mended to give a zero-length extension when in-place editing files, as you risk corruption or partial content in situations where disk space is exhausted, etc.
-l Make output line buffered. -n By default, each line of input is echoed to the standard
output after all of the commands have been applied to it. The -n option suppresses this behavior.
The form of a sed command is as follows: [address[,address]]function[arguments] Whitespace may be inserted before the first address and the
function portions of the command.
Normally, sed cyclically copies a line of input, not including
its terminating newline character, into a pattern space, (unless there is something left after a ``D'' function), applies all of the commands with addresses that select that pattern space, copies the pattern space to the standard output, append- ing a newline, and deletes the pattern space.
Some of the functions use a hold space to save all or part of the
pattern space for subsequent retrieval.
anything else?
it's clear my problem?thanks.
I don't know bash shell too well so I can't evaluate what the failure is.
This is just an observation of the regex as understood (this may be wrong).
The *
mode regex looks ok:
^.*(.)\1+$ that ended with same letters..
But the literal mode might not do what you think.
current: ^.*[$1]$ that ended with 'literal string'
This shouldn't use a character class.
Change it to: ^.*$1$
Realize though the string in $1 (before it goes into the regex) should be escaped
incase there are any regex metacharacters contained within it.
Otherwise, do you intend to have a character class?
perl -ne '
BEGIN {$arg = shift; $re = $arg eq "*" ? qr/([[:alpha:]])\1$/ : qr/$arg$/}
/$re/ && next || print
'
Example:
echo "aa
ab
ac
ad
ee
ef
ff" | perl -ne '
BEGIN {$arg = shift; $re = $arg eq "*" ? qr/([[:alpha:]])\1$/ : qr/$arg$/}
/$re/ && next || print
' '*'
produces
ab
ac
ad
ee
ef
A possible issue:
- When you put
*
on the command line, the shell replaces it with the name of all the files in your directory. Your$1
will never equal*
.
And some tips:
- You can replace replace:
This:
# list files in current dir
list=`ls BRUTE*`
for i in $list
With:
for i in BRUTE*
- And:
This:
cat $FILE | grep -v "^$"
With:
grep -v "^$" $FILE
Besides the possible issue, I can't see anything jumping out at me. What do you mean clean? Can you give an example of what a file should look like before and after and what the command would look like?
This is the problem!
grep '\(.\)\1[^\r\n]$' *
on MAC OSX, ( ) { }
, etc... must be quoted!!!
Solved, thanks.
精彩评论