Folks,
I have a file that contains ldap entries and I want to remove "version: 1" lines from the second occurrence and on. I know sed can do things like this, but since I am very new, I don't know how to proceed. This is a Solaris 10 machine and the file looks like as follows:
version: 1
dn: uid=tuser1,ou=people,o=example.com,o=isp
cn: tuser1
uidNumber: 3
gidNumber: 3
homeDirectory: /export/home/tuser1
loginShell: /bin/sh
objectClass: posixAccount
objectClass: shadowAccount
objectClass: account
objectClass: top
uid: tuser1
shadowLastChange:开发者_如何学JAVA
userPassword:
version: 1
dn: uid=tuser2,ou=people,o=example.com,o=isp
uidNumber: 20
cn: tuser1
gidNumber: 3
homeDirectory: /export/home/tuser2
loginShell: /bin/sh
objectClass: posixAccount
objectClass: shadowAccount
objectClass: account
objectClass: top
uid: tuser1
shadowLastChange:
userPassword:
version: 1
dn: uid=tuser3,ou=people,o=example.com,o=isp
uidNumber: 10
cn: tuser3
gidNumber: 3
homeDirectory: /export/home/tuser3
loginShell: /bin/sh
objectClass: posixAccount
objectClass: shadowAccount
objectClass: account
objectClass: top
uid: tuser3
shadowLastChange:
userPassword:
version: 1
dn: uid=loperp,ou=people,o=example.com,o=isp
uid: loperp
userPassword:
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
sn: pop
cn: loper
version: 1
dn: uid=tuser4,ou=people,o=example.com,o=isp
userPassword:
uid: tuser4
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
sn: User4
cn: Test
With GNU sed
sed -ni '0,/version: 1/{p; d}; /version: 1/!p' ldap.txt
EDIT: This was initially wrong. When the first line wasn't version, it printed duplicates.
The GNU version is simpler. It prints (p
) from the beginning until the first line matching the version regex, both inclusive. Also, for each line in that range, after printing we delete the pattern space and start a new cycle (d
). Basically, this means go to the beginning of the script and to the next line (this avoids double printing). Unlike (standard) 1,/regex/
, if the first line matches, it will not continue to another matching line.
If we haven't d
'ed (so we're after the first version: 1
), we then simply print every line that doesn't match the regex (!
).
With standard sed):
sed -ni 'p; /version: 1/ b nov; d; :nov /version: 1/!p; n; b nov' ldap.txt
This begins by simply printing every line (p
). After that print, if we match the regex, we branch to the nov
(no version) label; the label name is up to us. If we do not branch, we (d
) delete the pattern space and start a new cycle (newline, beginning of script). In nov
, we print the line if it does not match (same as GNU). We then go to a new line, and branch back to nov. This loop continues until the end.
I (Jonathan Leffler) can confirm @kuti's observations on Solaris 10 standard 'sed'; what works is:
/bin/sed -n 'p
/version: 1/ b nov
d
:nov
/version: 1/!p
n
b nov' ldap.txt
The 'semi-colons in lieu of newlines' trick does not seem to work universally with Solaris 'sed'. Specifically, at the least, there cannot be a semi-colon after any use of a label.
This seems to work:
/bin/sed -n 'p; /version: 1/ b nov
d; :nov
/version: 1/!p; n; b nov' ldap.txt
(I can't think how to present the fix in a comment - the multiline formatting is crucial here.)
A simple answer uses awk:
awk '{ if ($0 ~ /^version: 1$/) { if (count++ == 0) print; }
else print;
}'
This assumes that you really mean you want only the first 'version: 1' line and don't mind keeping multiple 'version: 2' lines, etc.
here's another awk version
awk '/version: 1/{c++}c>1{gsub("version: 1","")}1' file
Using man 1 ed we can mark the line containing the first match and increment it by 1 to get:
# 'm+1,$
# ... which creates a line address space of:
# /first line matched + 1/,/last line/
# http://wiki.bash-hackers.org/doku.php?id=howto:edit-ed
[[ $(grep -c -m 1 '^version: 1' file) -eq 1 ]] && \
cat <<-'EOF' | sed -e 's/^ *//' -e 's/ *$//' | ed -s file
H
/^version: 1/km
'm+1,$g/^version: 1/d
wq
EOF
精彩评论