I have files with the format:
ATOM 3736 CB THR A 486 -6.552 153.891 -7.922 1.00115.15 C
ATOM 3737 OG1 THR A 486 -6.756 154.842 -6.866 1.00114.94 O
ATOM 3738 CG2 THR A 486 -7.867 153.727 -8.636 1.00115.11 C
ATOM 3739 OXT THR A 486 -4.978 151.257 -9.140 1.00115.13 O
HETATM10351 C1 NAG B 203 33.671 87.279 39.456 0.50 90.22 C
HETATM10483 C1 NAG Z 702 28.025 104.269 -27.569 0.50 92.75 C
ATOM 3736 CB THR X 486 -6.552 86.240 7.922 1.00115.15 C
ATOM 3737 OG1 THR X 486 -6.756 85.289 6.866 1.00114.94 O
ATOM 3738 CG2 THR X 486 -7.867 86.404 8.636 1.00115.11 C
ATOM 3739 OXT THR X 486 -4.978 88.874 9.140 1.00115.13 O
HETATM10351 C1 NAG Y 203 33.671 152.852 -39.456 0.50 90.22 C 开发者_JAVA技巧
HETATM10639 C2 FUC C 402 -48.168 162.221 -22.404 0.50103.03 C
For each block of lines starting with HETATM*, I would like to change column 5 to match that of the previous ATOM block. It means that for the first HETATM* block both B and Z will change to A, whereas for the second HETATM* block both Y and C will change to X.
A second question, I do not really need to do it, it is just out of curiosity, how would I split the file after each line starting with HETATM* but only if the next line is ATOM?
Try this:
awk '{
if( $1 == "ATOM" ) {
col5=$5;
}
else if( match($1,/HETATM[0-9]*/)) {
$5=col5;
}
print
}' < infile
awk '$1=="ATOM"{c=$5}/^HETATM/{ $5=c };1' file
To preserve space, use field separator
awk -F" " '/^ATOM/{c=$5}/^HETATM/{ $5=c };1' file
Here is my solution, which solves the first problem (replacing the fifth field) while preserving white spaces:
$1=="ATOM" {
fifthField=$5
# Block to determine which index position field #5 is
fifthField_index = 1
for (i = 0; i < 4; i++) {
// Skip until white space
for (; substr($0, fifthField_index, 1) != " "; fifthField_index++) { }
// Skip white spaces
for (; substr($0, fifthField_index, 1) == " "; fifthField_index++) { }
}
print;next
}
/^HETATM/ {
before_fifthField = substr($0, 1, fifthField_index - 1)
after_fifthField = substr($0, fifthField_index + 1, length($0))
print before_fifthField fifthField after_fifthField
next
}
1
It is not the most elegant solution, but it works. This solution assumes that the fifth field is a single character.
精彩评论