开发者

Remove a variety of lines in a text file

开发者 https://www.devze.com 2022-12-09 21:21 出处:网络
I\'ve been trying to implement a bash script that reads from wordnet\'s online database and have been wondering if there is a way to remove a variety text files with one command.

I've been trying to implement a bash script that reads from wordnet's online database and have been wondering if there is a way to remove a variety text files with one command.

Example FileDump:

**** Noun ****
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
**** Verb ****
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
**** Adjective ****
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

I just need to remove the lines which describe aspects of grammar e.g.

**** Noun ****
**** Verb ****
**** Adjective ****

So that I have a clean file with only definitions of the words:

(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (m开发者_开发百科ove fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"

The * symbols around the grammatical terms are tripping me up in sed.


If you want to select whole lines from a file based just on the content of those lines, grep is probably the most suitable tool available. However, some characters, such as your stars, have special meanings to grep, so need to be "escaped" with a backslash. This will print just the lines starting with four stars and a space:

grep "^\*\*\*\* " textfile

However, you want to keep the lines which don't match that, so you need the -v option for grep which does just that: prints the lines which don't match the pattern.

grep -v "\*\*\*\* " textfile

That should give you what you want.


sed '/^\*\{4\} .* \*\{4\}$/d'

or a bit looser

sed '/^*\{4\}/d'


 sed 's/^*.*//g' test | grep .


# awk '!/^\*\*+/' file
(n)hello, hullo, hi, howdy, how-do-you-do (an expression of greeting) "every morning they exchanged polite hellos"
(v)run (move fast by using one's feet, with one foot off the ground at any given time) "Don't run--you'll be out of breath"; "The children ran to the store"
(adj)running ((of fluids) moving or issuing in a stream) "as mountain stream with freely running water"; "hovels without running water"
0

精彩评论

暂无评论...
验证码 换一张
取 消