I have the following data on one line:
<a href="#page-metadata-start" class="assistive">Go to start of metadata</a>
<div id="page-metadata-end" class="assistive"></div>
<fieldset class="hidden parameters">
<input type="hidden" title="browsePageTreeMode" value="view">
</fieldset>
<div class="wiki-content">
<p>(openissues)81(/openissues)</p><p>(assignstoday)0(/assignstoday)</p><p>(assignsweek)2(/assignsweek)</p><p>(replyissues)6(/replyissues)<开发者_Go百科;/p><p>(wrapissues)26(/wrapissues)</p>
</div>
I'd like to grab the value for "openissues" for example, but I can't figure out to properly retrieve this. One of the things I tried is the following command:
sed -n '/(assignstoday)/,/(\/assignstoday)/p' ~/test.txt
Any help?
sed 's/.*(openissues)\(.*\)(\/openissues).*/\1/' test.txt
a quick hack to possibly meet your edited requirement:
sed -n '/openissues/p' test.txt | sed 's/.*(openissues)\(.*\)(\/openissues).*/\1/'
but regexes are really not the way to go when parsing HTML.
I'd try
VALUE=openissues
sed 's@.*('"$VALUE"')\([^(]\+\).*@\1@'
that is, replace everything except the contents of what you are searching, with that content.
edit: Now I see Neil's answer, that's practically the same, accept his. I leave my answer for the customization of which value you want to extract.
精彩评论