开发者

removing multiple tags in SGML

开发者 https://www.devze.com 2023-01-15 05:06 出处:网络
i have a sgml file like <p><p><data>sdlksdskdmskdmsamdakmdksam<p></data>...

i have a sgml file like

<p><p><data>sdlksdskdmskdmsamdakmdksam<p></data>...

my q开发者_C百科uestion is how to remove one tag <p> and keep another one intact ...which regular expression would be siutable......


If your SGML is such it can be processed as XML, then XProc is a good technology for this kind of thing, with a single step such as:

<p:unwrap match="p[parent::p]"/>

(Assuming you want to remove all self-nested p elements until p never wraps itself).

You definitely do not want to process SGML/XML with regexps unless you are 100% certain you will be dealing with a subset which has a certain well-specified lexical form. Think for example how you'd process stuff with comments using a regexp:

<p><!-- <p> commented out--><foo><p/><p/></foo></p>

!!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号