开发者

Extract contents of paragraph tag using a Perl one liner

开发者 https://www.devze.com 2023-02-11 21:08 出处：网络

I would like to match the contents of a paragraph tag u开发者_JAVA百科sing a perl reg ex one liner. The paragraph is something like this:

相关专题：expression perl tags

I would like to match the contents of a paragraph tag u开发者_JAVA百科sing a perl reg ex one liner. The paragraph is something like this:

<p style="font-family: Calibri,Helvetica,serif;">Text I want to extract</p>

so I have been using something like this:

perl -nle 'm/<p>($.)<\/p>/ig; print $1' file.html

Any ideas appreciated

thanks

Mandatory link to what happens when you try to parse HTML with regular expressions.

David Dorward's comment, to use HTML::TreeBuilder, is a good one. Another good way to do this, is by using HTML::DOM:

perl -MHTML::DOM -e 'my $dom = HTML::DOM->new(); $dom->parse_file("file.html"); my @p = $dom->getElementsByTagName("p"); print $p[0]->innerText();'

$ in matching part means 'end-of-the-string' and you need also match all in p-tag non-greedy way:

perl -nle 'm/<p.*?>(.+)<\/p/ig; print $1' test.html

暂无评论...

登录注册

请自觉遵守互联网相关的政策法规，严禁发布色情、暴力、反动的言论！

验证码：

取消

Delphi - Custom drawing a message list

C++ header-only include pattern

IE7 Margin Collapses Into Padding

in CoffeeScript, how can I use a variable as a key in a hash?

Interactive visualization of a graph in python [closed]

How to customise PHP MYSQL tables?

High quality, simple random password generator

Image Recognition ApI in android

开发者开发者网给大家分享系统运维,大数据运维,云计算,编程开发技巧,路由交换,运维和开发相关的资讯及技术文章，同时StackOverflow中文社区，知识经验交流分享。

法律声明：本站内容均为网友上传，网站举办方负责审核和监督，如存在版权或非法内容，欢迎举报，我们将尽快予以删除。邮箱：devze@qq.com