开发者

Strip Javascript on(whatever) events from Code using PHP

开发者 https://www.devze.com 2022-12-29 04:35 出处:网络
I want to strip out all JavaScript from a small snippet (4-6 lines) of HTML, I\'ve read on here before开发者_运维技巧 that its best not to use REGEX on HTML, so if anybody knows a better way, please a

I want to strip out all JavaScript from a small snippet (4-6 lines) of HTML, I've read on here before开发者_运维技巧 that its best not to use REGEX on HTML, so if anybody knows a better way, please advise.

So for example i have the following code:

<a href="go/to/my/link" onclick="fetchMeSomeData(this)">My Link</a>
<p onfocus="doSomethingAmazing();"></p>

Now in PHP i want to replace the on(what ever event it is) event with just an empty space.


Use the HTML Purifier library to strip things like JavaScript and plugins from the code. It's much better then a blacklist-based regex approach because it uses a full HTML parser and a whitelist to clean the HTML.


I've build such regexp some time ago, looks a bit scary though :). Here is pure regexp, you might need to additionally mask special chars to match your language requirements.

(\son[a-z]+\s*=\s*"[^"\\\r\n]*(?:\\.[^"\\\r\n]*)*"(?=[^<]*?>))|(\son[a-z]+\s*=\s*'[^'\\\r\n]*(?:\\.[^'\\\r\n]*)*'(?=[^<]*?>))

Here is masked version (according to java standards), that you should be able to use as a string.

(\\son[a-z]+\\s*=\\s*\"[^\"\\\\\\r\\n]*(?:\\\\.[^\"\\\\\\r\\n]*)*\"(?=[^<]*?>))|(\\son[a-z]+\\s*=\\s*'[^'\\\\\\r\\n]*(?:\\\\.[^'\\\\\\r\\n]*)*'(?=[^<]*?>))

It looks only inside tags and takes into consideration masked quotes inside events. I'm sure it is not 100% bullet proof though.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号