开发者

Any regex to replace broken HTML attribute like this?

开发者 https://www.devze.com 2023-01-29 17:18 出处:网络
I am using PHP and would love to make some automated functions which will replace broken HTML attributes like

I am using PHP and would love to make some automated functions which will replace broken HTML attributes like

title="TV 40" is better"

with

title="TV 40" is better"

So, my question is: How can开发者_如何转开发 I regex to find the second double quote?


you could use this instead of Regex

$value = "HTML CODE";
html_entities($value, ENT_QUOTES, 'UTF-8');

I hope this helps you, correct me if im wrong.


I am somewhat confused about what you are trying to accomplish. Maybe a bigger example would be helpful.

  • Do you have an html document that you wrote that has mistakes in it that you want to fix?
  • Are you trying to write a program that will fix any broken html?

Some extra information on the context of your question could be helpful.


There are many cases that you might be asking about but in vim this works for me (for the example you provided):

:%s/"\(.*\)"\(.*\)"/"\1\"\2"/g

It will change this:

title="TV 40" is better" title="TV 40" is better"

title="TV of 40 inch, spelled also as, 40" is better"

title="TV 40 is better"

To this:

title="TV 40" is better" title="TV 40" is better

title="TV of 40 inch, spelled also as, 40" is better

title="TV 40 is better"

However it will break something like this (that is already working):

title="TV 40 is better" title="TV 40 is better"

I think as I mentioned before giving us some more context on what you are trying to solve would be helpful.


On a more general note, it is usually a bad idea to try and parse html with regex. There are too many things that can go freakish. Unless you know that the html is going to be in a certain format I would not do it. HTML is not a regular language so it is impossible to parse with regular expressions. The only way that you can get around this is if you know something special about the html. Or you only want to find very specific things in an html page that is formatted in a predetermined way.

According to Jeff Attwod if you try to parse html with regex "you are you're succumbing to the temptations of the dark god Cthulhu's … er … code". See this page.

This answer also gives some good examples of why it is a bad idea to parse html with regex.

0

精彩评论

暂无评论...
验证码 换一张
取 消