开发者

lxml cleaner with a custom tag?

开发者 https://www.devze.com 2023-01-20 12:02 出处:网络
I w开发者_StackOverflow中文版ant to use lxml cleaner to get rid of all html, but then a regex to autolink something:

I w开发者_StackOverflow中文版ant to use lxml cleaner to get rid of all html, but then a regex to autolink something:

[ABC] -> <a href="bah bah bah">ABC</a>

what is the right way to handle this without xss and such?


Maybe using markdown with inline HTML disabled would be suitable? The python markdown module is quite mature.

Check out the "safe mode" section in the docs for more info on stripping out inline HTML.

Depending on what you want, something like py-wikimarkup may be more appropriate.

Using a custom regexp is probably not a great idea, because

  • you'll have to explain the rules to people who might already be familiar with markdown/WikiText
  • you'll have to provide a way to escape text, e.g. for people who really want to write [ABC]
  • you'll have to fix any bugs, including security issues
0

精彩评论

暂无评论...
验证码 换一张
取 消