Has anyone som开发者_开发问答e experience about this?
You could first get all DOM elements and then remove their content and attributes. After the content has been removed you could convert all tags to lower or uppercase, then you can use any of the well known string matching algorithms like Knuth-Morris-Pratt string matching
精彩评论