Help write regex that will surround certain text with tags, only if the tag isn't present_问答_开发者

Help write regex that will surround certain text with tags, only if the tag isn't present

开发者 https://www.devze.com 2022-12-31 02:09 出处：网络

I have several posts on a website; all these posts are chat conversations of this type: AD: Hey! BC: What\'s up?

I have several posts on a website; all these posts are chat conversations of this type:

AD: Hey!

BC: What's up?

AD: Nothing

BC: Okay

They're marked up as simple paragraphs surrounded by  tags.

开发者_高级运维

Using the javascript replace function, I want all instances of "AD" in the beginning of a conversation (ie, all instances of "AD" at the starting of a line followed by a ":") to be surrounded by  tags, but only if the instance isn't already surrounded by a  tag.

What regex should I use to accomplish this? Am I trying to do what this advises against?

The code I'm using is like this:

var posts = document.getElementsByClassName('entry-content');

for (var i = 0; i < posts.length; i++) {
    posts[i].innerHTML = posts[i].innerHTML.replace(/some regex here/,
    'replaced content here');
}

If AD: is always at the start of a line then the following regex should work, using the m switch:

.replace(/^AD:/gm, "<strong>AD:</strong>");

You don't need to check for the existence of  because ^ will match the start of the line and the regex will only match if the sequence of characters that follows the start of the line are AD:.

You're not going against the "Don't use regex to parse HTML" advice because you're not parsing HTML, you're simply replacing a string with another string.

An alternative to regex would be to work with ranges, creating a range selecting the text and then using execCommand to make the text bold. However, I think this would be much more difficult and you would likely face differences in browser implementations. The regex way should be enough.

After seeing your comment, the following regex would work fine:

.replace(/<(p|br)>AD:/gm, "<$1><strong>AD:</strong>");

Wouldn't it be easier to set the class or style property of found paragraph to text-weight: bold or a class that does roughly the same? That way you wouldn't have to worry about adding in tags, or searching for existing tags. Might perform better, too, if you don't have to do any string replaces.

If you really want to add the strong tags anyway, I'd suggest using DOM functions to find childNodes of your paragraph that are , and if you don't find one, add it and move the original (text) childNode of the paragraph into it.

Using regular expressions on the innerHTML isn't reliable and will potentially lead to problems. The correct way to do this is a tiresome process but is much more reliable.

E.g.

for (var i = 0, l = posts.length; i < l; i++) {

    findAndReplaceInDOM(posts[i], /^AD:/g, function(match, node){

        // Make sure current node does note have a <strong> as a parent
        if (node.parentNode.nodeName.toLowerCase() === 'strong') {
            return false;
        }

        // Create and return new <strong>
        var s = document.createElement('strong');
        s.appendChild(document.createTextNode(match[0]));
        return s;

    });

}

And the findAndReplaceInDOM function:

function findAndReplaceInDOM(node, regex, replaceFn) {

    // Note: regex MUST have global flag
    if (!regex || !regex.global || typeof replaceFn !== 'function') {
        return;
    }

    var start, end, match, parent, leftNode,
        rightNode, replacementNode, text,
        d = document;

    // Loop through all childNodes of "node"
    if (node = node && node.firstChild) do {

        if (node.nodeType === 1) {

            // Regular element, recurse:
            findAndReplaceInDOM(node, regex, replaceFn);

        } else if (node.nodeType === 3) {

            // Text node, introspect

            parent = node.parentNode;
            text = node.data;

            regex.lastIndex = 0;

            while (match = regex.exec(text)) {

                replacementNode = replaceFn(match, node);

                if (!replacementNode) {
                    continue;
                }

                end = regex.lastIndex;
                start = end - match[0].length;

                // Effectively split node up into three parts:
                // leftSideOfReplacement + REPLACEMENT + rightSideOfReplacement

                leftNode = d.createTextNode( text.substring(0, start) );
                rightNode = d.createTextNode( text.substring(end) );

                parent.insertBefore(leftNode, node);
                parent.insertBefore(replacementNode, node);
                parent.insertBefore(rightNode, node);

                // Remove original node from document
                parent.removeChild(node);

            }

        }
    } while (node = node.nextSibling);

}