开发者

How to search for a word and then replace text after it using regular expressions in python?

开发者 https://www.devze.com 2023-04-02 07:27 出处:网络
I\'m trying to write a script that will search through a html file and then replace the form action. So in this basic code:

I'm trying to write a script that will search through a html file and then replace the form action. So in this basic code:

<html>
    <head>
        <title>Forms</titl开发者_如何学Goe>
    </head>
    <body>
    <form action="login.php" method="post">
        Username: <input type="text" name="username" value="" />
        <br />
        Password: <input type="password" name="password" value="" /> 
        <br />
        <input type="submit" name="submit" value="Submit">
    </form>
    </body>
</html>

I would like the script to search for form action="login.php" but then only replace the login.php, with say newlogin.php. The key thing is that the form action might change from file to file, i.e. on another html file the login.php might be something totally different, so the regular expression has to search for the form action= and replace the text after it (maybe using the " as limiters?)

My knowledge of regular expressions is pretty basic, for example I'd know how to replace just login.php:

(re.sub('login.php', 'newlogin.php', line))

but obviously it's no use as mentioned above if the login.php changes from file to file.

Any help is much appreciated!

Thanks all =)


You can use regex, or just simple string manipulation. Just a test case.

for line in open("file"):
    if "form action" in line:
       line=line.rstrip()
       a=line.split('<form action="')
       a[-1] = '"newlogin" ' + a[-1].split()[-1]
       line = '<form action='.join(a)
    print line


Make the re catch 2 groups, the form and everything leading up to the 1st quote after action, and the action content.

Use the 1st group for the replacement, followed by the new action:

re.sub(r'(<form.*?action=")([^"]+)', r'\1newlogin.php',  content)


You cant try this technique:

(<form[^>]*action=")[^"]*

pseudo-code:

regex.replace(input, pattern, concat(\1, new_value))

You can use this regex:

(?<=<form[^>]*action=")[^"]*
0

精彩评论

暂无评论...
验证码 换一张
取 消