开发者

Which Java Regex can I use to match a URL that has a capital letter before the query string?

开发者 https://www.devze.com 2023-01-16 04:28 出处:网络
I\'m trying to create a regex that matches a URL that has a capital letter before the query string.I want to capture the query string including the question mark and I want to capture the non-query st

I'm trying to create a regex that matches a URL that has a capital letter before the query string. I want to capture the query string including the question mark and I want to capture the non-query string part. If there is no query string, but there is a capital letter, then the non query string part should be captured.

A few examples:

/contextroot/page.html?param1=value1&param2=value2 NO MATCH
/contextroot/page.html?param=VALUE&param2=value2   NO MATCH

/contextroot/Page.html?param=value                 MATCH
/contextro开发者_C百科ot/Page.html                             GROUP 1
?param=value                                       GROUP 2

/contextroot/page.HTML                             MATCH
/contextroot/page.HTML                             GROUP 1

Here's my first cut at the regex:

^(.*[A-Z].*)(\??.*)$

It's busted. This never captures the query string.


^/contextroot/([^?]*[A-Z][^?]*)(\?.*)?$

Explanation:

^/contextroot/  # literal start of URL
(               # match group 1
  [^?]*         # anything except `?` (zero or more)
  [A-Z]         # one capital letter
  [^?]*         # see above
)
(               # match group 2
  \?            # one ?
  .*            # anything that follows
)?              # optionally
$               # end of string    


(^/contextroot/(?=[^?A-Z]*[A-Z])[^?]*)(\?.*)?

Explanation:

(                 # match group 1
  ^/contextroot/  #   literal start of URL (optional, remove if not needed)
  (?=             #   positive look-ahead...
    [^?A-Z]*      #     anything but a question mark or upper-case letters
    [A-Z]         #     a mandatory upper-case letter
  )               #   end look-ahead
  [^?]*           #   match anything but a question mark
)                 # end group 1
(                 # match group 2
  \?.*            #   a question mark and the rest of the query string
)?                # end group 2, make optional

Note that this is intended to check a single URL and does not work when run against a multi-line string.

To make it work with multi-line input (one URL per line), make this change:

(^/contextroot/(?=[^?A-Z\r\n]*[A-Z])[^?\r\n]*)(\?.*)?
0

精彩评论

暂无评论...
验证码 换一张
取 消