开发者

jQuery URL parser plugin fails on '@' in query string - What URL parsing regex will work?

开发者 https://www.devze.com 2023-01-19 21:23 出处:网络
Background I use the jQuery URL parser plugin by Mark Perkins for extracting query string values from the current URL.

Background

I use the jQuery URL parser plugin by Mark Perkins for extracting query string values from the current URL.

The parsing process fails when query string values contain the '@' character, most notably when there is an email address in the query string. This is in reference to the latest version of the plugin, taken from the github project page today.

Working and non-working examples

The parsing process populates the internal parsed.queryKey object with key:value pairs from the query string.

Two modes are offered: 'loose' and 'strict'. Both return the same result.

// Parse URL that works
jQuery.url.setUrl("http://example.com/?email=example.example.com");

// Examine result
parsed.queryKey = {
    'email':'example.example.com'
}


// Parse URL that fails
jQuery.url.setUrl("http://example.com/?email=example@example.com");

// Examine result
parsed.queryKey = {
}

Problem

I'd like to be able to modify one (or both) regular expressions to overcome the issue of the parsing of query string arguments failing when there is an '@' present.

The parser uses regular expressions to extract information from the URL. These are defined on (what is currently) line 27:

parser: {
    strict: /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/, //less intuitive, more accurate to the specs
    loose: /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)开发者_Go百科(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/ // more intuitive, fails on relative paths and deviates from specs
}

I don't sufficiently understand the workings of these regular expressions to be able to make the required modifications.

How can I modify the regular expressions to allow the parsing process to work when the is an '@' present in the query string?


Use encodeURIComponent

var url = "http://example.com/?email=";
var email = encodeURIComponent("example@example.com");
jQuery.url.setUrl(url + email);

This will replace @ with %40.

enjoy!


Update:

Using Regex Coach I stepped through and can make this suggestive expression:

^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)

another attempt:

^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:]*):?([^:]*))?)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)

Maybe this RegEx can be of use to you:

(?<protocol>(http|ftp|https|ftps):\/\/)?(?<site>[\w\-_\.]+\.(?<tld>([0-9]{1,3})|([a-zA-Z]{2,3})|(aero|arpa|asia|coop|info|jobs|mobi|museum|name|travel))+(?<port>:[0-9]+)?\/?)((?<resource>[\w\-\.,@^%:/~\+#]*[\w\-\@^%/~\+#])(?<queryString>(\?[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*=[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*)+(&[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*=[a-zA-Z0-9\[\]\-\._+%\$#\~',/]*)*)?)?
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号