开发者

HttpWebRequest Url escaping

开发者 https://www.devze.com 2023-01-29 12:56 出处:网络
I know, the title sounds like this question has been addressed many times. But I am struggling with a specific case and I am very confused over it. Hopefully a seasoned C#\'er could point me in the co

I know, the title sounds like this question has been addressed many times. But I am struggling with a specific case and I am very confused over it. Hopefully a seasoned C#'er could point me in the correct direction.

I have the code:

string serviceURL = "https://www.domain.c开发者_开发知识库om/service/tables/bucketname%2Ftables%2Ftesttable/imports";
HttpWebRequest dataRequest = (HttpWebRequest)WebRequest.Create(serviceURL);

Now when I quickwatch dataRequest, I see that:

RequestUri: {https://www.domain.com/service/tables/bucketname/tables/testtable/imports}

And it looks like the HttpWebRequest has changed both the %2F to /. However, the server needs the requested Uri to be exactly as serviceURL is written, containing the %2F.

Is there any way to get the HttpWebRequest class to call the Url:

https://www.domain.com/service/tables/bucketname%2Ftables%2Ftesttable/imports

Many thanks! I am at a complete loss here... -Brett


Kyle posted the answer in a comment, so to make it official: GETting a URL with an url-encoded slash

It's a weird work around, but nevertheless gets the job done.


As long as the problem lies in %2F being unescaped to "/" there are solutions out there. One involving a hack and for newer versions of .Net, an app.config setting. Check here: How to make System.Uri not to unescape %2f (slash) in path?

However I have still to figure out how to prevent it unescaping some specifically escaped characters, like '(' and ')' (%28 and %29). I have tried all the settings and hacks that I found out there to prevent the Uri class from delivering a partially unescaped path for the WebRequest. The solutions will happily prevent %2F being unescaped, but not %28 and %29 and possible most of the other chars being specifically escaped.

It seems like the WebRequest is specifically asking for 1 value from the Uri object to create the "GET /path HTTP/1.1" syntax: Uri.PathAndQuery which again calls its UriParser.GetComponents.

If you want to download from mediafire and it contains the chars %28 and %29 you will get into a infinite redirect loop as .Net keeps changing %28 and %29 to '(' and ')' and following the redirect (exception: "Too many automatic redirections were attempted").

So this is a solution for those who are stuck and have not been able to find a way to prevent the unescape of some characters.

The only way I have found to override this (currenly using .Net 4.6) and deliver my own PathAndQuery has been a combination of inherting UriParser and hacking its use.

public sealed class MyUriParser : System.UriParser
{
    private UriParser _originalParser;
    private MethodInfo _getComponentsMethod;

    public MyUriParser(UriParser originalParser) : base()
    {
        if (_originalParser == null)
        {
            _originalParser = originalParser;

            _getComponentsMethod = typeof(UriParser).GetMethod("GetComponents", BindingFlags.NonPublic | BindingFlags.Instance);
            if (_getComponentsMethod == null)
            {
                throw new MissingMethodException("UriParser", "GetComponents");
            }
        }
    }

    private static Regex rx = new Regex(@"^(?<Scheme>[^:]+):(?://((?<User>[^@/]+)@)?(?<Host>[^@:/?#]+)(:(?<Port>\d+))?)?(?<Path>([^?#]*)?)?(\?(?<Query>[^#]*))?(#(?<Fragment>.*))?$",RegexOptions.Compiled | RegexOptions.ExplicitCapture | RegexOptions.Singleline);
    private Match m = null;

    protected override string GetComponents(Uri uri, UriComponents components, UriFormat format)
    {
        var original = (string)_getComponentsMethod.Invoke(_originalParser, BindingFlags.InvokeMethod, null, new object[] { uri, components, format }, null);
        if (components == UriComponents.PathAndQuery)
        {
            var reg = rx.Match(uri.OriginalString);
            var path = reg.Groups["Path"]?.Value;
            var query = reg.Groups["Query"]?.Value;
            if (path != null && query != null) return $"{path}?{query}";
            if (query == null) return $"{path}";
            return $"{path}";
        }

        return original;
    }

}

And then hacking it into the Uri instance by replacing its UriParser with this one.

    public static Uri CreateUri(string url)
    {
        var uri = new Uri(url);
        if (url.Contains("%28") || url.Contains("%29"))
        {
            var originalParser = ReflectionHelper.GetValueByReflection(uri, "m_Syntax") as UriParser;
            var parser = new MyUriParser(originalParser);
            ReflectionHelper.SetValueByReflection(parser, "m_Scheme", "http");
            ReflectionHelper.SetValueByReflection(parser, "m_Port", 80);
            ReflectionHelper.SetValueByReflection(uri, "m_Syntax", parser);
        }
        return uri;
    }

Due to the way UriParser works, it normally needs to register to have its port and scheme name set, so these 2 values has to be set by reflection as we are not registering it the correct way. I have not found a way to register "http" as it already exist. The ReflectionHelper is just a class I have but can be quickly replaced with normal reflection code.

Then call it like this:

HttpWebRequest dataRequest = (HttpWebRequest)WebRequest.Create(CreateUri(serviceURL));


string serviceURL = Uri.EscapeUriString("https://www.domain.com/service/tables/bucketname%2Ftables%2Ftesttable/imports");
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号