开发者

Should I store spaces in my URLs in the database? If so, how do I encode them when putting them into <a href="...">?

开发者 https://www.devze.com 2023-03-06 05:19 出处:网络
In my blog, I store URIs on entities to allow them be customised (and friendly). Originally, they could contain spaces (eg. \"/tags/ASP.NET MVC\"), but the W3C validation says spaces are not valid.

In my blog, I store URIs on entities to allow them be customised (and friendly). Originally, they could contain spaces (eg. "/tags/ASP.NET MVC"), but the W3C validation says spaces are not valid.

The System.Uri class takes spaces, and seems to encode them as I want (eg. /tags/ASP.NET MVC becomes /tags/ASP.NET%20MVC), but I don't want to create a Uri just to throw it away, this feels dirty!

Note: None of Html.Encode, Html.AttributeEncode and Url.Encode will encode "/tags/ASP.NET MVC" to "/tags/ASP.NET%20MVC".


Edit: I edited the DataType part out of my question as it turns out DataType does not directly provide any validation, and there's no built开发者_高级运维-in URI validation. I found some extra validators at dataannotationsextensions.org but it only supports absolute URIs and it looks like spaces my be valid there too.


It seems that the only sensible thing to do is not allow spaces in URLs. Support for encoding them correctly seems flaky in .NET :(

I'm going to instead replace spaces with a dash when I auto-generate them, and validate they only contain certain characters (alphanumeric, dots, dashes, slashes).

I think the best way to use them would be to store %20 in the DB, as the space is "unsafe" and it seems non-trivial to then encode them in a way that will pass the W3C validator in .NET.


URI and URLs are two different things, URLs being a subset of URIs. As such, a URL has different restrictions to URIs.

To encode a path string to proper W3C URL encoding standards, use HttpUtility.UrlPathEncode(string). It'll add the encoded spaces you're after.

You should store your URLs in whatever form that is most useful for you to work with them. It can be useful to refer to them as URIs until the point at which you encode them into a URL-compliant format, but that's just semantics to help your design be a little clearer.

EDIT:

If you don't like the slashes being encoded, it's pretty simple to "decode" them by replacing the encoded %2f with the simpler /:

var path = "/tags/ASP.NET MVC";
var url = HttpUtility.UrlPathEncode(path).Replace("%2f", "/");


I haven't used it, but UrlPathEncode sounds like it may give what you want.

You can encode a URL using with the UrlEncode() method or the UrlPathEncode() method. However, the methods return different results. The UrlEncode() method converts each space character to a plus character (+). The UrlPathEncode() method converts each space character into the string "%20", which represents a space in hexadecimal notation.

EDIT: The javascript method encodeURI will use %20 instead of +. Add a reference to Microsoft.JScript and call GlobalObject.encodeURI. Tried the method here and you get the result you're looking for:


I asked this similar question a while ago. The short answer was to replace spaces with "-" and then back out again. This is the source I used:

private static string EncodeTitleInternal(string title)
{
        if (string.IsNullOrEmpty(title))
                return title;

        // Search engine friendly slug routine with help from http://www.intrepidstudios.com/blog/2009/2/10/function-to-generate-a-url-friendly-string.aspx

        // remove invalid characters
        title = Regex.Replace(title, @"[^\w\d\s-]", "");  // this is unicode safe, but may need to revert back to 'a-zA-Z0-9', need to check spec

        // convert multiple spaces/hyphens into one space       
        title = Regex.Replace(title, @"[\s-]+", " ").Trim(); 

        // If it's over 30 chars, take the first 30.
        title = title.Substring(0, title.Length <= 75 ? title.Length : 75).Trim(); 

        // hyphenate spaces
        title = Regex.Replace(title, @"\s", "-");

        return title;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消