
Javascript slugifier to C#

开发者 https://www.devze.com 2023-03-12 04:28 出处:网络
I am looking at converting the JS slugify function by diegok and he is using this Ja开发者_运维技巧vaScript construct:

I am looking at converting the JS slugify function by diegok and he is using this Ja开发者_运维技巧vaScript construct:

function turkish_map() {
    return {
        'ş':'s', 'Ş':'S', 'ı':'i', 'İ':'I', 'ç':'c', 'Ç':'C', 'ü':'u', 'Ü':'U',
        'ö':'o', 'Ö':'O', 'ğ':'g', 'Ğ':'G'

It is a map of char to char translations. However, I don't know which JS construct is this and how could it be rewritten in C# preferably without spending too much time on rewriting? (There's more to it, this is just one of the functions).

Should I make an array, dictionary, something else?

Dictionary<char, char> turkish_map() {
    return new Dictionary<char, char> {
        {'ş','s'}, {'Ş','S'}, {'ı','i'}, {'İ','I'} {'ç','c'} , {'Ç','C' }, {'ü','u'}, {'Ü','U'}, {'ö','o'}, {'Ö','O'}, {'ğ','g'}, {'Ğ','G'} }; 

The use it like:

turkish_map()['İ'] // returns I

Or you can save it into field and use it without creating it every time.

Use these methods to remove diacritics, the result will be sSıIcCuUoOgG.

namespace Test
    public class Program

        public static IEnumerable<char> RemoveDiacriticsEnum(string src, bool compatNorm, Func<char, char> customFolding)
            foreach (char c in src.Normalize(compatNorm ? NormalizationForm.FormKD : NormalizationForm.FormD))
                switch (CharUnicodeInfo.GetUnicodeCategory(c))
                    case UnicodeCategory.NonSpacingMark:
                    case UnicodeCategory.SpacingCombiningMark:
                    case UnicodeCategory.EnclosingMark:
                        //do nothing
                        yield return customFolding(c);
        public static IEnumerable<char> RemoveDiacriticsEnum(string src, bool compatNorm)
            return RemoveDiacritics(src, compatNorm, c => c);
        public static string RemoveDiacritics(string src, bool compatNorm, Func<char, char> customFolding)
            StringBuilder sb = new StringBuilder();
            foreach (char c in RemoveDiacriticsEnum(src, compatNorm, customFolding))
            return sb.ToString();
        public static string RemoveDiacritics(string src, bool compatNorm)
            return RemoveDiacritics(src, compatNorm, c => c);

        static void Main(string[] args)
            var str = "şŞıİçÇüÜöÖğĞ";

            Console.Write(RemoveDiacritics(str, false));

            // output: sSıIcCuUoOgG


For other characters like ı which wasn't converted, and others as you mentioned as @, you can use the method to remove diacritics then use a regex to remove invalid characters. If you care enough for some characters you can make a Dictionary<char, char> and use it to replace them each one of them.

Then you can do this:

var input = "Şöme-p@ttern"; // text to convert into a slug
var replaces = new Dictionary<char, char> { { '@', 'a' } }; // list of chars you care
var pattern = @"[^A-Z0-9_-]+"; // regex to remove invalid characters

var result = new StringBuilder(RemoveDiacritics(input, false)); // convert Ş to S
                                                                // and so on

foreach (var item in replaces)
    result = result.Replace(item.Key, item.Value); // replace @ with a and so on

// remove invalid characters which weren't converted
var slug = Regex.Replace(result.ToString(), pattern, String.Empty,

// output: Some-pattern


验证码 换一张
取 消
