开发者

How can I convert Cyrillic string into English in c#

开发者 https://www.devze.com 2023-01-04 15:13 出处:网络
Is it posibl开发者_如何学Ce to convert Cyrillic string to English(Latin) in c#? For example I need to convert \"Петролеум\" in \"Petroleum\".

Is it posibl开发者_如何学Ce to convert Cyrillic string to English(Latin) in c#? For example I need to convert "Петролеум" in "Petroleum". Plus I forgot to mention that if I have Cyrillic string it need to stay like that, so can I somehow check that?


I'm not familiar with Cyrillic, but if it's just a 1-to-1 mapping of Cyrillic characters to Latin characters that you're after, you can use a dictionary of character pairs and map each character individually:

var map = new Dictionary<char, string>
{
    { 'П', "P" },
    { 'е', "e" },
    { 'т', "t" },
    { 'р', "r" },
    ...
}

var result = string.Concat("Петролеум".Select(c => map[c]));


You can of course map the letters to the latin transcription, but you won't get an english word out of it in most cases. E.g. Российская Федерация transcribes to Rossiyskaya Federatsiya. wikipedia offers an overview of the mapping. You are probably looking for a translation service, google probably offers an api for that.


If you're using Windows 7, you can take advantage of the new ELS (Extended Linguistic Services) API, which provides transliteration functionality for you. Have a look at the Windows 7 API Code Pack - it's a set of managed wrappers on top of many new API in Windows 7 (such as the new Taskbar). Look in the Samples folder for the Transliterator example, you'll find it's exactly what you're looking for:

How can I convert Cyrillic string into English in c#


You can use text.Replace(pair.Key, pair.Value) function.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;

namespace Transliter
{
    public partial class Form1 : Form
    {
        Dictionary<string, string> words = new Dictionary<string, string>();

        public Form1()
        {
            InitializeComponent();
            words.Add("а", "a");
            words.Add("б", "b");
            words.Add("в", "v");
            words.Add("г", "g");
            words.Add("д", "d");
            words.Add("е", "e");
            words.Add("ё", "yo");
            words.Add("ж", "zh");
            words.Add("з", "z");
            words.Add("и", "i");
            words.Add("й", "j");
            words.Add("к", "k");
            words.Add("л", "l");
            words.Add("м", "m");
            words.Add("н", "n");
            words.Add("о", "o");
            words.Add("п", "p");
            words.Add("р", "r");
            words.Add("с", "s");
            words.Add("т", "t");
            words.Add("у", "u");
            words.Add("ф", "f");
            words.Add("х", "h");
            words.Add("ц", "c");
            words.Add("ч", "ch");
            words.Add("ш", "sh");
            words.Add("щ", "sch");
            words.Add("ъ", "j");
            words.Add("ы", "i");
            words.Add("ь", "j");
            words.Add("э", "e");
            words.Add("ю", "yu");
            words.Add("я", "ya");
            words.Add("А", "A");
            words.Add("Б", "B");
            words.Add("В", "V");
            words.Add("Г", "G");
            words.Add("Д", "D");
            words.Add("Е", "E");
            words.Add("Ё", "Yo");
            words.Add("Ж", "Zh");
            words.Add("З", "Z");
            words.Add("И", "I");
            words.Add("Й", "J");
            words.Add("К", "K");
            words.Add("Л", "L");
            words.Add("М", "M");
            words.Add("Н", "N");
            words.Add("О", "O");
            words.Add("П", "P");
            words.Add("Р", "R");
            words.Add("С", "S");
            words.Add("Т", "T");
            words.Add("У", "U");
            words.Add("Ф", "F");
            words.Add("Х", "H");
            words.Add("Ц", "C");
            words.Add("Ч", "Ch");
            words.Add("Ш", "Sh");
            words.Add("Щ", "Sch");
            words.Add("Ъ", "J");
            words.Add("Ы", "I");
            words.Add("Ь", "J");
            words.Add("Э", "E");
            words.Add("Ю", "Yu");
            words.Add("Я", "Ya");
    }

        private void button1_Click(object sender, EventArgs e)
        {
            string source = textBox1.Text;
            foreach (KeyValuePair<string, string> pair in words)
            {
                source = source.Replace(pair.Key, pair.Value);
            }
            textBox2.Text = source;
        }
    }
}

If you change

cryllic to latin:

text.Replace(pair.Key, pair.Value); 

latin to cryllic

source.Replace(pair.Value,pair.Key);


This method is very fast:

static string[] CyrilicToLatinL = 
  "a,b,v,g,d,e,zh,z,i,j,k,l,m,n,o,p,r,s,t,u,f,kh,c,ch,sh,sch,j,y,j,e,yu,ya".Split(',');
static string[] CyrilicToLatinU = 
  "A,B,V,G,D,E,Zh,Z,I,J,K,L,M,N,O,P,R,S,T,U,F,Kh,C,Ch,Sh,Sch,J,Y,J,E,Yu,Ya".Split(',');

public static string CyrilicToLatin(string s)
{
  var sb = new StringBuilder((int)(s.Length * 1.5));
  foreach (char c in s)
  {
    if (c >= '\x430' && c <= '\x44f') sb.Append(CyrilicToLatinL[c - '\x430']);
    else if (c >= '\x410' && c <= '\x42f') sb.Append(CyrilicToLatinU[c - '\x410']);
    else if (c == '\x401') sb.Append("Yo");
    else if (c == '\x451') sb.Append("yo");
    else sb.Append(c);
  }
  return sb.ToString();
}


http://code.google.com/apis/ajaxlanguage/documentation/#Transliteration

Google offer this AJAX based transliteration service. This way you can avoid computing transliterations yourself and let Google do them on the fly. It'd mean letting the client-side make the request to Google, so this means your app would need to have some kind of web-based output for this solution to work.


Use a Dictionary with russian and english words as a lookup table. It'll be a lot of typing to build it, but it's full proof.


You are searching for a way of translitterating russian words written in cirillic (in some encodings, e.g. even a Latin encoding, since iso 8859-5 aka Latin-5 is for cyrillic) into latin alphabet (with accents)?

I don't know if .NET has something to transliterate, but I dare say it (as many other good frameworks) hasn't. This wikipedian link could give you some ideas to implement translitteration, but it is not the only way and remember tha cyrillic writing systems is not used by russian only and the way you apply translitteration may vary on the language that use the writing system. E.g. see the same for bulgarian. May this link (always from wp) can be also interesting if you want to program the translitterator by yourself.


This is solution for serbian cyrillic-latin transliteration for form like this: form

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        Dictionary<string, string> slova = new Dictionary<string, string>();

        public Form1()
        {
            InitializeComponent();
            slova.Add("Љ", "Lj");
            slova.Add("Њ", "Nj");
            slova.Add("Џ", "Dž");
            slova.Add("љ", "lj");
            slova.Add("њ", "nj");
            slova.Add("џ", "dž");
            slova.Add("а", "a");
            slova.Add("б", "b");
            slova.Add("в", "v");
            slova.Add("г", "g");
            slova.Add("д", "d");
            slova.Add("ђ", "đ");
            slova.Add("е", "e");
            slova.Add("ж", "ž");
            slova.Add("з", "z");
            slova.Add("и", "i");
            slova.Add("ј", "j");
            slova.Add("к", "k");
            slova.Add("л", "l");
            slova.Add("м", "m");
            slova.Add("н", "n");
            slova.Add("о", "o");
            slova.Add("п", "p");
            slova.Add("р", "r");
            slova.Add("с", "s");
            slova.Add("т", "t");
            slova.Add("ћ", "ć");
            slova.Add("у", "u");
            slova.Add("ф", "f");
            slova.Add("х", "h");
            slova.Add("ц", "c");
            slova.Add("ч", "č");
            slova.Add("ш", "š");
        }

        // Method for cyrillic to latin
        private void button1_Click(object sender, EventArgs e)
        {
            string source = textBox1.Text;
            foreach (KeyValuePair<string, string> pair in slova)
            {
                source = source.Replace(pair.Key, pair.Value);
                // For upper case
                source = source.Replace(pair.Key.ToUpper(), 
                                        pair.Value.ToUpper());                             
            }
            textBox2.Text = source;
        }

        // Method for latin to cyrillic
        private void button2_Click(object sender, EventArgs e)
        {
            string source = textBox2.Text;
            foreach (KeyValuePair<string, string> pair in slova)
            {
                source = source.Replace(pair.Value, pair.Key);
                // For upper case
                source = source.Replace(pair.Value.ToUpper(),  
                                        pair.Key.ToUpper());
            }
            textBox1.Text = source;
        }
    }
}


Why do you want to do this? Changing characters one-for-one generally doesn't even produce a reasonable transliteration, much less a translation. You may find this post to be of interest.

0

精彩评论

暂无评论...
验证码 换一张
取 消