开发者

How to decipher an unknown substitution cipher

开发者 https://www.devze.com 2023-01-23 10:16 出处:网络
You are given a file containing a list of strings (one per line). The strings are sorted and then encrypted using an unknown substitution cipher (e.g. a < c, b < r, c < d). How do you determi

You are given a file containing a list of strings (one per line). The strings are sorted and then encrypted using an unknown substitution cipher (e.g. a < c, b < r, c < d). How do you determine what the mapping is for the substitution cipher? The unencrypted strings can be in any language.

I'd like to know if that question is hard or not, I was applying for a new gra开发者_StackOverflowduate position, and I couldn't solve it that good, and he stayed about 45 mins with me on that question.


I guess the key fact is that the strings were sorted before encryption, so you need not worry about language at all.

First solution that comes to my mind is just creating a brute-force backtracking algorithm, but this is probably not good.

Second solution I can think of is to extract all known relationships from the file, eg. this file:

xtw
yaw
yay

will tell you that x < y (because xtw < yaw) and w < y (because yaq < yay). After you have the directed graph of relationships, you just need to topologically sort this graph, and your solution is there.


I wouldn't label it as an easy question for sure. I think whatever solution you use, you will need to have some strong heuristics, and preferably some experience doing cipher puzzles.

You mention that the strings are sorted, I'm not sure if you could use that as a heuristic; if yes, then you can look at the first character of each word, and use that to figure out what the cypher is. I'm going to assume you can't do that, though.

Start out with a dictionary of all words for the chosen language, and, if possible, a frequency of how often those words tend to occur. Start with whichever size word occurs least frequently (For most languages, English included, this will be the 1 letter words), and start plugging away trying different possibilities doing a breadth first search.

There might be a better solution, though.


Well, usually substitution ciphers are 'broken' with statistical methods (Frequency distribution) which is probably not really feasible here since you don't know the language of the unenctypted string (might be vulcan?). Still you could at least try some known distributions. The hint is that the input is sorted (assuming that it is in latin alphabet). Still I can not judge how hard this would be.

But just a thougt: Why do you assume that to succed in this test you would have to 'solve' the actual problem? I think the interviewer just wanted to test your general problem-solving skills, how you go about solving problems, where you are not qualified (the classic from Google-Interviews: You are tiny and sit in a huge blender: What to you do?)

Regards Andreas


Substitution is an Symmetric encryption method, here we change/replace any character by another character according to the key selected, If you have a key of value 5 then the word "piran" is change like in this manner: p is replaced by 'u', i by 'j' and so on, means that we are replacing our each character by the character whose ASCII value is greater by 5 than the original character, as ASCII value of 'p' is 112 and if we add 5 then it becomes 112+5=117, and here 117 is the ASCII value of 'u' so in this manner each character is changed, and one thing should be noted that if you have 'z' value and you have key of 5 then no english alphabates will comes in this case, because there are too much character possible in computer, even if you will select key in million like "349833720" then you will get character of some other languages like chinees etc, because in computer there is a capacity for each type of character, which is predefined, Fire this link and see the practical example/apps on encryption which i have describe above https://play.google.com/store/apps/details?id=com.ifahja.secretsms#?t=W251bGwsMSwxLDIxMiwiY29tLmlmYWhqYS5zZWNyZXRzbXMiXQ..

0

精彩评论

暂无评论...
验证码 换一张
取 消