I was wondering, what characters are accepted in .Net identifiers?
Not C# or VB.Net, but the CLR.
The reason I ask this is I was looking at how yield return
statements were implemented (C# In Depth), and saw 开发者_StackOverflow社区that it compiles into code like:
public int <count>5__1;
Are there any other identifier characters that I could use? This code would not be public.
This is governed by the CLS specification, chapter 8.5.1 "Valid names":
CLS Rule 4: Assemblies shall follow Annex 7 of Technical Report 15 of the Unicode Standard 3.0 governing the set of characters permitted to start and be included in identifiers, available on-line at http://www.unicode.org/unicode/reports/tr15/tr15-18.html. Identifiers shall be in the canonical format defined by Unicode Normalization Form C. For CLS purposes, two identifiers are the same if their lowercase mappings (as specified by the Unicode locale-insensitive, one-to-one lowercase mappings) are the same. That is, for two identifiers to be considered different under the CLS they shall differ in more than simply their case. However, in order to override an inherited definition the CLI requires the precise encoding of the original declaration be used.
Or in other words, it doesn't specify a list of verboten characters, is it only concerned about being able to compare strings without surprises. Which is all the CLR ever has to do. The job of a compiler is much harder, it must be able to recognize tokens in the program, the job of the lexer. Practical lexer implementations set rules on valid characters in an identifier. Not being to start an identifier with a digit for example.
The C# spec says which characters can be used.
The CLR however allows much more. That is why the C# compiler emits them as such.
精彩评论