Why was String designed as a reference type instead of value type?
From the modeling perspective I would have modeled it as a value type since it represents something without identity. It doesn't have distinguishing attributes. (E.g I can't make any difference between one string "a" and another string "a")
I know that I would have had serious performance problems having long strings stored on the stack. Probably it's impossible, as strings get very long, because stack is limited in size.
If it weren't for the performance why would you design System.String as a reference type? (Assume any possible string is at most 开发者_JS百科16 bytes long)
As you point out having a value type which may become very huge may be prohibitive due to limited stack space and the copy-on-use semantics of value types.
Also, the way strings are implemented in .NET adds a couple of elements to the equation. Strings are not only reference types, they are also immutable (outside the System namespace anyway) and the runtime uses interning to do neat tricks for strings.
All this adds up to a couple of benefits: Duplicate literal strings are only stored once and comparison of such strings becomes extremely effective as you can compare references instead of streams of Unicode characters. Those options would not be possible for value types.
Structs need to be fixed size. Think of a string[]
, for example. The only way you could have string
as a value-type would be to store just the pointer. Which is essentially what we achieve by using a reference-type.
Of course, it is also hugely beneficial that we don't copy the string every time we assign it.
My understanding is that strings are immutable classes instead of structures only as a performance gain.
Strings tend to be created and then passed to many objects for rendering to a user or handing to other systems. After their creation, strings tend not to change, so copying the entire character array as a unique value in each object has little practical value and creates a lot of temporary objects.
Simple -- because I don't want to make copies of strings every time I pass one into a method. It takes more memory, and it takes more time.
In terms of equlity, you still have the possibility to consider it as value-type with ==
operator.
So if anything, it's just and advantage to have it as a reference no?
- ~ Edited to answer question more accurately
One point is that the String type like in many languages is encoded as Unicode and so it's illogical to treat them as primitive types (like int
) as there is no direct correspondence between its binary encoding and its human read form.
The Unicode layer automatically qualifies string types to be abstracted away from binary, whereas numbers are interchangeable between base 2 (binary) and base 10 (decimal) forms with relative ease.
The reason that primitive variables can reside on the stack is that there is plenty of room available for a lot of numbers. This isn't the case for the more data heavy String
type.
The types of operations carried out on strings are not really arithmetical but more Boolean logic based (except when counting strings when they are treated like a vector or array), so it makes sense to optimise the data structure for it's primary uses, via the System.String namespace.
精彩评论