doubts regarding Memory management in .net_问答_开发者

I'm learning about Memory management in C# from the book "Professional C#"

The presence of the garbage collector means that you will usually not worry about objects that you no longer need; you will simply allow all references to those objects to go out of scope and allow the garbage collector to free memory as required. However, the garbage collector does not know how to free unmanaged resources (such as file handles, network connections, and database connections). When managed classes encapsulate direct or indirect references to unmanaged resources, you need to make special provision to ensure that the unmanaged resources are released when an instance of the class is garbage collected.

When defining a class, you can use two mechanisms to automate the freeing of unmanaged resources.

Declaring a destructor (or finalizer) as a member of your class.

Implementing the System.IDisposable interface in your class.

I didn't understand few things:

"unmanaged resources (such as file handles, network connections, and database connections)". Whats the big deal about them? How come they are unmanaged? (or) Why can't GC managed these resources?
What code would we place in finalizer or Dispose() method of the a class 开发者_JS百科and what exactly that code would look like? Some examples using these resources, would be of lot of help.

Some classes on the .NET framework are just wrappers of Windows APIs or third party assemblies. These APIs are not managed code (they can be written in C++ or they are old COM assemblies) and the garbage collector does not know when they are no longer required by the application.

For example, when you open a file of disk, it will remain open until you tell it to close the file. If you destroy the pointer to the file (i.e. leaving the scope) without closing the file, this file will remain open and locked.

The Dispose method implemented on the Framework for these classes calls the inner Close method required to finalize the instance in a clean way. So all the classes that wrap unmanaged code should implement the Disposable interface to assure that a closing method it's implemented.

Then when you instance that class it is a good practice to do it with the using statement because then when you leave the scope the Dispose method is called automatically.

The real question here is about urgency. As the garbage collector explicitly tracks memory, it will know when there is a need to free up memory by cleaning unreferenced objects. This can happen several times a minute, or once an hour, or even never (if no new objects needs to be created). But the important thing is that it does happen when needed.

But memory isn't the only resource that is limited. Take files. Usually only one application at a time can open a file as it can become messy if several people tries to write to the same file. Databases have a limited amount of connections. And so on. The garbage collector doesn't track any of these resources. And it has no idea of how urgent it is to close them.

Sure, you could open a FileStream and read from it without closing it afterwards. if you null out the reference to the object, eventually the garbage collector will probably decide to collect the FileStream object, which will have its Finalizer run and the file will get properly closed. But that could take a long time, and in the meanwhile the file is locked.

With database connections it is far more urgent, as there is a very limited amount of collections available, so if you open too many connections without disposing of them you will eventually get an error as you will have a bunch of database objects having open connections that lie waiting in the garbage collector queue.

Properly disposing of Disposable object is therefore good practice. Sometimes you could get away not doing so, but it is poor style. If an object implements IDisposable, it is because it wants you to clean it up when you are done using it.

1.) The GC does not know how to close external resources properly. Of course he could kill a network connection (which is, in fact, what he does if you don't disconnect i.e. a database connection). But the database isn't being notified of closing the connection.

Similar goes for file streams. Is there still something in a buffer? Does that have to be written to file before closing the file handle? The GC does not know about this - the accessing code does.

2.) Is what follows from that. So if you have open file streams and an internal buffer - in the dispose method you would flush the buffer, write it to the file and close the file hanlde.

For usual, you don't directly access databases. You use libraries managing this for you.

In most cases it's enough, to dispose those external resource managers (Db connection, filestream, network classes) if your class is being disposed.

This is a good question and one that a lot of developers don't seem to understand.

At a high level, managed resources are resources that are allocated and tracked by .Net. The memory used by the resource comes from a pool allocated to .Net and the .Net runtime tracks all references between managed resources. This tracking (I'm sure this is the wrong term, but will suffice here) allows the .Net runtime to know when a given resource is no longer being used and thus eligible to be released. Unmanaged resources therefore, are resources allocated outside of that .Net managed pool and not tracked by the runtime. Most often, these are references to OS or external application resources. There are all sorts of complicated reasons why the .Net runtime cannot "see" into an unmanaged resource, but I like to think of it like this: .Net is a walled development garden that you must enter to use. You can poke a hole in that wall to see outside (ie, PInvoke) but you cannot own a resource on the other side.

Now, on to the second part of your question. Bill Wagner has a great discussion on how to implement Dispose methods and why in his book Effective C#. There are also some really good answers about this here and here.

Hope this helps.

For 2) see here: http://msdn.microsoft.com/en-us/library/fs2xkftw.aspx

Unmanaged resources are handles to resources owned and controlled by the operating system (other than memory of course).

The GC doesn't clean up memory immediately at the point that there are no longer any references to an object - it may leave it for a long time. If it did this with files, network and graphics handles it would potentially take up a lot of operating resources and only release them occassionaly.

In order to release these unmanaged resources back to the operating system you need to explicitly release them by disposing them. Hence the use of IDisposable and the using keyword.

I don't like the way the cited text uses the term "unmanaged resources", since it suggests that the term refers primarily to objects the OS knows about. In fact, I think it is much more helpful to think of an "unmanaged resource" as being something outside the present object (possibly outside the computer!), whose useful life may exceed that of the present object, whose state may have been altered in a way that would cause problems if not cleaned up, and which the present object is expected to clean up. A "managed resource" is a reference to an object which holds one or more "unmanaged resources", but which will usually manage to take care of those resources (at least eventually) even if it is abandoned.

It is possible to have unmanaged resources even within entirely managed code. As a simple example, an enumerator for a collection might subscribe to an event so it will get notified if the collection changes. The collection's event-subscription list is an unmanaged resource. Unless the enumerator unsubscribes from the event before it is abandoned, the useful life of the collection holding the event subscription is likely to exceed that of the enumerator. While an occasional abandoned event subscription may not cause too much harm, a routine which creates many enumerators and abandons them without cleaning the subscriptions could wreak substantial havoc.

I have done in practice a lot coding with both - native code - C++ - also known as unmanaged and managed code - C#. I'm still not sure why C# was invented in first place - there are a lot of improvement to developer of course, but there are quite many hidden rocks behind C# architecture.

To my best understanding a lot of Microsoft professional developers received a task to make C# to happen, and like normally goes with any new platform - developers got overexcited about their technology.

First attempt was of course to claim that "we are doing right things", and everyone else is doing it wrong - I guess "unmanaged" term appeared because of that. It's like we have here "managed" code and something that was not designed properly - "un"-something. :-)

Unmanaged resources like file handles, network connections, database connections are managed for quite long time already - if you terminate process, it will close all file handles.

On C++ you have malloc, free, on C# you have new (or gcnew), but then you're fighting with problems of what references what, why this object won't go away from memory, what is eating ram - and most of answers to those questions becomes quite difficult to answer.

Constructor / destructor got replaced with finalizers, destructors, disposable objects, where it's relatively difficult to test what gets invoked, in which order, and did you remember to release all resources ? "Oh we are managed, but we don't know how to manage these objects..." :)

C# is fun for as long as it's small and simple application - after you add there 3d objects, and super huge amount of allocations, a lot of functionality, compilation starts to become slow, and you're not satisfied with C# anymore and thinking about going back to C++.

On quite many forums you can probably find some debates about - is C# or C++ better / faster / easier - and most of people try to protects using at all costs C#. Reality is that it's over-complex, over abstracted and too heavy monster, and noone can control it anymore.

Intermediate language (IL) - represents one additional abstraction layer between source code and executable code (assembly), which makes it more difficult to optimize and boost your program.

But I'm not big fan of C++ either - language complexity with pointers, references and objects / classes themselves does not make C++ easier to code or easier to learn.

Basically when you construct new language - you need to take into account target language infrastructure ( low level assembly + smooth integration with C++ code and high level code improvements - easy to use, to easy understand, easy to develop, easy to maintain, and improve).

Creating more "words" in theory might make language richer - you can express yourself using less text, and more efficiently, but again this does not prevent pollution of language itself (like IDisposable).

I'm intentionally comparing programming language with natural language, because I'm writing now answer in natural language, and it's more native to me than programming language. Programming language however is better structured than natural and does not have 2016 years of history.

2 - finalizer / dispose - have read this chapter at least 5 times, but still don't understand it. I'm typically creating one function (close) and calling it from both functions - from finalizer - and from dispose. Why to bother trying to understand something that is not important.

My recommendation to you is anyway - try everything in code - how it looks like, how it feels like. Books tends to become something similar to Bible - they drag you into religion, which you don't want to be in necessarily.