开发者

Why is free() not allowed in garbage-collected languages?

开发者 https://www.devze.com 2022-12-29 17:13 出处:网络
I was开发者_如何转开发 reading the C# entry on Wikipedia, and came across: Managed memory cannot be explicitly freed; instead, it is automatically garbage collected.

I was开发者_如何转开发 reading the C# entry on Wikipedia, and came across:

Managed memory cannot be explicitly freed; instead, it is automatically garbage collected.

Why is it that in languages with automatic memory management, manual management isn't even allowed? I can see that in most cases it wouldn't be necessary, but wouldn't it come in handy where you are tight on memory and don't want to rely on the GC being smart?


Languages with automatic memory management are designed to provide substantial memory safety guarantees that can't be offered in the presence of any manual memory management.

Among the problems prevented are

  • Double free()s
  • Calling free() on a pointer to memory that you do not own, leading to illegal access in other places
  • Calling free() on a pointer that was not the return value of an allocation function, such as taking the address of some object on the stack or in the middle of an array or other allocation.
  • Dereferencing a pointer to memory that has already been free()d

Additionally, automatic management can result in better performance when the GC moves live objects to a consolidated area. This improves locality of reference and hence cache performance.


Garbage collection enforces the type safety of a memory allocator by guaranteeing that memory allocations never alias. That is, if a piece of memory is currently being viewed as a type T, the memory allocator can guarantee (with garbage collection) that while that reference is alive, it will always refer to a T. More specifically, it means that the memory allocator will never return that memory as a different type.

Now, if a memory allocator allows for manual free() and uses garbage collection, it must ensure that the memory you free()'d is not referenced by anyone else; in other words, that the reference you pass in to free() is the only reference to that memory. Most of the time this is prohibitively expensive to do given an arbitrary call to free(), so most memory allocators that use garbage collection do not allow for it.

That isn't to say it is not possible; if you could express a single-referrent type, you could manage it manually. But at that point it would be easier to either stop using a GC language or simply not worry about it.


Calling GC.Collect is almost always the better than having an explicit free method. Calling free would make sense only for pointers/object refs that are referenced from nowhere. That is something that is error prone, since there is a chance that your call free for the wrong kind of pointer.

When the runtime environment does reference counting monitoring for you, it knows which pointers can be freed safely, and which not, so letting the GC decide which memory can be freed avoids a hole class of ugly bugs. One could think of a runtime implementation with both GC and free where explicitly calling free for a single memory block might be much faster than running a complete GC.Collect (but don't expect freeing every possible memory block "by hand" to be faster than the GC). But I think the designers of C#, CLI (and other languages with garbage collectors like Java) have decided to favor robustness and safety over speed here.


In systems that allow objects to be manually freed, the allocation routines have to search through a list of freed memory areas to find some free memory. In a garbage-collection-based system, any immediately-available free memory is going to be at the end of the heap. It's generally faster and easier for the system to ignore unused areas of memory in the middle of the heap than it would be to try to allocate them.


Interestingly enough, you do have access to the garbage collector through System.GC -- Though from everything I've read, it's highly recommended that you allow the GC manage itself.

I was advised once to use the following 2 lines by a 3rd party vendor to deal with a garbage collection issue with a DLL or COM object or some-such:

// Force garbage collection (cleanup event objects from previous run.)
GC.Collect();         // Force an immediate garbage collection of all generations
GC.GetTotalMemory(true);

That said, I wouldn't bother with System.GC unless I knew exactly what was going on under the hood. In this case, the 3rd party vendor's advice "fixed" the problem that I was dealing with regarding their code. But I can't help but wonder if this was actually a workaround for their broken code...


If you are in situation that you "don't want to rely on the GC being smart" then most probably you picked framework for your task incorrectly. In .net you can manipulate GC a bit (http://msdn.microsoft.com/library/system.gc.aspx), in Java not sure.

I think you can't call free because you start doing one task of GC. GC's efficiency can be somehow guaranteed overall when it does things the way it finds it best and it does them when it decides. If developers will interfere with GC it might decrease it's overall efficiency.


I can't say that it is the answer, but one that comes to mind is that if you can free, you can accidentally double free a pointer/reference or even worse, use one after free. Which defeats the main point of using languages like c#/java/etc.

Of course one possible solution to that, would be to have your free take it's argument by reference and set it to null after freeing. But then, what if they pass an r-value like this: free(whatever()). I suppose you could have an overload for r-value versions, but don't even know if c# supports such a thing :-P.

In the end, even that would be insufficient because as has been pointed out, you can have multiple references to the same object. Setting one to null would do nothing to prevent the others from accessing the now deallocated object.


Many of the other answers provide good explanations of how the GC work and how you should think when programming against a runtime system which provides a GC.

I would like to add a trick that I try to keep in mind when programming in GC'd languages. The rule is this "It is important to drop pointers as soon as possible." By dropping pointers I mean that I no longer point to objects that I no longer will use. For instance, this can be done in some languages by setting a variable to Null. This can be seen as a hint to the garbage collector that it is fine to collect this object, provided there are no other pointers to it.


Why would you want to use free()? Suppose you have a large chunk of memory you want to deallocate.

One way to do it is to call the garbage collector, or let it run when the system wants. In that case, if the large chunk of memory can't be accessed, it will be deallocated. (Modern garbage collectors are pretty smart.) That means that, if it wasn't deallocated, it could still be accessed.

Therefore, if you can get rid of it with free() but not the garbage collector, something still can access that chunk (and not through a weak pointer if the language has the concept), which means that you're left with the language's equivalent of a dangling pointer.

The language can defend itself against double-frees or trying to free unallocated memory, but the only way it can avoid dangling pointers is by abolishing free(), or modifying its meaning so it no longer has a use.


Why is it that in languages with automatic memory management, manual management isn't even allowed? I can see that in most cases it wouldn't be necessary, but wouldn't it come in handy where you are tight on memory and don't want to rely on the GC being smart?

In the vast majority of garbage collected languages and VMs it does not make sense to offer a free function although you can almost always use the FFI to allocate and free unmanaged memory outside the managed VM if you want to.

There are two main reasons why free is absent from garbage collected languages:

  1. Memory safety.
  2. No pointers.

Regarding memory safety, one of the main motivations behind automatic memory management is eliminating the class of bugs caused by incorrect manual memory management. For example, with manual memory management calling free with the same pointer twice or with an incorrect pointer can corrupt the memory manager's own data structures and cause non-deterministic crashes later in the program (when the memory manager next reaches its corrupted data). This cannot happen with automatic memory management but exposing free would open up this can of worms again.

Regarding pointers, the free function releases a block of allocated memory at a location specified by a pointer back to the memory manager. Garbage collected languages and VMs replace pointers with a more abstract concept called references. Most production GCs are moving which means the high-level code holds a reference to a value or object but the underlying location in memory can change as the VM is capable of moving allocated blocks of memory around without the high-level language knowing. This is used to compact the heap, preventing fragmentation and improving locality.

So there are good reasons not to have free when you have a GC.


Manual management is allowed. For example, in Ruby calling GC.start will free everything that can be freed, though you can't free things individually.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号