开发者

Why is NULL/0 an illegal memory location for an object?

开发者 https://www.devze.com 2023-01-01 19:09 出处:网络
I understand the purpose of the NULL constant in C/C++, a开发者_开发问答nd I understand that it needs to be represented some way internally.

I understand the purpose of the NULL constant in C/C++, a开发者_开发问答nd I understand that it needs to be represented some way internally.

My question is: Is there some fundamental reason why the 0-address would be an invalid memory-location for an object in C/C++? Or are we in theory "wasting" one byte of memory due to this reservation?


The null pointer does not actually have to be 0. It's guaranteed in the C spec that when a constant 0 value is given in the context of a pointer it is treated as null by the compiler, however if you do

char *foo = (void *)1;
--foo;
// do something with foo

You will access the 0-address, not necessarily the null pointer. In most cases this happens to actually be the case, but it's not necessary, so we don't really have to waste that byte. Although, in the larger picture, if it isn't 0, it has to be something, so a byte is being wasted somewhere

Edit: Edited out the use of NULL due to the confusion in the comments. Also, the main message here is "null pointer != 0, and here's some C/pseudo code that shows the point I'm trying to make." Please don't actually try to compile this or worry about whether the types are proper; the meaning is clear.


This has nothing to do with wasting memory and more with memory organization.

When you work with the memory space, you have to assume that anything not directly "Belonging to you" is shared by the entire system or illegal for you to access. An address "belongs to you" if you have taken the address of something on the stack that is still on the stack, or if you have received it from a dynamic memory allocator and have not yet recycled it. Some OS calls will also provide you with legal areas.

In the good old days of real mode (e.g., DOS), all the beginning of the machine's address space was not meant to be written by user programs at all. Some of it even mapped to things like I/O. For instance, writing to the address space at 0xB800 (fairly low) would actually let you capture the screen! Nothing was ever placed at address 0, and many memory controller would not let you access it, so it was a great choice for NULL. In fact, the memory controller on some PCs would have gone bonkers if you tried writing there.

Today the operating system protects you with a virtual address space. Nevertheless, no process is allowed to access addresses not allocated to it. Most of the addresses are not even mapped to an actual memory page, so accessing them will trigger a general protection fault or the equivalent in your operating system. This is why 0 is not wasted - even though all the processes on your machine "have an address 0", if they try to access it, it is not mapped anywhere.


There is no requirement that a null pointer be equal to the 0-address, it's just that most compilers implement it this way. It is perfectly possible to implement a null pointer by storing some other value and in fact some systems do this. The C99 specification §6.3.2.3 (Pointers) specifies only that an integer constant expression with the value 0 is a null pointer constant, but it does not say that a null pointer when converted to an integer has value 0.

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

On some embedded systems the zero memory address is used for something addressable.


The zero address and the NULL pointer are not (necessarily) the same thing. Only a literal zero is a null pointer. In other words:

char* p = 0; // p is a null pointer

char* q = 1;
q--; // q is NOT necessarily a null pointer

Systems are free to represent the null pointer internally in any way they choose, and this representation may or may not "waste" a byte of memory by making the actual 0 address illegal. However, a compiler is required to convert a literal zero pointer into whatever the system's internal representation of NULL is. A pointer that comes to point to the zero address by some way other than being assigned a literal zero is not necessarily null.

Now, most systems do use 0 for NULL, but they don't have to.


It is not necessarily an illegal memory location. I have stored data by dereferencing a pointer to zero... it happens the datum was an interrupt vector being stored at the vector located at address zero.

By convention it is not normally used by application code since historically many systems had important system information starting at zero. It could be the boot rom or a vector table or even unused address space.


On many processors address zero is the reset vector, wherein lies the bootrom (BIOS on a PC), so you are unlikely to be storing anything at that physical address. On a processor with an MMU and a supporting OS, the physical and logical address addresses need not be the same, and the address zero may not be a valid logical address in the executing process context.


NULL is typically the zero address, but it is the zero address in your applications virtual address space. The virtual addresses that you use in most modern operating systems have exactly nothing to do with actual physical addresses, the OS maps from the virtual address space to the physical addresses for you. So, no, having the virtual address 0 representing NULL does not waste any memory.

Read up on virtual memory for a more involved discussion if you're curious.


I don't see the answers directly addressing what i think you were asking, so here goes:

Yes, at least 1 address value is "wasted" (made unavailable for use) because of the constant used for null. Whether it maps to 0 in linear map of process memory is not relevant.

And the reason that address won't be used for data storage is that you need that special status of the null pointer, to be able to distinguish from any other real pointer. Just like in the case of ASCIIZ strings (C-string, NUL-terminated), where the NUL character is designated as end of character string and cannot be used inside strings. Can you still use it inside? Yeah but that will mislead library functions as of where string ends.

I can think of at least one implementation of LISP i was learning, in which NIL (Lisp's null) was not 0, nor was it an invalid address but a real object. The reason was very clever - the standard required that CAR(NIL)=NIL and CDR(NIL)=NIL (Note: CAR(l) returns pointer to the head/first element of a list, where CDR(l) returns ptr to the tail/rest of the list.). So instead of adding if-checks in CAR and CDR whether the pointer is NIL - which will slow every call - they just allocated a CONS (think list) and assigned its head and tail to point to itself. There! - this way CAR and CDR will work and that address in memory won't be reused (because it is taken by the object devised as NIL)

ps. i just remembered that many-many years ago i read about some bug of Lattice-C that was related to NULL - must have been in the dark MS-DOS segmentation times, where you worked with separate code segment and data segment - so i remember there was an issue that it was possible for the first function from a linked library to have address 0, thus pointer to it will be considered invalid since ==NULL


But since modern operating systems can map the physical memory to logical memory addresses (or better: modern CPUs starting with the 386), not even a single byte is wasted.


As people already have pointed out, the bit representation of the NULL pointer has not to be the same as the bit represention of a 0 value. It is though in nearly all cases (the old dinosaur computers that had special addresses can be neglected) because a NULL pointer can also be used as a boolean and by using an integer (of suffisent size) to hold the pointer value it is easier to represent in the common ISAs of modern CPU. The code to handle it is then much more straight forward, thus less error prone.


You are correct in noting that the address space at 0 is not usable storate for your program. For a number of reasons a variety of systems do not consider this a valid address space for your program anyway.

Allowing any valid address to be used would require a null value flag for all pointers. This would exceed the overhead of the lost memory at address 0. It would also require additional code to check and see if the address were null or not, wasting memory and processor cycles.

Ideally, the address that NULL pointer is using (usually 0) should return an error on access. VAX/VMS never mapped a page to address 0 so following the NULL pointer would result in a failure.


The memory at that address is reserved for use by the operating system. 0 - 64k is reserved. 0 is used as a special value to indicate to developers "not a valid address".

0

精彩评论

暂无评论...
验证码 换一张
取 消