Today I had to use the basename()
function, and the man 3 basename
(here) gave me some strange message:
Notes
There are two different versions of basename() - the POSIX version described above, and the GNU version, which one gets after
#define _GNU_SOURCE
#include <string.h>
I'm wondering what this #define _GNU_SOURCE
means: is it tainting the code I write with a GNU-related license? Or is it simply used to tell the compiler something like "Well, I know, this set of functions is not POSIX, thus not portable, but I'd like to use it anyway".
If so, why not give people different headers,开发者_开发问答 instead of having to define some obscure macro to get one function implementation or the other?
Something also bugs me: how does the compiler know which function implementation to link with the executable? Does it use this #define
as well?
Anybody have some pointers to give me?
Defining _GNU_SOURCE
has nothing to do with license and everything to do with writing (non-)portable code. If you define _GNU_SOURCE
, you will get:
- access to lots of nonstandard GNU/Linux extension functions
- access to traditional functions which were omitted from the POSIX standard (often for good reason, such as being replaced with better alternatives, or being tied to particular legacy implementations)
- access to low-level functions that cannot be portable, but that you sometimes need for implementing system utilities like
mount
,ifconfig
, etc. - broken behavior for lots of POSIX-specified functions, where the GNU folks disagreed with the standards committee on how the functions should behave and decided to do their own thing.
As long as you're aware of these things, it should not be a problem to define _GNU_SOURCE
, but you should avoid defining it and instead define _POSIX_C_SOURCE=200809L
or _XOPEN_SOURCE=700
when possible to ensure that your programs are portable.
In particular, the things from _GNU_SOURCE
that you should never use are #2 and #4 above.
For exact details on what are all enabled by _GNU_SOURCE
, documentation can help.
From the GNU documentation:
Macro: _GNU_SOURCE
If you define this macro, everything is included: ISO C89, ISO C99, POSIX.1, POSIX.2, BSD, SVID, X/Open, LFS, and GNU extensions. In the cases where POSIX.1 conflicts with BSD, the POSIX definitions take precedence.
From the Linux man page on feature test macros:
_GNU_SOURCE
Defining this macro (with any value) implicitly defines _ATFILE_SOURCE, _LARGEFILE64_SOURCE, _ISOC99_SOURCE, _XOPEN_SOURCE_EXTENDED, _POSIX_SOURCE, _POSIX_C_SOURCE with the value 200809L (200112L in glibc versions before 2.10; 199506L in glibc versions before 2.5; 199309L in glibc ver‐ sions before 2.1) and _XOPEN_SOURCE with the value 700 (600 in glibc versions before 2.10; 500 in glibc versions before 2.2). In addition, various GNU-specific extensions are also exposed.
Since glibc 2.19, defining _GNU_SOURCE also has the effect of implicitly defining _DEFAULT_SOURCE. In glibc versions before 2.20, defining _GNU_SOURCE also had the effect of implicitly defining _BSD_SOURCE and _SVID_SOURCE.
Note: _GNU_SOURCE
needs to be defined before including header files so that the respective headers enable the features. For example:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
...
_GNU_SOURCE
can be also be enabled per compilation using -D
flag:
$ gcc -D_GNU_SOURCE file.c
(-D
is not specific to _GNU_SOURCE
but any macro be defined this way).
Let me answer two further points:
Something also bugs me: how does the compiler know which function implementation to link with the executable? Does it use this #define as well?
A common approach is to conditionally #define
identifier basename
to different names, depending on whether _GNU_SOURCE
is defined. For instance:
#ifdef _GNU_SOURCE
# define basename __basename_gnu
#else
# define basename __basename_nongnu
#endif
Now the library simply needs to provide both behaviors under those names.
If so, why not give people different headers, instead of having to define some obscure environment variable to get one function implementation or the other?
Often the same header had slightly different contents in different Unix versions, so there is no single right content for, say, <string.h>
— there are many standards (xkcd).
There's a whole set of macros to pick your favorite one, so that if your program expects one standard, the library will conform to that.
From some mailing list via google:
Look at glibc's include/features.h:
_GNU_SOURCE All of the above, plus GNU extensions.
Which means it enables all this:
STRICT_ANSI, _ISOC99_SOURCE, _POSIX_SOURCE, _POSIX_C_SOURCE, _XOPEN_SOURCE, _XOPEN_SOURCE_EXTENDED, _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, _FILE_OFFSET_BITS=N, _BSD_SOURCE, _SVID_SOURCE
So it enables a whole lot of compiling flags for gcc
why not give people different headers
They already got them; the headers are divided by topic into files, so it takes another dimension to filter.
I was looking for a signal number-to-name conversion. I found strsignal()
, in <string.h>
. The man page says:
sigabbrev_np(), sigdescr_np():
_GNU_SOURCE <<< not default
strsignal():
From glibc 2.10 to 2.31:
_POSIX_C_SOURCE >= 200809L <<< default, cf. XOPEN2K8 below
Before glibc 2.10:
_GNU_SOURCE
I had never really cared for this part at all. sigabbrev_np()
is not included in the default "features". string.h
shows how:
#ifdef __USE_XOPEN2K8
/* Return a string describing the meaning of the signal number in SIG. */
extern char *strsignal (int __sig) __THROW;
# ifdef __USE_GNU
/* Return an abbreviation string for the signal number SIG. */
extern const char *sigabbrev_np (int __sig) __THROW;
/* Return a string describing the meaning of the signal number in SIG,
the result is not translated. */
extern const char *sigdescr_np (int __sig) __THROW;
# endif
__USE_GNU
can/should be set via _GNU_SOURCE
, at compilation or top of the file. But that "activates" all other such ifdeffed declarations in all headers, too. (Unless you define-undefine per header)
So to explicitly import just one (or the other) special function, I go like this for now (copy-paste. I left the "THROW" and changed "__sig"):
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
extern const char *sigabbrev_np(int sig) __THROW; /* __USE_GNU / _GNU_SOURCE */
#include <errno.h>
#include <elf.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
...
Now sigabbrev_np(wstate >> 8)
gives me TRAP
etc. without #defines.
I had a hard time realizing that 0x57f
means OK because 5
is TRAP
, but 0xb7f
and 0x77f
are SEGV
and BUS
--- which I got depending on where I set the breakpoint, sometimes after thousands of instructions. Because I did not step back the intruction pointer...
精彩评论