开发者

String manipulation & memory allocation - C

开发者 https://www.devze.com 2023-01-05 19:42 出处:网络
I am in the process of learning C. I have a method that takes 3 strings and combines them to do some operation. Following was my first implementation using a GCC compiler.

I am in the process of learning C. I have a method that takes 3 strings and combines them to do some operation. Following was my first implementation using a GCC compiler.

void foo(const char *p1, const char *p2, const char *p3)
{
    size_t length = strlen(p1) + strlen(p2) + strlen(p3);
    char combined[length + 1];
    memset(combined, 0, length + 1);
    strcat(combined, p1);
    strcat(combined, p2);
    strcat(combined, p3);
    printf("Result : %s", combined);
}

int main()
{
    foo("hello ", "world ", "how");
    return 0;
}

This works well. But when I compiled this using, cc -Wall -pe开发者_JAVA技巧dantic -g foo.c -o foo, I started getting warnings like ISO C90 forbids variable length array ‘combined’. MSVC was not compiling this code. Changed the code like

void foo(const char *p1, const char *p2, const char *p3)
{
    size_t length = strlen(p1) + strlen(p2) + strlen(p3);
    char *combined = (char *) malloc(length + 1);
    memset(combined, 0, length + 1);
    strcat(combined, p1);
    strcat(combined, p2);
    strcat(combined, p3);
    printf("Result : %s", combined);
    free(combined);
}

Questions

  1. Is this the correct implementation?
  2. If variable length arrays are not part of standard, why GCC implemented it? If the code is expected to compile only on GCC, using variable arrays will be a better alternative than using malloc?
  3. I think the thumb rule is, if memory required is knows at compile time, use arrays else use malloc to allocate required memory. Is this correct?
  4. My code is expected to compile on GCC and MSVC. I will be developing on GCC usually. So what are the compiler flags that ensures maximum portability? Currently I am using -Wall -pedantic. Should I use -ansi too? What would be the equivalent flags available in MSVC?
  5. What are the other common things to consider when writing portable C code?


This works well. But when I compiled this using, cc -Wall -pedantic -g foo.c -o foo, I started getting warnings like ISO C90 forbids variable length array ‘combined’.

Try compiling with -std=c99 option (gcc).

MSVC was not compiling this code. Changed the code like

If variable length arrays are not part of standard, why GCC implemented it?

VLAs are part of ISO C99(gcc and g++(as an extension) support VLAs). MSVC still only supports C89.

My code is expected to compile on GCC and MSVC.

Then you should not use VLAs in your code IMHO.


  1. Yes, it is. There's no specific violations of the standard there. The memset is a waste of time however since it's al going to be overwritten anyway (make your first strcat into a strcpy). And you should always check for malloc returning NULL. No matter what!
  2. C89/90 is not the current standard, C99 is. And C1x is not that far away. GCC is keeping up with the bleeding edge.
  3. Only use local arrays if you don't need them to survive beyond the end of the function. Otherwise malloc is your best bet, particularly if you want to return the combined string.
  4. I think gcc has the -std=c89 flag or something similar. In any case, MSVC does not always follow the standard :-)
  5. Compile and test it on both platforms frequently. That's the only way to be sure.

I would opt for:

void foo (const char *p1, const char *p2, const char *p3) {
    size_t length = strlen(p1) + strlen(p2) + strlen(p3);
    char *combined = (char *) malloc(length + 1);
    if (combined == NULL) {
        printf("Result : <unknown since I could't get any memory>\n");
    } else {
        strcpy(combined, p1);
        strcat(combined, p2);
        strcat(combined, p3);
        printf("Result : %s", combined);
        free(combined);
    }
}

or, since you're not actually doing anything with the string except printing it:

void foo (const char *p1, const char *p2, const char *p3) {
    printf("Result : %s%s%s", p1, p2, p3);
}

:-)

Another strategy I've seen is the "only allocate if you have to" strategy:

void foo (const char *p1, const char *p2, const char *p3) {
    char str1k[1024];
    char *combined;
    size_t length = strlen (p1) + strlen (p2) + strlen (p3) + 1;
    if (length <= sizeof(str1k))
        combined = str1k;
    else
        combined = malloc (length);
    if (combined == NULL) {
        printf ("Result : <unknown since I couldn't get any memory>\n");
    } else {
        strcpy (combined, p1);
        strcat (combined, p2);
        strcat (combined, p3);
        printf ("Result : %s", combined);
    }
    if (combined != str1k)
        free (combined);
}

which uses stack storage if the combined string will fit and only allocates memory if it won't. This can often lead to a substantial speed improvement if the bulk of strings combine into less than the limit.


Variable-length arrays were not part of the first ISO C standard (variously referred to as "C89", "C90" or "ANSI C"). However, they are a part of the latest ISO C standard (known as "C99").

GCC can compile your code in several modes, including "strict C90", "C90-with-GNU-C-extensions", and "C99" (although it does not fully implement C99, it is close enough for most practical purposes).

By default, GCC uses "C90-with-GNU-C-extensions", which is why your code compiles without complaint. Using -pedantic tells it to emit all required warnings by the relevant standard (in this case, C90), and such a warning is required by your code. If you give GCC the -std=c99 -pedantic flags, to tell it to compile against the C99 base standard and emit all required warnings, your code compiles fine.

If you want to ensure your code is compatible with the base C90 standard, then use -std=c90 -pedantic (or -ansi -pedantic: -ansi is a synonym for -std=c90 when compiling C code). Note that MSVC does not support C99.


A very common idiom to work around these issues is to let the caller manage the memory. So instead of allocating memory yourself (either using a variable-length array on the stack or by malloc'ing something, or whatever) you expect the caller to provide memory. Consider this:

int foo(const char *p1, const char *p2, const char *p3, char *buf, size_t bufsize)
{
    size_t requiredSize = strlen(p1) + strlen(p2) + strlen(p3) + 1;
    if (!buf)
        return requiredSize;
    if (requiredSize > bufsize)
        return -1;
    buf[0] = '\0';
    strcat(buf, p1);
    strcat(buf, p2);
    strcat(buf, p3);
    return requiredSize;
}

int main()
{
  /* simple case: caller knows that the buffer is large enough. */
  char buf[ 1024 ];
  foo( "Hello", "World", "Bar", buf, sizeof(buf) );
  printf("Result : %s\n", buf);

  /* complicated case: caller wants to allocate buffer of just the right size */
  size_t bufsize = foo( "Hello", "World", "Bar", NULL, 0 );
  char *buf2 = (char *)malloc(bufsize);
  foo( "Hello", "World", "Bar", buf2, bufsize );
  free( buf2 );
}

The advantage of this approach is that foo will never leak. In addition to that, the caller can use a simple stack-based array in case it works for him. If he wants to know the exact size he can call foo and pass NULL as the fourth argument.

0

精彩评论

暂无评论...
验证码 换一张
取 消