char *s = "hello ppl.";
for (i = 0; i < strlen(s); i++) {
开发者_开发技巧 char c = s[i];
if (c >= 97 && c <= 122) {
c += 2;
s[i] = c;
}
}
I want to rotate the string by two characters: "hello ppl." -> "jgnnq rrn."
I am getting a segmentation fault
. What is wrong with the code?
The code:
char *s = "hello ppl.";
gives you a pointer to what it probably read-only memory. That's because string constants in C are non-modifiable. When you try to write to that memory, you'll most likely get a segmentation violation. The relevant part of the standard (C99 6.4.5/6
on String literals
) states:
It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
So, while the memory doesn't have to be read-only, you're still breaking the rules by trying to modify it.
Try this instead:
char s[] = "hello ppl.";
which is conceptually the same as:
char s[11]; // for the whole string plus null terminator
strcpy (s, "hello ppl.");
In other words, it puts the string you want to change into writable memory. The following code:
#include <stdio.h>
#include <string.h>
int main(void) {
int i;
char s[] = "hello ppl.";
for (i = 0; i < strlen(s); i++) {
char c = s[i];
if (c >= 97 && c <= 122) {
c += 2;
s[i] = c;
}
}
printf("%s\n",s);
return 0;
}
gives you "jgnnq rrn."
as you desired.
A few other things I'd like to point out which are not fatal:
- It's not usually a good idea to use 'magic' numbers like
97
and122
. It's just as easy, and clearer in intent, to use 'a' and 'z'. - If you really want to rotate, you can't blindly add 2 to 'y' and 'z'. You have to treat them specially (subtract 24) so that they map correctly to 'a' and 'b'.
- The C standard doesn't guarantee that alpha characters are contiguous. If you know you're using ASCII, you're probably okay but I thought I'd just mention that. As an aside, it does guarantee that for the numeric characters.
Having said that, I'd rather use a mapping table as follows:
#include <stdio.h>
#include <string.h>
int main (void) {
char *lkupPtr, *strPtr;
char str[] = "hello ppl.";
const char * const from = "abcdefghijklmnopqrstuvwzyz";
const char * const to = "cdefghijklmnopqrstuvwzyzab";
for (strPtr = str; *strPtr != '\0'; strPtr++)
if (lkupPtr = strchr (from, *strPtr)) != NULL)
*strPtr = to[(int)(lkupPtr - from)];
printf("%s\n",str);
return 0;
}
This takes care of all the points I raised above and you can, if necessary, add more mappings if you're in an internationalized environment (rather than just plain ASCII or EDCDIC).
This should be, in my opinion, fast enough for all but the most demanding of requirements (I clocked it at over 3 million characters per second on my PC). If you have a near-insatiable need for performance over and above that, yet don't want to opt for hand-crafted assembly targeted to your specific CPU, you could try something like the following.
It's still fully compliant with the C standard but may deliver better performance by virtue of the fact all heavy calculation work is done once at the start. It creates a table holding all possible character values, initializes it so that every character translates to itself by default, then changes the specific characters you're interested in.
That removes any checking for characters from the translation itself.
#include <stdio.h>
#include <string.h>
#include <limits.h>
static char table[CHAR_MAX + 1];
static void xlatInit (void) {
int i;
char * from = "abcdefghijklmnopqrstuvwzyz";
char * to = "cdefghijklmnopqrstuvwzyzab";
for (i = 0; i <= CHAR_MAX; i++) table[i] = i;
while (*from != '\0') table[*from++] = *to++;
}
int main (void) {
char *strPtr;
char str[] = "hello ppl.";
xlatInit(); // Do this once only, amortize the cost.
for (strPtr = str; *strPtr != '\0'; strPtr++)
*strPtr = table[*strPtr];
printf("%s\n",str);
return 0;
}
The variable s points to read-only memory. This means that it cannot be modified. You'll want to use:
char varname[] = "...";
Gotchas to watch out for:
char varname[] = "...";
Places the data on the stack. Make sure you are not returning a pointer to a functions local data. If that's the case you'll want to look at malloc to allocate memory in the heap.
Another gotcha:
for (i = 0; i < strlen(s); i++) {...} is O(N^2)
The reason is that strlen(s) is an O(N) operation you do each time through the loop. An improvement would be:
int len = strlen(s);
for(i=0;i<len;i++) { ... }
This way we only do the strlen(s) computation once and reuse the result.
In char *s = "hello ppl."
you are not allocating any memory, instead you are pointing to a string which may be residing in read-only memory of your program. Ideally it should be const char*
. Now if you try to modify it, it will crash.
The code:
char *s = "hello ppl.";
creates an entry in the string table, typically in the code segment (read only space of the program). Any attempts to change it will cause the segfault by trying to modify read only memory. The appropriate way to create/initialize a string to be modified is:
char s[] = "hello ppl.";
As others have mentioned,
char *s = "hello ppl.";
points to read-only memory because it is a string literal. It should be
char s[] = "hello ppl.";
which creates an array in read-write memory and copies the string into it.
Ignoring non-ASCII character sets, the problem can be solved most efficiently like this:
void Convert(char *s)
{
for(char *sp = s; *sp; sp++)
if(*sp >= 'a' && *sp <= 'z')
*sp = (*sp - 'a' + 2) % 26 + 'a';
}
If you're dealing with EBCDIC or any other charset that doesn't have contiguous alphabetic characters, you can use a map:
char *from = "abcdefghijklmnopqrstuvwxyz";
char *to = "cdefghijklmnopqrstuvwxyzab";
char map[CHAR_MAX+1];
void Initialize()
{
for(int i=0; from[i]; i++)
map[from[i]] = to[i];
}
void Convert(char *s)
{
for(char *sp = s; *sp; sp++)
if(map[*sp])
*sp = map[*sp];
}
The compiler will optimize each of these to nearly optimal assembly language.
Update In the original problem there was no separate Initialize() call, so I optimized the code to make "Initialize(); Convert(s);" as fast as possible. If you are able to call Initialize() ahead of time and only care about how fast "Convert(s);" runs, the optimal code will fill the array first, like this:
char *from = "abcdefghijklmnopqrstuvwxyz";
char *to = "cdefghijklmnopqrstuvwxyzab";
char map[CHAR_MAX+1];
void Initialize()
{
int i;
for(i=0; i<=CHAR_MAX; i++) // New code here fills the array
map[i] = i;
for(i=0; from[i]; i++)
map[from[i]] = to[i];
}
void Convert(char *s)
{
for(char *sp = s; *sp; sp++) // 'if' removed
*sp = map[*sp];
}
This modified code is 375% slower if you are calling "Initialize(); Convert(s);", but it is 3% faster if you have already called Initialize() and you are only timing "Convert(s);".
char *s = "hello ppl.";
in this s is pointing to string literal which is constant string
as constant strings are stored on data segment of memory area (read only memory area)
so you can access anything from that area but if you want to modify
os doesn't allow you and trap an segfault
modified code could be ::
char s[] = "hello ppl.";
for (i = 0; i < strlen(s); i++) {
char c = s[i];
if (c >= 97 && c <= 122) {
c += 2;
s[i] = c;
}
}
in this s string stored on stack so you can modify,access there is no restriction here.
精彩评论