How to write a segmentation fault handler, so that the faulty instruction is not restarted? (C and Linux)_问答_开发者

I wrote a segmentation fault handler but the problem is, the instruction at which the fault is happening is restarted after going to the handler, and this causes the handler to go to infinite loop.

I want the handler to work such that after reaching the handler, the instruction following the faulty instruction should be executed such that it does not go to infinite loop. Can anyone please help me wit开发者_JAVA百科h some code snippet?

I am using C and Linux.

Warning: I do not recommend doing this. Listen to the comments telling you to find some other way to solve you problem

I'd also like to re-iterate Henning Makholms warning that it will be extremely architecture-specific and nonportable. It will be maintenance hell and you will have to manually handle lots of different instructions unless it is one specific instruction sequence you're looking for (as in the below example).

With that said if you still want to do it, it can be done in the following manner:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <time.h>

#define __USE_GNU
#include <signal.h>

void action(int sig, siginfo_t* siginfo, void* context)
{
    sig=sig; siginfo=siginfo;// ignore warning

    // get execution context
    mcontext_t* mcontext = &((ucontext_t*)context)->uc_mcontext;

    // find out what instruction faulted
#if defined(__x86_64)
    uint8_t* code = (uint8_t*)mcontext->gregs[REG_RIP];
    if (code[0] == 0x88 && code[1] == 0x10) { // mov %dl,(%rax)
        mcontext->gregs[REG_RIP] += 2; // skip it!
        return;
    }
#elif defined(__i386)
    uint8_t* code = (uint8_t*)mcontext->gregs[REG_EIP];
    if (code[0] == 0x88 && code[1] == 0x10) { // mov %dl,(%eax)
        mcontext->gregs[REG_EIP] += 2; // skip it!
        return;
    }
#else
#error "Unsupported system"
#endif
    // unknown/unhandled instruction failed...

    // only for debugging, shouldn't print stuff in a signal handler
    int i = 0; 
    for (i = 0; i < 16; i++) {
        fprintf(stderr, "%2.2X ", code[i]);
    }
    fprintf(stderr, "\n");
    exit(1);
}

int main(void)
{
    // install SIGSEGV handler
    struct sigaction act;
    memset(&act, 0, sizeof(act));
    act.sa_sigaction = action;
    act.sa_flags = SA_SIGINFO;
    if (sigaction(SIGSEGV, &act, NULL) < 0) {
        perror("sigaction");
        return 1;
    }

    // cause fault
    int i;
    for (i = 0; i < 10; i++) {
        ((unsigned char*)0)[i] = i;
    }
    return EXIT_SUCCESS;
}

Here I have only handled one specific instruction sequence for x86 32- and 64-bit, though it should be trivial (if tedious) to support more architectures and instructions.

Update: You (now) mention that you are on an ARM machine. That should actually make it easier as the instructions are always 32-bit (except in thumb mode) if I'm not mistaken. I don't have an ARM machine to test this one, so you will have to dig into sys/ucontext.h to check if I got the names right. Ofcourse you should also be checking the faulting instruction in a similar fashion. My best guess as to how it is for ARM is the following (placed along side the other #if defined(...) statements:

    #elif defined(__arm) // or use what your GCC defines, also check for 32-bit arm mode or whatever...
    uint8_t* code = (uint8_t*)mcontext->arm_pc;
    if (*(uint32_t*)code == /*some instruction*/) {
        mcontext->arm_pc += 4; // skip it!
        return;
    }

Simply skipping a failing instruction sounds like a recipe for extremely hard-to-track-down errors. However, if you really want to, you can rewrite the IP in the ucontext struct the handler gets as a second or third (I forget which, but don't try any of this without reading the manpages closely anyway) parameter. You'll need to disassemble the faulting instruction for yourself in order to find out how long it is -- which is good because you shouldn't be skipping instructions that you don't understand.

Whatever you do, the result will be extremely architecture-specific and nonportable.

I have figured out an alternate and perhaps easier way to write a segmentation fault handler. Although user78863 answer and effort is praiseworthy, but seeing the complexity of code and difficult in porting, i think my solution is better. So i will accept my answer.

Here is the link to the code: Can we reset sigsetjmp to return "0" again (Reset sigsetjmp)?