开发者

Unix command to find non-ascii chars

开发者 https://www.devze.com 2023-01-29 00:28 出处:网络
I have a file 500MB of size. It has some non-ascii characters in it. I just want to find out those characters using Unix command. May it will be better to get the line numbers and p开发者_运维知识库o

I have a file 500MB of size. It has some non-ascii characters in it. I just want to find out those characters using Unix command. May it will be better to get the line numbers and p开发者_运维知识库osition at each line.

Thanks :)


Use the answer given in the other solution, but add -n to grep.


You know, it's weird. Sometimes I find it faster to code up some quick and dirty C than it is to try and navigate the wilderness of UNIX utility command line options :-)

#include <stdio.h>

int main (void) {
    size_t ln = 1;
    size_t chpos = 0;
    int chr;
    while ((chr = fgetc (stdin)) != EOF) {
        if (chr == '\n') {
            ln++;
            chpos = 0;
            continue;
        }
        chpos++;
        if (chr > 127) {
            printf ("Non-ASCII %02x found at line %d, offset %d\n",
                chr, ln, chpos);
        }
    }
    return 0;
}

This will give you both the line number, and the character position within that line, of any characters outside the ASCII range.

0

精彩评论

暂无评论...
验证码 换一张
取 消