Getting zeros between data while reading a binary file in C_问答_开发者

Getting zeros between data while reading a binary file in C

开发者 https://www.devze.com 2023-02-04 06:25 出处：网络

I have a binary data which I am reading into an array of long integers using a C programme. A hexdump of the binary data shows that after first few data points, it starts again at a location 20000 he

相关专题：binary c

I have a binary data which I am reading into an array of long integers using a C programme.

A hexdump of the binary data shows that after first few data points, it starts again at a location 20000 hex addresses away. hexdump output is as shown below.

0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0020000 0000 0000 0053 0000 0064 0000 006b 0000
0020010 0066 0000 0068 0000 0066 0000 005d开发者_运维百科 0000
0020020 0087 0000 0059 0000 0062 0000 0066 0000

... and so on... But when I read it into an array 'data' of long integers by the typical fread command

fread(data,sizeof(*data),filelength/sizeof(*data),fd);

It is filling up with all zeros in my data array till it reaches the 20000 location. After that it reads in data correctly. Why is it reading regions where my file is not there? Or how will I make it read only my file, not anything in between which are not in file?

I know it looks like a trivial problem, but I cannot figure it out even after Googling one night. Can anyone suggest me where I am doing it wrong?

Other Info: I am working on a GNU/Linux machine. (slax-atma distro to be specific). My C compiler is gcc.

The hex dump output shows that the first line (16 bytes) of data are all zeroes; the '*' indicates that the following lines are the same, until you reach offset 0x0020000. So, the start of your file is all zeroes.

The read call reads the file as if the zeroes were present on disk because the Unix/Linux interface is defined to do that. Whether they are stored on disk or not is immaterial; as far as your program is concerned, they are there.

If you want to skip the 'all zero' part of the file, then preferably don't write the file with all zeroes at the start. Failing that, you'll have to decide how to read the data in chunks until you start finding non-zero information - or use a fixed offset to jump over the zeroes.

So, the file system abstraction on Unix and Linux means that the zeroes are read, whether they are physically stored on disk or not. To skip them, you have to know how you want to do that - either by knowing how many there are and seeking past them, or by reading and discarding data.

You ask "Why is it reading regions where my file is not there?"

But you are wrong. Zeros are valid data in file. So it reads those zeros. It behaves correctly.

If you want to skip zeros, you have to fread one number after another and skip it if it is a zero. You can also read whole dataset into memory and then shrink it (which needs more memory but is faster than reading numbers one by one from disk).