I was working on a piece of code to do some compression, and I wrote a bitstream class.
My bitstream class kept track of the current bit we are reading and the current byte (unsigned char).
I noticed that reading the next unsigned character from 开发者_JS百科the file was done differently if I used the >> operator vs get() method in the istream class.
I was just curious why I was getting different results?
ex:
this->m_inputFileStream.open(inputFile, std::ifstream::binary);
unsigned char currentByte;
this->m_inputFileStream >> currentByte;
vs.
this->m_inputFileStream.open(inputFile, std::ifstream::binary);
unsigned char currentByte;
this->m_inputFileStream.get((char&)currentByte);
Additional Info:
To be specific the byte I was reading was 0x0A however when using >> it would read it as 0x6F
I'm not sure how they're even related ? (they're not the 2s complement of each other?)
The >> operator is also defined to work for unsigned char as well however (see c++ istream class reference
operator>>
is for formatted input. It'll read "23"
as an integer if you stream it into an int
, and it'll eat whitespace between tokens. get()
on the other hand is for unformatted, byte-wise input.
If you aren't parsing text, don't use operator>>
or operator<<
. You'll get weird bugs that are hard to track down. They are also resilient to unit tests, unless you know what to look for. Reading a uint8 for instance will fail on 9 for instance.
edit:
#include <iostream>
#include <sstream>
#include <cstdint>
void test(char r) {
std::cout << "testing " << r << std::endl;
char t = '!';
std::ostringstream os(std::ios::binary);
os << r;
if (!os.good()) std::cout << "os not good" << std::endl;
std::istringstream is(os.str(), std::ios::binary);
is >> t;
if (!is.good()) std::cout << "is not good" << std::endl;
std::cout << std::hex << (uint16_t)r
<< " vs " << std::hex << (uint16_t)t << std::endl;
}
int main(int argc, char ** argv) {
test('z');
test('\n');
return 0;
}
produces:
testing z
7a vs 7a
testing
is not good
a vs 21
I suppose that would never have been evident a priori.
C++'s formatted input (operator >>
) treats char
and unsigned char
as a character, rather than an integer. This is a little annoying, but understandable.
You have to use get
, which returns the next byte, instead.
However, if you open a file with the binary flag, you should not be using formatted I/O. You should be using read
, write
and related functions. Formatted I/O won't behave correctly, as it's intended to operate on text formats, not binary formats.
精彩评论