I'm about to make a design decision that could potentially have visible performance implications. Generally speaking, how do libraries handle unzipping; is it cheaper to unzip a file from memory or from hard disk?
I imagine this varies from library to library, but what about zlib — just an example of a more popular library —, when it extracts from hard disk does it first copy the data to memory anyway (meaning there's no performance difference betwe开发者_如何转开发en the two approaches), or is it able to extract directly from the hard disk?
By default, zlib will read a file "chunk by chunk" dependent on a predefined buffer size; this allows it to compress/uncompress data larger than available system memory.
Since reads from disk are expensive (when compared to reads from memory), loading a file into memory first would provide an improvement in performance, for files larger than the default buffer size and smaller than available memory. Performance will increase the larger multiple the file is to the buffer size and the less fragmented the file is on disk.
精彩评论