开发者

What is the smallest valid jpeg file size (in bytes)

开发者 https://www.devze.com 2022-12-20 13:00 出处:网络
I\'d like to screen some jpegs for validity before I send them across the network 开发者_开发技巧for more extensive inspection.It is easy enough to check for a valid header and footer, but what is the

I'd like to screen some jpegs for validity before I send them across the network 开发者_开发技巧for more extensive inspection. It is easy enough to check for a valid header and footer, but what is the smallest size (in bytes) a valid jpeg could be?


A 1x1 grey pixel in 125 bytes using arithmetic coding, still in the JPEG standard even if most decoders can't decode it:

ff d8 : SOI
ff e0 ; APP0
 00 10
 4a 46 49 46 00 01 01 01 00 48 00 48 00 00
ff db ; DQT
 00 43
 00
 03 02 02 02 02 02 03 02
 02 02 03 03 03 03 04 06
 04 04 04 04 04 08 06 06
 05 06 09 08 0a 0a 09 08
 09 09 0a 0c 0f 0c 0a 0b
 0e 0b 09 09 0d 11 0d 0e
 0f 10 10 11 10 0a 0c 12
 13 12 10 13 0f 10 10 10
ff c9 ; SOF
 00 0b
 08 00 01 00 01 01 01 11 00
ff cc ; DAC
 00 06 00 10 10 05
ff da ; SOS
 00 08
 01 01 00 00 3f 00 d2 cf 20
ff d9 ; EOI

I don't think the mentioned 134 byte example is standard, as it is missing an EOI. All decoders will handle this but the standard says it should end with one.

That file can be generated with:

#!/usr/bin/env bash
printf '\xff\xd8' # SOI
printf '\xff\xe0' # APP0
printf  '\x00\x10'
printf  '\x4a\x46\x49\x46\x00\x01\x01\x01\x00\x48\x00\x48\x00\x00'
printf '\xff\xdb' # DQT
printf  '\x00\x43'
printf  '\x00'
printf  '\x03\x02\x02\x02\x02\x02\x03\x02'
printf  '\x02\x02\x03\x03\x03\x03\x04\x06'
printf  '\x04\x04\x04\x04\x04\x08\x06\x06'
printf  '\x05\x06\x09\x08\x0a\x0a\x09\x08'
printf  '\x09\x09\x0a\x0c\x0f\x0c\x0a\x0b'
printf  '\x0e\x0b\x09\x09\x0d\x11\x0d\x0e'
printf  '\x0f\x10\x10\x11\x10\x0a\x0c\x12'
printf  '\x13\x12\x10\x13\x0f\x10\x10\x10'
printf '\xff\xc9' # SOF
printf  '\x00\x0b'
printf  '\x08\x00\x01\x00\x01\x01\x01\x11\x00'
printf '\xff\xcc' # DAC
printf  '\x00\x06\x00\x10\x10\x05'
printf '\xff\xda' # SOS
printf  '\x00\x08'
printf  '\x01\x01\x00\x00\x3f\x00\xd2\xcf\x20'
printf '\xff\xd9' # EOI

and opened fine with GNOME Image Viewer 3.38.0 and GIMP 2.10.18 on Ubuntu 20.10.

Here's an upload on Imgur. Note that Imgur process the file making it larger however if you download it to check, and as seen below, the width=100 image shows white on Chromium 87:

What is the smallest valid jpeg file size (in bytes)


It occurs to me you could make a progressive jpeg with only the DC coefficients, that a single grey pixel could be encoded in 119 bytes. This reads just fine in a few programs I've tried it in (Photoshop, GNOME Image Viewer 3.38.0, GIMP 2.10.18, and others).

ff d8 : SOI
ff db ; DQT
 00 43
 00
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
 01 01 01 01 01 01 01 01
ff c2 ; SOF
 00 0b
 08 00 01 00 01 01 01 11 00
ff c4 ; DHT
 00 14
 00
 01 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00
 03
ff da ; SOS
 00 08
 01 01 00 00 00 01 3F
ff d9 ; EOI

The main space savings is to only have one Huffman table. Although this is slightly smaller than the 125 byte arithmetic encoding given in another answer, the arithmetic encoding without the JFIF header would be smaller yet (107 bytes), so that should still be considered the smallest known.

The above file can be generated with:

#!/usr/bin/env bash
printf '\xff\xd8' # SOI
printf '\xff\xdb' # DQT
printf  '\x00\x43'
printf  '\x00'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf  '\x01\x01\x01\x01\x01\x01\x01\x01'
printf '\xff\xc2' # SOF
printf  '\x00\x0b'
printf  '\x08\x00\x01\x00\x01\x01\x01\x11\x00'
printf '\xff\xc4' # DHT
printf  '\x00\x14'
printf  '\x00'
printf  '\x01\x00\x00\x00\x00\x00\x00\x00'
printf  '\x00\x00\x00\x00\x00\x00\x00\x00'
printf  '\x03'
printf '\xff\xda' # SOS
printf  '\x00\x08'
printf  '\x01\x01\x00\x00\x00\x01\x3F'
printf '\xff\xd9' # EOI


Try the following (134 bytes):

FF D8 FF E0 00 10 4A 46 49 46 00 01 01 01 00 48 00 48 00 00
FF DB 00 43 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF C2 00 0B 08 00 01 00 01 01 01
11 00 FF C4 00 14 10 01 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 FF DA 00 08 01 01 00 01 3F 10

Source: Worlds Smallest, Valid JPEG? by Jesse_hz


Found "the tiniest GIF ever" with only 26 bytes.

47 49 46 38 39 61 01 00 01 00 
00 ff 00 2c 00 00 00 00 01 00 
01 00 00 02 00 3b

Python literal:

b'GIF89a\x01\x00\x01\x00\x00\xff\x00,\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x00;'


While I realize this is far from the smallest valid jpeg and has little or nothing to do with your actual question, I felt I should share this as I'd been looking for a very small JPEG that actually looked like something to do some testing with when i'd found your question. I'm sharing it here because its valid, its small, and it makes me ROFL.

Here is a 384 byte JPEG image that I made in photoshop. It is the letters ROFL hand drawn by me and then saved with max compression settings while still being sort of readable.

Hex sequences:

my @image_hex = qw{
 FF D8 FF E0 00 10 4A 46 49 46 00 01 02 00 00 64
 00 64 00 00 FF EC 00 11 44 75 63 6B 79 00 01 00
 04 00 00 00 00 00 00 FF EE 00 0E 41 64 6F 62 65
 00 64 C0 00 00 00 01 FF DB 00 84 00 1B 1A 1A 29
 1D 29 41 26 26 41 42 2F 2F 2F 42 47 3F 3E 3E 3F
 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47
 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47
 47 47 47 47 47 47 47 47 47 47 47 47 01 1D 29 29
 34 26 34 3F 28 28 3F 47 3F 35 3F 47 47 47 47 47
 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47
 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47
 47 47 47 47 47 47 47 47 47 47 47 47 47 FF C0 00
 11 08 00 08 00 19 03 01 22 00 02 11 01 03 11 01
 FF C4 00 61 00 01 01 01 01 00 00 00 00 00 00 00
 00 00 00 00 00 00 04 02 05 01 01 01 01 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 02 04 10 00 02
 02 02 02 03 01 00 00 00 00 00 00 00 00 00 01 02
 11 03 00 41 21 12 F0 13 04 31 11 00 01 04 03 00
 00 00 00 00 00 00 00 00 00 00 00 00 21 31 61 71
 B1 12 22 FF DA 00 0C 03 01 00 02 11 03 11 00 3F
 00 A1 7E 6B AD 4E B6 4B 30 EA E0 19 82 39 91 3A
 6E 63 5F 99 8A 68 B6 E3 EA 70 08 A8 00 55 98 EE
 48 22 37 1C 63 19 AF A5 68 B8 05 24 9A 7E 99 F5
 B3 22 20 55 EA 27 CD 8C EB 4E 31 91 9D 41 FF D9
}; #this is a very tiny jpeg. it is a image representaion of the letters "ROFL" hand drawn by me in photoshop and then saved at the lowest possible quality settings where the letters could still be made out :)

my $image_data = pack('H2' x scalar(@image_hex), @image_hex);
my $url_escaped_image = uri_escape( $image_data );

URL escaped binary image data (can paste right into a URL)

%FF%D8%FF%E0%00%10JFIF%00%01%02%00%00d%00d%00%00%FF%EC%00%11Ducky%00%01%00%04%00%00%00%00%00%00%FF%EE%00%0EAdobe%00d%C0%00%00%00%01%FF%DB%00%84%00%1B%1A%1A)%1D)A%26%26AB%2F%2F%2FBG%3F%3E%3E%3FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%01%1D))4%264%3F((%3FG%3F5%3FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%FF%C0%00%11%08%00%08%00%19%03%01%22%00%02%11%01%03%11%01%FF%C4%00a%00%01%01%01%01%00%00%00%00%00%00%00%00%00%00%00%00%00%04%02%05%01%01%01%01%00%00%00%00%00%00%00%00%00%00%00%00%00%00%02%04%10%00%02%02%02%02%03%01%00%00%00%00%00%00%00%00%00%01%02%11%03%00A!%12%F0%13%041%11%00%01%04%03%00%00%00%00%00%00%00%00%00%00%00%00%00!1aq%B1%12%22%FF%DA%00%0C%03%01%00%02%11%03%11%00%3F%00%A1~k%ADN%B6K0%EA%E0%19%829%91%3Anc_%99%8Ah%B6%E3%EAp%08%A8%00U%98%EEH%227%1Cc%19%AF%A5h%B8%05%24%9A~%99%F5%B3%22%20U%EA'%CD%8C%EBN1%91%9DA%FF%D9


Here's the C++ routine I wrote to do this:

bool is_jpeg(const unsigned char* img_data, size_t size)
{           
    return img_data &&
           (size >= 10) &&
           (img_data[0] == 0xFF) &&
           (img_data[1] == 0xD8) &&
           ((memcmp(img_data + 6, "JFIF", 4) == 0) ||
            (memcmp(img_data + 6, "Exif", 4) == 0));
}

img_data points to a buffer containing the JPEG data.

I'm sure you need more bytes to have a JPEG that will decode to a useful image, but it's a fair bet that if the first 10 bytes pass this test, the buffer probably contains a JPEG.

EDIT: You can, of course, replace the 10 above with a higher value once you decide on one. 134, as suggested in another answer, for example.


It is not a requirement that JPEGs contain either a JFIF or Exif marker. But they must start with FF D8, and they must have a marker following that, so you can check for FF D8 FF.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号