开发者

How to store binary data in a Lua string

开发者 https://www.devze.com 2023-01-18 10:50 出处:网络
I needed to create a custom file format with embedded meta information. Instead of whipping up my own format I decide to just use Lua.

I needed to create a custom file format with embedded meta information. Instead of whipping up my own format I decide to just use Lua.

texture
{
   format=GL_LUMINANCE_ALPHA;
开发者_StackOverflow   type=GL_UNSIGNED_BYTE;
   width=256;
   height=128;
   pixels=[[
<binary-data-here>]];
}

texture is a function that takes a table as its sole argument. It then looks up the various parameters by name in the table and forwards the call on to a C++ routine. Nothing out of the ordinary I hope.

Occasionally the files fail to parse with the following error:

my_file.lua:8: unexpected symbol near ']'

What's going on here?

Is there a better way to store binary data in Lua?


Update

It turns out that storing binary data is a Lua string is non-trivial. But it is possible when taking care with 3 sequences.

  • Long-format-string-literals cannot have an embedded closing-long-bracket (]], ]=], etc).

    This one is pretty obvious.

  • Long-format-string-literals cannot end with something like ]== which would match the chosen closing-long-bracket.

    This one is more subtle. Luckily the script will fail to compile if done wrong.

  • The data cannot embed \n or \r.

    Lua's built in line-end processing messes these up. This problem is much more subtle. The script will compile fine but it will yield the wrong data. 0x13 => 0x10, 0x1013 => 0x10, etc.

To get around these limitations I split the binary data up on \r, \n, then pick a long-bracket that works, finally emit Lua that concats the various parts back together. I used a script that does this for me.

input: XXXX\nXX]]XX\r\nXX]]XX]=

texture
{
  --other fields omitted      
  pixels= '' ..
     [[XXXX]] ..
     '\n' ..
     [=[XX]]XX]=] ..
     '\r\n' ..
     [==[XX]]XX]=]==];
}


Lua is able to encode most characters in long bracket format including nulls. However, Lua opens the script file in text mode and this causes some problems. On my Windows system the following characters have problems:

Char code(s)      Problem
--------------    -------------------------------
13 (CR)           Is translated to 10 (LF)
13 10 (CR LF)     Is translated to 10 (LF)
26 (EOF)          Causes "unfinished long string near '<eof>'"

If you are not using windows than these may not cause problems, but there may be different text-mode based problems.


I was only able to produce the error you received by encoding multiple close brackets:

a=[[
]]] --> a.lua:2: unexpected symbol near ']'

But, this was easily fixed with the following:

a=[==[
]]==]


The binary data needs to be encoded into printable characters. The simplest method for decoding purposes would be to use C-like escape sequences for all bytes. For example, hex bytes 13 41 42 1E would be encoded as '\19\65\66\30'. Of course, then the encoded data is three to four times larger than the source binary.

Alternatively, you could use something like Base64, but that would have to be decoded at runtime instead of relying on the Lua interpreter. Personally, I'd probably go the Base64 route. There are Lua examples of Base64 encoding and decoding.

Another alternative would be have two files. Use a well defined image format file (e.g. TGA) that is pointed to by a separate Lua script with the additional metadata. If you don't want two files to move around then they could be combined in an archive.

0

精彩评论

暂无评论...
验证码 换一张
取 消