I have a hex dump of a message in a file which i want to get it in an array so i can perform the decoding logic on it.
I was wondering if that was a easier way to parse a message which looks like this.37 39 30 35 32 34 35 34 3B 32 31 36 39 33 34 35
3B 32 31 36 39 33 34 36 00 00 01 08 40 00 00 15 6C 71 34 34 73 69 6D 31 5F 33 30 33 31 00 00 00 00 00 01开发者_如何学编程 28 40 00 00 15 74 65 6C 63 6F 72 64 69 74 65 6C 63 6F 72 64 69
Note that the data can be max 16 bytes on any row. But any row can contain fewer bytes too (minimum :1 )
Is there a nice and elegant way rather than to read 2 chars at a time in perl ?Perl has a hex
operator that performs the decoding logic for you.
hex EXPR
hex
Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either
0
,0x
, or0b
, seeoct
.) If EXPR is omitted, uses$_
.print hex '0xAf'; # prints '175' print hex 'aF'; # same
Remember that the default behavior of split
chops up a string at whitespace separators, so for example
$ perl -le '$_ = "a b c"; print for split' a b c
For every line of the input, separate it into hex values, convert the values to numbers, and push
them onto an array for later processing.
#! /usr/bin/perl
use warnings;
use strict;
my @values;
while (<>) {
push @values => map hex($_), split;
}
# for example
my $sum = 0;
$sum += $_ for @values;
print $sum, "\n";
Sample run:
$ ./sumhex mtanish-input 4196
I would read a line at a time, strip the whitespace, and use pack 'H*'
to convert it. It's hard to be more specific without knowing what kind of "decoding logic" you're trying to apply. For example, here's a version that converts each byte to decimal:
while (<>) {
s/\s+//g;
my @bytes = unpack('C*', pack('H*', $_));
print "@bytes\n";
}
Output from your sample file:
55 57 48 53 50 52 53 52 59 50 49 54 57 51 52 53
59 50 49 54 57 51 52 54 0 0 1 8 64 0 0 21
108 113 52 52 115 105 109 49 95 51 48 51 49 0 0 0
0 0 1 40 64 0 0 21 116 101 108 99 111 114 100 105
116 101 108 99 111 114 100 105
I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units.
Is there some reason you think that's ugly?
If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.
精彩评论