开发者

Reading dataset value into a gnuplot variable (start of X series)

开发者 https://www.devze.com 2023-04-08 17:11 出处:网络
I originally thought this may be the same as gnuplot - start of X series - Stack Overflow - but I think this is slightly more specific.

I originally thought this may be the same as gnuplot - start of X series - Stack Overflow - but I think this is slightly more specific.

Since I'm interested in finding the "start of X series", so to speak - I'll try to clarify with an example; say you have this script:

# generate data
system "cat >开发者_高级运维 ./inline.dat <<EOF\n\
10.0 1 a 2\n\
10.2 2 b 2\n\
10.4 3 a 2\n\
10.6 4 b 2\n\
10.8 5 c 7\n\
11.0 5 c 7\n\
EOF\n"

# ranges 
set yrange [0:8]
set xrange [0:11.5]

plot "inline.dat" using 1:2 with impulses linewidth 2

If you plot it, you'll notice the data starts from 10 on x-axis:

Reading dataset value into a gnuplot variable (start of X series)

Now, of course you can adjust the xrange - but sometimes you're interested in "relative positions" which start "from 0", so to speak. Therefore, one would like to see the data "moved left" on the x-axis, so it starts at 0. Since we know the data starts at 10.0, we could subtract that from first column explicitly:

plot "inline.dat" using ($1-10.0):2 with impulses linewidth 2

... and that basically does the trick.

But say you don't want to specify the "10.0" explicitly in the plot command above; then - knowing that it is the first element of the first column of the data which is already loaded, one would hope there is a way to somehow read this value in a variable - say, with something like the following pseudocode:

varval = "inline.dat"(1,1) # get first element of first column in variable
plot "inline.dat" using ($1-varval):2 with impulses linewidth 2

... and with something like this, one wouldn't have to specify this "x offset" value, so to speak, manually in the plot command.

So - to rephrase - is there a way to read the start of x series (the first value of a given column in a dataset) as a variable in gnuplot?


Two ways:

1. Plot the function first and let gnuplot to tell the minimum x value:

plot "inline.dat" using 1:2 with impulses linewidth 2

xmin = GPVAL_DATA_X_MIN
plot "inline.dat" using ($1-xmin):2 with impulses linewidth 2

2. Use external script to figure out what is the minimum x value:

xmin = `sort -nk 1 inline.dat | head -n 1 | awk '{print $1}'`
plot "inline.dat" using ($1-xmin):2 with impulses linewidth 2


OK, I keep coming back to this - so I think I needed the following clarification here:

Given that gnuplot, well, plots datasets as 2D plots - it's a given that it somehow deals with 2D structures or arrays. That is why someone coming from C, Perl, Python etc. would naturally think it is possible to somehow index the dataset, and be able to retrieve a value at a given row and column position; say, something like the following pseudocode:

my_val = "inline.dat"[1][2]     # get value in 1st row and 2nd column

Or alternately, pseudocode:
my_dataset = parse_dataset("inline.dat")
my_val = get_value(my_dataset, 1, 2)

And I spent a ton of time looking for something like this in gnuplot, and cannot find anything like that (a direct variable access to dataset values through row and column index). It seems that the only thing one can do, is plot the dataset - and possibly access values there, via function called in the using part.

That means, that if I want to find some dataset values from gnuplot, I have to iterate through the dataset by calling plot - even if I need those values precisely to construct a proper plot statement :) And I kind of dislike that, thinking that the first plot may somehow screw up the second afterwards :) However, as finding maximum value in a data set and subtracting from plot - comp.graphics.apps.gnuplot | Google Groups points out, one can plot to a file, also stdout or /dev/null, and get a plain ASCII formatted table - so at least I can redirect the first call in that way, so it doesn't interfere with the actual plotting terminal of the second call to plot.

So, below is another code example, where the first element of first column in the "inline.dat" dataset is retrieved via:

# print and get (in _drcv) first value of first data column:
eval print_dataset_row_column("inline.dat",0,1)

# must "copy" the "return" value manually:
first = _drcv

... so then the plot can be offset by first directly in the plot call.

Note again that print_dataset_row_column calls plot (redirected via set table to /dev/null) -- and as such, each time you call it to retrieve a single value, it will cause iteration through the entire dataset! So if you need first element and last element (and possibly other stuff, like some basic statistics with gnuplot), it's probably better to rewrite print_dataset_row_column so it retrieves all of those in one go.

Also a print_dataset_row_column rewrite would be needed if you use some special formats in your dataset and the using line. Note that in this example, the third column is a string - which is not by default accepted as a plot data column; and as such, calls to the print_dataset_* functions will fail if they have to deal with it (see also gnuplot plot from string).

 

So here is the example code - let's call it test.gp:

# generate data
system "cat > ./inline.dat <<EOF\n\
10.0 1 a 2\n\
10.2 2 b 2\n\
10.4 3 a 2\n\
10.6 4 b 2\n\
10.8 5 c 7\n\
11.0 5 c 7\n\
EOF\n"

### "dry-run" functions:
# iterate through dataset by calling
# `plot`, redirected to file output (via `set table`)
#
# note: eval (not print) cannot be inside a user-defined function:
#  a(b) = eval('c=3') ; print a(4) ==> "undefined function: eval"
# nor you can make multistatement functions with semicolon:
#  f(x) = (2*x ; x=x+2) ==> ')' expected (at ';')
#
# so these functions are defined as strings - and called through eval
#
# through a single column spec in `using`:
# (`plot` prints table to stdout)
#
print_dataset_column(filename,col) = "set table '/dev/stdout' ;\
plot '".filename."' using ".col." ;\
unset table"
#
# through two column spec in `using`:
# (`plot` prints table to stdout)
#
print_dataset_twocolumn(filename,colA,colB) = "set table '/dev/stdout' ;\
plot '".filename."' using ".colA.":".colB." ;\
unset table"
#
# print value of row:column in dataset, saving it as _drcv variable
#
# init variable
#
_drcv = 0
#
# create _drc helper function; note assign and "return" in
# true branch of ternary clause
#
_drc(ri, colval, col) = (ri == _row) ? _drcv = colval : colval
#
# define the callable function:
#
print_dataset_row_column(filename,row,col) = "_row = ".row." ;\
set table '/dev/null' ;\
plot '".filename."' using (_drc($0, $".col.", ".col.")) ;\
unset table ;\
print '".filename."[r:".row.",c:".col."] = ',_drcv"
#
#
### end dry run functions


#
# test print_dataset_* functions:
#

eval print_dataset_column("inline.dat",0)
eval print_dataset_twocolumn("inline.dat",0,0)

# string column - cannot directly:
# set table '/dev/stdout' ;plot 'inline.dat' using 3 ;unset table
#                                                  ^
# line 69: warning: Skipping data file with no valid points
# line 69: x range is invalid
#~ eval print_dataset_column("inline.dat",3)

eval print_dataset_column("inline.dat",1)
eval print_dataset_twocolumn("inline.dat",1,2)

eval print_dataset_row_column("inline.dat",4,1)
eval print_dataset_row_column("inline.dat",4,2)

# will fail - 3 is string column
# line 82: warning: Skipping data file with no valid points
# line 82: x range is invalid
#~ eval print_dataset_row_column("inline.dat",4,3)


#
# do a plot offset by first element position
#

# print and get (in _drcv) first value of first data column:
eval print_dataset_row_column("inline.dat",0,1)
# must "copy" the "return" value manually:
first = _drcv

# ranges
set yrange [0:8]
set xrange [0:11.5]

# plot finally:
plot "inline.dat" using ($1-first):2 with impulses linewidth 2

When this script is called, the dataset in the OP is plotted moved, starting from 0 - and the following is output in terminal (the first few table printouts are the actual output from plot redirected via set table to stdout):

gnuplot> load './test.gp'

# Curve 0 of 1, 6 points
# Curve title: "'inline.dat' using 0"
# x y type
 0  0  i
 1  1  i
 2  2  i
 3  3  i
 4  4  i
 5  5  i


# Curve 0 of 1, 6 points
# Curve title: "'inline.dat' using 0:0"
# x y type
 0  0  i
 1  1  i
 2  2  i
 3  3  i
 4  4  i
 5  5  i


# Curve 0 of 1, 6 points
# Curve title: "'inline.dat' using 1"
# x y type
 0  10  i
 1  10.2  i
 2  10.4  i
 3  10.6  i
 4  10.8  i
 5  11  i


# Curve 0 of 1, 6 points
# Curve title: "'inline.dat' using 1:2"
# x y type
 10  1  i
 10.2  2  i
 10.4  3  i
 10.6  4  i
 10.8  5  i
 11  5  i

inline.dat[r:4,c:1] = 10.8
inline.dat[r:4,c:2] = 5.0
inline.dat[r:0,c:1] = 10.0


To read a single value from a data files consider the following user defined function:

at(file, row, col) = system( sprintf("awk -v row=%d -v col=%d 'NR == row {print $col}' %s", row, col, file) )
file="delta-fps" ; row=2 ; col=2
print at(file,row,col)

Of course, the input to awk has to be cleared of ignored/invalid input (blank lines, comments, etc). For example:

at(file, row, col) = system( sprintf("grep -v '^#|^$' %s | awk -v row=%d -v col=%d 'NR == row {print $col}'", file, row, col) )

Still, this function will not allow reading any dataset, it is restricted to files. However, this limitation can be overcome by checking for the redirection character '<' in the file name argument and replacing it in a sensible way (see the ternary operator):

at(file, row, col)=system( sprintf("%s | grep -v '^#\\|^$' | awk -v row=%d -v col=%d 'NR == row {print $col}'", (file[:1] eq '<') ? file[2:] :'cat '.file, row, col) )

A good point to define such a unction may be your .gnuplot init file.


Hmmm... OK, I got something:

initer(x) = (!exists("first")) ? first = x : first ;
plot "inline.dat" using ($1-initer($1)):2 with impulses linewidth 2

... but this looks more like "capturing" a variable, than reading it (as the function initer is being used to scan a stream of numbers, detect the first one, and return its value) :) Hope there is a better way of doing this ....

0

精彩评论

暂无评论...
验证码 换一张
取 消