Given the following code:
options = {}
optparse = OptionParser.new do |opts|
opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
options[:things] = t
end
end
If THING1
has a comma in it, how can I prevent OptionParser from splitting on it?
Sample case: ./scrit.rb -t 'foo,bar',baz
. In this case I开发者_开发百科 want options[:things]
should be ['foo,bar', 'baz']
Is this even possible?
If your run:
./scrit.rb -t 'foo,bar',baz
shell pass ARGV:
["-t", "foo,bar,baz"]
Shell converts 'foo,bar',baz to foo,bar,baz:
$ strace -e trace=execve ./scrit.rb -t 'foo,bar',baz
execve("./scrit.rb", ["./scrit.rb", "-t", "foo,bar,baz"], [/* 52 vars */]) = 0
execve("/home/scuawn/bin/ruby", ["ruby", "./scrit.rb", "-t", "foo,bar,baz"], [/* 52 vars */]) = 0
You can use other delimiter:
opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
options[:things] = t
options[:things][0] = options[:things][0].split(":")
end
$ ./scrit.rb -t foo:bar,baz
[["foo", "bar"], "baz"]
Or:
opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
options[:things] = t
options[:things] = options[:things].length == 3 ? [[options[:things][0],options[:things][1]],options[:things][2]] : options[:things]
end
$ ./scrit.rb -t foo,bar,baz
[["foo", "bar"], "baz"]
First of all, the shell1 yields the same final value for all of the following quoting variations:
./scrit.rb -t 'foo,bar',baz
./scrit.rb -t foo,'bar,baz'
./scrit.rb -t 'foo,bar,baz'
./scrit.rb -t foo,bar,baz
./scrit.rb -t fo"o,b"ar,baz
./scrit.rb -t foo,b\ar,baz
# obviously many more variations are possible
You can verify this like so:
ruby -e 'f=ARGV[0];ARGV.each_with_index{|a,i|puts "%u: %s <%s>\n" % [i,a==f,a]}'\
'foo,bar',baz foo,'bar,baz' 'foo,bar,baz' foo,bar,baz fo"o,b"ar,baz foo,b\ar,baz
1 I am assuming a Bourne-like shell (some sh-variant like zsh, bash, ksh, dash, et cetera).
If you want to switch to some other separator, you might do it like this:
split_on_semicolons = Object.new
OptionParser.accept split_on_semicolons do |s,|
s.split ';'
end
⋮
opts.on('-t', '--thing [THING1;THING2]', split_on_semicolons, 'Set THING1, THING2 (semicolon must be quoted to protect it from the shell)') do |t|
options[:things] = t
end
The shell gives special meaning to the semicolon, so it must be escape or quoted (otherwise it serves as an unconditional command separator (e.g. echo foo; sleep 2; echo bar
)):
./scrit.rb -t foo,bar\;baz
./scrit.rb -t foo,bar';'baz
./scrit.rb -t 'foo,bar;baz'
# et cetera
The “parsing” done when you specify Array
is almost exactly a basic str.split(',')
(it also drops empty string values), so there is no way to directly specify an escape character.
If you want to stick with commas but introduce an “escape character”, then you could post-process the values a bit in your OptionParser#on
block to stitch certain values back together:
# use backslash as an after-the-fact escape character
# in a sequence of string values,
# if a value ends with a odd number of backslashes, then
# the last backslash should be replaced with
# a command concatenated with the next value
# a backslash before any other single character is removed
#
# basic unsplit: (note doubled backslashes due to writing these as Ruby values)
# %w[foo\\ bar baz] => %w[foo,bar baz]
#
# escaped, trailing backslash is not an unsplit:
# %w[foo\\\\ bar baz] => %w[foo\\ bar baz]
#
# escaping [other, backslash, split], also consecutive unsplits
# %w[f\\o\\\\o\\ \\\\\\bar\\\\\\ baz] => %w[fo\\o,\\bar\\,baz]
def unsplit_and_unescape(orig_values)
values = []
incompleteValue = nil
orig_values.each do |val|
incomplete = /\\*$/.match(val)[0].length.odd?
val.gsub! /\\(.)/, '\1'
val = incompleteValue + ',' + val if incompleteValue
if incomplete
incompleteValue = val[0..-2]
else
values << val
incompleteValue = nil
end
end
if incompleteValue
raise ArgumentError, 'Incomplete final value'
end
values
end
⋮
opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2 (use \\, to include a comma)') do |t|
options[:things] = unsplit_and_unescape(t)
end
You could then run it from the shell like this (the backslash is also special to the shell, so it must be escaped or quoted2):
./scrit.rb -t foo\\,bar,baz
./scrit.rb -t 'foo\,bar,baz'
./scrit.rb -t foo'\,'bar,baz
./scrit.rb -t "foo\\,bar,baz"
./scrit.rb -t fo"o\\,ba"r,baz
# et cetera
2 Unlike in Ruby, the shell’s single quote is completely literal (e.g. no backslashes are interpreted), so it is often a good choice when you need to embed any other shell-special characters (like backslashes and double quotes).
精彩评论