Check if a string variable is in a set of strings_问答_开发者

Check if a string variable is in a set of strings

开发者 https://www.devze.com 2023-02-16 21:09 出处：网络

Which one is better: x == \'abc\' || 开发者_运维百科x == \'def\' || x == \'ghi\' %w(abc def ghi).include? x

相关专题：ruby

Which one is better:

x == 'abc' || 开发者_运维百科x == 'def' || x == 'ghi'
%w(abc def ghi).include? x
x =~ /abc|def|ghi/

Which one is better? The question can't be easily answered, because they don't all do the same things.

x == 'abc' || x == 'def' || x == 'ghi'
%w(abc def ghi).include? x

compare x against fixed strings for equality. x has to be one of those values. Between those two I tend to go with the second because it's easier to maintain. Imagine what it would look like if you had to compare against twenty, fifty or one hundred strings.

The third test:

x ~= /abc|def|ghi/

matches substrings:

x = 'xyzghi'
(x =~ /abc|def|ghi/) # => 3

so it isn't the same as the first two.

EDIT: There are some things in the benchmarks done by nash that I'd do differently. Using Ruby 1.9.2-p180 on a MacBook Pro, this tests 1,000,000 loops and compares the results of anchoring the regex, using grouping, along with not splitting the %w() array each time through the loop:

require 'benchmark'
str = "test"

n = 1_000_000
Benchmark.bm do |x|
  x.report { n.times { str == 'abc' || str == 'def' || str == 'ghi' } }
  x.report { n.times { %w(abc def ghi).include? str } }
  x.report { ary = %w(abc def ghi); n.times { ary.include? str } }
  x.report { n.times { str =~ /abc|def|ghi/ } }
  x.report { n.times { str =~ /^abc|def|ghi$/ } }
  x.report { n.times { str =~ /^(abc|def|ghi)$/ } }
  x.report { n.times { str =~ /^(?:abc|def|ghi)$/ } }
  x.report { n.times { str =~ /\b(?:abc|def|ghi)\b/ } }
end
# >>       user     system      total        real
# >>   1.160000   0.000000   1.160000 (  1.165331)
# >>   1.920000   0.000000   1.920000 (  1.920120)
# >>   0.990000   0.000000   0.990000 (  0.983921)
# >>   1.070000   0.000000   1.070000 (  1.068140)
# >>   1.050000   0.010000   1.060000 (  1.054852)
# >>   1.060000   0.000000   1.060000 (  1.063909)
# >>   1.060000   0.000000   1.060000 (  1.050813)
# >>   1.050000   0.000000   1.050000 (  1.056147)

The first might be a tad quicker, since there are no method calls and your doing straight string comparisons, but its also probably the least readable and least maintainable.

The second is definitely the grooviest, and the ruby way of going about it. It's the most maintainable, and probably the best to read.

The last way uses old school perl regex syntax. Fairly fast, not as annoying as the first to maintain, fairly readable.

I guess it depends what you mean by "better".

some benchmarks:

require 'benchmark'
str = "test"
Benchmark.bm do |x|
  x.report {100000.times {if str == 'abc' || str == 'def' || str == 'ghi'; end}}
  x.report {100000.times {if %w(abc def ghi).include? str; end}}
  x.report {100000.times {if str =~ /abc|def|ghi/; end}}
end

    user     system      total        real
0.250000   0.000000   0.250000 (  0.251014)
0.374000   0.000000   0.374000 (  0.402023)
0.265000   0.000000   0.265000 (  0.259014)

So as you can see the first way works faster then other. And the longer str, the slower the last way works:

str = "testasdasdasdasdasddkmfskjndfbdkjngdjgndksnfg"
    user     system      total        real
0.234000   0.000000   0.234000 (  0.248014)
0.405000   0.000000   0.405000 (  0.403023)
1.046000   0.000000   1.046000 (  1.038059)