开发者

Simple way for removing all non word characters

开发者 https://www.devze.com 2023-04-06 20:34 出处:网络
I\'d like to remove all characters from string, usi开发者_开发问答ng most simple way. For example

I'd like to remove all characters from string, usi开发者_开发问答ng most simple way. For example from "a,sd3 31ds" to "asdds" I cad do it something like this:

"a,sd3 31ds".gsub(/\W/, "").gsub(/\d/,"")
# => "asdds"

but it looks a little bit awkward. Maybe it is possible to merge these rexegs in one?


"a,sd3 31ds".gsub(/(\W|\d)/, "")


I would go for the regexp /[\W\d]+/. It is potentially faster than e.g. /(\W|\d)/.

require 'benchmark' 

N = 500_000
Regexps = [ "(\\W|\\d)", "(\\W|\\d)+", "(?:\\W|\\d)", "(?:\\W|\\d)+", 
            "\\W|\\d", "[\\W\\d]", "[\\W\\d]+" ]

Benchmark.bm(15) do |x|  
  Regexps.each do | re_str |
    re = Regexp.new(re_str)
    x.report("/#{re_str}/:") { N.times { "a,sd3 31ds".gsub(re, "") }}
  end
end   

gives (with ruby 2.0.0p195 [x64-mingw32])

                      user     system      total        real
/(\W|\d)/:        1.950000   0.000000   1.950000 (  1.951437)
/(\W|\d)+/:       1.794000   0.000000   1.794000 (  1.787569)
/(?:\W|\d)/:      1.857000   0.000000   1.857000 (  1.855515)
/(?:\W|\d)+/:     1.638000   0.000000   1.638000 (  1.626698)
/\W|\d/:          1.856000   0.000000   1.856000 (  1.865506)
/[\W\d]/:         1.732000   0.000000   1.732000 (  1.754596)
/[\W\d]+/:        1.622000   0.000000   1.622000 (  1.617705)


You can do this with the regex "OR".

"205h2n0bn  r0".gsub(/\W|\d/, "")

will do the trick :)


What about

"a,sd3 31ds".gsub(/\W|\d/,"")

You can always join regular expressions by | to express an "or".


You can try this regex:

\P{L}

not Unicode letter, but I don't know, does Ruby support this class.


A non regex solution:

"a,sd3 31ds".delete('^A-Za-z')
0

精彩评论

暂无评论...
验证码 换一张
取 消