开发者

Split on different newlines

开发者 https://www.devze.com 2023-03-16 15:23 出处:网络
Right now I\'m doing 开发者_如何学运维a split on a string and assuming that the newline from the user is \\r\\n like so:

Right now I'm doing 开发者_如何学运维a split on a string and assuming that the newline from the user is \r\n like so:

string.split(/\r\n/)

What I'd like to do is split on either \r\n or just \n.

So how what would the regex be to split on either of those?


Did you try /\r?\n/ ? The ? makes the \r optional.

Example usage: http://rubular.com/r/1ZuihD0YfF


Ruby has the methods String#each_line and String#lines

returns an enum: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line

returns an array: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines

I didn't test it against your scenario but I bet it will work better than manually choosing the newline chars.


# Split on \r\n or just \n
string.split( /\r?\n/ )

Although it doesn't help with this question (where you do need a regex), note that String#split does not require a regex argument. Your original code could also have been string.split( "\r\n" ).


\n is for unix 
\r is for mac 
\r\n is for windows format

To be safe for operating systems. I would do /\r?\n|\r\n?/

"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]


The alternation operator in Ruby Regexp is the same as in standard regular expressions: |

So, the obvious solution would be

/\r\n|\n/

which is the same as

/\r?\n/

i.e. an optional \r followed by a mandatory \n.


Are you reading from a file, or from standard in?

If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n - it'll just look like \n.

C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"


Perhaps do a split on only '\n' and remove the '\r' if it exists?


Another option is to use String#chomp, which also handles newlines intelligently by itself.

You can accomplish what you are after with something like:

lines = string.lines.map(&:chomp)

Or if you are dealing with something large enough that memory use is a concern:

<string|io>.each_line do |line|
  line.chomp!
  #  do work..
end

Performance isn't always the most important thing when solving this kind of problem, but it is worth noting the chomp solution is also a bit faster than using a regex.

On my machine (i7, ruby 2.1.9):

Warming up --------------------------------------
           map/chomp    14.715k i/100ms
  split custom regex    12.383k i/100ms
Calculating -------------------------------------
           map/chomp    158.590k (± 4.4%) i/s -    794.610k in   5.020908s
  split custom regex    128.722k (± 5.1%) i/s -    643.916k in   5.016150s
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号