开发者

How to get text between two strings in ruby?

开发者 https://www.devze.com 2023-03-26 15:57 出处:网络
I have a text file that contains this text: What\'s New in this Version ========================== -This is the text I want to get

I have a text file that contains this text:

What's New in this Version
==========================
-This is the text I want to get 
-It can have 1 or many lines
-These equal signs are repeated throughout the file to separate sections

Primary Category
================

I just want to get everything between ========================== and Primary Category and store that block of text in a variable. I thought the following match method would work but it gives me, NoMethodError: undefined method `mat开发者_如何学JAVAch'

    f = File.open(metadataPath, "r")
    line = f.readlines
    whatsNew = f.match(/==========================(.*)Primary Category/m).strip

Any ideas? Thanks in advance.


f is a file descriptor - you want to match on the text in the file, which you read into line. What I prefer to do instead of reading the text into an array (which is hard to regex on) is to just read it into one string:

contents = File.open(metadataPath) { |f| f.read }
contents.match(/==========================(.*)Primary Category/m)[1].strip

The last line produces your desired output:

-This is the text I want to get \n-It can have 1 or many lines\n-These equal signs are repeated throughout the file to separate sections"


f = File.open(metadataPath, "r")
line = f.readlines
line =~ /==========================(.*)Primary Category/m
whatsNew = $1

you may want to consider refining the .* though as that could be greedy


Your problem is that readlines gives you an array of strings (one for each line), but the regular expression you're using needs a single string. You could read the file as one string:

contents = File.read(metadataPath)
puts contents[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category

or you could join the lines into a single string before applying the regular expression:

lines = File.readlines(metadataPath)
puts lines.join[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category


The approach I'd take is read in the lines, find out which line numbers are a series of equal signs (using Array#find_index), and group the lines into chunks from the line after the equal signs to the line before (or two lines before) the next lot of equal signs (probably using Enumerable#each_cons(2) and map). That way I don't have to modify much if the section headings change.

0

精彩评论

暂无评论...
验证码 换一张
取 消