I want to select blocks of text within given string. These blocks of text have almost similar pattern. For example, in text given below I want to capture line starting with "client" i.e. I want to select information of 3 clients given in text below. Sometimes this information may not start with word "client", it may start with word "customer" or "project title" or "employer" words.
1. Client Name
The XXX Company
Title
Application Dev Office
Period
September 2008 Till date
Role
Quality Analyst Lead
Responsibilities
Testing
Client Name
The XYZ Company
Title
Application web
Period
September 2009 Till date
Role
Quality Tester Lead
Responsibilities
Testing and destroying
3) Client Name
The 1234 Company
Title
Application web RIA
Period
September 2209 Till date
Role
Quality Lead
Responsibilities
Developer
I have created one regular expression for this and it is as follows :
(\n|\r|\a|\f)(\s|\d|\.)*?
(?<id>(Client|Customer|Role|Organi(s|z)ation|Vendor|Company|Employer))
(\s|\S)*?(?=(\n|\r|\a|\f)(\s|\d|\.)*?(\k'id'))*?
I have used (\n|\r|\a|\f)
because when I load text from file into string variable, ^ is not able to identify these characters.
Problem with this regular expression is that, it is able to identify information of first two clients, but its not able to identify last client's information.
Anyone knows how to develope regular expression for this? I am using C# for this.
Thanks in advance.
------------------ EDITED PART -------------------
I need to develop regex like, if client related information starts with word "client", then look if there r words like "role","enviornment", "vendor" are present.If such words are present, then only we can say t开发者_开发问答hat it is client related information. But in some cases this information can start with other words like "employer". In this case we still have to search for words like "role","enviornment", "vendor". That is the reason I created my regex like
(?<id>(Client|Customer|Role|Organi(s|z)ation|Vendor|Company|Employer))
If word "client" matches, then any of the words except "client" should match in subsequent text. If any of the words is found, then again start looking for "client".
You Regex may be falling over due to the ")" on item 3. Add a ) to the regex and it seems to go on fine:
(\n|\r|\a|\f)(\s|\d|\.|\))*?(?<id>(Client|Customer|Role|Organi(s|z)ation|Vendor|Company|Employer))(\s|\S)*?(?=(\n|\r|\a|\f)(\s|\d|\.)*?(\k'id'))*?
This will grab each client block, if that's what you want:
Regex regexObj = new Regex("^[^A-Za-z]*Client(?:(?!^[^A-Za-z]*Client).)*", RegexOptions.Singleline | RegexOptions.Multiline);
精彩评论