开发者

Regex to extract the search term in search phrase

开发者 https://www.devze.com 2022-12-12 10:55 出处:网络
I have the following search phrase and I need to extract ABC XYZ Mob开发者_StackOverflow中文版ile Accessories

I have the following search phrase and I need to extract

  1. ABC XYZ
  2. Mob开发者_StackOverflow中文版ile Accessories
  3. Samsung 250

whenever they occur in the string in any order. The application is C# .Net.

Search Phrase
__________________________________________________________
ABC XYZ
ABC XYZ category:"Mobile Accessories"
category:"Mobile Accessories" ABC XYZ
ABC XYZ Model:"Samsung 250"
Model:"Samsung 250" ABC XYZ
ABC XYZ category:"Mobile Accessories" Model:"Samsung 250"
Model:"Samsung 250" category:"Mobile Accessories" ABC XYZ
category:"Mobile Accessories" Model:"Samsung 250" ABC XYZ
__________________________________________________________

Thanks in advance.

Example 1 Input - ABC XYZ category:"Mobile Accessories" Output - ABC XYZ and Mobile Accessories

Example 2 Input - Model:"Samsung 250" category:"Mobile Accessories" ABC XYZ Output - Samsung 250, Mobile Accessories and ABC XYZ

Example 3 Input - ABC XYZ Output - ABC XYZ

Example 4 Input - Model:"Samsung 250" ABC XYZ Output - Samsung 250 and ABC XYZ


If you're literally trying to find explicit strings, the IndexOf method will work for you (EG: s.IndexOf("ABC XYZ")).

The syntax you show looks kind of like a field:"value" syntax though, so perhaps you want a regex like "([a-z]+):\"([^"]+)\"" (Which should match out field and value in pairs).

If that's not what you're after sorry, but the question is a bit vague.


As for Model and Category, you can capture them using something like that:

category:"([^"]*)"

This searches for the string category:" followed by a your category (which assumbly can change, followed by another ". Of course, in c# this should be escaped: @"category:""([^""]*)""".
Similarity, you can extract the Model: Model:"([^"]*)".

Not sure about the rest, but if you remove these two, you are left with the free string.


It seems like you want to extract a few different patterns from the same string. One approach would be to find each match and then remove it from your working string.

Example:

String workingstring = "ABC XYZ category:\"Mobile Accessories\"";

Regex categoryMatch("category:\"([^\"]+)\"");
Regex modelMatch("model:\"([^\"]+)\"");

String category = categoryMatch.Match(workingstring);
String model = modelMatch.Match(workingstring);

workingstring = Regex.Replace(workingstring, categoryMatch, "");
workingstring = Regex.Replace(workingstring, modelMatch, "");

String name = workingstring; //I assume that the extra data is the name

This will extract the Category, Model and Name regardless of the format of the string. You should note that malformed strings such as:

ABC Model:"Samsung 250" XYZ

Will return:

ABC  XYZ
0

精彩评论

暂无评论...
验证码 换一张
取 消