开发者

Getting a list out of an extractor -- or even a Match

开发者 https://www.devze.com 2023-02-01 02:09 出处:网络
First, I was like cool... scala> var nameRE = \"\"\"\\W*(\\w+)\\W+(\\w+)\\W*\"\"\".r nameRE: scala.util.matching.Regex = \\W*(\\w+)\\W+(\\w+)\\W*

First, I was like cool...

scala> var nameRE = """\W*(\w+)\W+(\w+)\W*""".r
nameRE: scala.util.matching.Regex = \W*(\w+)\W+(\w+)\W*

scala> var nameRE(first, last) = "Will Smith " 
first: String = Will
last: String = Smith

Then I was like darn...

scala> var listOfVowels = "(([aeiou])*)".r
listOfVowels: scala.util.matching.Regex = (([aeiou])*)

scala> var listOfVowels(vowels:List[String]) = "uoiea"
<console>:7: error: scrutinee is incompatible with pattern type;
 found   : List[String]
 required: java.lang.String
       var listOfVowels(vowels:List[String]) = "uoiea"

Now I'm like huh ...

scala> (listOfVowels findFirstMatchIn "uoiea" get) subgroups
res35: List[String] = List(a)

[ In case my question isn't obvious: how do I get a list of all the subgroups actually matched by a pattern, ideally i开发者_开发问答n an extractor but in any case without writing a second-level matcher. The correct answer here would be List(u, o, i, e, a) of course. ]


This will let you extract the pattern:

scala> var listOfVowels(vowels @ _*) = "uoiea"
vowels: Seq[String] = List(uoiea, a)

However, the pattern doesn't do what you expect it to -- it doesn't generate multiple groups. The regex library rule is one set of parenthesis, one group, and every regex I know works that way.


To my knowledge it is not possible to convert a subpattern of type (E)* in a regular expression into a List of Strings. This would be because the regular expression mechanism used in Scala is implemented using the JDK's default implementation for regexes (java.util.Pattern, java.util.Matcher, etc.) and this implementation does just not support capturing multiplicities of subpatterns. In case of a match the subpattern (E)* will only capture the last match for that subpattern. To my knowledge, this behavior even holds true for the majority of regex implementations.

Although I believe your case to be a simplified version of your actual problem, a simple solution for the described case does exist :

scala> "[aeiou]".r findAllIn "hello, world!" toList
res1: List[String] = List(e, o, o)

Hopefully that helps.


This is a bit stylistically sloppy.

scala> implicit def mkRr(regex: String) = new { def rr = new { def unapply(s: String) = (regex.r findAllIn s toList) match { case Nil => None ; case xs => Some(xs) } } }
mkRr: (regex: String)java.lang.Object{def rr: java.lang.Object{def unapply(s: String): Option[List[String]]}}

scala> val ListOfVowels = "[aeiou]".rr
ListOfVowels: java.lang.Object{def unapply(s: String): Option[List[String]]} = $anon$1$$anon$2@49f2afad

scala> val ListOfVowels(vowels) = "uoiea"
vowels: List[String] = List(u, o, i, e, a)
0

精彩评论

暂无评论...
验证码 换一张
取 消