Scala - get unique values from List with a twist_问答_开发者

I have a list like this:

val l= List(("Agent", "PASS"), ("Agent", "FAIL"), ("Agent 1", "FAIL"), ("Agent", "PASS"), ("Agent 2", "PASS") )

and I need to end up with a list like this:

val filteredList= List(("Agent", "FAIL"), ("Agent 1", "FAIL"), ("Agent 2", "PASS") )

What happened?

("Agent", "PASS"), ("Agent", "FAIL")

becomes

("Agent", "FAIL")

(because if there is at least one FAIL, I need to keep that entry)

the entries for Agent 1 and Agent 2 stay the same because there are just one e开发者_开发百科ntry for each.

The closest answer I found is How in Scala to find unique items in List but I cannot tell how to keep the entries with FAIL.

I hope the question is clear, if not, I can give you a better example.

Thanks

Preamble

It occurred to me that the status could be seen as having a priority, and if given a sequence of (agent,status) pairs then the task is to select only the highest priority status for each agent. Unfortunately, status isn't strongly typed with an explicit ordering so defined, but... as it's a string with only two values we can safely use string ordering as having a 1:1 correspondence to the priority.

Both my answers take advantage of two useful facts:

In natural string ordering, "FAIL" < "PASS", so:

List("PASS", "FAIL", "PASS").sorted.head = "FAIL"

For two tuples (x,a) and (x,b), (x,a) > (x, b) if (a > b)

UPDATED REPLY

val solution = l.sorted.reverse.toMap

When converting a Seq[(A,B)] to a Map[A,B] via the .toMap method, each "key" in the original sequence of tuples can only appear in the resulting Map once. As it happens, the conversion uses the last such occurrence.

l.sorted.reverse = List(
  (Agent 2,PASS),  // <-- Last "Agent 2"
  (Agent 1,FAIL),  // <-- Last "Agent 1"
  (Agent,PASS),
  (Agent,PASS),
  (Agent,FAIL))    // <-- Last "Agent"

l.sorted.reverse.toMap = Map(
  Agent 2 -> PASS,
  Agent 1 -> FAIL,
  Agent -> FAIL)

ORIGINAL REPLY

Starting with the answer...

val oldSolution = (l groupBy (_._1)) mapValues {_.sorted.head._2}

...and then showing my working :)

//group
l groupBy (_._1) = Map(
  Agent 2 -> List((Agent 2,PASS)),
  Agent 1 -> List((Agent 1,FAIL)),
  Agent -> List((Agent,PASS), (Agent,FAIL), (Agent,PASS))
)

//extract values
(l groupBy (_._1)) mapValues {_.map(_._2)} = Map(
  Agent 2 -> List(PASS),
  Agent 1 -> List(FAIL),
  Agent -> List(PASS, FAIL, PASS))

//sort
(l groupBy (_._1)) mapValues {_.map(_._2).sorted} = Map(
  Agent 2 -> List(PASS),
  Agent 1 -> List(FAIL),
  Agent -> List(FAIL, PASS, PASS))

//head
(l groupBy (_._1)) mapValues {_.map(_._2).sorted.head} = Map(
  Agent 2 -> PASS,
  Agent 1 -> FAIL,
  Agent -> FAIL)

However, you can directly sort the agent -> status pairs without needing to first extract _2:

//group & sort
(l groupBy (_._1)) mapValues {_.sorted} = Map(
  Agent 2 -> List((Agent 2,PASS)),
  Agent 1 -> List((Agent 1,FAIL)),
  Agent -> List((Agent,FAIL), (Agent,PASS), (Agent,PASS)))

//extract values
(l groupBy (_._1)) mapValues {_.sorted.head._2} = Map(
  Agent 2 -> PASS,
  Agent 1 -> FAIL,
  Agent -> FAIL)

In either case, feel free to convert back to a List of Pairs if you wish:

l.sorted.reverse.toMap.toList = List(
  (Agent 2, PASS),
  (Agent 1, FAIL),
  (Agent, FAIL))

Is this what you want?

jem@Respect:~$ scala
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) Client VM, Java 1.6.0_21).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val l= List(("Agent", "PASS"), ("Agent", "FAIL"), ("Agent 1", "FAIL"), ("Agent", "PASS"), ("Agent 2", "PASS") )
l: List[(java.lang.String, java.lang.String)] = List((Agent,PASS), (Agent,FAIL), (Agent 1,FAIL), (Agent,PASS), (Agent 2,PASS))

scala> l.foldLeft(Map.empty[String, String]){(map,next) =>
     |   val (agent, result) = next
     |   if ("FAIL" == result) map.updated(agent, result)
     |   else {           
     |     val maybeExistingResult = map.get(agent)
     |     if (maybeExistingResult.map(_ == "FAIL").getOrElse(false)) map
     |     else map.updated(agent, result)
     |   }
     | }
res0: scala.collection.immutable.Map[String,String] = Map((Agent,FAIL), (Agent 1,FAIL), (Agent 2,PASS))

scala> res0.toList
res1: List[(String, String)] = List((Agent 2,PASS), (Agent 1,FAIL), (Agent,FAIL))

Or here is a shorter and more obscure solution:

scala> l.groupBy(_._1).map(pair => (pair._1, pair._2.reduceLeft((a,b) => if ("FAIL" == a._2 || "FAIL" == b._2) (a._1, "FAIL") else a))).map(_._2).toList
res2: List[(java.lang.String, java.lang.String)] = List((Agent 2,PASS), (Agent 1,FAIL), (Agent,FAIL))

Plenty of good solutions, but here is mine anyway. :-)

l
.groupBy(_._1) // group by key
.map { 
    case (key, list) => 
        if (list.exists(_._2 == "FAIL")) (key, "FAIL") 
        else (key, "PASS")
}

Here's another I just had at a sudden epiphany:

def booleanToString(b: Boolean) = if (b) "PASS" else "FAIL"
l
.groupBy(_._1)
.map {
    case (key, list) => key -> booleanToString(list.forall(_._2 == "PASS"))
}

Here is my take. First a functional solution:

l.map(_._1).toSet.map({n:String=>(n, if(l contains (n,"FAIL")) "FAIL" else "PASS")})

First we isolate the names, uniquely (toSet), then we map each name to a tuple with itself as first element, and either "FAIL" as second element if a fail is contained in l, or otherwise it must obviously be a "PASS".

The result is a set. Of course you can do toList at the end of the call chain if you really need a list.

Here is an imperative solution:

var l = List(("Agent", "PASS"), ("Agent", "FAIL"), ("Agent 1", "FAIL"), ("Agent", "PASS"), ("Agent 2", "PASS"))
l.foreach(t=>if(t._2=="FAIL") l=l.filterNot(_ == (t._1,"PASS")))
l=l.toSet.toList

I don't like it as much because it is imperative, but hey. In some sense, it reflects better what you would actually do when you'd solve this by hand. For each "FAIL" you see, you remove all corresponding "PASS"es. After that, you ensure uniqueness (.toSet.toList).

Note that l is a var in the imperative solution, which is necessary because it gets reassigned.

Look at Aggregate list values in Scala

In your case you'd group by Agent and aggregate by folding PASS+PASS=>PASS and ANY+FAIL=>FAIL.

Perhaps more efficient to group first, then find the disjuction of PASS/FAIL:

l.filter(_._2 == "PASS").toSet -- l.filter(_._2 == "FAIL").map(x => (x._1, "PASS"))

This is based on your output of ("Agent", "PASS") but if you just want the agents:

l.filter(_._2 == "PASS").map(x => x._1).toSet -- l.filter(_._2 == "FAIL").map(x => x._1)

Somehow I expected that second one to be shorter.

So as I understand it, you want to:

Group the tuples by their first entry ("key")
For each key, check all tuple second entries for the value "FAIL"
Produce (key, "FAIL") if you find "FAIL" or (key, "PASS") otherwise

Since I still find foldLeft, reduceLeft, etc. hard to read, here's a direct translation of the steps above into for comprehensions:

scala> for ((key, keyValues) <- l.groupBy{case (key, value) => key}) yield {
     |   val hasFail = keyValues.exists{case (key, value) => value == "FAIL"}
     |   (key, if (hasFail) "FAIL" else "PASS")                              
     | }
res0: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map((Agent 2,PASS), (Agent 1,FAIL), (Agent,FAIL))

You can call .toList at the end there if you really want a List.

Edit: slightly modified to use the exists idiom suggested by Daniel C. Sobral.

Do you need to preserve the original order? If not, the shortest solution I know of (also quite straightforward) is:

{
  val fail = l.filter(_._2 == "FAIL").toMap        // Find all the fails
  l.filter(x => !fail.contains(x._1)) ::: fail.toList // All nonfails, plus the fails
}

but this won't remove extra passes. If you want that, then you need an extra map:

{
  val fail = l.filter(_._2 == "FAIL").toMap
  l.toMap.filter(x => !fail.contains(x._1)).toList ::: fail.toList
}

On the other hand, you might want to take the elements in the same order you originally found them. This is trickier because you need to keep track of when the first interesting item appeared:

{
  val fail = l.filter(_._2 == "FAIL").toMap
  val taken = new scala.collection.mutable.HashMap[String,String]
  val good = (List[Boolean]() /: l)((b,x) => {
    val okay = (!taken.contains(x._1) && (!fail.contains(x._1) || x._2=="FAIL"))
    if (okay) taken += x
    okay :: b
  }).reverse
  (l zip good).collect{ case (x,true) => x }
}