I would like to write a SparseVector[T]
class where T
can be a开发者_开发技巧 double, an int or a boolean.
The class will not be backed by an array (because I want a sparse data structure) but I have seen that when I build an empty array of an AnyVal
type, the elements are initialized to the default value. For instance:
scala> new Array[Int](10)
res0: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
scala> new Array[Boolean](10)
res1: Array[Boolean] = Array(false, false, false, false, false, false, false, false, false, false)
scala> new Array[Double](10)
res2: Array[Double] = Array(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
How can I include this default value in my class ? The behaviour I would like to get is:
val v = new SparseVector[Double](100)
println( v(12) ) // should print '0.0'
val w = new SparseVector[Boolean](100)
println( v(85) ) // should print 'false'
Thanks
You could leverage the fact that Scala already provides you a way of getting the default value of a type. When you write var x: Int = _
, this initialises x
to 0
. Similar for all AnyVal
types. All AnyRef
types are initialised to null
.
Bearing that in mind, you could rewrite your sparse vector class as following:
class SparseVector[T](val size: Int) {
import scala.collection.mutable.Map
private var default: T = _
private[this] val storage = Map[Int, T]()
def apply(key: Int) =
if(key < size)
storage.getOrElse(key, default)
else
throw new IllegalArgumentException("Index " + key + " out of bounds")
def update(key: Int, value: T) { storage(key) = value }
}
Now code like the following works as expected:
scala> val b = new SparseVector[Boolean](10)
b: SparseVector[Boolean] = SparseVector@cfd22a
scala> b(1)
res20: Boolean = false
scala> b(1) = true
scala> b(1)
res22: Boolean = true
scala> val i = new SparseVector[Int](10)
i: SparseVector[Int] = SparseVector@1813c12
scala> i(1)
res23: Int = 0
scala> i(1) = 10
scala> i(1)
res25: Int = 10
scala> i(10)
java.lang.IllegalArgumentException: Index 10 out of bounds
A couple of improvements I might make to this class:
- Have a `toString` method printing the collection in a reasonable way
- Provide a companion object which can change the default value of the vector if required (see the code below).
object SparseVector {
def apply[T](size: Int) = new SparseVector[T](size)
def apply[T](size: Int, default: T) = {
val result = new SparseVector[T](size)
result.default = default
result
}
}
Now this works:
scala> val b = SparseVector[Boolean](10, true)
b: SparseVector[Boolean] = SparseVector@126f29f
scala> b(4)
res28: Boolean = true
scala> val i = SparseVector[Int](10, 42)
i: SparseVector[Int] = SparseVector@b9979b
scala> i(3)
res30: Int = 42
EDIT: The code I have written works with Scala 2.7.6.final. Mitch Blevins has pointed out that the code yields null
as a default value for AnyVal
types when run with Scala 2.8r.19890. As explained in the comments, this should not be possible as Null
is not a subtype of AnyVal
.The general idea should be similar if using 2.8, as var b: Boolean = _
should still give you the default value of the Boolean
type. The usage of collections to store the sparse vector might be different, but as I said in the comment, I am not familiar with the 2.8 collection redesign.
EDIT2: ... the null
behaviour should not be possible, but unfortunately it is. Doing some more research into the problem it seems that due to type erasure the field default
always gets initialised to null
. And after that... weirdness ensues. See Mitch's post for a discussion and some bear-bones code reproducing the problem.
Things I have tried and failed in order to make the code work as it should:
null.asInstanceOf[T]
- nope, Java doesn't have reified generics. This still yieldsnull
@specialised
- nope, it seems that even if the compiler generates specialised code for primitives, you still get the null behaviour- Casting the result to an
AnyVal
, which should not benull
. Nope. Stillnull
.
So conceptually, my solution should work. But it doesn't due to very strange behaviour which I have reported in the Scala Trac.
See also this blog post for a nice discussion of null
able AnyVal
s.
-- Flaviu Cipcigan
You could add an implicit argument as a second parameter to the constructor:
class SparseVector[A](size: Int) (implicit default: () => A) {
private var storage = scala.collection.mutable.Map[Int, A]()
def apply(i: Int) = storage.getOrElse(i, default())
def update(i: Int, v: A) = storage.update(i, v)
}
implicit def strDefault(): String = "default"
And provide implicits for the types you care about. This also allows callers to provide their own defaults, by passing their own default values in:
val sparseWithCustomDefault = new SparseVector[String](10) (() => "dwins rules!");
You can use a manifest to get the same default as for Array
, which avoids the need to provide your own implicits. Borrowing the rest of the code again from David Winslow,
class SparseVector[T](size: Int)(implicit manifest: Manifest[T]) {
private val default = manifest.newArray(1)(0)
private var storage = scala.collection.mutable.Map[Int, T]()
def apply(i: Int) = storage.getOrElse(i, default)
def update(i: Int, v: T) = storage.update(i, v)
}
Then just,
val v = new SparseVector[Int](100)
println( v(12) ) // prints '0'
etc.
Re-using David's SparseVector class, you could use something like this:
class SparseVector[T](size: Int, default: T = 0) {
private var storage = scala.collection.mutable.Map[Int, T]()
def apply(i: Int) = storage.getOrElse(i, default)
def update(i: Int, v: T) = storage.update(i, v)
}
object SparseVector {
implicit def svInt2String(i: Int) = "default"
implicit def svInt2Boolean(i: Int = false
}
You need to Import the implicits, which is a shame, but this gives you:-
import SparseVector._
val v = new SparseVector[Int](100)
println( v(12) ) // prints '0'
val w = new SparseVector[Double](100)
println( w(12) ) // prints '0.0'
val x = new SparseVector[Boolean](100)
println( x(85) ) // prints 'false'
val y = new SparseVector[String](100)
println( y(85) ) // prints 'default'
精彩评论