I think I have implemented some of my code wrong. I cannot figure out why my sort (using arrays.sort) is taking longer in the "parallel" version than in the non-parallel version (it's obviously in putting the two arrays back together, but I didn't think it would add that much more time on). If someone could point out any mistakes that I am making or any tips to improve the parallel version over the non-parallel version I would appreciate it. Am I able to do the array merge more efficiently, or maybe even in parallel? If so, what is the best practice for implementation. Any help would be greatly appreciated.
import java.util.Arrays
import scala.concurrent._
import scala.collection._
trait Sorts {
def doSort(a: Array[Double]): Array[Double]
}
object Simple extends Sorts {
def doSort(a: Array[Double]) = {
Arrays.sort(a)
a
}
}
object Parallel extends Sorts开发者_高级运维 {
def doSort(a: Array[Double]) = {
val newArray = new Array[Double](a.length)
val aLength = (a.length)
val aSplit = ((a.length / 2).floor).toInt
ops.par(Arrays.sort(a, 0, aSplit), Arrays.sort(a, (aSplit + 1), aLength))
def merge(w: Int, x: Int, y: Int) {
var i = w
var j = x
var k = y
while (i <= aSplit && j <= aLength) {
if (a(i) <= a(j)) {
newArray(k) = a(i)
i = i + 1
} else {
newArray(k) = a(j)
j = j + 1
}
k = k + 1
}
if (i < aSplit) {
for (i <- i until aSplit) {
newArray(k) = a(i)
k = k + 1
}
} else {
for (j <- j until aLength) {
newArray(k) = a(j)
k = k + 1
}
}
}
merge(0, (aSplit + 1), 0)
newArray
}
}
object Main {
def main(args: Array[String]): Unit = {
val simpleNumbers = Array.fill(10000)(math.random)
println(simpleNumbers.toList + "\n")
val simpleStart = System.nanoTime()
Simple.doSort(simpleNumbers)
val simpleEnd = System.nanoTime()
println(simpleNumbers.toList + "\n")
val simpleDifference = ((simpleEnd - simpleStart) / 1e9).toDouble
val parallelNumbers = Array.fill(10000)(math.random)
println(parallelNumbers.toList + "\n")
val parallelStart = System.nanoTime()
Parallel.doSort(parallelNumbers)
val parellelEnd = System.nanoTime()
println(parallelNumbers.toList + "\n")
val parallelDifference = ((parellelEnd - parallelStart) / 1e9).toDouble
println("\n Simple Time Taken: " + simpleDifference + "\n")
println("\n Parallel Time Taken: " + parallelDifference + "\n")
}
}
Output on an Intel Core i7:
Simple Time Taken: 0.01314
Parallel Time Taken: 0.05882
I think you've got a couple of different things going on here. First, on my system the ops.par(Arrays.sort(...))
line by itself takes longer than all of Simple.doSort()
. So there must be some overhead (thread creation?) that dominates the performance gain for a smallish array. Try it for 100,000 or a million elements. Second, Arrays.sort
is an in-place sort, so it doesn't have to incur the cost of creating a new 10k element array for the results.
To avoid creating the second array, you can do the partition first and then sort the two halves in parallel, as recommended here
def doSort(a: Array[Double]) = {
val pivot = a(a.length-1)
var i = 0
var j = a.length-2
def swap(i: Int, j: Int) {
val temp = a(i)
a(i) = a(j)
a(j) = temp
}
while(i < j-1) {
if(a(i) <= pivot) {
i+=1
}
else {
swap(i,j)
j-=1
}
}
swap(j-1, a.length-1)
ops.par(Arrays.sort(a,0,a.length/2), Arrays.sort(a,a.length/2+1,a.length))
a
}
After upping the array size to 100k, I do see the parallel version performing around twice as fast on an Intel E5300 CPU.
精彩评论