Possible optimization in cython: numpy array_问答_开发者

Possible optimization in cython: numpy array

开发者 https://www.devze.com 2023-01-26 17:29 出处：网络

The following is my Cython code for drawing from multivariate normal distribution. I am using loop because each time I have different density. (conLSigma is the Cholesky factor)

This is taking a lot of time because I am taking inverse and Cholesky decomposition for each loop. It is faster than pure python code, but I was wondering if there is any way I can boost the speed more.

from __future__ import division

import numpy as np 

cimport numpy as np 

ctypedef np.float64_t dtype_t

cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)

def drawMetro(np.ndarray[dtype_t, ndim = 2] beta,
              np.ndarray[dtype_t, ndim = 3] H,
              np.ndarray[dtype_t, ndim = 2] Sigma,
              float s):

    cdef int ncons = betas.shape[0]
    cdef int nX = betas.shape[1]
    cdef int con

    cdef np.ndarray betas_cand = np.zeros([ncons, nX], dtype = np.float64)
    cdef np.ndarray conLSigma = np.zeros([nX, nX], dtype = np.float64)

    for con in xrange(ncons):开发者_运维知识库
        conLSigma = np.linalg.cholesky(np.linalg.inv(H[con] + Sigma))
        betas_cand[con] = betas[con] + s * np.dot(conLSigma, np.random.standard_normal(size = nX))

    return(betas_cand)

The Cholesky decomposition creates a lower-triangular matrix. This means that close to half of the multiplications done in the np.dot don't need to be done. If you change the line

betas_cand[con] = betas[con] + s * np.dot(conLSigma, np.random.standard_normal(size = nX))

into

tmp = np.random.standard_normal(size = nX)
for i in xrange(nX):
    for j in xrange(i+1):
        betas_cand[con,i] += s * conLSigma[i,j] * tmp[j]

However, you'll also need to change

cdef np.ndarray betas_cand = np.zeros([ncons, nX], dtype = np.float64)

into

cdef np.ndarray betas_cand = np.array(betas)

You could of course use slices for the multiplications, but I'm not sure if it will be faster than the way that I've suggested. Anyway, hopefully you get the idea. I don't think that there is much else that you can do to speed this up.

what about computing the cholesky decomposition first and inverting the lower triangular matrix after by back substitution. This should be faster than linalg.cholesky(linalg.inv(S)).