gpgpu

相关标签：javascript jquery android 多少钱 iPhone

Why does CUDA Profiler indicate replayed instructions: 82% != global replay + local replay + shared replay?

I got information from CUDA Profiler. I am so confused why Replays Instruction != Grobal memory replay + Local memory replay + Shared bank conflict replay?[详细]

2023-03-30 09:27 分类：问答
OpenCL - How to I query for a device's SIMD width?

In CUDA, there is a concept of a warp, which is defined as 开发者_JAVA技巧the maximum number of threads that can execute the same instruction simultaneously within a single processing element.For NVID[详细]

2023-03-29 05:18 分类：问答
Sparse Cholesky factorization algorithm for GPU [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.[详细]

2023-03-29 01:56 分类：问答
cpu vs gpu - when cpu is better [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,or expertise, but this question will likely solicit debate, a[详细]

2023-03-28 21:32 分类：问答
How does the opencl command queue work, and what can I ask of it

I\'m working on an algorithm that does prettymuch the same operation a bunch of times. Since the operation consists of some linear algebra(BLAS), I thourght I would try using the GPU for this.[详细]

2023-03-27 03:24 分类：问答
Why does padding the shared memory array by one column increase the speed of the kernel by 40%?

Why is this matrix transpose kernel faster, when the shared memory array is padded by one column? I found the kernel at PyCuda/Examples/MatrixTranspose.[详细]

2023-03-27 01:51 分类：问答
Is there a way to independently task and use heterogenous multi gpus in a windows 7 system?

Can I have two mixed chipset/generation AMD gpus in my desktop; a 6950 and 4870, and dedicate one gpu (4870) for opencl/gpgpu purposes only, eliminating the device from video output or display driving[详细]

2023-03-26 07:19 分类：问答
how much time does it take to make a call to opencl?

I\'m currently implementing an algorithm that does allot of linear algebra on small matrices and vectors. the code is fast but I\'m wondering if it would make sense to implement it on a gpgpu instead[详细]

2023-03-25 21:51 分类：问答
CUDA limit seems to be reached, but what limit is that?

I have a CUDA program that seems to be hitting some sort of limit of some resource, but I can\'t figure out what that resource is.Here is the kernel function:[详细]

2023-03-24 21:32 分类：问答
CUDA - copy to array within array of Objects

I have a CUDA application I\'m working on with an array of Objects; each object has a pointer to an array of std::pair<int, double>.I\'m trying to cudaMemcpy the array of objects over, then cuda[详细]

2023-03-24 19:26 分类：问答