开发者

Work-items, Work-groups and Command Queues organization and memory limit in OpenCL

开发者 https://www.devze.com 2023-01-07 02:03 出处:网络
Okay i have already been through most of the ati and nvidia guides to OpenCL, there are some stuff th开发者_运维知识库at i just want to be sure of, and some need clarification. Nothing in the document

Okay i have already been through most of the ati and nvidia guides to OpenCL, there are some stuff th开发者_运维知识库at i just want to be sure of, and some need clarification. Nothing in the documentation gives a clear cut answer.

Now i have a radeon 4650, now on querying my device, i got

  CL_DEVICE_MAX_COMPUTE_UNITS:  8
  CL_DEVICE_ADDRESS_BITS:  32
  CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
  CL_DEVICE_MAX_WORK_ITEM_SIZES: 128 / 128 / 128 
  CL_DEVICE_MAX_WORK_GROUP_SIZE: 128
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  256 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:  256 MByte

ok first, my card has 1GB memory, why am i allowed to 256MB only?

2nd i don't understand the Work-item dimension part, does that mean i can have up to 128*3 or 128^3 work-items?

when i calculated this before i run the query, i got 8 cores * 16 stream processors * 4 work-items = 512 why is this wrong?

also i got the same 3 dimension work-item stuff for my inte core 2 duo CPU, does the same calculations apply?

As for the command queues, when i tried accessing my core duo CPU as a device using OpenCL, stuff got processed on one core only, i tried doing multiple queues and queueing several entries, but still got processed on one core only, i used a global_work_size of 128*128*128*8 for a simple write program where each work-item writes its own global-id to the buffer and i got only zeros.

and what about Nvidia Cards? on a Nvidia 9500 GT with 32 cuda cores, does the work-items calculate similarly?

Thanks alot, i've been really all over the place trying to find answers.


ok first, my card has 1GB memory, why am i allowed to 256MB only?

This is an ATI driver bug/limitation AFAIK. I'll check on my 5850 if I can repro.

http://devforums.amd.com/devforum/messageview.cfm?catid=390&threadid=124142&messid=1069111&parentid=0&FTVAR_FORUMVIEWTMP=Branch

2nd i don't understand the Work-item dimension part, does that mean i can have up to 128*3 or 128^3 work-items?

No. That means you can have max 128 on one dim since CL_DEVICE_MAX_WORK_ITEM_SIZES is 128 / 128 / 128. And since CL_DEVICE_MAX_WORK_GROUP_SIZE is 128, you can have, e.g: work_group_size(128, 1, 1) or work_group_size(1, 128, 1) or work_group_size(64, 1, 2), or work_group_size(8, 4, 4) etc, as long as product of each dim is <= 128 it will be fine.

when i calculated this before i run the query, i got 8 cores * 16 stream processors * 4 work-items = 512 why is this wrong?

also i got the same 3 dimension work-item stuff for my inte core 2 duo CPU, does the same calculations apply?

Don't understand what you are trying to compute here.

0

精彩评论

暂无评论...
验证码 换一张
取 消