OpenCV: in search for less CPU intensive frame capture+resize and into buffer way: how to optimize my code?_问答_开发者

So I created a function (C++)

void CaptureFrame(char* buffer, int w, int h, int bytespan)
{
 /* get a frame */
 if(!cvGrabFrame(capture)){              // capture a frame 
  printf("Could not grab a frame\n\7");
  //exit(0);
 }
 CVframe =cvRetrieveFrame(capture);           // retrieve the captured frame

 /* always check */
 if (!CVframe)
 {
  printf("No CV frame captured!\n");
  cin.get();
 }

 /* resize buffer for current frame */
 IplImage* destination = cvCreateImage(cvSize(w, h), CVframe->depth, CVframe->nChannels);

 //use cvResize to resize source to a destination image
 cvResize(CVframe, destination);

 IplImage* redchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* greenchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* bluechannel = cvCreateImage(cvGetSize(destination), 8, 1);

 cvSplit(destination, bluechannel, greenchannel, redchannel, NULL);
 for(int y = 0; y < destination->height; y++)
 {
  char* line = buffer + y * bytespan;
  for(int x = 0; x < destination->width; x++)
  {
   line[0] = cvGetReal2D(redchannel, y, x);
   line[1] = cvGetReal2D(greenchannel, y, x);
   line[2] = cvGetReal2D(bluechannel, y, x);
   line += 3;
  }
 }
 cvReleaseImage(&开发者_如何学Python;redchannel);
 cvReleaseImage(&greenchannel);
 cvReleaseImage(&bluechannel);
 cvReleaseImage(&destination);
}

So generally it captures a frame from device, creates a frame to resize into and copies it into buffer (RGB or YUV420P is requirement for me).

So I wonder what I do wrong, because my function is way 2 cpu intensive, and what can be done to fix it?

Update:

My function is runed in thread:

     void ThreadCaptureFrame()
    {
        while(1){
        t.restart();
        CaptureFrame((char *)frame->data[0], videoWidth, videoHeight, frame->linesize[0]);
        AVFrame* swap = frame;
        frame = readyFrame;
        readyFrame = swap;
        spendedTime = t.elapsed();
        if(spendedTime < desiredTime){
            Sleep(desiredTime - spendedTime);
        }
    }
 }

which is started at the beginning of int main ( after some initialization):

boost::thread workerThread(ThreadCaptureFrame);

So if it can it runs 24 times per second, it eats 28% of core quad. cam resolution I capture is like 320x240. So: how to optimize it?

Things you can do:

Instead of taking images from the camera at the default resolution, choose what resolution you want.
I think you can simply set buffer = destination->imageData

These articles might be helpful:

http://aishack.in/tutorials/efficiently-accessing-matrices/
http://aishack.in/tutorials/memory-layout-of-matrices-of-multidimensional-objects/

First, don't allocate and the release the images per every frame! That probably takes the most time. Have all your IplImages pre-allocated and release them only when your app is done. You can use boost::shared_ptr with a custom deleter to avoid needing to remember to release the images.
I don't get why you're splitting and why you're copying like that. If you must copy, then just copy the whole of destination->imageData into buffer. If it is the padding that is buggung you then do it in a loop like you did, but directly from destination->imageData. You dont need to separate the color channels.
Use cvResize with CV_INTER_NN. That will reduce the image quality but is faster.

I'm not familiar with OpenCV, but if I'm reading your code correctly, you're:

reading from camera's buffer to memory (1 copying)
resizing the image (1 copying)
splitting the image into RGB channel (3 copying)
re-merge the channels to buffer (1 copying)

I think that's a lot of unnecessary copying, for each frame you made 6 copies of the image (i.e. if your image is 320x240 on 24-bit color and 24fps you'd be moving around at least 32MB/sec, with 1000x1000 frame you're talking about half gigabyte per second; note that this is a very crude back-of-the-envelope underestimate, depending on the resizing algorithm, extra copying may be done, reading/writing to non-aligned memory location may incur some overhead, etc, etc).

You can probably skip step #3 and/or #4, though I'm not familiar enough with OpenCV to suggest how.