What is a 10.6-compatible means of recording video frames to a movie without using the QuickTime API?_问答_开发者

I'm updating an applicati开发者_如何学JAVAon to be 64-bit-compatible, but I'm having a little difficulty with our movie recording code. We have a FireWire camera that feeds YUV frames into our application, which we process and encode out to disk within an MPEG4 movie. Currently, we are using the C-based QuickTime API to do this (using Image Compression Manager, etc.), but the old QuickTime API does not have support for 64 bit.

My first attempt was to use QTKit's QTMovie and encode individual frames using -addImage:forDuration:withAttributes:, but that requires the creation of an NSImage for each frame (which is computationally expensive) and it does not do temporal compression, so it doesn't generate the most compact files.

I'd like to use something like QTKit Capture's QTCaptureMovieFileOutput, but I can't figure out how to feed raw frames into that which aren't associated with a QTCaptureInput. We can't use our camera directly with QTKit Capture because of our need to manually control the gain, exposure, etc. for it.

On Lion, we now have the AVAssetWriter class in AVFoundation which lets you do this, but I still have to target Snow Leopard for the time being, so I'm trying to find a solution that works there as well.

Therefore, is there a way to do non-QuickTime frame-by-frame recording of video that is more efficient than QTMovie's -addImage:forDuration:withAttributes: and produces file sizes comparable to what the older QuickTime API can?

In the end, I decided to go with the approach suggested by TiansHUo, and use libavcodec for the video compression here. Based on the instructions by Martin here, I downloaded the FFmpeg source and built a 64-bit compatible version of the necessary libraries using

./configure --disable-gpl --arch=x86_64 --cpu=core2 --enable-shared --disable-amd3dnow --enable-memalign-hack --cc=llvm-gcc
make
sudo make install

This creates the LGPL shared libraries for the 64-bit Core2 processors in the Mac. ~~Unfortunately, I haven't yet figured a way to make the library run without crashing when the MMX optimizations are enabled, so that is disabled right now. This slows down encoding somewhat.~~ After some experimentation, I found that I could build a 64-bit version of the library which had MMX optimizations enabled and was stable on the Mac by using the above configuration options. This is much faster when encoding than the library built with MMX disabled.

Note that if you use these shared libraries, you should make sure you follow the LGPL compliance instructions on FFmpeg's site to the letter.

In order to get these shared libraries to function properly when placed in proper folder within my Mac application bundle, I needed to use install_name_tool to adjust the internal search paths in these libraries to point to their new location in the Frameworks directory within the application bundle:

install_name_tool -id @executable_path/../Frameworks/libavutil.51.9.1.dylib libavutil.51.9.1.dylib

install_name_tool -id @executable_path/../Frameworks/libavcodec.53.7.0.dylib libavcodec.53.7.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libavcodec.53.7.0.dylib

install_name_tool -id @executable_path/../Frameworks/libavformat.53.4.0.dylib libavformat.53.4.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libavformat.53.4.0.dylib
install_name_tool -change /usr/local/lib/libavcodec.dylib @executable_path/../Frameworks/libavcodec.53.7.0.dylib libavformat.53.4.0.dylib

install_name_tool -id @executable_path/../Frameworks/libswscale.2.0.0.dylib libswscale.2.0.0.dylib
install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libswscale.2.0.0.dylib

Your specific paths may vary. This adjustment lets them work from within the application bundle without having to install them in /usr/local/lib on the user's system.

I then had my Xcode project link against these libraries, and I created a separate class to handle the video encoding. This class takes in raw video frames (in BGRA format) through the videoFrameToEncode property and encodes them within the movieFileName file as MPEG4 video in an MP4 container. The code is as follows:

SPVideoRecorder.h

#import <Foundation/Foundation.h>

#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libswscale/swscale.h"

uint64_t getNanoseconds(void);

@interface SPVideoRecorder : NSObject
{
    NSString *movieFileName;
    CGFloat framesPerSecond;
    AVCodecContext *codecContext;
    AVStream *videoStream;
    AVOutputFormat *outputFormat;
    AVFormatContext *outputFormatContext;
    AVFrame *videoFrame;
    AVPicture inputRGBAFrame;

    uint8_t *pictureBuffer;
    uint8_t *outputBuffer;
    unsigned int outputBufferSize;
    int frameColorCounter;

    unsigned char *videoFrameToEncode;

    dispatch_queue_t videoRecordingQueue;
    dispatch_semaphore_t frameEncodingSemaphore;
    uint64_t movieStartTime;
}

@property(readwrite, assign) CGFloat framesPerSecond;
@property(readwrite, assign) unsigned char *videoFrameToEncode;
@property(readwrite, copy) NSString *movieFileName;

// Movie recording control
- (void)startRecordingMovie;
- (void)encodeNewFrameToMovie;
- (void)stopRecordingMovie;


@end

SPVideoRecorder.m

#import "SPVideoRecorder.h"
#include <sys/time.h>

@implementation SPVideoRecorder

uint64_t getNanoseconds(void)
{
    struct timeval now;
    gettimeofday(&now, NULL);
    return now.tv_sec * NSEC_PER_SEC + now.tv_usec * NSEC_PER_USEC;
}

#pragma mark -
#pragma mark Initialization and teardown

- (id)init;
{
    if (!(self = [super init]))
    {
        return nil;     
    }

    /* must be called before using avcodec lib */
    avcodec_init();

    /* register all the codecs */
    avcodec_register_all();
    av_register_all();

    av_log_set_level( AV_LOG_ERROR );

    videoRecordingQueue = dispatch_queue_create("com.sonoplot.videoRecordingQueue", NULL);;
    frameEncodingSemaphore = dispatch_semaphore_create(1);

    return self;
}

#pragma mark -
#pragma mark Movie recording control

- (void)startRecordingMovie;
{   
    dispatch_async(videoRecordingQueue, ^{
        NSLog(@"Start recording to file: %@", movieFileName);

        const char *filename = [movieFileName UTF8String];

        // Use an MP4 container, in the standard QuickTime format so it's readable on the Mac
        outputFormat = av_guess_format("mov", NULL, NULL);
        if (!outputFormat) {
            NSLog(@"Could not set output format");
        }

        outputFormatContext = avformat_alloc_context();
        if (!outputFormatContext)
        {
            NSLog(@"avformat_alloc_context Error!");
        }

        outputFormatContext->oformat = outputFormat;
        snprintf(outputFormatContext->filename, sizeof(outputFormatContext->filename), "%s", filename);

        // Add a video stream to the MP4 file 
        videoStream = av_new_stream(outputFormatContext,0);
        if (!videoStream)
        {
            NSLog(@"av_new_stream Error!");
        }


        // Use the MPEG4 encoder (other DiVX-style encoders aren't compatible with this container, and x264 is GPL-licensed)
        AVCodec *codec = avcodec_find_encoder(CODEC_ID_MPEG4);  
        if (!codec) {
            fprintf(stderr, "codec not found\n");
            exit(1);
        }

        codecContext = videoStream->codec;

        codecContext->codec_id = codec->id;
        codecContext->codec_type = AVMEDIA_TYPE_VIDEO;
        codecContext->bit_rate = 4800000;
        codecContext->width = 640;
        codecContext->height = 480;
        codecContext->pix_fmt = PIX_FMT_YUV420P;
//      codecContext->time_base = (AVRational){1,(int)round(framesPerSecond)};
//      videoStream->time_base = (AVRational){1,(int)round(framesPerSecond)};
        codecContext->time_base = (AVRational){1,200}; // Set it to 200 FPS so that we give a little wiggle room when recording at 50 FPS
        videoStream->time_base = (AVRational){1,200};
//      codecContext->max_b_frames = 3;
//      codecContext->b_frame_strategy = 1;
        codecContext->qmin = 1;
        codecContext->qmax = 10;    
//      codecContext->mb_decision = 2; // -mbd 2
//      codecContext->me_cmp = 2; // -cmp 2
//      codecContext->me_sub_cmp = 2; // -subcmp 2
        codecContext->keyint_min = (int)round(framesPerSecond); 
//      codecContext->flags |= CODEC_FLAG_4MV; // 4mv
//      codecContext->flags |= CODEC_FLAG_LOOP_FILTER;
        codecContext->i_quant_factor = 0.71;
        codecContext->qcompress = 0.6;
//      codecContext->max_qdiff = 4;
        codecContext->flags2 |= CODEC_FLAG2_FASTPSKIP;

        if(outputFormat->flags & AVFMT_GLOBALHEADER)
        {
            codecContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
        }

        // Open the codec
        if (avcodec_open(codecContext, codec) < 0) 
        {
            NSLog(@"Couldn't initialize the codec");
            return;
        }

        // Open the file for recording
        if (avio_open(&outputFormatContext->pb, outputFormatContext->filename, AVIO_FLAG_WRITE) < 0) 
        { 
            NSLog(@"Couldn't open file");
            return;
        } 

        // Start by writing the video header
        if (avformat_write_header(outputFormatContext, NULL) < 0) 
        { 
            NSLog(@"Couldn't write video header");
            return;
        } 

        // Set up the video frame and output buffers
        outputBufferSize = 400000;
        outputBuffer = malloc(outputBufferSize);
        int size = codecContext->width * codecContext->height;

        int pictureBytes = avpicture_get_size(PIX_FMT_YUV420P, codecContext->width, codecContext->height);
        pictureBuffer = (uint8_t *)av_malloc(pictureBytes);

        videoFrame = avcodec_alloc_frame();
        videoFrame->data[0] = pictureBuffer;
        videoFrame->data[1] = videoFrame->data[0] + size;
        videoFrame->data[2] = videoFrame->data[1] + size / 4;
        videoFrame->linesize[0] = codecContext->width;
        videoFrame->linesize[1] = codecContext->width / 2;
        videoFrame->linesize[2] = codecContext->width / 2;

        avpicture_alloc(&inputRGBAFrame, PIX_FMT_BGRA, codecContext->width, codecContext->height);

        frameColorCounter = 0;

        movieStartTime = getNanoseconds();
    });
}

- (void)encodeNewFrameToMovie;
{
//  NSLog(@"Encode frame");

    if (dispatch_semaphore_wait(frameEncodingSemaphore, DISPATCH_TIME_NOW) != 0)
    {
        return;
    }

    dispatch_async(videoRecordingQueue, ^{
//      CFTimeInterval previousTimestamp = CFAbsoluteTimeGetCurrent();
        frameColorCounter++;

        if (codecContext == NULL)
        {       
            return;
        }

        // Take the input BGRA texture data and convert it to a YUV 4:2:0 planar frame
        avpicture_fill(&inputRGBAFrame, videoFrameToEncode, PIX_FMT_BGRA, codecContext->width, codecContext->height);
        struct SwsContext * img_convert_ctx = sws_getContext(codecContext->width, codecContext->height, PIX_FMT_BGRA, codecContext->width, codecContext->height, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL); 
        sws_scale(img_convert_ctx, (const uint8_t* const *)inputRGBAFrame.data, inputRGBAFrame.linesize, 0, codecContext->height, videoFrame->data, videoFrame->linesize);

        // Encode the frame
        int out_size = avcodec_encode_video(codecContext, outputBuffer, outputBufferSize, videoFrame);  

        // Generate a packet and insert in the video stream
        if (out_size != 0) 
        {
            AVPacket videoPacket;
            av_init_packet(&videoPacket);

            if (codecContext->coded_frame->pts != AV_NOPTS_VALUE) 
            {
                uint64_t currentFrameTime = getNanoseconds();

                videoPacket.pts = av_rescale_q(((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull/*codecContext->coded_frame->pts*/, AV_TIME_BASE_Q/*codecContext->time_base*/, videoStream->time_base);

//              NSLog(@"Frame time %lld, converted time: %lld", ((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull, videoPacket.pts);
            }

            if(codecContext->coded_frame->key_frame)
            {
                videoPacket.flags |= AV_PKT_FLAG_KEY;
            }
            videoPacket.stream_index = videoStream->index;
            videoPacket.data = outputBuffer;
            videoPacket.size = out_size;

            int ret = av_write_frame(outputFormatContext, &videoPacket);
            if (ret < 0) 
            { 
                av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing frame.\n"); 
                av_free_packet(&videoPacket);
                return;
            } 

            av_free_packet(&videoPacket);
        }

//      CFTimeInterval frameDuration = CFAbsoluteTimeGetCurrent() - previousTimestamp;
//      NSLog(@"Frame duration: %f ms", frameDuration * 1000.0);

        dispatch_semaphore_signal(frameEncodingSemaphore);
    });
}

- (void)stopRecordingMovie;
{
    dispatch_async(videoRecordingQueue, ^{
        // Write out the video trailer
        if (av_write_trailer(outputFormatContext) < 0) 
        { 
            av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing trailer.\n"); 
            exit(1); 
        } 

        // Close out the file
        if (!(outputFormat->flags & AVFMT_NOFILE)) 
        {
            avio_close(outputFormatContext->pb);
        }

        // Free up all movie-related resources
        avcodec_close(codecContext);
        av_free(codecContext);
        codecContext = NULL;

        free(pictureBuffer);
        free(outputBuffer);

        av_free(videoFrame);
        av_free(outputFormatContext);
        av_free(videoStream);       
    });

}

#pragma mark -
#pragma mark Accessors

@synthesize framesPerSecond, videoFrameToEncode, movieFileName;

@end

This works under Lion and Snow Leopard in a 64-bit application. It records at the same bitrate as my previous QuickTime-based approach, with overall lower CPU usage.

Hopefully, this will help out someone else in a similar situation.

I asked a very similar question of a QuickTime engineer last month at WWDC and they basically suggested using a 32-bit helper process... I know that's not what you wanted to hear. ;)

Yes, there is (at least) a way to do non-QuickTime frame-by-frame recording of video that is more efficient and produces files comparable to Quicktime.

The open-source library libavcodec is perfect for your case of video-encoding. It is used in very popular open-source and commercial software and libraries (For example: mplayer, google chrome, imagemagick, opencv) It also provides a huge amount of options to tweak and numerous file formats (all important formats and lots of exotic formats). It is efficient and produces files at all kinds of bit-rates.

From Wikipedia:

libavcodec is a free software/open source LGPL-licensed library of codecs for encoding and decoding video and audio data.[1] It is provided by FFmpeg project or Libav project.[2] [3] libavcodec is an integral part of many open-source multimedia applications and frameworks. The popular MPlayer, xine and VLC media players use it as their main, built-in decoding engine that enables playback of many audio and video formats on all supported platforms. It is also used by the ffdshow tryouts decoder as its primary decoding library. libavcodec is also used in video editing and transcoding applications like Avidemux, MEncoder or Kdenlive for both decoding and encoding. libavcodec is particular in that it contains decoder and sometimes encoder implementations of several proprietary formats, including ones for which no public specification has been released. This reverse engineering effort is thus a significant part of libavcodec development. Having such codecs available within the standard libavcodec framework gives a number of benefits over using the original codecs, most notably increased portability, and in some cases also better performance, since libavcodec contains a standard library of highly optimized implementations of common building blocks, such as DCT and color space conversion. However, even though libavcodec strives for decoding that is bit-exact to the official implementation, bugs and missing features in such reimplementations can sometimes introduce compatibility problems playing back certain files.

You can choose to import FFmpeg directly into your XCode project.
Another solution is to directly pipe your frames into the FFmpeg executable.

The FFmpeg project is a fast, accurate multimedia transcoder which can be applied in a variety of scenarios on OS X.

FFmpeg (libavcodec included) can be compiled in mac
http://jungels.net/articles/ffmpeg-howto.html

FFmpeg (libavcodec included) can be also compiled in 64 bits on snow leopard
http://www.martinlos.com/?p=41

FFmpeg supports a huge number of video and audio codecs:
http://en.wikipedia.org/wiki/Libavcodec#Implemented_video_codecs

Note that libavcodec and FFmpeg is LGPL, which means that you will have to mention you've used them, and you don't need to open source your project.