开发者

Faster Alternative to std::ofstream

开发者 https://www.devze.com 2023-03-11 08:15 出处:网络
I generate a set of data files. As the files are supposed to be readable, they text files (opposed to binary files).

I generate a set of data files. As the files are supposed to be readable, they text files (opposed to binary files).

To output information to my files, I used very comfortable std::ofstream object.

In the beginning, when the data to be exported was smaller, the time needed to write to the files was not noticeable. However, as the information to be exported has accumulated, it takes now around 5 minutes to generate them.

As I started being bothered by waiting, my question is obvious: Is there any faster alternative to std::ofstream, please? In case there is faster alternative, will it be worth of rewritting my application? In other words, could the time saved be +50%? Thank you.


Update:

I was asked to show you my code that generates the above files, so here you are - the most time consuming loop:

ofstream fout;
fout.open(strngCollectiveSourceFileName,ios::out);

fout << "#include \"StdAfx.h\"" << endl;
fout << "#include \"Debug.h\"" << endl;
fout << "#include \"glm.hpp\"" << endl;
fout << "#include \"" << strngCollectiveHeaderFileName.substr( strngCollectiveHeaderFileName.rfind(TEXT("\\")) + 1) << "\"" << endl << endl;

fout << "using namespace glm;" << endl << endl << endl;


for (unsigned int nSprite = 0; nSprite < vpTilesetSprites.size(); nSprite++ )
{
    for(unsigned int nFrameSet = 0; nFrameSet < vpTilesetSprites[nSprite]->vpFrameSets.size(); nFrameSet++)
    {

        // display index definition
        fout << "// Index Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";
        string strngIndexSignature = strngIndexDefinitionSignature;
        strngIndexSignature.replace(strngIndexSignature.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("IndexData") );
        strngIndexSignature.replace(strngIndexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );        
        fout << strngIndexSignature << "[4] = {0, 1, 2, 3};\t\t" << "// " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << ": Index Definition\n\n";


        // display vertex definition
        fout << "// Vertex Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";

        string strngVertexSignature = strngVertexDefinitionSignature;
        strngVertexSignature.replace(strngVertexSignature.find(TEXT("#aVertexArrayName#")), strlen(TEXT("#aVertexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("VertexData") );
        strngVertexSignature.replace(strngVertexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );
        fout << strngVertexSignature << "[" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFramesCount() << "] =\n";
        fout << "{\n";

        for (int nFrameNo = 0; nFrameNo < vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFramesCount(); nFrameNo++)
        {
            fout << "\t" << "{{ vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[0].vTextureUV.fv << "f) },  // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 1: vec4(x, y, z, w), vec2(u, v) \n";
            fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[1].vTextureUV.fv << "f) },  // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 2: vec4(x, y, z, w), vec2(u, v) \n";
            fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]-&g开发者_JS百科t;aVertices[2].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[2].vTextureUV.fv << "f) },  // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 3: vec4(x, y, z, w), vec2(u, v) \n";
            fout << "\t" << " { vec4(" << fixed << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fx << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fy << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fz << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vPosition.fw << "f), vec2(" << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vTextureUV.fu << "f, " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->vpFrames[nFrameNo]->aVertices[3].vTextureUV.fv << "f) }},  // " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetShortDescription() << " vertex 4: vec4(x, y, z, w), vec2(u, v) \n\n";
        }
        fout << "};\n\n\n\n";
    }
}

fout.close();


If you don't want to use C file I/O then you can give a try to; FastFormat. Look at the comparison for more info.


How are vpTilesetSprites and vpTilesetSprites[nSprite] stored? Are they implemented with lists or arrays? There is a lot of indexed access to them, and if they are list-like structures, you'll spend a lot of extra time following needless pointers. Ed S.'s comment is right: giving the long indexed temporary variables and linebreaks could make it easier to read, and maybe faster, too:

fout << "// Index Definition: " << vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetLongDescription() << "\n";
string strngIndexSignature = strngIndexDefinitionSignature;
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")), TEXT("a") + vpTilesetSprites[nSprite]->GetObjectName() + vpTilesetSprites[nSprite]->vpFrameSets[nFrameSet]->GetFrameSetName() + TEXT("IndexData") );
strngIndexSignature.replace(strngIndexSignature.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")), strngCollectiveArrayClassName );        

vs

string idxsig = strngIndexDefinitionSignature;
sprite sp = vpTilesetSprites[nSprite];
frameset fs = sp->vpFrameSets[nFrameSet];

fout << "// Index Definition: " << fs->GetLongDescription() << "\n";
idxsig.replace(idxsig.find(TEXT("#aIndexArrayName#")), strlen(TEXT("#aIndexArrayName#")),
    TEXT("a") + sp->GetObjectName() + fs->getFrameSetName() + TEXT("IndexData"));
idxsig.replace(idxsig.find(TEXT("#ClassName#")), strlen(TEXT("#ClassName#")),
    strngCollectiveArrayClassName);

But, the much bigger problem is how you're using strings as templates; you're looking for a given text string (and computing the length of your needle string every single time you need it!) over and over again.

Consider this: You're performing the find and replace operations nSprite * nFrameSet times. Each time through, this loop:

  • makes a copy of strngIndexDefinitionSignature
  • creates four temporary string objects when concatenating static and dynamic strings
  • compute strlen(TEXT("#ClassName#"))
  • compute strlen(TEXT("#aIndexArrayName#"))
  • find start point of both
  • replace both texts with new texts

And that's just the first four lines of your loop.

Can you replace your strngIndexDefinitionSignature with a format string? I assume it currently looks like this:

"flubber #aIndexArrayName# { blubber } #ClassName# blorp"

If you re-write it like this:

"flubber a %s%sIndexData { blubber } %s blorp"

Then your two find and replace lines can be replaced with:

sprintf(out, index_def_sig, sp->GetObjectName(), fs->getFrameSetName(),
    strngCollectiveArrayClassName);

This would remove two find() operations, two replace() operations, creating and destroying four temporary string objects, a string duplicate that was promptly over-written with two replace() calls, and two strlen() operations that return the same result every time (but aren't actually needed anyway).

You can then output your string with << as usual. Or, you can change sprintf(3) to fprintf(3), and avoid even the temporary C string.


Assuming you do it in large enough chunks, calling write() directly might be faster; that said, it's more likely that your biggest bottleneck doesn't have anything directly to do with std::ofstream. The most obvious thing is to make sure you aren't using std::endl (because flushing the stream frequently will kill performance). Beyond that, I would suggest profiling your app to see where it's actually spending the time.


The performance of ostream is probably not your actual issue; I suggest using a profiler to determine where your real bottlenecks are. If ostream turns out to be your actual problem, you can drop down to <cstdio> and use fprintf(FILE*, const char*, ...) for formatted output to a file handle.


The best answer will depend on what sort of text you are generating, and how you are generating it. C++ streams can be slow, but that mostly is because they can also do a lot more for you, such as locale-dependent formatting, and so on.

You may find speed ups with streams by bypassing some of the formatting (eg. ostream::write), or by writing characters directly to a streambuf instead (streambuf::sputn). Sometimes increasing the buffer size on the relevant streambuf helps (via streambuf::pubsetbuf).

If this isn't good enough, you might want to try C-style stdio files, eg fopen, fprintf, etc. It takes a little while to get used to the way the text is formatted if you're not used to that method but the performance is usually pretty good.

For the absolute top performance you usually have to go to OS-specific routines. Sometimes the direct low-level file routines are significantly better than the C stdio, but sometimes not - for example, I've seen some people say WriteFile on Win32 is the fastest method on Windows, whereas some Google hits report it as being slower than stdio. Another approach might be a memory-mapped file, eg. mmap + msync - this essentially uses your system memory as the disk and writes the actual data to disk in large blocks, which is likely to be near optimal. However you run the risk of losing all the data if you incur a crash half way for some reason, which may or may not be a problem for you.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号