In C#, I'm reading a moderate size of file (100 KB ~ 1 MB), modifying some parts of the content, and finally writing to a different file. All contents are text. Modification is done as string objects and string operations. My current approach is:
- Read each line from the original file by using
StreamReader
. - Open a
StringBuilder
for the contents of the new file. - Modify the string object and call
AppendLine
of theStringBuilder
(until the end of the file) - Open a new
StreamWriter
, and write theStringBuilder
to the write stream.
However, I've found that StremWriter.Write
truncates 32768 bytes (2^16), but the length of StringBuilder
is greater than that. I could write a simple loop to guarantee entire string to a file. But, I'm wondering what would be the most efficient way in C# for doing this task?
To summarize, I'd like to modify only some parts of a text file and write to a different file. But, the text file size could be larger than 32768 bytes.
== Answer == I'm sorry to make c开发者_开发问答onfusin to you! It was just I didn't call flush
. StremWriter.Write
does not have a short (e.g., 2^16) limitation.
StreamWriter.Write
does not
truncate the string and has no limitation.
Internally it uses String.CopyTo
which on the other hand uses unsafe code (using fixed
) to copy chars so it is the most efficient.
The problem is most likely related to not closing the writer. See http://msdn.microsoft.com/en-us/library/system.io.streamwriter.flush.aspx.
But I would suggest not loading the whole file in memory if that can be avoided.
can you try this :
void Test()
{
using (var inputFile = File.OpenText(@"c:\in.txt"))
{
using (var outputFile = File.CreateText(@"c:\out.txt"))
{
string current;
while ((current = inputFile.ReadLine()) != null)
{
outputFile.WriteLine(Process(current));
}
}
}
}
string Process(string current)
{
return current.ToLower();
}
It avoid to have to full file loaded in memory, by processing line by line and writing it directly
Well, that entirely depends on what you want to modify. If your modifications of one part of the text file are dependent on another part of the text file, you obviously need to have both of those parts in memory. If however, you only need to modify the text file on a line-by-line basis then use something like this :
using (StreamReader sr = new StreamReader(@"test.txt"))
{
using (StreamWriter sw = new StreamWriter(@"modifiedtest.txt"))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
//do some modifications
sw.WriteLine(line);
sw.Flush(); //force line to be written to disk
}
}
}
Instead of of running though the hole dokument i would use a regex to find what you are looking for Sample:
public List<string> GetAllProfiles()
{
List<string> profileNames = new List<string>();
using (StreamReader reader = new StreamReader(_folderLocation + "profiles.pg"))
{
string profiles = reader.ReadToEnd();
var regex = new Regex("\nname=([^\r]{0,})", RegexOptions.IgnoreCase);
var regexMatchs = regex.Matches(profiles);
profileNames.AddRange(from Match regexMatch in regexMatchs select regexMatch.Groups[1].Value);
}
return profileNames;
}
精彩评论