Sorry about the lame title.
I have a project I'm working on and I would appreciate any suggestions as to how I should go about doing开发者_JAVA技巧 the IO stuff.
Ok, I have 3 text files. One file contains many many lines of text. This is the same for the other 2 files. Let's call them File1, File2 and File3.
I need to create a text file, for the sake of explaining, I'll name it Result.txt.
Here's what needs to be done:
- Extract the first line of text from File1 and append it to Result.txt.
Extract the first line of text from File2 and append it to the end of the first line in Result.txt.
Extract the first line of text from File3 and append it to the end of the first line in Result.txt.
Create a new line in Result.txt
Repeat from 1 to 4.
Note: These files can be quite large.
Anyone have any ideas as to how to best approach this?
Thank you
-
Thank you all for your very helpful answers. I've learned alot from your advice and code samples!
I think here you can use the philosophy of the producer/consumer. You can have a thread (producer) reading each line from your 3 source files, concatenating the 3 lines and put the result in a queue (in memory). Meanwhile, another thread (consumer) is constantly reading the from this queue and writing the to your result.txt file.
1: producer thread
Reads line n from file 1,2 and 3
concatenates the contents of the 3 lines and push_back in the queue
2: consumer thread
Check if the queue is empty.
If not, pop the first item in the queue and write to the result.txt
int i = 0;
using (StreamWriter result = new StreamWriter("result.txt"),
StreamReader file1 = new StreamReader("file1.txt"),
StreamReader file2 = new StreamReader("file1.txt"),
StreamReader file3 = new StreamReader("file1.txt"))
{
while(file1.Peek() != -1)
{
result.Write(file1.ReadLine());
result.Write(file2.ReadLine());
result.WriteLine(file3.ReadLine());
if (i++ % 100 == 0) result.Flush();
}
}
Here we go:
using (StreamWriter result = new StreamWriter("result.txt"))
{
StreamReader file1 = new StreamReader("file1.txt");
StreamReader file2 = new StreamReader("file2.txt");
StreamReader file3 = new StreamReader("file3.txt");
while (!file1.EndOfStream || !file2.EndOfStream || !file3.EndOfStream)
{
result.Write(file1.ReadLine() ?? "");
result.Write(file2.ReadLine() ?? "");
result.WriteLine(file3.ReadLine() ?? "");
}
}
I built something similar a few months ago, but using a sightly different approach:
- Create two threads, one for reading, another for writing
- On first thread, read your input files, format a result line and append it to a
StringBuilder
- If
StringBuilder
is larger than n bytes, place it on a queue and signal write thread. - On write thread, take your buffer for queue and start to write async.
- Do it until both threads finish their jobs
You'll need to learn how to synchronize two threads, but it's fun and, in my specific case, we got a good performance boost.
EDIT: A new version to Yuriy copy:
object locker = new object();
using (StreamWriter result = new StreamWriter("result.txt"))
{
StreamReader file1 = new StreamReader("file1.txt");
StreamReader file2 = new StreamReader("file2.txt");
StreamReader file3 = new StreamReader("file3.txt");
const int SOME_MAGICAL_NUMBER = 102400; // 100k?
Queue<string> packets = new Queue<string>();
StringBuilder buffer = new StringBuilder();
Thread writer = new Thread(new ThreadStart(() =>
{
string packet = null;
while (true)
{
Monitor.Wait(locker);
lock (locker)
{
packet = packets.Dequeue();
}
if (packet == null) return;
result.Write(packet);
}
}));
writer.Start();
while (!file1.EndOfStream || !file2.EndOfStream || !file3.EndOfStream)
{
buffer.Append(file1.ReadLine() ?? "");
buffer.Append(file2.ReadLine() ?? "");
buffer.AppendLine(file3.ReadLine() ?? "");
if (buffer.Length > SOME_MAGICAL_NUMBER)
{
lock (locker)
{
packets.Enqueue(buffer.ToString());
buffer.Length = 0;
Monitor.PulseAll(locker);
}
}
}
lock (locker)
{
packets.Enqueue(buffer.ToString());
packets.Enqueue(null); // done
Monitor.PulseAll(locker);
}
writer.Join();
}
This looks pretty straightforward. Using binary reading instead of text (line by line) one might speed the process up.
精彩评论