开发者

Parallel downloading

开发者 https://www.devze.com 2023-03-17 03:59 出处:网络
I am trying to achieve parallel downloading of file开发者_运维百科s through http in C#. I have tried a couple of different approaches, but none of them seem to work correctly. No matter what I do, the

I am trying to achieve parallel downloading of file开发者_运维百科s through http in C#. I have tried a couple of different approaches, but none of them seem to work correctly. No matter what I do, the downloads end up being queued and are not working in a true parallel fashion.

Could anyone give me some directions or link to an article that described a method that actually works?


I just wrote some code, didn't test it, waiting for some observation thanks everyone :

public class DownloadFile
{
    public string Url { get; set; }

    public string PathToSave { get; set; }
}


public class ParallelDownloading
    {
        private ConcurrentQueue<DownloadFile> _queueToDownlaod;
        private IList<Task> _downloadingTasks;
        private Timer _downloadTimer;

        private int _parallelDownloads;

        public ParallelDownloading(int parallelDownloads)
        {
            _queueToDownlaod = new ConcurrentQueue<DownloadFile>();
            _downloadingTasks = new List<Task>();
            _downloadTimer = new Timer();

            _parallelDownloads = parallelDownloads;

            _downloadTimer.Elapsed += new ElapsedEventHandler(DownloadTimer_Elapsed);
            _downloadTimer.Interval = 1000;
            _downloadTimer.Start();

            ServicePointManager.DefaultConnectionLimit = parallelDownloads;
        }

        public void EnqueueFileToDownload(DownloadFile file)
        {
            _queueToDownlaod.Enqueue(file);
        }

        void DownloadTimer_Elapsed(object sender, ElapsedEventArgs e)
        {
            StartDownload();
        }

        private void StartDownload()
        {
            lock (_downloadingTasks)
            {
                if (_downloadingTasks.Count < _parallelDownloads && _queueToDownlaod.Count > 0)
                {
                    DownloadFile fileToDownload;
                    if (_queueToDownlaod.TryDequeue(out fileToDownload))
                    {
                        var task = new Task(() =>
                        {
                            var client = new WebClient();
                            client.DownloadFile(fileToDownload.Url, fileToDownload.PathToSave);
                        }, TaskCreationOptions.LongRunning);

                        task.ContinueWith(DownloadOverCallback, TaskContinuationOptions.None);

                        _downloadingTasks.Add(task);
                        task.Start();
                    }      
                }
            }
        }

        public void DownloadOverCallback(Task downloadingTask)
        {
            lock (_downloadingTasks)
            {
                _downloadingTasks.Remove(downloadingTask);
            }
        }
    }

You can test it with this:

ParallelDownloading p = new ParallelDownloading(5);

        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file1.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file2.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file3.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file4.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file5.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file6.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file7.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file8.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file9.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file10.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file11.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file12.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file13.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file14.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file15.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file16.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file17.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file18.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });
        p.EnqueueFileToDownload(new DownloadFile() { PathToSave = @"c:\file19.f", Url = @"http://download.thinkbroadband.com/20MB.zip" });


Is this because you are running on a single core machine?

The TPL will use as many threads as you have cores. There are ways of making it run using more threads if you want to.


Downloading a file is an I/O bound call so parallel or not the first thing you should make sure is that you are making a threadless async call to download a single file. Methods like Task.Run, task.Start are thread based and they should not be used for I/O bound calls, otherwise you will kick the downloads of in parallel but you will immediately block your entire CPU, each core sitting there idle waiting for the download call to return.

Instead you should use async/await pattern and await your async download method. That is threadless assuming that you have a true async download method but most libraries provide that.

Now if you parallelize this I/O call, store all the returned tasks in a collection and at the end you can use await Tasks.WhenAll(tasks); to await all tasks.

One thing you also need to make sure while making concurrent I/O bound async calls is not to starve the I/O connection pool so you may want to limit the number of concurrent I/O calls you make.

I have implemented a Parallel Processing API that allows you to make Concurrent Threadless Async calls with I/O throttling options etc.

Feel free to have a look and use: https://www.nuget.org/packages/ParallelProcessor/

0

精彩评论

暂无评论...
验证码 换一张
取 消