开发者

Download files from net using threading or not

开发者 https://www.devze.com 2023-01-08 18:08 出处:网络
I need to download a huge number of files from net based on a keyword. The steps i am following are Using Scraping figure out the links to files

I need to download a huge number of files from net based on a keyword. The steps i am following are

  1. Using Scraping figure out the links to files
  2. Using WebClient.DownloadData() download the byte[]
  3. Save the arr to a file.

Is it a good idea to create one thread for downloading each file for better performance. Any suggestions. Thanks

foreach (string each in arr)
        {

            Thread t = new Thread(
                                new ThreadStart(
                                    delegate
                                    {

                                        string[] arr2 = each.Split(new string[] { "http://" }, StringSplit开发者_如何学PythonOptions.None);

                                        string[] firstElem = arr2[1].Split(new string[] { " " }, StringSplitOptions.None);

                                        string urlToDownload = @firstElem[0].Replace("\"", string.Empty);
                                        string filName = Path.GetFileName(urlToDownload);
                                        string dirName = DirInAppConfig();
                                        DataRow row;
                                        bool dataExistsInDtKwWithSameDownloadLinkAndFileName;
                                        getRowForKwDownLinkFileName(urlToDownload, filName, out row, out dataExistsInDtKwWithSameDownloadLinkAndFileName);
                                        downloadFile(Client, urlToDownload, dirName, filName, search, row);
                                    }));
                                t.IsBackground = true;
                                t.Start();
                                t.Join();
        }


Often server limit the download from one IP to 2 connections. So if all files are from the same server, multiple threads might not help much.


Have you done a performance analysis that indicates to you that you need to consider threading? No? Then you're using premature optimization, and you should stop that right now.

Do you have experience with multithreading, such that you're not likely to make some stupid mistake about locking, or, if you do make such a mistake, you will be able to quickly find and fix it? No? Then you should stop right now.

You may have no clear idea how much more time it can take to debug a multithreaded program. That time could totally overwhelm the time you could save by using multiple threads.

0

精彩评论

暂无评论...
验证码 换一张
取 消