开发者

Best Practices for working with files via c#

开发者 https://www.devze.com 2022-12-29 04:10 出处:网络
Application I work on generates several hundreds of files (csv) in a 15 minutes period of times. and the back end of the application takes these files and process them (updates database with those val

Application I work on generates several hundreds of files (csv) in a 15 minutes period of times. and the back end of the application takes these files and process them (updates database with those values). One problem is database locks.

What are the best practices on working with several thousands of files to avoid locking and efficiently processing these files?

Would it be more efficient to create a single file and process it? or process single file at a time?

What are some common best practices?

Edit: the datab开发者_运维知识库ase is not a relational dbms. It s nosql, object oriented dbms that works in the memory.


So, assuming that you have N-Machines creating files and each file is similar in the sense that it generally gets consumed into the same tables in the database...

I'd set up a Queue, have all of the machines write their files to the queue and then have something on the other side picking stuff off of the queue and then processing it into the database. So, one file at a time. You could probably even optimize out the file operations by writing to the Queue directly.


If you are experiencing problems with locks, it's likely the database tables being updated do not have proper indexes on them. Get the SQL code that does the updating and find out what the execution plan is for it; if you are using MSSQL, you can do this in SSMS; if the UPDATE is causing a table scan, you need to add an index that will help isolate the records being updated (unless you are updating every single record in the table; that could be a problem).


With limited knowledge of your exact scenario...

Performance wise, closing the file is possibly the most expensive operation you would be performing in terms of time, so my advice would be if you can go the single file route - then that would be the most performant approach.


Lock will protect the files from processing until the first one is finished.

class ThreadSafe
{
  static readonly object _locker = new object();
  static int _val1, _val2;

  static void Go()
  {
    lock (_locker)
    {
      if (_val2 != 0) Console.WriteLine (_val1 / _val2);
      _val2 = 0;
    }
  }
}


Sounds like you'll either want a single file mechanism, or have all of the files consumed out of a shared single directory that continuously checks for the oldest csv file and runs it through your code. That might be the "cheapest" solution, anyway. If you are actually generating more files that you can process, then I'd probably rethink the overall system architecture instead of the 'band-aid' approach.


You may try to take care of concurrency issues at level of your app code and force dbms not to lock objects during updates.

(In RDBMS you would set the lowest transaction isolation level possible (read uncommitted))

Provided you can do that, another option is to truncate all old objects and bulk-insert new values.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号