开发者

nHibernate performance issue when loading large collections

开发者 https://www.devze.com 2023-02-24 11:41 出处:网络
(just to make clear: my app isn\'t really about employees and departments. I just use these terms for example\'s sake).

(just to make clear: my app isn't really about employees and departments. I just use these terms for example's sake).

Each department has an employees collection, which is lazily-loaded. Whenever I add a new employee I want to make sure It doesn't already exist in the collection, so I load the collection to memory and perform the check on it.

Problem is- in production environment, I have some departments with 10,000+ employees.

I found that fetching the collection and then saving the new employee takes A LOT of time.

I've done a little experment, in which I copied the exact same select statement generated by nH to ADO.Net SQLDataAdapter. Here开发者_运维技巧 are the results:

***16:04:50:437*** DEBUG NHibernate.SQL - SELECT ... FROM dbo.[Employee] emp0_ left outer join dbo.[Department] department1_ on emp0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE emp0_.SomeField_id=@p0;@p0 = 2
***16:05:00:250*** DEBUG NHibernate.SQL - SELECT ... FROM dbo.TableD codeshared0_ left outer join dbo.[Department] department1_ on codeshared0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE codeshared0_.Employee_id in (select emp0_.Id FROM dbo.[Employee] emp0_ left outer join dbo.[Department] department1_ on emp0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE emp0_.SomeField_id=@p0);@p0 = 2
16:05:04:984 DEBUG NHibernate.SQL - Reading high value:select next_hi from dbo._uniqueKey with (updlock, rowlock)
16:05:05:078 DEBUG NHibernate.SQL - Updating high value:update dbo._uniqueKey set next_hi = @p0 where next_hi = @p1;@p0 = 10686, @p1 = 10685
***16:05:05:328*** DEBUG MyApp.Managers - commiting
16:05:12:000 DEBUG NHibernate.SQL - INSERT INTO dbo.[Employee] (...) VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9);@p0 = 23/04/2011 04:04:49, @p1 = 23/04/2011 03:34:49, @p2 = 23/04/2011 04:04:49, @p3 = 23/04/2011 03:34:49, @p4 = '', @p5 = False, @p6 = 433, @p7 = NULL, @p8 = 2, @p9 = 10685
16:05:12:140 DEBUG NHibernate.SQL - UPDATE dbo.[Employee] SET Department_id = @p0 WHERE Id = @p1;@p0 = 2, @p1 = 10685
16:05:12:343 DEBUG MyApp.Managers - success
16:05:12:359 DEBUG MyApp.Tests - ------------------------------------------------------------
16:05:12:359 DEBUG MyApp.Tests - Finished nHib stuff- now switching to ADO 
16:05:12:359 DEBUG MyApp.Tests - starting SQL: SELECT ... FROM dbo.[Employee] emp0_ left outer join dbo.[Department] department1_ on emp0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE emp0_.SomeField_id=2
16:05:14:750 DEBUG MyApp.Tests - total rows received: 10036
16:05:14:750 DEBUG MyApp.Tests - SQL: SELECT ... FROM dbo.TableD codeshared0_ left outer join dbo.[Department] department1_ on codeshared0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE codeshared0_.Employee_id in (select emp0_.Id FROM dbo.[Employee] emp0_ left outer join dbo.[Department] department1_ on emp0_.Department_id=department1_.Id left outer join dbo.[TableC] TableC2_ on department1_.TableC_id=TableC2_.Id WHERE emp0_.SomeField_id=2)
16:05:15:250 DEBUG MyApp.Tests - total rows received: 2421

as you can see- fetching takes ~15 secs with nH, compared to ~2 secs with ADO.Net.

From researching around a bit I know that nH probably isn't meant to be used to store that many items in session. Can you think of any other possible reason for this problem, or of another suggestion other than filtering the Employees at the DB level?

thanks

--EDIT--

Following the below suggestions i've tried using Reflection Optimizer (made no difference), and IStatelessSession for loading my collection (throws an exception- collections cannot be fetched by a stateless session.).

I think my code in the Department class will have to change from the clean:

if (this.Employees.Contains(emp))
{
  ...
}  

to this 'dirtier' version:

var employeesRepository = IOCContainer.Get<IEmployeesRepository>();  
if (employeesRepository.EmployeeExists(this,emp))
{
  ...
}  

anyone has a better suggestion?


There is no reason for you to load all the empoyees to memory. you should write a query using HQL/Critiria API/Linq to NHibernate to check if the employee already existing in the DB. for example:

var existingEmpoyee = session.Query<Employee>()
                             .Where(e => e.Equals(newEmployee))
                             .FirstOrDefault();
if(existingEmployee != null)
   // Insert new employee to DB


I would use a StatelessSession and batch optimization.

The session will keep track of all the loaded objects, and if we load a lot of data, it will eventually blow out with an out of memory exception. Luckily, NHibernate has a ready made solution for this, the stateless session. The code now looks like this:

using (IStatelessSession s = sessionFactory.OpenStatelessSession())
{
    var books = new ActionableList<Book>(book => Console.WriteLine(book.Name));
    s.CreateQuery("from Book")
        .List(books);

}

The stateless session, unlike the normal NHibernate session, doesn’t keep track of loaded objects, so the code here and the data reader code are essentially the same thing.

For batch optimization and more: NHibernate performance tricks.


hmmmm. may be its too much, and actually - i'd expect an "lazy=extra" ISet to behave like this, but you can write your own "extra lazy" ISet. if you didn't encounter an extra lazy collection - its a collection that, for example, when you ask its count, it doesn't fetch everything but issues a count query. your extra lazy ISet could issue an exists query whenever you try to add something.

If you implement this, your code would be clean and you could submit the code to nhibernate core.

you should however think about add range, and be careful not to issue N queries

good luck.

0

精彩评论

暂无评论...
验证码 换一张
取 消