开发者

Data Streaming with Entity Framework

开发者 https://www.devze.com 2023-02-18 03:39 出处:网络
I\'m designing an ELT system for a data warehouse and was wondering what is the most effective while still (in some sense) safe way of extracting the data from the source database.

I'm designing an ELT system for a data warehouse and was wondering what is the most effective while still (in some sense) safe way of extracting the data from the source database.

I need to read a couple of tables from the source database, organize it into POCO objects that I can work with effectively. These roughly correspond to the dimensions of my cube. To get the facts into my cube, I need to bulk load huge amounts of data from other tables, make some (non-trivial) transformations on them, and write them into a table in the target database.

Although in principle I would only benefit from a small subset of O/RM features, I'm anyway wondering whether using Entity Framework could be an option. Therefore, my question is whether EF (in its newest version) can handle streaming data. What I mean by that is that I keep some kind of a DataReader 开发者_如何转开发open, load a couple of POCOs, make transformation on them, write the results into the second database, dispose them all as soon as I can (I cannot keep them all in memory cause it would blow up) and continue reading until I'm done.

I obviously don't need any change management for these objects and I want to keep them (at least the second category with facts) alive only for a short period of time and dispose them while still in the same transaction. Disposing means for me that not only I get rid of POCOs, but that EF will not keep any infrastructure and not waste even a single byte of memory on any of those objects anymore.

The advantages that I see in using O/RM is that it could simplify querying and transformation to some extent, but I'm not willing to sacrifice too much performance and I'm limited by the overall memory amount that I can consume. Does it make sense to go for EF or should I better stay by plain old ADO.NET DataReader ?


Use BLTOolkit. We so that - very nice. ONLY has the small subset that is good for ETL. Like not remembering which objects it got in a transaction etc.

If you use EF, you are dead. ORMs are NOT for data loads, they are for business objects. A lot of the higher level features (uniquing, etc.) comes with a HUGE price the moment you move 10 million objects ;)

0

精彩评论

暂无评论...
验证码 换一张
取 消