开发者

MongoDB ETL (php/java...)

开发者 https://www.devze.com 2023-02-20 04:36 出处:网络
Is there a开发者_StackOverflow中文版n ETL for MongoDB ?...Pentaho Data Integration supports MongoDB (See the documentation http://wiki.pentaho.com/display/EAI/Pentaho+Data+Integration+Steps).

Is there a开发者_StackOverflow中文版n ETL for MongoDB ?...


Pentaho Data Integration supports MongoDB (See the documentation http://wiki.pentaho.com/display/EAI/Pentaho+Data+Integration+Steps).

Similarly Talend supports MongoDB: https://github.com/adrien-mogenet/tMongoDBConnection


For simple inserts of CSV documents I would suggest looking at the Mongo wiki page Import Export Tools.

For anything more complicated I'd suggest writing an ad-hoc script in the language you are most comfortable with.


It seems that Pentaho Data Integration and Talend only supports reading from MongoDB, but not writing.

Another tool that just announced support for MongoDB is DataCleaner, and it supports both read and write operations. It does not position itself quite as an ETL tool, but more like a data quality analysis tool, but it does have ETL-like capabilities also.

http://datacleaner.eobjects.org


I created my own ETL solution with python scripts to transfer data from MySQL to MongoDB. It's awesome in my mind.

Basically, I used following two python modules for accessing mysql and mongodb:

  1. pymongo
  2. python-mysql.connect

Both of them are installable from official Ubuntu repository.


I've created MongoDB driver for Scriptella ETL tool. It is available at https://github.com/scriptella/scriptella-mongodb.

Example of migrating data from the relational table:

<connection id="out" url="mongodb://localhost/test"  classpath="../lib/scriptella-mongodb-driver.jar:../lib/mongo-java-driver-2.10.1.jar" />

<query connection-id="in">
    SELECT * FROM USERS
    <script connection-id="out">
        {
            operation: 'db.collection.save',
            collection: 'users',
            data: {
                user_id: '?user_id',
                name: '?name'
            }
        }
    </script>
</query>


This PHP program automatically transfers MongoDB database to MySQL, It introspects the Mogno collections, creates MySQL schema and transfers the data. It does it only 1 level deep (level 0 and 1), deeper nesting is not migrated:

http://my.sociopal.com/sociopaltech/post?id=simple_utility_for_copying_data_from_mongodb_to_mysql_this_is_a_simple_php_program_im_using_in_o_61755


Pentaho DI supports MongoDB reads. I am not sure about the writes. I would think considering the underlying structure of MongoDB as opposed to conventional RDBMS you might be better off looking for a custom ETL process/scripting using python/java maybe rather than off the shelf tools that might not do what they claim to. Eventually one of these players in the BI-ETL market would have this in their tool once the process is mature and tested multiple times for mongoDb


MongoSyphon is a lightweight open source ETL tool that transforms data into documents in JSON or XML format.

It can also do the reverse, sending documents directly into MongoDB, differing from other ETL tools that try to create relational structure. Other than MongoSyphon, we can also work with these all tools which deal with the same for eg:

  • Transporter
  • Hevo
  • Data
  • Krawler
  • Panoply
  • SYNC
  • Pentaho
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号