I am looking for a开发者_Python百科 faster alternative to using mwdumper for importing mediawiki xml dumps. I had used wget to download the xmls one by one for large articles and there are a few hundreds that I need to import. Importing one at a time is taking too long
the command java -jar mwdumper.jar --format=sql:1.5 page1.xml | mysql -u username -d databasename does not seem to be working on Windows command line.
I have compared several available options. maintenance/importDump.php
has been a winner for me:
- It's part of MediaWiki itself, so more likely to remain supported and less chance of anomalies (which you will certainly get if you start messing with SQL queries yourself).
- It's at least twice as fast as some code I had based on code from
maintenance/edit.php
. - It can run on an existing MediaWiki setup.
- It works well with GNU Parallel.
- Nice feedback in the form of
20.23 revs/sec
精彩评论