开发者

is there a faster alternative to mwdumper to import xmls?

开发者 https://www.devze.com 2023-04-01 20:19 出处:网络
I am looking for a开发者_Python百科 faster alternative to using mwdumper for importing mediawiki xml dumps.

I am looking for a开发者_Python百科 faster alternative to using mwdumper for importing mediawiki xml dumps. I had used wget to download the xmls one by one for large articles and there are a few hundreds that I need to import. Importing one at a time is taking too long

the command java -jar mwdumper.jar --format=sql:1.5 page1.xml | mysql -u username -d databasename does not seem to be working on Windows command line.


I have compared several available options. maintenance/importDump.php has been a winner for me:

  • It's part of MediaWiki itself, so more likely to remain supported and less chance of anomalies (which you will certainly get if you start messing with SQL queries yourself).
  • It's at least twice as fast as some code I had based on code from maintenance/edit.php.
  • It can run on an existing MediaWiki setup.
  • It works well with GNU Parallel.
  • Nice feedback in the form of 20.23 revs/sec
0

精彩评论

暂无评论...
验证码 换一张
取 消