开发者

Python (or C) library for creating XLSX documents that can handle millions of rows [closed]

开发者 https://www.devze.com 2023-03-14 22:56 出处:网络
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines guidelines. It is not currently accepting answers.
Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered 开发者_运维百科with facts and citations.

Closed 3 years ago.

Improve this question

I'm looking for a library to create XLSX files which can contain upwards of a million rows, and several dozen columns. So far all the libraries I have found in Python consume too much memory, and I haven't found a suitable library to wrap in C. I'd prefer open source so I can modify the code if need be.

EDIT: I have found a solution. openpyxl has an "Optimized Writer": http://packages.python.org/openpyxl/optimized.html


have you tried ElementTree? if that uses too much memory, use SAX and just process a row at a time. XML parsing - ElementTree vs SAX and DOM


The XLSX format consists of a number of XML files that have been zipped. If the format of the output will not be changing, it would be trivial to use an existing file as a template and simply add rows to it as necessary. Unfortunately ZipFile.writestr doesn't allow you to write the file in pieces, so you'll have to write the entire XML to a temporary file then place that into the zip with ZipFile.write.

0

精彩评论

暂无评论...
验证码 换一张
取 消