Trying to find a proper way to transform HTMl talbles into plain text tables. Anyone know if there is a good tool out? either p开发者_运维知识库ayware or freeware. Preferably in .net (C#). I've looked in to doing it via HTML agility pack, maybe there are better ways? Using a html parser would still leave a lot of the complexity e.g. to calculate column widths and table widths with different colspan and so on...
Thanks!
Here is a example: http://www.w3.org/TR/html401/struct/tables.html#h-11.5
I'm actually working with financial tables, that have more varying colspan, but the example points out what I want to achieve. It must also be possible to limit the width of the table.
Take a look at the source code to Links. It is a text-based web browser, so it knows how to render tables as text. It's written in C, not C#, but it should be enough for you to figure out a mapping algorithm.
精彩评论