html table to CSV, formatting problem in csv_问答_开发者

html table to CSV, formatting problem in csv

开发者 https://www.devze.com 2023-03-06 15:44 出处：网络

I am stuck with idea on creating proper CSV from an html table. I am using HTMLAgilityPack to read the html from string and create a HTMLDocument. Then I am using XPATH to loop through rows and column

The problem is that I am unable to determine the correct row and cell(x,y) for a particular cell.

Example HTML:

<html>
<body>
    <table border="1">
        <tr>
            <td rowspan="2">
                100
            </td>
            <td>
                200
            </td>
            <td colspan="2">
                300
            </td>
        </tr>
        <tr>
            <td colspan="2">
                400
            </td>
            <td>
                600
            </td>
        </tr>
        <tr>
            <td>
                400
            </td>
            <td>
                500
            </td>
            <td>
                600
            </td>
        </tr>
    </table>
</body>
</html>

Image of Table

When I open it in excel and save as CSV, I do get t开发者_JS百科he desired output, which is:

100,200,300,
,400,,600
400,500,600,

Can someone help me create the same output in .Net respecting the rowpan and colspan?

Thanks! Dex

You don't need to know which row and column are you on. All you need to do is add a "," for each new column you found and a breakline every time you reach the end of a row.

If you navigate through the document considering it an xml document all you have to do is go through all TR nodes adding a breakline when you reach the end of the child nodes list. And iterate through all TD nodes on each TR node adding a "," when necessary.