Is there a way to determine if an instance of a org.apache.poi.hwpf.model.ListData
belongs to a numbered list or bulleted list?
I am using Apache Poi's org.apache.poi.hwpf.HWPFDocument
class to read the contents of a word document in order to generate HTML. I can identify the list items in the document by checking to see that the paragraph I am working with is an instance of org.apache.poi.hwpf.model.ListData
. I can not find a way to determine if ListData
belongs to a bullet开发者_如何学Pythoned list or a numbered list.
I think I have found the answer to my own question.
ListEntry aListEntry = (ListEntry) aParagraph;
ListData listData = listTables.getListData(aListEntry.getIlfo());
int numberFormat = listData.getLevel(listData.numLevels()).getNumberFormat();
The number format returns 23 for bullet points and 0 for numbered lists. I dare say that there are multiple format numbers that can be interpreted as either bullet points or numbered lists but at least I can now identify them!
I lately posted another way to determine the list type. Unfortunately this way only worked for a few tests.
I now can confirm leighgorys way to determine the list type.
public class ListTest {
public static void main(String[] args) {
String filename = "/some/path/to/ListTest.doc";
try {
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(filename));
HWPFDocument doc = new HWPFDocument(fs);
//Get a table of all the lists in this document
ListTables listtables = doc.getListTables();
Paragraph para;
Range range = doc.getRange();
for(int x=0; x<range.numParagraphs(); x++) {
para = range.getParagraph(x);
//When non-zero, (1-based) index into the pllfo
//identifying the list to which the paragraph belongs
if( para.getIlfo()!=0 ) {
//Get the list this paragraph belongs to
ListData listdata = listtables.getListData(para.getIlfo());
//Now get all the levels for this list
ListLevel[] listlevel = listdata.getLevels();
//Find the list level info for our paragraph
ListLevel level = listlevel[para.getIlvl()];
System.out.print("Text: \"" + para.text() + "\"");
//list level for this paragraph
System.out.print("\tListLevel: " + para.getIlvl());
//Additional text associated with list symbols
System.out.print("\tgetNumberText: \"" + level.getNumberText() + "\"" );
//Format value for the style of list symbols
System.out.println("\tgetNumberFormat: " + level.getNumberFormat() );
} else {
System.out.println();
}
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
nfc value Numbering scheme
15 Single Byte character
16 Kanji numbering 3 (dbnum3).
17 Kanji numbering 4 (dbnum4).
18 Circle numbering (circlenum).
19 Double-byte Arabic numbering
20 46 phonetic double-byte Katakana characters (aiueodbchar).
21 46 phonetic double-byte katakana characters (irohadbchar).
22 Arabic with leading zero (01, 02, 03, ..., 10, 11)
23 Bullet (no number at all)
24 Korean numbering 2 (ganada).
25 Korean numbering 1 (chosung).
26 Chinese numbering 1 (gb1).
27 Chinese numbering 2 (gb2).
28 Chinese numbering 3 (gb3).
29 Chinese numbering 4 (gb4).
30 Chinese Zodiac numbering 1
31 Chinese Zodiac numbering 2
32 Chinese Zodiac numbering 3
33 Taiwanese double-byte numbering 1
34 Taiwanese double-byte numbering 2
35 Taiwanese double-byte numbering 3
36 Taiwanese double-byte numbering 4
37 Chinese double-byte numbering 1
38 Chinese double-byte numbering 2
39 Chinese double-byte numbering 3
40 Chinese double-byte numbering 4
41 Korean double-byte numbering 1
42 Korean double-byte numbering 2
43 Korean double-byte numbering 3
44 Korean double-byte numbering 4
45 Hebrew non-standard decimal
46 Arabic Alif Ba Tah
47 Hebrew Biblical standard
48 Arabic Abjad style
49 Hindi vowels
50 Hindi consonants
51 Hindi numbers
52 Hindi descriptive (cardinals)
53 Thai letters
54 Thai numbers
55 Thai descriptive (cardinals
56 Vietnamese descriptive (cardinals)
57 Page Number format - # -
58 Lower case Russian alphabet
精彩评论