I want to create a Text file for this below pdf
http://examples.itextpdf.com/results/part4/chapter16/with_font.pdf
output should be similar to::
<BaseFont:'WaltDisneyScriptv4.1'; Type:'None'; Size:'60'>iText in Action<End>
I could google and find how to extract/find fonts used in a pdf but not their size or type(i.e. bold/italic...) and relate font to the every text being used.
In case where different fonts are used the o/p should be like
Eg: <BaseFont:'Courier'; Type:'None'; Size:'45'>iText <End><BaseFont:'WaltDisneyScriptv4.1'; Type:'None'; Size:'60'>in Action<End>
Any assistance is appreciated. Thank开发者_Go百科s in advance!
Here is some code that I used to find the SET of fonts in a pdf.
public static void processResource(Map<String, String> set, PdfDictionary resource)
{
if (resource == null)
return;
PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
if (xobjects != null)
{
for (PdfName key : xobjects.getKeys())
{
processResource(set, xobjects.getAsDict(key));
}
}
PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
if (fonts == null)
return;
PdfDictionary font;
for (PdfName key : fonts.getKeys())
{
font = fonts.getAsDict(key);
String name = font.getAsName(PdfName.BASEFONT).toString();
if (name.length() > 8 && name.charAt(7) == '+')
{
name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
}
else
{
name = name.substring(1);
PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);
if (desc == null)
name += " nofontdescriptor";
else if (desc.get(PdfName.FONTFILE) != null)
name += " (Type 1) embedded";
else if (desc.get(PdfName.FONTFILE2) != null)
name += " (TrueType) embedded";
else if (desc.get(PdfName.FONTFILE3) != null)
name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded";
}
set.put(font.getAsName(PdfName.NAME).toString(), name);
// System.err.println(font.getAsName(PdfName.NAME) + " " + name);
}
}
You should be able to extend it to extract some font size information. Additionally, if there is not information in the Dictionary, then you can look at the raw postscript and get font information from that.
精彩评论