I know of several tools/libraries that can do this but I want to know if this is possible with just opening up the file as a text file and looking for a keyword.
have a look at this: http://www.freevbcode.com/ShowCode.asp?ID=8153
Edit: not work, may be too old
Found this:
public static int GetNoOfPagesPDF(string FileName)
{
int result = 0;
FileStream fs = new FileStream(FileName, FileMode.Open, FileAccess.Read);
StreamReader r = new StreamReader(fs);
string pdfText = r.ReadToEnd();
System.Text.RegularExpressions.Regex regx = new Regex(@"/Type\s*/Page[^s]");
System.Text.RegularExpressions.MatchCollection matches = regx.Matches(pdfText);
result = matches.Count;
return result;
}
Ps: tested! It works.see here source
[Edit: based on the edited question]
It is possible by reading it as text file and some minimal parsing.
If you read the pdf yourself then you will need to do the parsing. Each page in a PDF is represented by a page object.
The following provides an understanding about the pdf specification in short for pages and the link to the pdf spec.
- http://help.4xpdf.com/questions/8/how-to-programmatically-count-the-number-of-pages-in-a-pdf
The xpdf utilities package (called xpdf-utils in debian) includes an application called pdfinfo. It will print out the number of pages in the file, among other data.
http://www.linuxquestions.org/questions/programming-9/how-to-find-pdf-page-count-699113/
精彩评论