I can see there are a lot of questions for getting the number of pages in a a pdf with C, PHP and others but am wondering with a batc开发者_StackOverflow中文版h file or cmd is there a simple way of getting the number of pages?
Using pdftk:
pdftk my.pdf dump_data | grep NumberOfPages
does the trick.
Alternatively you can use the command, which returns only the number:
pdfinfo "${PDFFILE}" | grep Pages | sed 's/[^0-9]*//'
You will need the poppler package:
https://poppler.freedesktop.org/
Which can be installed using either homebrew / linuxbrew:
brew install poppler
or with apt
:
sudo apt install poppler-utils
QPDF is a lightweight alternative to PDFtk (requiring Java runtime) and pdfinfo (quite a dumb tool).
qpdf --show-npages file.pdf
It prints just the number of pages, needing no post-processing.
Packages for most Linux distributions exist, usually named just qpdf
. Pages like Softpedia host binaries for Windows. The source code can be downloaded from SourceForge, or from the official GitHub repository.
The --show-npages
option was added in a version after 4.1.0, in commit 91367239fd55f7c4996ed6158405ea10573ae3cb. To be compatible with version 4.1.0 and earlier, you can dump basic information about each page and count the pages. In Linux and OS X:
qpdf --show-pages file.pdf | grep -c ^page
On Windows, you should use findstr
and find
instead:
qpdf --show-pages file.pdf | findstr ^page | find /c /v ""
Without any external tools (save the script bellow as .bat
) :
@if (@X)==(@Y) @end /* JScript comment
@echo off
cscript //E:JScript //nologo "%~f0" %*
exit /b 0
@if (@X)==(@Y) @end JScript comment */
var args=WScript.Arguments;
var filename=args.Item(0);
var fSize=0;
var inTag=false;
var tempString="";
var pages="";
function getChars(fPath) {
var ado = WScript.CreateObject("ADODB.Stream");
ado.Type = 2; // adTypeText = 2
ado.CharSet = "iso-8859-1";
ado.Open();
ado.LoadFromFile(fPath);
var fs = new ActiveXObject("Scripting.FileSystemObject");
fSize = (fs.getFile(fPath)).size;
var fBytes = ado.ReadText(fSize);
var fChars=fBytes.split('');
ado.Close();
return fChars;
}
function checkTag(tempString) {
if (tempString.length == 0 ) {
return;
}
if (tempString.toLowerCase().indexOf("/count") == -1) {
return;
}
if (tempString.toLowerCase().indexOf("/type") == -1) {
return;
}
if (tempString.toLowerCase().indexOf("/pages") == -1) {
return;
}
if (tempString.toLowerCase().indexOf("/parent") > -1) {
return;
}
var elements=tempString.split("/");
for (i = 0;i < elements.length;i++) {
if (elements[i].toLowerCase().indexOf("count") > -1) {
pages=elements[i].split(" ")[1];
}
}
}
function getPages(fPath) {
var fChars = getChars(fPath);
for (i=0;i<fSize-1;i++) {
if ( fChars[i] == "<" && fChars[i+1] == "<" ) {
inTag = true;
continue;
}
if (inTag && fChars[i] == "<") {
continue;
}
if ( inTag &&
fChars[i] == ">" &&
fChars[i+1] == ">" ) {
inTag = false;
checkTag(tempString);
if (pages != "" ) {
return;
}
tempString="";
}
if (inTag) {
if (fChars[i] != '\n' && fChars[i] != '\r') {
tempString += fChars[i];
}
}
}
}
getPages(filename);
if (pages == "") {
WScript.Echo("1");
} else {
WScript.Echo(pages);
}
It takes the path to the .pdf
file and simply prints the number of the pages.Not pretty fast as it reads the pdf symbol by symbol , but could be optimized.
Because you asked for a "batch file" I have to assume you only want a Windows-based solution. But, just in case Mac OS X is an option, here something that could be useful. If you have the PDFs on a Mac, on a drive that has been indexed by Spotlight (the default), the following command will return the number of pages using no external dependencies:
mdls -name kMDItemNumberOfPages POSIX_PATH_OF_PDF_FILE
Source: MacScripter.net - http://macscripter.net/viewtopic.php?id=32381
It might be helpful for new users. In the new version of PDFtk tool (above 2.0), use below command to get the number of pages of a PDF file:
pdftk file.pdf dump_data_annots output outputfile.txt
A new file will created at destination having content similar to below:
NumberOfPages: 6
Now read the file and manipulate the content as you want.
I know this is old post, yet still very much relevant, so I believe that there should be an answer which can tell how to get page count using "poppler-0.68.0" utility, in windows.
Navigate to bin folder and run pdfinfo.exe like -
C:\Temp\temp_folder\poppler-0.68.0\bin>pdfinfo.exe "C:\Temp\temp_folder\TT.pdf"
A simple way if ImageMagick or GraphicsMagick is installed:
identify *.pdf | wc -l
精彩评论