Is there any available solution for 开发者_如何学运维(re-)generating PHP code from the Parser Tokens returned by token_get_all
? Other solutions for generating PHP code are welcome as well, preferably with the associated lexer/parser (if any).
From my comment:
Does anyone see a potential problem, if I simply write a large switch statement to convert tokens back to their string representations (i.e. T_DO to 'do'), map that over the tokens, join with spaces, and look for some sort of PHP code pretty-printing solution?
After some looking, I found a PHP homemade solution in this question, that actually uses the PHP Tokenizer interface, as well as some PHP code formatting tools which are more configurable (but would require the solution as described above).
These could be used to quickly realize a solution. I'll post back here when I find some time to cook this up.
Solution with PHP_Beautifier
This is the quick solution I cooked up, I'll leave it here as part of the question. Note that it requires you to break open the PHP_Beautifier class, by changing everything (probably not everything, but this is easier) that is private to protected, to allow you to actually use the internal workings of PHP_Beautifier (otherwise it was impossible to reuse the functionality of PHP_Beautifier without reimplementing half their code).
An example usage of the class would be:
file: main.php
<?php
// read some PHP code (the file itself will do)
$phpCode = file_get_contents(__FILE__);
// create a new instance of PHP2PHP
$php2php = new PHP2PHP();
// tokenize the code (forwards to token_get_all)
$phpCode = $php2php->php2token($phpCode);
// print the tokens, in some way
echo join(' ', array_map(function($token) {
return (is_array($token))
? ($token[0] === T_WHITESPACE)
? ($token[1] === "\n")
? "\n"
: ''
: token_name($token[0])
: $token;
}, $phpCode));
// transform the tokens back into legible PHP code
$phpCode = $php2php->token2php($phpCode);
?>
As PHP2PHP extends PHP_Beautifier, it allows for the same fine-tuning under the same API that PHP_Beautifier uses. The class itself is:
file: PHP2PHP.php
class PHP2PHP extends PHP_Beautifier {
function php2token($phpCode) {
return token_get_all($phpCode);
}
function token2php(array $phpToken) {
// prepare properties
$this->resetProperties();
$this->aTokens = $phpToken;
$iTotal = count($this->aTokens);
$iPrevAssoc = false;
// send a signal to the filter, announcing the init of the processing of a file
foreach($this->aFilters as $oFilter)
$oFilter->preProcess();
for ($this->iCount = 0;
$this->iCount < $iTotal;
$this->iCount++) {
$aCurrentToken = $this->aTokens[$this->iCount];
if (is_string($aCurrentToken))
$aCurrentToken = array(
0 => $aCurrentToken,
1 => $aCurrentToken
);
// ArrayNested->off();
$sTextLog = PHP_Beautifier_Common::wsToString($aCurrentToken[1]);
// ArrayNested->on();
$sTokenName = (is_numeric($aCurrentToken[0])) ? token_name($aCurrentToken[0]) : '';
$this->oLog->log("Token:" . $sTokenName . "[" . $sTextLog . "]", PEAR_LOG_DEBUG);
$this->controlToken($aCurrentToken);
$iFirstOut = count($this->aOut); //5
$bError = false;
$this->aCurrentToken = $aCurrentToken;
if ($this->bBeautify) {
foreach($this->aFilters as $oFilter) {
$bError = true;
if ($oFilter->handleToken($this->aCurrentToken) !== FALSE) {
$this->oLog->log('Filter:' . $oFilter->getName() , PEAR_LOG_DEBUG);
$bError = false;
break;
}
}
} else {
$this->add($aCurrentToken[1]);
}
$this->controlTokenPost($aCurrentToken);
$iLastOut = count($this->aOut);
// set the assoc
if (($iLastOut-$iFirstOut) > 0) {
$this->aAssocs[$this->iCount] = array(
'offset' => $iFirstOut
);
if ($iPrevAssoc !== FALSE)
$this->aAssocs[$iPrevAssoc]['length'] = $iFirstOut-$this->aAssocs[$iPrevAssoc]['offset'];
$iPrevAssoc = $this->iCount;
}
if ($bError)
throw new Exception("Can'process token: " . var_dump($aCurrentToken));
} // ~for
// generate the last assoc
if (count($this->aOut) == 0)
throw new Exception("Nothing on output!");
$this->aAssocs[$iPrevAssoc]['length'] = (count($this->aOut) -1) - $this->aAssocs[$iPrevAssoc]['offset'];
// post-processing
foreach($this->aFilters as $oFilter)
$oFilter->postProcess();
return $this->get();
}
}
?>
In the category of "other solutions", you could try PHP Parser.
The parser turns PHP source code into an abstract syntax tree....Additionally, you can convert a syntax tree back to PHP code.
If I'm not mistaken http://pear.php.net/package/PHP_Beautifier uses token_get_all() and then rewrites the stream. It uses heaps of methods like t_else
and t_close_brace
to output each token. Maybe you can hijack this for simplicity.
See our PHP Front End. It is a full PHP parser, automatically building ASTs, and a matching prettyprinter that regenerates compilable PHP code complete with the original commments. (EDIT 12/2011: See this SO answer for more details on what it takes to prettyprint from ASTs, which are just an organized version of the tokens: https://stackoverflow.com/a/5834775/120163)
The front end is built on top of our DMS Software Reengineering Toolkit, enabling the analysis and transformation of PHP ASTs (and then via the prettyprinter code).
精彩评论