I have a single PHP file within a legacy project that is at least a few thousand lines long. It is predominantly separated up into a number of different conditional blocks by a switch statement with about 10 cases. Within each case there is what appears to be a very similar - if not exact duplicate - block of code. What methods are available for me identifying these blocks of code as being either the same - or close to the same - so I can abstract that code out and begin to refactor the entire file? I know this is possible in very manual te开发者_运维百科rms (separate each case statement in the code into individual files and Diff) but i'm interested in what tools i could be using to speed this process up.
Thanks.
You can use phpcpd.
phpcpd is a Copy/Paste Detector (CPD) for PHP code. It scans a PHP project for duplicated code.
Further resources:
- http://qualityassuranceinphpprojects.com/pages/tools.html
You can use phpunit PMD (Project Mess Detector) to detect duplicated blocks of code.
It also can compute the Cyclomatic complexity of your code.
Here is a screenshot of the pmd tab in phpuc:
See our PHP Clone Detector tool.
This finds both exact copies and near misses, in spite of reformatting, insertion/deletion of comments, replacement of variable names, addition/replacments of subblocks etc.
PHPCPD as far as I can tell finds only (token) sequences which are exactly the same. That misses a lot of clones, since the most common operation after copy-paste is edit-to-customize. So it would miss the very clones the OP is trying to find.
You could put the blocks in separate files and just run diff on them?
However, I think in the end you will need to go through everything manually anyway, since it sounds like this code requires a lot of refactoring, and even if there are differences you will probably need to evaluate whether this is intentional or a bug.
精彩评论