Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this questionI am sorry - C++ source code can be seen as implementation of a design, and with reverse-engineering I mean getting the design back. It seems most of you have read it as getting C++ source from binaries. I have posted a more precise question at Understanding a C++ codebase by generating UML - tools&methology
I think there are many tools that can reverse-engineer C++ (source-code), but usually it is not so eas开发者_如何学运维y to make sense of what you get out.
Have somebody found a good methodology?
I think one of the things I might want to see for example is the GUI-layer and how it is separated (or not from the rest). Think the tools should somehow detect packages, and then let me manually organize it.
To my knowledge, there are no reliable tools that can reverse-engineer compiled C++.
Moreover, I think it should be near impossible to construct such a device. A compiled C++ program becomes nothing more than machine language instructions. In order to kn ow how that's mapped to C++ constructs, you need to know the compiler, compiler settings, libraries included, etc ad infinitum.
Why do you want such a thing? Depending on what you want it for, there may be other ways to accomplish what you're really after.
While it isn't a complete solution. You should look into IDA Pro and Hexrays.
It is more for "reverse engineering" in the traditional sense of the phrase. As in, it will give you a good enough idea of what the code would look like in a C like language, but will not (cannot) provide fully functioning source code.
What it is good for, is getting a good understanding of how a particular segment (usually a function) works. It is "user assisted", meaning that it will often do a lot of dereferences of offsets when there is a really a struct or class. At which point, you can supply the decompiler with a struct definition (classes are really just structs with extra things like v-tables and such) and it will reanalyze the code with the new type information.
Like I said, it isn't perfect, but if you want to do "reverse engineering" it is the best solution I am aware of. If you want full "decompilation" then you are pretty much out of luck.
You can pull control flow with dissembly but you will never get data types back...
There are only integers (and maybe some shorts) in assembly. Think about objects, arrays, structs, strings, and pointer arithmetic all being the same type!
The OovAide project at http://sourceforge.net/projects/oovaide/ or on github has a few features that may help. It uses the CLang compiler for retrieving accurate information from the source code. It scans the directories looking for source code, and collects the information into a smaller dataset that contains the information needed for analysis.
One concept is called Zone Diagrams. It shows relationships between classes at a very high level since each class as shown as a dot on the diagram, and relationship lines are shown connecting them. This allows the diagrams to show hundreds or thousands of classes. The OovAide program zone diagram display has an option call "Show Child Zones", which groups the classes that are within directories closer to each other. There are also directory filters, which allow reducing the number of classes shown on a diagram for very large projects. An example of zone diagrams and how they work is shown here: http://oovaide.sourceforge.net/articles/ZoneDiagrams.html
If the directories are assigned component types in the build settings, then the component diagram will show the dependencies between components. This even shows which components are dependent on external components such as GTK, or other external libraries.
The next level down shows something like UML class diagrams, but shows all relations instead of just aggregation and inheritance. It can show classes that are used within methods, or classes that are passed as parameters to methods. Any class can be chosen as a starting point, then before a class is added the diagram, a list is displayed that allows viewing which classes will be displayed by a relationship type.
The lowest level shows sequence diagrams. This allows navigating up or down the call tree while showing the classes that contain the methods.
精彩评论