This is a dangerous question, so let me try to phrase it correctly. Premature optimization is the root of all evil, but if you know you need it, there is a basic set of rules that should be considered. This set is what I'm wondering about.
For instance, imagine you got a list of a few thousand items. How do you look up an item with a specific, unique ID? Of course, you simply use a Dictionary to map the ID to the item.
And if you know that there is a setting stored in a database that is required all the time, you simply cache it instead of issuing a database request hundred times a second.
Or even something as simple as using a release instead of a debug build in prod.
I guess there are a few even more basic ideas.
I am specifically not looking for "don't do it, for experts: don't do it yet" or "use a profiler" answers, but for really simple, general hints. If you feel this is an argumentative question, you probably misunderstood my intention.
I am also not looking for concrete advice in any of my projects nor any sophisticated low level tricks. Think of it as an overview of how to avoid the most important performance mistakes you made as a very beginner.
Edit: This might be a good description of what I am looking for: Create a presentation (not a practical example) of common optimization rules for people who have a basic technical understanding (let's sa开发者_如何学编程y they got a CS degree) but for some reason never wrote a single line of code. Point out the most important aspects. Pseudocode is fine. Do not assume specific languages or even architectures.
Two rules:
Use the right data structures.
Use the right algorithms.
I think that covers it.
- Minimize the number of network roundtrips
- Minimize the number of harddisk seeks
These are several orders of magnitude slower than anything else your program is likely to do, so avoiding them can be very important indeed. Typical methods to achieve this are:
- Caching
- Increasing the granularity of network and HD accesses
For example, B-Trees are absolutely ubiquitous in DB systems because the reduce the granularity of HD access for on-disk index lookups.
I think something extremely important is to be very carefully on all code that is frequently executed. This is normally the code in critical inner loops.
Rule 1: Know this code
For this code avoid all overhead. Small differences in runtime can make a big impact on the overall performance. E.g. if you implement an image filter a difference of 0.001ms per pixel will make a difference in 1s in the filter runtime on a image with size 1000x1000 (which is not big).
Things to avoid/do in inner loops are:
- don't go through interfaces (e.g DB queries, RPC calls etc)
- don't jump around in the RAM, try to access it linearly
- if you have to read from disk then read large chunks outside the inner loop (paging)
- avoid virtual function calls
- avoid function calls / use inline functions
- use float instead of double if possible
- avoid numerical casts if possible
- use ++a instead of a++ in C++
- iterate directly on pointers if possible
The second general advice: Each layer/interface costs, try to avoid large stacks of different technologies, the system will spend more time in data transformation then in doing the actual job, keep things simple.
And as the others said, use the right algorithm, try to optimize the algorithm complexity first before you optimize the algorithm implementation.
I know you're looking for specific coding hints, but those are easy to find: cacheing, loop unrolling, code hoisting, data & code locality, blah, blah...
The biggest hint of all is don't use them.
Would it help to make this point if I said "This is the secret that the almighty Powers That Be don't want you to know!!"? Pick your Powers: Microsoft, Google, Sun, etc. etc.
Don't Use Them
Until you know, with dead certainty, what the problems are, and then the coding hints are obvious.
Here's an example where many coding tricks were used, but the heart and soul of the exercise is not the coding techniques, but the diagnostic technique.
Are your algorithms correct for the situation or are there better ones available?
精彩评论