开发者

What algorithm is behind STL's find?

开发者 https://www.devze.com 2023-02-18 23:44 出处:网络
I just created a custom finding function for strings in a map. I developed some kind of the linear search algorithm (which I found out later) and was not satisfied with the speed of the function. So I

I just created a custom finding function for strings in a map. I developed some kind of the linear search algorithm (which I found out later) and was not satisfied with the speed of the function. So I searched for a faster function and found map's own function: map::find.

This was incredibly faster than the linear algorithm I was using.

In another example STL's function find was also much faster than another linear function I am using.

But how is this possible? If you use the binary search algorithm you need to sort the map first which would take (hypothetically) more time the bigger your map is.

Also how to find out the algorithms behind those core functions? Is there a list or some kind of 开发者_StackOverflow社区database to find this out?

Thanks for all your answers! I upvoted the best answers and accepted Max Lybbert's answer because it was the most detailed one.

Paul :)


std::map stores its elements in sorted order (almost always in a self-balancing binary search tree).

std::map::find takes advantage of this and uses a dichotomic search.


I developed some kind of the linear search algorithm (which I found out later) and was not satisfied with the speed of the function. So I searched for a faster function and found map's own function: map::find.

This was incredibly faster than the linear algorithm I was using.

std::map is designed to keep data sorted as it is inserted into the container. That's one of its main jobs. It's also the reason you must define some sort of partial ordering for the data you put into a std::map.

This means each insertion takes a little longer than inserting into other containers (inserting into a std::list -- once you have the insertion point -- for instance is O(1), as is appending to a std::vector or appending/prepending to a std::deque). But look up is guaranteed to use binary search (or, rather, to navigate the red-black tree behind the std::map (under "Premature or Prudent Optimization")).

In another example STL's function find was also much faster than another linear function I am using.

But how is this possible? If you use the binary search algorithm you need to sort the map first which would take (hypothetically) more time the bigger your map is.

There's nothing hypothetical about it. Sorting your data takes time, and always takes more time the more items of data.

std::find is able to handle unsorted data, so it must be implemented as a linear search (compare std::binary_search/std::lower_bound). But std::find is allowed to be sneaky and unroll loops, compare more than one item at a time (if the items are small, and especially if they are primitive types that lend themselves to low-level bit fiddling), etc.

Also how to find out the algorithms behind those core functions? Is there a list or some kind of database to find this out?

Personally, I learned a lot of algorithms by reading what was available in the STL and a few other languages. I found it easier to study the containers first.


Technically, there are no such algorithms. The standard defines how well should every algorithm perform, not how it should do it. Each compiler ships an implementation of the standard library.

That being say, there are free implementations of the STL. You could look at their code. For example, STL Port.

Also how to find out the algorithms behind those core functions? Is there a list or some kind of database to find this out?

Well, there is the Dictionary of Algorithms and Data Structures but it's a bit messy.


The implementation is well, implementation dependent.

But as far as the generic complexity classes go, you can check for example this page with an overview for the common STL methods:

http://www.cplusplus.com/reference/stl/


The STL algorithms are almost always faster than anything you could write yourself because it makes a lot of optimizations under the covers. It's also faster to use iterators than operator[] when iterating through a vector or other random access container because there's less overhead.

You should checkout Scott Meyers' books Effective C++ Third Edition and Effective STL. (The material in More Effective C++ is contained within the 3rd Edition of Effective C++.)

0

精彩评论

暂无评论...
验证码 换一张
取 消