开发者

C++ String-type independent algorithms

开发者 https://www.devze.com 2023-01-27 16:46 出处:网络
I\'m trying to derive a technique for writing string-algorithms that is truly independent of the underlying type of string.

I'm trying to derive a technique for writing string-algorithms that is truly independent of the underlying type of string.

Background: the prototypes for GetIndexOf and FindOneOf are either overloaded or templated variations on:

int GetIndexOf(const char * pszInner, const char * pszString);
const char * FindOneOf(const char * pszString, const char * pszSetOfChars);

This issue comes up in the following template function:

// return index of, or -1, the first occurrence of any given char in target
template <typename T>
inline int FindIndexOfOneOf(const T * str, const T * pszSearchChars)
{
    return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}

Objectives:

1. I would like this code to work for CStringT<>, const char *, const wchar_t * (and should be trivial to extend to std::string)

2. I don't want to pass anything by copy (only by const & or const *)

In an attempt to solve these two objectives, I thought I might be able to use a type-selector of sorts to derive the correct interfaces on the fly:

namespace details {

    template <typename T>
    struct char_type_of
    {
        // typedef T type; error for invalid types (i.e. anything for which there is not a specialization)
    };

    template <>
    struct char_type_of<const char *>
    {
        typedef char type;
    };

    template <>
    struct char_type_of<const wchar_t *>
    {
        typedef wchar_t type;
    };

    template <>
    struct char_type_of<CStringA>
    {
        typedef CStringA::XCHAR type;
    };

    template <>
    struct char_type_of<CStringW>
    {
        typedef CStringW::XCHAR type;
    };

}

#define CHARTYPEOF(T) typename details::char_type_of<T>::type

Which allows:

template <typename T>
inline int FindIndexOfOneOf(T str, const CHARTYPEOF(T) * pszSearchChars)
{
    return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}

This should guarantee that the second argument is passed as const *, and should not determine T (rather only the first argument should determine T).

But the problem with this approach is that T, when str is a CStringT<>, is a copy of the CStringT<> rather than a reference to it: hence we have an unnecessary copy.

Trying to rewrite the above as:

template <typename T>
inline int FindIndexOfOneOf(T & str, const CHARTYPEOF(T) * pszSearchChars)
{
    return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}

Makes it impossible for the compiler (VS2008) to generate a correct instance of FindIndexOfOneOf<> for:

FindIndexOfOneOf(_T("abc"), _T("def"));
    error C2893: Failed to specialize function template 'int FindIndexOfOneOf(T &,const details::char_type_of<T>::type *)'
    With the following template arguments: 'const char [4]'

This is a generic problem I've had with templates since they were introduced (yes, I'm that old): That it's been essentially impossible to construct a way to handle both old C-style arrays and newer class based entities (perhaps best highlighted by const char [4] vs. CString<> &).

The STL/std library "solved" this issue (if one can really call it solving) by instead using pairs of iterators everywhere instead of a reference to the thing itself. I could go this route, except it sucks IMO, and I don't want to have to litter my code with two-arguments everywhere a single argument properly handled should have been.

Basically, I'm interested in an approach - such as using some sort of stringy_traits - that would allow me to write GetIndexOfOneOf<> (and other similar template functions) where the argument is the string (not a pair of (being, end] arguments), and the template that is then generated be correct based on that string-argument-type (either const * or const CString<> &).

So the Question: How might I write FindIndexOfOneOf<> such that its arguments can be any of the following without ever creating a copy of the underlying arguments:

1. FindIndexOfOneOf(_T("abc"), _T("def"));

2. CString str; FindIndexOfOneOf(str, _T("def"));

3. CString str; FindIndexOfOneOf(T("abc"), str);

3. CString str; FindIndexOfOneOf(str, str);

Related th开发者_如何学Pythonreads to this one that have lead me to this point:

A better way to declare a char-type appropriate CString<>

Templated string literals


Try this.

#include <type_traits>
inline int FindIndexOfOneOf(T& str, const typename char_type_of<typename std::decay<T>::type>::type* pszSearchChars)

The problem is that when you make the first argument a reference type T becomes deduced as:

const char []

but you want

const char*

You can use the following to make this conversion.

std::decay<T>::type 

The documentation says.

If is_array<U>::value is true, the modified-type type is remove_extent<U>::type *.


You can use Boost's enable_if and type_traits for this:

#include <boost/type_traits.hpp>
#include <boost/utility/enable_if.hpp>

// Just for convenience
using boost::enable_if;
using boost::disable_if;
using boost::is_same;

// Version for C strings takes param #1 by value
template <typename T>
inline typename enable_if<is_same<T, const char*>, int>::type
FindIndexOfOneOf(T str, const CHARTYPEOF(T) * pszSearchChars)
{
    return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}

// Version for other types takes param #1 by ref
template <typename T>
inline typename disable_if<is_same<T, const char*>, int>::type
FindIndexOfOneOf(T& str, const CHARTYPEOF(T) * pszSearchChars)
{
    return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}

You should probably expand the first case to handle both char and wchar_t strings, which you can do using or_ from Boost's MPL library.

I would also recommend making the version that takes a reference take a const reference instead. This just avoids instantiation of 2 separate versions of the code (as it stands, T will be inferred as a const type for const objects, and a non-const type for non-const objects; changing the parameter type to T const& str means T will always be inferred as a non-const type).


Based on your comments about iterators it seems you've not fully considered options you may have. I can't do anything about personal preference, but then again...IMHO it shouldn't be a formidable obstacle to overcome in order to accept a reasonable solution, which should be weighed and balanced technically.

template < typename Iter >
void my_iter_fun(Iter start, Iter end)
{
 ...
}
template < typename T >
void my_string_interface(T str)
{
  my_iter_fun(str.begin(), str.end());
}
template < typename T >
void my_string_interface(T* chars)
{
  my_iter_fun(chars, chars + strlen(chars));
}


Alternative to my previous answer, if you don't want to install tr1.

Add the following template specializations to cover the deduced T type when the first argument is a reference.

template<unsigned int N>
struct char_type_of<const wchar_t[N]>
{ 
    typedef wchar_t type;
};

template<unsigned int N>
struct char_type_of<const char[N]>
{ 
    typedef char type;
};
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号