开发者

How to remove repetitive characters from std::string

开发者 https://www.devze.com 2023-02-14 06:08 出处:网络
I have got a std::string like this: std::string fileName; where fileName is like /tmp/fs////js//config.js

I have got a std::string like this:

std::string fileName;

where fileName is like /tmp/fs////js//config.js It is coming from somewhere and I need to store it. But when I store it, I need to remove extra '/' chars from the path, basically need only one separator between directory names and file names.

I can remove these by iterating over the string one char at a time and开发者_Python百科 comparing with the next char, but its not very efficient.

Can anyone suggest some efficient way to do it?


Removing duplicate adjacent elements is a job for std::unique. You need to provide your own predicate in this case but it's O(n) and dead simple.

struct both_slashes {
    bool operator()(char a, char b) const {
        return a == '/' && b == '/';
    }
};

std::string path("/tmp/fs////js//config.js");

path.erase(std::unique(path.begin(), path.end(), both_slashes()), path.end());


You're not going to find anything more efficient than that - think about it - you need to remove consecutive duplicated characters - the imnplication is that, even in the best case, you're going to have to look at every character at least once.


I think std::unique will work even though your string is not sorted because all it removes is consecutive duplicates.

Of course it won't know that / is a special character here and you may find file-names that contain double-letters also getting modified unexpectedly to single-leter, posibly anoyingly.

It is also O(N) but you can't avoid that.

One algorithm that will work well is std::remove_if because you can put in your own "functor" which can keep state so it will know what the last character was.

struct slash_pred
{
  char last_char;

  slash_pred()
   : last_char( '\0' ) // or whatever as long as it's not '/'
  {
  }

  bool operator()(char ch)
  {
      bool remove = (ch == '/') && (last_char == '/');
      last_char = ch;
  }
};

path.erase( std::remove_if( path.begin(), path.end(), 
      slash_pred() ), path.end() );

O(N) but should work.

For the dissenters who think remove_if might be O(N^2) it might be implemented like this:

template< typename ForwardIterator, typename Pred >
ForwardIterator remove_if( ForwardIterator read, ForwardIterator end, Pred pred )
{
   ForwardIterator write = read; // outside the loop as we return it
   for( ; read!=end; ++read )
   {
      if( !pred( *read ) )
      {
         if( write != read ) // avoid self-assign
         {
            *write = *read;
         }
         ++write;
      }
   }
   return write;
}


O(n) in time + O(n) in mem

void clean_path(std::string& path) {
    std::string new_path;
    char sep = '/';
    for (auto i = 0; i < path.size(); ++i) {
        if (path[i] == sep && !new_path.empty() && new_path.back() == sep)
            continue;
        new_path.push_back(path[i]);
    }
    path = new_path;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消