开发者

Trying to read from a file and skip punctuation in C++, tips?

开发者 https://www.devze.com 2023-02-14 12:45 出处:网络
I\'m trying to read from a file, and make a vector of all the words from the file. What I tried to do below is have the user input the filename, and then have the code open the file, and skip characte

I'm trying to read from a file, and make a vector of all the words from the file. What I tried to do below is have the user input the filename, and then have the code open the file, and skip characters if they aren't alphanumeric, then input that to a file.

Right now it just closes immediately when I input the filename. Any idea what I could be doing wrong?

#include <vector>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;

int main() 
{

string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;

ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt"); 

bookread.open(whichbook.c_str());
//assert(!bookread.fail());

if(bookread.is_open()){
    while(bookread.good()){
        getline(bookread, line);
        cout << line;
        while(isalnum(bookread)){
            word开发者_C百科s.push_back(bookread);
        }
    }
}
cout << words[];
}


I think I'd do the job a bit differently. Since you want to ignore all but alphanumeric characters, I'd start by defining a locale that treats all other characters as white space:

struct digits_only: std::ctype<char> {
    digits_only(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table() {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::space);

        std::fill(&rc['0'], &rc['9']+1, std::ctype_base::digit);
        std::fill(&rc['a'], &rc['z']+1, std::ctype_base::lower);
        std::fill(&rc['A'], &rc['Z']+1, std::ctype_base::upper);
        return &rc[0];
    }
};

That makes reading words/numbers from the stream quite trivial. For example:

int main() {
    char const test[] = "This is a bunch=of-words and 2@numbers#4(with)stuff to\tseparate,them, I think.";
    std::istringstream infile(test);
    infile.imbue(std::locale(std::locale(), new digits_only));

    std::copy(std::istream_iterator<std::string>(infile),
              std::istream_iterator<std::string>(),
              std::ostream_iterator<std::string>(std::cout, "\n"));

    return 0;
}

For the moment, I've copied the words/numbers to standard output, but copying to a vector just means giving a different iterator to std::copy. For real use, we'd undoubtedly want to get the data from an std::ifstream as well, but (again) it's just a matter of supplying the correct iterator. Just open the file, imbue it with the locale, and read your words/numbers. All the punctuation, etc., will be ignored automatically.


The following would read every line, skip non-alpha numeric characters and add each line as an item to the output vector. You can adapt it so it outputs words instead of lines. I did not want to provide the entire solution, as this looks a bit like a homework problem.

#include <vector>
#include <sstream>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;


int _tmain(int argc, _TCHAR* argv[])
{   
    string line; //for storing words
    vector<string> words; //unspecified size vector
    string whichbook;
    cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
    cin >> whichbook;
    cout << endl;

    ifstream bookread;
    //could be issue
    //ofstream bookoutput("results.txt"); 

    bookread.open(whichbook.c_str());
    //assert(!bookread.fail());

    if(bookread.is_open()){
         while(!(bookread.eof())){
            line = "";
            getline(bookread, line);


            string lineToAdd = "";

            for(int i = 0 ; i < line.size(); ++i)
            {
                if(isalnum(line[i]) || line[i] == ' ')
                {
                    if(line[i] == ' ')
                        lineToAdd.append(" ");
                    else
                    { // just add the newly read character to the string 'lineToAdd'
                        stringstream ss;
                        string s;
                        ss << line[i];
                        ss >> s;            
                        lineToAdd.append(s);
                    }
                }
            }

            words.push_back(lineToAdd);

        }
    }
    for(int i = 0 ; i < words.size(); ++i)
    cout << words[i] + " ";


    return 0;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消