开发者

Recursive Descent Parsing

开发者 https://www.devze.com 2023-02-19 02:58 出处:网络
I was looking for some advice / help on my assignment. Since it is for a class I am not asking for anyone to literally write out the answers for me, but I do need help with my actual coding.

I was looking for some advice / help on my assignment. Since it is for a class I am not asking for anyone to literally write out the answers for me, but I do need help with my actual coding.

The Question:

Consider the following BNF grammar:

A -> I = E  
E -> T + E | T - E | T  
T -> F * T | F / T | F  
F -> P ^ F | P  
P -> I | L | (E)  
I -> a | b | ... | y | z  
L -> 0 | 1 | ... | 8 | 9  

Using the technique described in class implement a recursive descent parser that recognizes strings in this language. Input should be from a file called input.dat and output should be to the console. An example session might look like this:

String read from file: a=a+b-c*d The string "a=a+b-c*d" is in the language. String read from file: a=a**b++c The string "a=a**b++c" is not in the language.

You must implement the project in BOTH Java and C++! Implementations that do not include a solution in both languages will, at best, receive half credit. To simplify things you will not have to handle whitespace when parsing the string, i.e. "" and similiar are illegal characters in this language.

A correct related example with a different grammar:

// * <assign> => <id> = <expr>
// *     <id> => a | b | c
// *   <expr> => <lit> + <lit> | <lit> - <lit>
// *    <lit> => 0 | ... | 9 | (<expr>)

#include <iostream>

using std::cout;
using std::endl;

bool assign(void);
bool id(void);
bool expr(void);
bool lit(void);

char *c;

int main(int argc, char *argv[]) {

    c = argc == 2 ? argv[1] : (char *)"";

    if (assign() && *c == '\0') {
        cout << "The string \"" << argv[1] << "\" is in the language." << endl;
    }
    else {
        cout << "The string \"" << argv[1] << "\" is not in the language." << endl;
    }

    return 0;
}

bool assign(void) {

    if (id()) {
        if (*c == '=') {
            ++c;
            if (expr()) {
                return true;
            }
        }
    }

    return false;
}

bool id(void) {

    if (*c >= 'a' && *c <= 'c') {
        ++c;
        return true;
    }

    return false;
}

bool expr(void) {

    if (lit()) {
        if (*c == '+' || *c == '-') {
            ++c;
            if (lit()) {
                return true;
            }
        }
    }

    return false;
}

bool lit(void) {

    if (*c >= '0' && *c <= '9') {
        ++c;
        return true;
    }
    else {
        if (*c == '(') {
            ++c;
            if (expr()) {
                if (*c == ')') {
                    ++c;
                    return true;
                }
            }
        }
    }

    return false;
}

My actual work so far: (I am leaving out the reading from the txt file for the moment, want to get the grammar working first) edit1: The problem is that none of the strings are showing up as valid strings that are in the language. Unfortunately even a simple string like a=b has to run through almost all the production rules so I can't pintpoint where I am going wrong

//A -> I = E 
//E -> T + E | T - E | T 
//T -> F * T | F / T | F 
//F -> P ^ F | P 
//P -> I | L | (E) 
//I -> a | b | ... | y | z 
//L -> 0 | 1 | ... | 8 | 9 

#include <iostream>

using namespace std;

bool A(void);
bool E(void);
bool T(void);
bool F(void);
bool P(void);
bool I(void);
bool L(void);

char *c;

int main(int开发者_Python百科 argc, char *argv[]){

     c = argc == 2 ? argv[1] : (char *)"";

    if (A() && *c == '\0') {
        cout << "The string \"" << argv[1] << "\" is in the language." << endl;
    }
    else {
        cout << "The string \"" << argv[1] << "\" is not in the language." << endl;
    }

    return 0;
}

bool A(void){

    if( I() )
    {
        if ( *c == '=' ){
            ++c;
            if ( E() )
            return true;
        }
    }

    return false;
}

bool E(void){

    if( T() ){
        if ( *c == '+' || *c == '-' ){
                ++c;
                if ( E() )
                return true;
        }
    }
    else
    if ( T() ){
        ++c;
        return true;
    }

    return false;
}

bool F(void){

    if( P() ){
        if( *c == '^')
        ++c;
        if( F() )
        return true;
    }
    else
    if ( P() ){
        ++c;
        return true;
    }

    return false;

}

bool I(void){

    if ( *c >= 'a' && *c <= 'z'){
        ++c;
        return true;
    }

    return false;

}

bool L(void){

    if ( *c >= '0' && *c <= '9' ){
    ++c;
    return true;
    }

    return false;
}

bool P(void){

    if ( I() ){
        ++c;
        return true;
    }
    else
    if ( L() ){
        ++c;
        return true;
    }
    else
    if ( *c == '(' ){
            ++c;
            if ( E() ){
                    if ( *c == ')' ){
                        ++c;
                        return true;
                    }
            }
    }

    return false;
}

bool T(void){

    if( F() ){
        if ( *c == '*' || *c == '/' ){
                ++c;
                if ( T() )
                return true;
        }
    }
    else
    if ( F() ){
        ++c;
        return true;
    }

    return false;
}


I think you have a generic problem with detecting things like

E = T + E | T - E | T

When you look for T() and it fails you try to look for T() again. You have made this mistake in most of the other functions as well.

The correct implementation for E() would be (updated after Chris comment):

if (T())
{
   if (c == '+' || c == '-')
   {  
      ++c;
      return E();
   }
   return true;
}
return false;

Lets exercise an example: "a=(3)"

A() -> I() returns true -> c == '=' -> E() -> T() returns true upon parsing 3 after calling into P() which ends up calling into E() again -> now c == ')' so we need to return true in E() otherwise if we return false here, parsing will stop in P()

I hope it's not too confusing but it's hard to express here. The best is if you write down the parse tree on a sheet of paper to visualize it.

Update again: You have a similar situation in T() and F()

0

精彩评论

暂无评论...
验证码 换一张
取 消