开发者

C Lexical analyzer in python

开发者 https://www.devze.com 2023-01-21 16:49 出处:网络
I\'m creating a C Lexical analyzer using python as part of developing a parser.Here in my code i have written some methods for identifying keywords,numbers,operators etc. No error is shown after compi

I'm creating a C Lexical analyzer using python as part of developing a parser.Here in my code i have written some methods for identifying keywords,numbers,operators etc. No error is shown after compiling. While executing i could input a .c file.My output should list all the keywords,identifiers etc in the input file. But it is not showing anything .Can anyone help me with this. The code is attached.

import sys
import string
delim=['\t','\n',',',';','(',')','{','}','[',']','#','<','>']
oper=['+','-','*','/','%','=','!']
key=["int","float","char","double","bool","void","开发者_如何学编程extern","unsigned","goto","static","class","struct","for","if","else","return","register","long","while","do"]
predirect=["include","define"]
header=["stdio.h","conio.h","malloc.h","process.h","string.h","ctype.h"]
word_list1=""
i=0
j=0
f=0
numflag=0
token=[0]*50


def isdelim(c):
    for k in range(0,14):
        if c==delim[k]:
            return 1
        return 0

def isop(c):
    for k in range(0,7):
        if c==oper[k]:
            ch=word_list1[i+1]
            i+=1
            for j in range(0,6):
                if ch==oper[j]:
                    fop=1
                    sop=ch
                    return 1
                #ungetc(ch,fp);
                return 1
                j+=1
        return 0;
        k+=1

def check(t):
    print t
    if numflag==1:
        print "\n number "+str(t)
        return
    for k in range(0,2):#(i=0;i<2;i++)
        if strcmp(t,predirect[k])==0:
            print "\n preprocessor directive "+str(t)
            return
    for k in range(0,6): #=0;i<6;i++)
        if strcmp(t,header[k])==0:
            print "\n header file "+str(t)
            return
    for k in range(0,21): #=0;i<21;i++)
        if strcmp(key[k],t)==0:
            print "\n keyword "+str(key[k])
            return
        print "\n identifier \t%s"+str(t)

def skipcomment():
    ch=word_list[i+1]
    i+=1
    if ch=='/':
        while word_list1[i]!='\0':
            i+=1#ch=getc(fp))!='\0':
    elif ch=='*':
        while f==0:
            ch=word_list1[i]
            i+=1
        if c=='/':
            f=1
    f=0




a=raw_input("Enter the file name:")
s=open(a,"r")
str1=s.read()
word_list1=str1.split()




i=0
#print word_list1[i]
for word in word_list1 :
    print word_list1[i]
    if word_list1[i]=="/":
        print word_list1[i]
    elif word_list1[i]==" ":
        print word_list1[i]
    elif word_list1[i].isalpha():
        if numflag!=1:
            token[j]=word_list1[i]
            j+=1
        if numflag==1:
            token[j]='\0'
            check(token)
            numflag=0
            j=0
            f=0
        if f==0:
            f=1
    elif word_list1[i].isalnum():
        if numflag==0:
            numflag=1
            token[j]=word_list1[i]
            j+=1
        else:
            if isdelim(word_list1[i]):
                if numflag==1:
                    token[j]='\0'
                    check(token)
                    numflag=0
                if f==1:
                    token[j]='\0'
                    numflag=0
                    check(token)
                j=0
                f=0
                print "\n delimiters : "+word_list1[i]
    elif isop(word_list1[i]):
        if numflag==1:
            token[j]='\0'
            check(token)
            numflag=0
            j=0
            f=0
        if f==1:
            token[j]='\0'
            j=0 
            f=0
            numflag=0
            check(token)    
        if fop==1:
            fop=0
            print "\n operator \t"+str(word_list1[i])+str(sop)
        else:
            print "\n operator \t"+str(c)
    elif word_list1[i]=='.':
        token[j]=word_list1[i]
        j+=1
    i+=1


def isdelim(c):
    if c in delim:
        return 1
    return 0

You should learn more about Python basics. ATM, your code contains too much ifs and fors.

Try learning it the hard way.


Your code is bad. Try splitting it up into smaller functions that you can test individually. Have you tried debugging the program? Once you find the place that causes the problem, you can come back here and ask a more specific question.

Some more hints. You can implement isdelim much simpler like this:

def isdelim(c):
    return c in delim

To compare string for equality, use string1 == string2. strcmp does not exist in Python. I do not know if you are aware that Python is usually interpreted and not compiled. This means that you will get no compiler-error if you call a function that does not exist. The program will only complain at run-time when it reaches the call.

In your function isop you have unreachable code. The lines j += 1 and k += 1 can never be reached as they are right after a return statement.

In Python iterating over a collection is done like this:

for item in collection:
    # do stuff with item

These are just some hints. You should really read the Python Tutorial.


It seems to print out quite a bit of output for me, but the code is pretty hard to follow. I ran it against itself and it errored out like so:

Traceback (most recent call last):
  File "C:\dev\snippets\lexical.py", line 92, in <module>
    token[j]=word_list1[i]
IndexError: list assignment index out of range

Honestly, this is pretty bad code. You should give the functions better names and don't use magic numbers like this:

for k in range(0,14)

I mean, you have already made a list you can use for the range.

for k in range(delim)

Makes slightly more sense.

But you're just trying to determine if c is in the list delim, so just say:

if c in delim

Why are you returning 1 and 0, what do they mean? Why not use True and False.

There are probably several other blatantly obvious problems, like the whole "main" section of the code.

This is not very pythonic:

token=[0]*50

Do you really just mean to say?

token = []

Now it's just an empty list.

Instead of trying to use a counter like this:

token[j]=word_list1[i]

You want to append, like this:

token.append (word_list[i])

I honestly think you've started with too hard a problem.

0

精彩评论

暂无评论...
验证码 换一张
取 消