I am tasked with creating a small program that can read in the definition of a FSM from input, read some strings from input and determine if those strings are accepted by the FSM based on the definition. I need to write this in either C, C++ or Java. I've scoured the net for ideas on how to get started, but the best I could find was a Wikipedia article on Automata-based programming. The C example p开发者_如何学Gorovided seems to be using an enumerated list to define the states, that's fine if the states are hard coded in advance. Again, I need to be able to actually read the number of states and the definition of what each state is supposed to do. Any suggestions are appreciated.
UPDATE: I can make the alphabet small (e.g. { a b }) and adopt other conventions such as the start state is always state 0. I'm allowed to impose reasonable restrictions on the number of states, e.g. no more than 10.
Question summary:
- How do I implement an FSA?
First, get a list of all the states (N of them), and a list of all the symbols (M of them). Then there are 2 ways to go, interpretation or code-generation:
Interpretation. Make an NxM matrix, where each element of the matrix is filled in with the corresponding destination state number, or -1 if there is none. Then just have an initial state variable and start processing input. If you get to state -1, you fail. If you run out of input symbols without getting to the success state, you fail. Otherwise you succeed.
Code generation. Print out a program in C or your favorite compiler language. It should have an integer state variable initialized to the start state. It should have a for loop over the input characters, containing a switch on the state variable. You should have one case per state, and at each case, have a switch statement on the current character that changes the state variable.
If you want something even faster than 2, and that is sure to get you flunked (!), get rid of the state variable and instead use goto :-) If you flunk, you can comfort yourself in the knowledge that that's what compilers do.
P.S. You could get your F changed to an A if you recognize loops etc. in the state diagram and print out corresponding while and if statements, rather than using goto.
One non-hardcoded way to represent an automaton is as a transition matrix, which allows to represent for each current state, and each input character, what the next state is.
You haven't actually asked a question. You'll get more and better help if you have a specific question for a specific task (but still give the overall goal). The question should be narrow in scope (e.g. not "How can I implement an FSA?").
As for how to represent an FSA (which seems to be what you're having difficulties with), read on.
Start by considering the definition of an FSM: it's an alphabet ∑, a set of states S, a start state s0, a set of accept states A and a transition function δ from a state and a symbol to a state. You have to be able to determine these properties from the input. Any states not reachable by the transition function can be dropped to produce an equivalent FSM. The minimal set of states and alphabet are thus implicit in the transition function; you could make your FSM easier to use (and harder to implement, but not much harder) by not requiring either ∑ or S in the input.
You don't need to use the same representation for states that the input uses. You could use unsigned integers for your internal representation, as long as you have a map from integers to strings and strings to integers so you can convert between the internal representation and external representation. This way, your transition function can be stored as an array, so the transition step can be performed in constant time.
A simpler approach would be to use the external representation as your internal representation. With this option, the transition function would be stored as a map from strings and symbols to strings. The transition step would probably be O(log(|S|+|∑|)), given the performance of most map data structures. If symbols are represented as integers (e.g. char
s), the transition function could be represented as a map from strings to an array of strings, giving O(log(|S|)) performance.
Yet another optionmodeled after the graph view of an FSM, is to create a class for states. A state has a name (the external representation). States are responsible for transitions; send a symbol to a state and get back another state.
class State {
property name;
State& transition(Symbol s);
void setTransition(Symbol s, State& to);
}
Store the set of states as a map from names to states.
There you go, three different places to start, each with a different way to represent states.
Stop thinking about everything at once. Do one thing at a time
- come with language of state machine
- come with language for stimulus
- create sample file of one state machine in language
- create sample file of stimulus
- come with class for state
- come with class for transition
- come with class for state machine as set of states and transitions
- add method to handle violation to state class
- code a little parser for language
- code another parser for language
- initial state
- some output thing like WriteLn here and there
- main method
- compile
- run
- debug
- done
The way the OpenFst toolkit does it is: A FSM has a vector of states, each of which has a vector of arcs. Each arc has an input (and output) label, a target state ID and a weight. You could take a look at the code. Maybe it will inspire you.
If you're using an object-oriented language like Java or C++, I'd recommend that you start with objects. Before you worry about file formats and the like, get a good object model for a finite state automata and how it behaves. How will you represent states, transitions, events, etc.? Will your FSA be a Composite? Once you have that sort of thing working you can get the file formats right. Anything will do: XML, text, etc.
精彩评论