i want to parse a xml like that. My output need to be a code in c or c++ who make the transformation of each letter in phoneme. I introduce a word at input and my generated code need tu pe transformed based on the rules from xml. Any sugestion for begening pls...
<?xml version="1.0" encoding="UTF-8"?>
<rules>
<define_groups>
<cons>B,C,D,F,G,H,J,K,L,M,N,P,R,S,s,Q,W,T,t,V,X,Z</cons>
<vowel>A,a,E,I,i,O,U,Y</vowel>
<fric>S,s,t</fric>
<gr_h_O>B,F,G,L,X</gr_h_O>
<gr_EI>E,I</gr_EI>
<gr_CG>C,G</gr_CG>
<gr_AE>A,E</gr_AE>
<gr_AOU>A,O,U</gr_AOU>
</define_groups>
<phonemes>
<p>a</p>
<p>@</p>
<p>b</p>
<p>k</p>
<p>k_O</p>
<p>tS</p>
<p>d</p>
<p>e</p>
<p>e_X</p>
<p>f</p>
<p>g</p>
<p>dZ</p>
<p>g_O</p>
<p>h</p>
<p>i</p>
<p>1</p>
<p>j</p>
<p>i_O</p>
<p>Z</p>
<p>l</p>
<p>m</p>
<p>n</p>
<p>o</p>
<p>o_X</p>
<p>p</p>
<p>r</p>
<p>s</p>
<p>S</p>
<p>t</p>
<p>ts</p>
<p>u</p>
<p>w</p>
<p>v</p>
</phonemes>
<letters>
<l>A</l>
<l>a</l>
<l>B</l>
<l>C</l>
<l>D</l>
<l>E</l>
<l>F</l>
<l>G</l>
<l>H</l>
<l>I</l>
<l>i</l>
<l>J</l>
<l>K</l>
<l>L</l>
<l>M</l>
<l>N</l>
<l>O</l>
<l>P</l>
<l>Q</l>
<l>R</l>
<l>S</l>
<l>s</l>
<l>T</l>
<l>t</l>
<l>U</l>
<l>V</l>
<l>W</l>
<l>X</l>
<l>Y</l>
<l>Z</l>
</letters>
<for_A>
<r><t>a</t></r>
</for_A>
<for_a>
<r><t>@</t></r>
</for_a>
<for_D>
<r><t>d</t></r>
</for_D>
<for_F>
<r><t>f</t></r>
</for_F>
<for_i>
<r><t>1</t></r>
</for_i>
<for_J>
<r><t>Z</t></r>
</for_J>
<for_K>
<r>
<right>gr_EI</right>
<t>k_O</t>
</r>
<r>
<t>k</t>
</r>
</for_K>
<for_E>
<r>
<left>=</left>
<right>L</right>
<right>=</right>
<t>j e</t>
</r>
<r>
<left>E</left>
<t>j e</t>
</r>
<r>
<t>e</t>
</r>
</for_E>
<for_U>
<r>
<left>cons</left>
<right>A</right>
<t>u w</t>
</r>
<r>
<left>vowel</left>
<right>A</right>
<t>u w</t>
</r>
<r>
<left>cons</left>
<right>E</right>
<t>u</t>
</r>
<r>
<left>cons</left>
<right>cons</right>
<t>u</t>
</r>
<r>
<left>C</left>
<left>I</left>
<right>=</right>
<t>w</t开发者_运维知识库>
</r>
<r>
<left>vowel</left>
<right>=</right>
<t>w</t>
</r>
<r>
<left>I</left>
<t>u</t>
</r>
<r>
<left>O</left>
<right>L</right>
<t>u</t>
</r>
<r>
<left>U</left>
<t>u</t>
</r>
<r>
<right>U</right>
<t>u</t>
</r>
<r>
<right>A</right>
<t>w</t>
</r>
<r>
<left>Z</left>
<right>A</right>
<t>u</t>
</r>
<r>
<right>I</right>
<t>u</t>
</r>
<r>
<right>a</right>
<t>w</t>
</r>
<r>
<right>E</right>
<right>A</right>
<t>w</t>
</r>
<r>
<right>E</right>
<t>u</t>
</r>
<r>
<right>A</right>
<right>=</right>
<t>w</t>
</r>
<r>
<t>u</t>
</r>
</rules>
For a fast xml parser which just needs to be dumped in to your project without dependencies, I highly recommend TinyXML.
http://www.grinninglizard.com/tinyxml/
First use an XML parser (for example Xerces, Expat or TinyXML) to parse your XML file and build a data structure for input transformation. Probably it will be a list of objects each representing a pattern to match and the desired output. Then iterate over the letters in your word matching them against the patterns. If the number of patterns is large you may want to have a map from letter to the list of patterns that apply to it.
I usually use lex/yacc or ANTLR for parsing related suff. ANTLR is very easy to learn/code.
精彩评论