Currently I am playing with DCG in Prolog to parse an XML file. I was able to get the following code snippet which is able to parse a simple XML such as:
<author> <name> <f> arthur </f>
<m> conan </m>
<l> doyle </l>
</name>
<bday> <d> 22 </d>
<m> 5 </m>
<y> 1859 </y>
</bday>
</author>
<author> <name> <f> william </f>
<l> shakespeare </l>
</name>
<bday> &开发者_JAVA百科lt;d> 23 </d>
<m> 4 </m>
<y> 1564 </y>
</bday>
</author>
$
and the DCG is defined as:
xml([E]) --> element(E).
xml([E|L]) --> element(E), xml(L).
element(E) --> begintag(N), elements(L), endtag(N), {E =.. [N|L]}.
elements(L) --> xml(L).
elements([E]) --> [E].
begintag(N) --> ['<', N, '>'].
endtag(N) --> ['<', '/', N, '>'].
Would someone please illustrate how the DCG works in this case? I am having a really hard time trying to understand the arguments in the DCG (e.g. [E] in xml([E]); [E|L] in xml([E|L]). ) Thank you!
Think declaratively: A DCG always describes a list. In a DCG body, comma (",") is read as "and then". So for example, xml//1 describes (by its first rule) either a single element, or (by its second rule) an element and then something that is again described by xml//1. element//1 is a begintag, then elements, then an endtag etc. The arguments used in the DCG heads let you relate the sequence that is described by the DCG bodies to other information, like a subsequence you are particularly interested in. Take for example begintag//1 - you can ask:
?- phrase(begintag(T), [<, test, >]).
T = test.
and also in the other direction:
?- phrase(begintag(test), Ls).
Ls = [<, test, >].
In the case of begintag//1, the tag name seems to be important to you, so you have introduced an argument for it that, on the one hand, lets you generate an opening tag with a given name, and, on the other hand, lets you parse a tag and extract its name, and, on the third hand, ask even the most general query:
?- phrase(begintag(T), Ls).
Ls = [<, T, >].
which abstractly relates an opening XML tag to its name.
精彩评论