开发者

Postgres xmlelement encoding

开发者 https://www.devze.com 2023-04-06 11:12 出处:网络
My postgres database in UTF-8 and client in UTF-8 too. When i try to: select xmlelement(name pampam, xmlattributes(\'русский\' as \"top\"));

My postgres database in UTF-8 and client in UTF-8 too.

When i try to:

select xmlelement(name pampam, xmlattributes('русский' as "top"));

I get back:

开发者_开发百科
<pampam top="&#x440;&#x443;&#x441;&#x441;&#x43A;&#x438;&#x439;"/>`  

But i want get back attribute as is( i.e. in Russian UTF-8), not as &#x440....

How i can do this?


This is not solve problem, I use xmleliment to construct xml from data get back by queries. And i can't find another way to do this...


This doesn't appear to be possible. The values to print are passed to libxml, and that's how it chooses to print it.


Probably it is not the best solution but this works for me:

SELECT xmlparse(CONTENT '<element attribute=''áéíóú´ñÇ`''>value</element>')

I get back:

<element attribute='áéíóú´ñÇ`'>value</element>


I have just created a character entity to character conversion function, entity2char:

select
    xmlparse(content
        entity2char((
            xmlelement(name pampam, xmlattributes('русский' as "top"))
        )::text)
    );
        xmlparse         
-------------------------
 <pampam top="русский"/>


I used plpythonu to write function using lxml

CREATE OR REPLACE FUNCTION xmlelementpy (
              nodename_in varchar,
              nodetext_in text,
              attrname_in varchar [],
              attrvalue_in varchar []
)
RETURNS xml AS
$body$
from lxml import etree
noattr = 0
isxml = 1
if (attrname_in is None) or (attrname_in is None):
    noattr = 1
elif len(attrname_in) <> len(attrvalue_in):
    return None;

newnode = etree.Element(nodename_in)
if nodetext_in is not None:
    try: 
        schema = etree.fromstring(nodetext_in)
    except: 
        newnode.text = nodetext_in
        isxml = 0
    if isxml <> 0:
        newnode.append(etree.XML(nodetext_in))

if noattr <> 1:
    for i in range(len(attrname_in)):
        if (attrvalue_in[i] is not None):
            attrvalue = attrvalue_in[i].decode('UTF-8')
            newnode.set(attrname_in[i], attrvalue)

return etree.tounicode(newnode, pretty_print=True)
$body$
LANGUAGE 'plpythonu'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;
0

精彩评论

暂无评论...
验证码 换一张
取 消