Avoiding Python UnicodeDecodeError in Jinja's nl2br filter_问答_开发者

Avoiding Python UnicodeDecodeError in Jinja's nl2br filter

开发者 https://www.devze.com 2023-02-13 06:04 出处：网络

I\'m using Jinja2\'s nl2br filter, which looks like: import re from jinja2 import environmentfilter, Markup, escape

相关专题：jinja2 python

I'm using Jinja2's nl2br filter, which looks like:

import re
from jinja2 import environmentfilter, Markup, escape

_paragraph_re = re.compile(r'(?:\r\n|\r|\n){2,}')

@evalcontextfilter
def nl2br(eval_ctx, value):
    result = u'\n\n'.join(u'<p>%s</p>' % p.replace('\n', '<br>\n')
                      for p in _paragraph_re.split(escape(value)))
    if eval_ctx.autoescape:
        result = Mark开发者_运维技巧up(result)
    return result

The problem is if "value" has anything but ascii characters (for example: "/mɒnˈtænə/" causes it to fail). I get this error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 889, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 879, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 876, in wsgi_app
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 695, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/mcrittenden/Dropbox/Code/dropdo/dropdo.py", line 105, in view
    return render_template(template, src = url, data = content)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/templating.py", line 85, in render_template
    context, ctx.app)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/templating.py", line 69, in _render
    rv = template.render(context)
  File "/usr/local/lib/python2.6/dist-packages/Jinja2-2.5.5-py2.6.egg/jinja2/environment.py", line 891, in render
    return self.environment.handle_exception(exc_info, True)
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/text.html", line 1, in top-level template code
    {% extends "layout.html" %}
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/layout.html", line 25, in top-level template code
    {% block content %}{% endblock %}
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/text.html", line 8, in block "content"
    {{ data|nl2br }}
  File "/home/mcrittenden/Dropbox/Code/dropdo/dropdo.py", line 26, in nl2br
    for p in _paragraph_re.split(escape(value)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position 12: ordinal not in range(128)

What's the best to prevent the error but not remove the problem characters altogether?

Use unicode literals everywhere.

"Unicode in Python, Completely Demystified"

If "value" has anything but ascii characters, you want it to be Unicode, and nothing but Unicode, throughout your entire app, except for a few places where you explicitly encode or decode it. Pass Unicode to your templates, too.

If you acquire the string "/mɒnˈtænə/" somehow, you probably know its encoding; use it: value = "/mɒnˈtænə/".decode(the_encoding).

How do you learn the encoding? A HTTP request knows its encoding. An XML file knows its encoding. A plain text file usually does not; you must know its encoding by some other means.

Note that UTF-8 is not Unicode though it is an encoding that can fully represent Unicode. It's still an encoding, and to get a Python Unicode string from it, you need to .decode("utf-8") it.

Try unidecode from http://pypi.python.org/pypi/Unidecode

>>> from unidecode  import unidecode
>>> m=u'My fianc\xe9 David'; print m; print unidecode(m)
My fiancé David
My fiance David
>>>

Avoiding Python UnicodeDecodeError in Jinja's nl2br filter

精彩评论

关注公众号

热门标签

图文推荐

Avoiding Python UnicodeDecodeError in Jinja's nl2br filter

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：