开发者

Character encoding for US Census Cartographic Boundary Files

开发者 https://www.devze.com 2022-12-23 05:30 出处:网络
I\'m trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application.However, python is complaining about Un

I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).

The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined开发者_开发百科 by the spec for shapefiles. What is the correct character encoding to use?


I was having the same issue with CBSA and Place data from 2010 Census full geometry shapes. These are not the clipped carto files.

IBM850 Did not work correctly for me. On a whim, I tried latin1 and it worked perfectly.


The US Census cartographic boundary files use the IBM850 character encoding. Python code to properly encode these strings would be as follows:

unicode(featurestring.decode("IBM850"))
0

精彩评论

暂无评论...
验证码 换一张
取 消