I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).
The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined开发者_开发百科 by the spec for shapefiles. What is the correct character encoding to use?
I was having the same issue with CBSA and Place data from 2010 Census full geometry shapes. These are not the clipped carto files.
IBM850 Did not work correctly for me. On a whim, I tried latin1 and it worked perfectly.
The US Census cartographic boundary files use the IBM850
character encoding. Python code to properly encode these strings would be as follows:
unicode(featurestring.decode("IBM850"))
精彩评论