开发者

Character encoding for US Census Cartographic Boundary Files

开发者 https://www.devze.com 2022-12-23 05:30 出处：网络

I\'m trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application.However, python is complaining about Un

相关专题：census character encoding shapefile unicode

I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).

The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined开发者_开发百科 by the spec for shapefiles. What is the correct character encoding to use?

I was having the same issue with CBSA and Place data from 2010 Census full geometry shapes. These are not the clipped carto files.

IBM850 Did not work correctly for me. On a whim, I tried latin1 and it worked perfectly.

The US Census cartographic boundary files use the IBM850 character encoding. Python code to properly encode these strings would be as follows: