I have a Mercurial repository that exported successfully to the "fast-import" 开发者_如何学Goformat. However, when I try to import, it fails with a 'utf8' error:
...
16:02:09 800/2100 commits processed at 199/minute (800)
16:03:52 900/2100 commits processed at 157/minute (900)
ABORT: exception occurred processing commit :901
bzr: ERROR: exceptions.UnicodeDecodeError: 'utf8' codec can't decode byte 0xb9 in position 14: unexpected code byte
Traceback (most recent call last):
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 946, in exception_to_return_code
return the_callable(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 1150, in run_bzr
ret = run(*run_argv)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 699, in run_argv_aliases
return self.run(**all_cmd_args)
File "/usr/lib/pymodules/python2.6/bzrlib/commands.py", line 721, in run
return self._operation.run_simple(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 135, in run_simple
self.cleanups, self.func, *args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/cleanup.py", line 165, in _do_with_cleanups
result = func(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/cmds.py", line 314, in run
user_map=user_map)
File "/usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/cmds.py", line 40, in _run
return proc.process(p.iter_commands)
File "/usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/processors/generic_processor.py", line 311, in process
super(GenericProcessor, self)._process(command_iter)
File "/usr/lib/pymodules/python2.6/fastimport/processor.py", line 76, in _process
handler(self, cmd)
File "/usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/processors/generic_processor.py", line 536, in commit_handler
handler.process()
File "/usr/lib/pymodules/python2.6/fastimport/processor.py", line 158, in process
handler(self, fc)
File "/usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/bzr_commit_handler.py", line 890, in modify_handler
self._modify_item(filecmd.path.decode('utf8'), kind,
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb9 in position 14: unexpected code byte
I'm running this on Ubuntu. bzr version "2.4.0-1~bazaar1~lucid1" and bzr-fastimport version "0.11.0-1~lucid1".
Any ideas for converting this repository successfully?
This error occurs because there is data in the input stream that is not valid utf8; Mercurial usually stores only utf8 data, but older commits might still contain utf8-invalid data (See https://www.mercurial-scm.org/wiki/EncodingStrategy).
Please file a bug about this issue against bzr-fastimport (https://launchpad.net/bzr-fastimport). It should handle this sort of situation more gracefully; presumably it should warn you that there is utf8 invalid data and then replace the utf8-invalid characters.
As a stopgap fix, you could change path.decode('utf-8') to path.decode('utf-8', 'replace') on /usr/lib/pymodules/python2.6/bzrlib/plugins/fastimport/bzr_commit_handler.py line 890.
精彩评论