Sunday, June 30, 2013

A question for encoding issue when I read file by python

Hi all,

when I try to read a file which actual encoding is gbk.

I get exception as below:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/", line 675, in readline
    return self.reader.readline(size)
UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 0-1: illegal multibyte sequence

here is my code:
>>> import codecs
>>> f ='namelist.txt',encoding='gb2312')
>>> f.readline()

Data example:
曹星 iamcaoyuxin

