1. When saving as Notepad, you can choose several encoding modes to save the text, namely:
ANSI: The default saved encoding format adopts the default internal code of the local operating system, and the simplified Chinese is generally GB23 12.
Unicode: UTF- 16, plus BOM signature: 0xFFFE.
Unicode bigendian:Unicode encoding: big endian byte order of UTF- 16, plus BOM signature: 0xFEFF.
UTF-8: The encoding format is UTF-8, and its BOM is 0xEF BB BF(UTF-8 does not distinguish byte order, and this BOM is only marked with UTF-8 encoding).
Python should decode the txt file into unicode code when reading it.
def read _ out(self):with codecs . open(self . filename,' r+') as get: return get.read()。 Decoding ("gbk")
Then it is coded into the corresponding desired coding type when writing, which can ensure that the coding mode of the source file will not change and the Chinese will not be garbled.
Unicode encoding is used throughout the code process, and try…except is used to determine which encoding method to use.
f . write(self . filename . encode(' gbk '))
Secondly, the text input by raw_input through keyboard is decoded by stdin.encodeing in sys module.
Content = original input (). Decoding (sys.stdin.encoding)
Type (content)? Unicode is so much for the time being.