程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python deals with the character encoding of CSV files

編輯:Python

office Can store two csv file , I call the orange box in the figure below csv by 1 Number , Red box csv by 2 Number

So how do you judge one csv What kind of file does it belong to ?

- Open in Notepad csv file , The lower right corner shows utf-8 Namely 1 Number , Show ANSI Namely 2 Number

about 1 No

Obviously, only utf-8 decode , The code is as follows

with open(file, encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
print(row)

The attached drawings help understand :

chart 1

about 2 No

It can be used gbk and ansi Two kinds of encoding for decoding

in addition ,with open Our default decoder is GBK, So all three ways of writing

# How to write it 1
with open(file, encoding='gbk') as f:
reader = csv.reader(f)
for row in reader:
print(row)
# How to write it 2
with open(file) as f:
reader = csv.reader(f)
for row in reader:
print(row)
# How to write it 3
with open(file, encoding='ansi') as f:
reader = csv.reader(f)
for row in reader:
print(row)

Explain why ansi Yes, but utf-8 But not , Because the encoding format of Notepad "ANSI" Different from ASCII, It may contain gbk etc. uft-8 Incompatible characters . See :

Windows Notepad ANSI、Unicode、UTF-8 What's the difference between the three coding modes ? - You know


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved