您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python reads and writes files (incomplete)

編輯：Python

python The types of files that can be opened are ：txt,xlsx,csv,zip,json,xml,html,images,hdf,pdf,docx,mp3,mp4

1.txt file

f=open(filepath,'r') or with open(filepath,'r') or codecs.open() or io.open()

lines=f.read() or for line in f.read() or f.readline() or f.readlines()

Common parameters

1）filepath

2）mode, Yes 'r','w','a','wb','rb'

3)encoding, There are common utf-8

How to write f.write(), Write string

f.writelines(), Parameters can be list write to multiple lines

f.seek()

（1） Options =0, Indicates to point the file pointer from the file header to “ Offset ” Byte
（2） Options =1, Indicates to point the file pointer to the current location of the file , Move backward “ Offset ” byte
（3） Options =2, Indicates to point the file pointer from the end of the file , Move forward “ Offset ” byte

f.flush() Write changes to a file （ No need to close files ）

f.tell() Get pointer position

2.xlsx file

pf = pd.read_excel('train.xlsx',sheetname = 'xx')

read_excel Common parameters

1)io, File path

2）sheetname, The default is 0, That first table ,None Indicates that the full table is returned , The format is dict of Dataframe

3）header, The first row is the default column name

4）skiprows, Number of omitted lines

5）skip_footer, Omit the number of lines starting from the tail

6）index_col, Specify a column as the row index

7）names, Specifies the name of the column

to_excel Common parameters

3.csv file

import pandas as pd
pf = pd.read_csv('train.csv')

read_csv Common parameters

1）filepath_or_buffer , It could be a file handle,StringIO object , File path string or URL

2）sep, Separator , If yes ‘,’

3）header, Is the number of the row used as the column name ,header=0 Indicates that the first row is used as the column name ,header=None when , Automatically add column index

4）names, As listed list, Will do reprocessing

5）dtype, Column data type

6）nrows, How many lines to read

7）chunksize, When reading by block , Specify the number of rows in the block

8）index_col, Specify a column as the row index , You can also specify multiple columns , Form a hierarchical index . Default does not specify , Plus from 0 The starting number index .

9）parse_dates=True, The string can be parsed into time format .

import pandas as pd
pf = pd.to_csv('train.csv')

to_csv Common parameters

1）path_or_buf

2）sep

3) columns, Optional column write

4）encoding

4.zip file

import zipfile
archive = zipfile.ZipFile('T.zip', 'r')
df = archive.read('train.csv')

5.json

import pandas as pd
df = pd.read_json('train.json')

Common parameters

1）path_or_buf

2)orient,json String format

https://blog.csdn.net/qq_24499417/article/details/81428594

6. xml

import xml.etree.ElementTree as ET
tree = ET.parse('/home/sunilray/Desktop/2 sigma/train.xml')

7.html

Use BeautifulSoup Library to read HTML file

import urllib2 #if you are using python3+ version, import urllib.request

wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"

page = urllib2.urlopen(wiki) #For python 3 use urllib.request.urlopen(wiki)

from bs4 import BeautifulSoup

#Parse the html in the 'page' variable, and store it in Beautiful Soup format
soup = BeautifulSoup(page)

https://www.analyticsvidhya.com/blog/2015/10/beginner-guide-web-scraping-beautiful-soup-python/

8.images

from scipy import misc
f = misc.face()
misc.imsave('face.png', f) # uses the Image module (PIL)
import matplotlib.pyplot as plt
plt.imshow(f)
plt.show()

https://www.analyticsvidhya.com/blog/2014/12/image-processing-python-basics/

9.hdf

import pandas as pd
df = pd.read_hdf('train.h5')

10.pdf

install pdfminer library

python setup.py install

pdf2txt.py train.pdf # Test read pdf

11.docx

install docx2txt library ：

pip install docx2txt

Read docx file ：

import docx2txt
text = docx2txt.process("file.docx")

12.mp3

http://pymedia.org/tut/index.html

13.mp4

http://zulko.github.io/moviepy/

from moviepy.editor import VideoFileClip
clip = VideoFileClip(‘<video_file>.mp4’)

Citations

https://blog.csdn.net/hellocsz/article/details/79623142

This blog post is very good , There are many examples , Easy to understand

https://blog.csdn.net/sinat_35562946/article/details/81058221

This one is also very good

https://www.cnblogs.com/hackpig/p/8215786.html

speak txt file

https://blog.csdn.net/u010801439/article/details/80033341

https://www.jianshu.com/p/03e3cfd5519e

https://www.analyticsvidhya.com/blog/2017/03/read-commonly-used-formats-using-python/

上一篇文章： Python learning day-4
下一篇文章： Python judges the twin prime pairs (prime pairs) and calculates the number.

Python

Python爬蟲程序出現錯誤如何修改

這個是Python爬蟲程序運行的時候說No module n

Python local variables and global variables

Local variables cannot be modi

Java for mobile phone development has a sketchware graphical interface. What does Python have?

The kind of graphic editing so

Python 3.9 install RSA decryption module pycryptodome (available for 2021/12/22 pro test)

1、cmd Direct installation &nbs

The python zipfile library decompresses all zip files in a certain path

today , Data sets engaged in m

[meeting Django] - (I) creating a project

to encounter Django - Catalog

The problem of sorted and reversed in Python

The use of str() and repr() methods in Python

Pandas uses apply and lambda to process data

Django admin uses import_ Export display and import / export data

Pandas uses the split function to split the specific string data column of dataframe into two new data columns and generate a new dataframe

Python and fractal 0019 - [tutorial] stack of circles

Python car and walking problem solution

Luogu pythonp1228 carpet filling problem divide and conquer

Python script: change all files in the current folder in a certain order, and save the original file name and the new file name to TXT (separated by spaces)

[Django] development: static file, application and model layer

熱門圖文

POJ 2019 Cornfields 二維RMQ ASP.NET MVC的Script管理為Linux配置Java Home變量 Hibernate及JPA 對象關系映射的簡單映射策略 C++編碼技術：為什麼要避免單參數構造函數？解析json數據-android 開發中 json解析問題出錯啊 c#軟件工程師筆試題 php地址引用(php地址引用的效率問題)

欄目導航