程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python reads and writes files (incomplete)

編輯:Python

python The types of files that can be opened are :txt,xlsx,csv,zip,json,xml,html,images,hdf,pdf,docx,mp3,mp4

1.txt file

f=open(filepath,'r') or with open(filepath,'r') or codecs.open() or io.open()

lines=f.read() or for line in f.read() or f.readline() or f.readlines()

Common parameters

1)filepath

2)mode, Yes 'r','w','a','wb','rb'

3)encoding, There are common utf-8

How to write f.write(), Write string

f.writelines(), Parameters can be list write to multiple lines

f.seek()

(1) Options =0, Indicates to point the file pointer from the file header to “ Offset ” Byte
(2) Options =1, Indicates to point the file pointer to the current location of the file , Move backward “ Offset ” byte
(3) Options =2, Indicates to point the file pointer from the end of the file , Move forward “ Offset ” byte

f.flush()    Write changes to a file ( No need to close files )

f.tell()   Get pointer position

2.xlsx file

pf = pd.read_excel('train.xlsx',sheetname = 'xx')

read_excel Common parameters

1)io, File path

2)sheetname, The default is 0, That first table ,None Indicates that the full table is returned , The format is dict of Dataframe

3)header, The first row is the default column name

4)skiprows, Number of omitted lines

5)skip_footer, Omit the number of lines starting from the tail

6)index_col, Specify a column as the row index

7)names, Specifies the name of the column

to_excel Common parameters

 

3.csv file

import pandas as pd
pf = pd.read_csv('train.csv')

 read_csv Common parameters

1)filepath_or_buffer , It could be a file handle,StringIO object , File path string or URL

2)sep, Separator , If yes ‘,’

3)header, Is the number of the row used as the column name ,header=0 Indicates that the first row is used as the column name ,header=None when , Automatically add column index

4)names, As listed list, Will do reprocessing

5)dtype, Column data type

6)nrows, How many lines to read

7)chunksize, When reading by block , Specify the number of rows in the block

8)index_col, Specify a column as the row index , You can also specify multiple columns , Form a hierarchical index . Default does not specify , Plus from 0 The starting number index .

9)parse_dates=True, The string can be parsed into time format .

import pandas as pd
pf = pd.to_csv('train.csv')

 to_csv Common parameters

1)path_or_buf

2)sep

3) columns, Optional column write

4)encoding

4.zip file

import zipfile
archive = zipfile.ZipFile('T.zip', 'r')
df = archive.read('train.csv')

5.json

import pandas as pd
df = pd.read_json('train.json')

Common parameters

1)path_or_buf

2)orient,json String format

https://blog.csdn.net/qq_24499417/article/details/81428594

6. xml

import xml.etree.ElementTree as ET
tree = ET.parse('/home/sunilray/Desktop/2 sigma/train.xml')

7.html

Use BeautifulSoup Library to read HTML file

import urllib2 #if you are using python3+ version, import urllib.request
wiki = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India"
page = urllib2.urlopen(wiki) #For python 3 use urllib.request.urlopen(wiki)
from bs4 import BeautifulSoup
#Parse the html in the 'page' variable, and store it in Beautiful Soup format
soup = BeautifulSoup(page)

https://www.analyticsvidhya.com/blog/2015/10/beginner-guide-web-scraping-beautiful-soup-python/

8.images

from scipy import misc
f = misc.face()
misc.imsave('face.png', f) # uses the Image module (PIL)
import matplotlib.pyplot as plt
plt.imshow(f)
plt.show()

 https://www.analyticsvidhya.com/blog/2014/12/image-processing-python-basics/

9.hdf

import pandas as pd
df = pd.read_hdf('train.h5')

10.pdf

install pdfminer library

python setup.py install
pdf2txt.py train.pdf # Test read pdf

 

11.docx

install docx2txt library :

pip install docx2txt

Read docx file :

import docx2txt
text = docx2txt.process("file.docx")

 

12.mp3

http://pymedia.org/tut/index.html

13.mp4

http://zulko.github.io/moviepy/

from moviepy.editor import VideoFileClip
clip = VideoFileClip(‘<video_file>.mp4’)

Citations

https://blog.csdn.net/hellocsz/article/details/79623142

This blog post is very good , There are many examples , Easy to understand

https://blog.csdn.net/sinat_35562946/article/details/81058221

This one is also very good

https://www.cnblogs.com/hackpig/p/8215786.html

speak txt file

https://blog.csdn.net/u010801439/article/details/80033341

https://www.jianshu.com/p/03e3cfd5519e

https://www.analyticsvidhya.com/blog/2017/03/read-commonly-used-formats-using-python/


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved