程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Pandas of Python

編輯:Python

Python And Pandas

List of articles

  • Python And Pandas
  • One 、pandas What is it? ?
  • Two 、 And Numpy The difference between ?
  • 3、 ... and 、 Place of application
  • Two 、 Based on using
    • 1. Import and stock in
    • 2. Read and save data
    • 3. Indexes ( assignment \ modify )
    • 4. Make tables and attributes
    • 5. Common methods
    • 6. Simple operation
    • 7. Tips


One 、pandas What is it? ?

pandas yes "Python data analysis" For short , Used literally in data analysis , In order to Numpy Based extension ,Pandas Widely used in academic 、 Finance 、 Statistics and other data analysis fields .

Two 、 And Numpy The difference between ?

The main reason is that the elements in the matrix can be heterogeneous ( It can be different )

3、 ... and 、 Place of application

Reading and saving of files (excel、csv)
Make table

Two 、 Based on using

1. Import and stock in

import pandas as pd
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

2. Read and save data

Shape parameter explain header Specify the row as the column name ( Ignore comment lines ), If no column name is specified , Default header=0; If the column name is specified header=Noneindex_col The default is None Use the column name DataFrame The line of label , If I give you a sequence , Then use MultiIndex. If a file is read , The file has a delimiter at the end of each line , Consider using index_col=False send panadas Do not use the first column as the row name .dtype Example : {‘a’: np.float64, ‘b’: np.int32} Specify the data type for each column ,a,b Said column names
 File read :
url form
data = pd.read_csv(
'https://labfile.oss.aliyuncs.com/courses/1283/adult.data.csv')
print(data.head())
Local form
data = pd.read_csv("./test1.txt", sep=' ') # Read the file to with a space as a separator pandas In the table , Other default
data = pd.read_csv("./test1.txt", sep=' ', header=None, index_col=False, dtype=np.float64)

3. Indexes ( assignment \ modify )

Row index and column index

----------------------------------
data = pd.DataFrame(np.arange(12, 24).reshape((3, 4)), index=["a", "b", "c"], columns=["A", "B", "C", "D"])
out:
A B C D
a 12 13 14 15
b 16 17 18 19
c 20 21 22 23
----------------------------------
# One 、 Press data[] Come on :
data[...] # That's ok
----------------------
...
----------------------
data[{
 Row position 1}:{
 Row position 2}] # Multiple lines
data[{
' Row labels 1'}:{
' Row labels 2'}] # Multiple lines
----------------------
in: data[0:3]
out:
A B C D
a 12 13 14 15
b 16 17 18 19
c 20 21 22 23
----------------------
----------------------
in: data['a':'c']
out:
A B C D
a 12 13 14 15
b 16 17 18 19
c 20 21 22 23
----------------------
data['{ Column name }'] # Column
----------------------
in: data['A']
out:
a 12
b 16
c 20
----------------------
data[['{ Column name 1}',' Column name 2']] # Multiple columns
----------------------
in: data[['A','B']]
out:
A B
a 12 13
b 16 17
c 20 21
----------------------
# Two 、 Press index( label ) Come on :
data.loc['{ Line name }'] # That's ok
----------------------
in: data.loc['a']
out:
A 12
B 13
C 14
D 15
----------------------
data.loc[['{ Line name 1}',' Line name 2']] # Multiple lines
----------------------
in: data.loc[['a','b']]
out:
A B C D
a 12 13 14 15
b 16 17 18 19
----------------------
# 3、 ... and 、 Press Location ( Indexes ) Come on :
data.iloc['{ Row position }'] # That's ok
----------------------
in: data.iloc[1]
out:
A 16
B 17
C 18
D 19
----------------------
data.iloc[['{ Row position 1}',' Row position 2']] # Multiple lines
----------------------
in: data.iloc[[1,2]]
out:
A B C D
b 16 17 18 19
c 20 21 22 23
----------------------

Row and column Slicing and selection

data = pd.DataFrame(np.arange(12, 24).reshape((3, 4)), index=["a", "b", "c"], columns=["A", "B", "C", "D"])
# Are all the same
print(data.loc['a':'b','A':'B']) # Slice by label
print(data.loc[['a','b'],['A','B']]) # Select by tag
print(data.iloc[0:2,0:2]) # Slice by position
print(data.iloc[[0,1],[0,1]]) # Select by location
# Are all the same
out:
# Are all the same
A B
a 12 13
b 16 17

4. Make tables and attributes

data = pd.DataFrame(values = {
 Incoming data }, index = [{
 List of row names }], columns=[ Column name list ])
print(data.index) # Get row properties ( name )
print(data.columns) # Get column properties ( name )
print(data.values) # Get value attribute ( data \ Value matrix )

5. Common methods

print(data.describe()) # Get common statistics , Median 、 Average, etc

6. Simple operation

data = pd.DataFrame(values = {
 Incoming data }, index = [{
 List of row names }], columns=[ Column name list ])
data.loc['total'] = data.apply(lambda x:x.sum()) # newly added 'total' That's ok , Sum every column
data.loc['total'] = data.apply(lambda x:x.mean()) # newly added 'total' That's ok , Average value of each column

7. Tips

# Return line information , By a value in a column
data[data['{ Column name }'].isin([{
 The value of the column }])]

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved