stay Pandas In the course of using , In addition to data , We deal more with forms . In order to better display a tabular data , There must be good settings in the early stage .
This article introduces Pandas Common configuration skills , Mainly based on options and setings To unfold . Push the official website learning address :https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html.
<!--MORE-->
This is a way of introducing international practice !
import pandas as pd
Because the version is updated , Probably Pandas Some uses of will be removed soon , There are often warnings ( It's not a mistake ), With the following code, you can ignore the relevant warnings :
# Ignore the warning
import warnings
warnings.filterwarnings('ignore')The default is to keep 6 Decimal place . Print the current accuracy in the following way :
pd.get_option( 'display.precision')
6
Set the precision to 2 position
pd.set_option( 'display.precision',2) # How to write it 2:pd.options.display.precision = 2
Then we print again, and the current accuracy becomes 2 position :
pd.get_option( 'display.precision')
2
The default number of rows displayed is 60
pd.get_option("display.max_rows") # The default is 6060
The default minimum number of rows is 10 position :
pd.get_option("display.min_rows") # Show at least rows 10
Modify the maximum number of display lines to 999, Then look at :
pd.set_option("display.max_rows",999) # Display the maximum number of lines pd.get_option("display.max_rows")999
Modify the minimum number of lines displayed :
pd.set_option("display.min_rows",20) pd.get_option("display.min_rows")20
Use reset reset_option After the method , The setting will become the default form ( The number ):
pd.reset_option("display.max_rows")pd.get_option("display.max_rows") # It's back to 6060
pd.reset_option("display.min_rows")pd.get_option("display.min_rows") # It's back to 1010
If we have more than one options Modified settings , If you want to recover at the same time , Using regular expressions, you can reset multiple option.
Here it means with displacy Reset all the settings at the beginning :
# ^ Indicates starting with a character , Here it means with display Start resetting all
pd.reset_option("^display")If you use all, It means to reset all settings :
pd.reset_option('all')Since you can control the number of lines displayed , Of course, you can also control the number of columns displayed
The number of columns displayed by default is 20:
pd.get_option('display.max_columns')
# Another way of writing : By attributes
pd.options.display.max_columns 20
Modify the number of columns displayed into 100:
# Modified into 100
pd.set_option('display.max_columns',100)View the number of modified Columns :
# View the modified value
pd.get_option('display.max_columns')100
If I set it to None, It means that all columns are displayed :
pd.set_option('display.max_columns',None)pd.reset_option('display.max_columns')The above is to view the number of columns , The following is to set the width of each column . Single column data width , In the number of characters , Use an ellipsis to indicate when it exceeds .
The default column width is 50 The width of characters :
pd.get_option ('display.max_colwidth')50
Modify the displayed column width to 100:
# Modified into 100
pd.set_option ('display.max_colwidth', 100)View the displayed column width and length :
pd.get_option ('display.max_colwidth')100
Show all columns :
pd.set_option ('display.max_colwidth', None)When we output data width , When the set width is exceeded , Do you want to collapse . Usually use False Do not fold , contrary True To fold .
pd.set_option("expand_frame_repr", True) # Fold pd.set_option("expand_frame_repr", False) # Do not fold Various settings described above , If there is any modification, it is the of the whole environment ; We can also make temporary settings for only one code block .
Run out of the current code block , It will fail , Restore to the original settings .
Suppose this is the first code block :
print(pd.get_option("display.max_rows"))
print(pd.get_option("display.max_columns"))
60
20Here is the second code block :
# Set the current code block
with pd.option_context("display.max_rows", 20, "display.max_columns", 10):
print(pd.get_option("display.max_rows"))
print(pd.get_option("display.max_columns"))
20
10Here's the third code block :
print(pd.get_option("display.max_rows"))
print(pd.get_option("display.max_columns"))
60
20In the above example, we can find that : Outside the specified code block , Invalid settings
Pandas There was a display.float_format Methods , It can format and output floating-point numbers , For example, use the thousandth , percentage , Fixed decimal places, etc .
If other data types can be converted to floating point numbers , You can also use this method .
The callable should accept a floating point number and return a string with the desired format of the number
When the data is big , Hope to pass Thousandths To represent data , Be clear at a glance :
df = pd.DataFrame({
"percent":[12.98, 6.13, 7.4],
"number":[1000000.3183,2000000.4578,3000000.2991]})
dfexcept % Number , We can also use other special symbols to represent :
What does threshold switching mean ? First of all, the implementation of this function uses display.chop_threshold Method .
It means that you will Series perhaps DF The data in is displayed as the threshold of a certain number . Greater than this number , Direct display ; Less than , use 0 Show .
By default ,pandas Use matplotlib As drawing backend , We can modify the settings :
import matplotlib.pyplot as plt %matplotlib inline # By default df1 = pd.DataFrame(dict(a=[5,3,2], b=[3,4,1])) df1.plot(kind="bar") plt.show()
Change the back end of the next drawing , Become powerful plotly:
# How to write it 1 pd.options.plotting.backend = "plotly" df = pd.DataFrame(dict(a=[5,3,2], b=[3,4,1])) fig = df.plot() fig.show() # How to write it 2 df = pd.DataFrame(dict(a=[5,3,2], b=[3,4,1])) fig = df.plot(backend='plotly') # Specify here fig.show()
By default , attribute field ( The column header ) It's right aligned , We can set . Let's take a look at an example from the official website :
pd.describe_option() Is to print all the current settings , And recharge all options . Here are some setting options :
Common configurations are summarized below , Copy and use :
import pandas as pd # International practice
import warnings
warnings.filterwarnings('ignore') # Ignore the warning in the text
pd.set_option( 'display.precision',2)
pd.set_option("display.max_rows",999) # Display the maximum number of lines
pd.set_option("display.min_rows",20) # Minimum number of lines displayed
pd.set_option('display.max_columns',None) # All columns
pd.set_option ('display.max_colwidth', 100) # Change column width
pd.set_option("expand_frame_repr", True) # Fold
pd.set_option('display.float_format', '{:,.2f}'.format) # Thousandths
pd.set_option('display.float_format', '{:.2f}%'.format) # Percentage form
pd.set_option('display.float_format', '{:.2f}¥'.format) # Special symbols
pd.options.plotting.backend = "plotly" # Modify drawing
pd.set_option("colheader_justify","left") # Column field alignment
pd.reset_option('all') # Reset