程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python | memory management

編輯:Python

Catalog

python Reference mechanism

Python Reference count

  Reference counter principle

 Get reference count : getrefcount()

 Increase the reference count

 Reduce reference count

Memory leaks and memory overflows

 Mark clear   # It is mainly used to solve circular references .

 The advantages of the reference counting mechanism :

 The disadvantages of the reference counting mechanism

Garbage collection

 Recycling principle

 gc Mechanism

 The efficiency problem

  Three conditions trigger garbage collection

 generational (generation) Recycling   -- Determine which objects are there when you start garbage collection

Python Buffer pool ( Memory pool )

Why introduce memory pools (why)

How the memory pool works (how)

 Integer object buffer pool

 String cache

 Be careful :

 A string of intern Mechanism


 python Memory management in : Based on reference count , Generation based recycling and mark removal are supplemented by garbage collection , And the memory pool mechanism for caching small integers and hosting simple strings .


python Reference mechanism

Python Dynamic type
• An object is an entity stored in memory . Objects in memory (pycodeobject) Include ( Reference count , data type , value )
• The object name we write in the program , Just a reference to this object (reference)
• Reference and object separation , It's the core of dynamic types
• A reference can point to a new object at any time ( Memory address will be different
    answer   =    42
    identifier          value  

Python Reference count

stay Python in , Each object has the total number of references to that object , That is, reference count (reference count).

Python Save in memory variable tracking by reference counting , That is to record the number of times that the object is referenced by other objects that are used .

Python There's an internal tracking variable in called the reference counter , How many references are there for each variable , Referred to as reference count . When the reference count of an object is 0 when , It's in the garbage collection queue .


>>> a=[1,2]
>>> import sys
>>> sys.getrefcount(a) ## Get objects a The number of citations
2
>>> b=a
>>> sys.getrefcount(a)
3
>>> del b ## Delete b References to
>>> sys.getrefcount(a)
2
>>> c=list()
>>> c.append(a) ## Add to container
>>> sys.getrefcount(a)
3
>>> del c ## Delete container , quote -1
>>> sys.getrefcount(a)
2
>>> b=a
>>> sys.getrefcount(a)
3
>>> a=[3,4] ## Reassign
>>> sys.getrefcount(a)
2

Be careful : When put a Pass as parameter to getrefcount when , A temporary reference will be generated , So the result is better than the real situation +1 

  Reference counter principle

python Each object in maintains one ob_ref Field , Used to record the number of times the object is currently referenced
Whenever a new reference points to the object , Its reference count ob_ref Add 1( Reference count from 0 Start )
Count every time a reference to this object fails ob_ref reduce 1, Once the reference count of an object is 0, The object can be recycled , The memory space occupied by the object will be freed .

 Get reference count : getrefcount()

When using a reference as an argument , Pass to getrefcount() when , Parameter actually creates a temporary reference . therefore ,getrefcount() The result obtained , It will be more than expected 1

>>> from sys import getrefcount  from sys Module import getrefcount attribute , You can use it directly later getrefcount attribute
                                  If you import directly sys modular , Use later getrefcount Properties need to use sys.getrefcount()
>>> a=[1,2,3]
>>> print(getrefcount(a))
2
>>> b=a
>>> print(getrefcount(b))
3

Its disadvantage is that it requires additional space to maintain reference counts , This problem is secondary to
The main problem is that it can't release the object's “ Circular reference ” Space

 Increase the reference count

Be an object A By another object B When referencing ,A The reference count of will increase

 Reduce reference count

del When deleted or re referenced , The reference count changes (del Just delete the reference

>>> x=[1]
>>> y=[2]
>>> x.append(y)
>>> y.append(x)
>>> getrefcount(x)
3
>>> getrefcount(y)
3
>>> del x
>>> del y

Memory leaks and memory overflows

According to the law of reference counting , Circular reference occurs , Memory cannot be released by reference counting
This can cause memory leaks
Memory leak : Some memory is occupied and cannot be released , The process is inaccessible again
Memory leaks can lead to memory overflow (oom --out of memory): Out of memory , The memory required by the program is greater than the free memory of the system

 Mark clear   # It is mainly used to solve circular references .

Mark - Clearing mechanism , seeing the name of a thing one thinks of its function , First mark the object ( Garbage detection ), Then remove the garbage ( Garbage collection ).
1. Mark : Activities ( It's quoted ), Inactive ( Can be deleted )
2. eliminate : Clear all inactive pairs


>>> a=[1,2]
>>> b=[3,4]
>>> sys.getrefcount(a)
2
>>> sys.getrefcount(b)
2
>>> a.append(b)
>>> sys.getrefcount(b)
3
>>> b.append(a)
>>> sys.getrefcount(a)
3
>>> del a
>>> del b

a quote b,b quote a, At this point, the two objects are referenced 2 Time ( Remove getrefcout() A temporary reference to )

 

perform del after , object a,b The number of citations is -1, At this point, the respective reference counters are 1, Fall into a circular reference

 

Mark : Find one end of it a, Because it has a right to b References to , Will b Reference count of -1

 

Mark : Then follow the quotation to b,b There is one a References to , take a Reference count of -1, At this point the object a and b The number of citations is all 0, Marked as inaccessible (Unreachable)

 

eliminate : Objects marked as inaccessible are those that need to be released

The garbage collection phase described above , Will pause the entire application , Wait for the mark to clear before resuming the application . In order to reduce the application pause time ,Python adopt “ Generational recycling ”(Generational Collection) Space for time to improve the efficiency of garbage collection .

 The advantages of the reference counting mechanism :

• Simple
• The real time

 The disadvantages of the reference counting mechanism

• Maintain reference count consumption resources
• When quoting circularly , It can't be recycled

Garbage collection

Python The mechanism of garbage collection is mainly based on the reference counting mechanism , Mark - A strategy supplemented by scavenging and generational recycling mechanisms . among , Mark - The clearing mechanism is used to solve the problem of circular references caused by counting references and unable to free memory , Generation recycling mechanism is to improve the efficiency of garbage collection .

 Recycling principle

When Python The reference count of an object of the is reduced to 0 when , It can be recycled by garbage

 gc Mechanism

• GC As an automatic memory management mechanism of modern programming language , Focus on two things
• Find useless garbage resources in memory
• Clean up the garbage and let the memory out for other objects .
GC Free programmers from the burden of resource management , Give them more time to focus on business logic . But that doesn't mean
The yard farmer can not understand GC, After all, know more GC Knowledge is still good for us to write more robust code

 The efficiency problem

Garbage collection ,Python You can't do anything else . Frequent garbage collection will greatly reduce Python Work efficiency .
When Python Runtime , It will record the assigned objects (object allocation) And unassign objects (object deallocation) Of
frequency . When the difference between the two is higher than a certain threshold , Garbage collection will start

import gc
print(gc.get_threshold())
(700,10,10)    #700 Gate threshold , The interval between generations of recycling is 10 Time 

  Three conditions trigger garbage collection

• call gc.collect()

>>> gc.collect()
2

• GC When the threshold is reached
• When the program exits

 generational (generation) Recycling   -- Determine which objects are there when you start garbage collection

The basic assumption of this strategy is : Objects that live longer , The less likely it is to become garbage in later programs .
• Python Divide all objects into 0,1,2 The three generation .
• All new objects are 0 On behalf of the object .
• When a generation of objects has experienced garbage collection , Still alive , So it's going to be the next generation of objects .
• When garbage collection starts , It will scan all the 0 On behalf of the object .
• If 0 After a certain number of garbage collection , So start right 0 The generation and 1 Generation of scanning and cleaning .
• When 1 Generation also experienced a certain number of garbage collection after , Then it will start. Yes 0,1,2, That is to scan all objects

Python Buffer pool ( Memory pool )

Why introduce memory pools (why)

When creating objects that consume a lot of small memory , Call... Frequently new/malloc It will cause a lot of memory fragmentation , To reduce efficiency . The function of memory pool is to apply a certain amount of memory in advance , Memory blocks of equal size are reserved for standby , When there are new memory requirements , First, allocate memory from the memory pool to this requirement , If not enough, apply for new memory . The most significant advantage of this is that it can reduce memory fragmentation , Improve efficiency .

How the memory pool works (how)

python Object management is mainly located in Level+1~Level+3 layer

Level+3 layer : about python Built in objects ( such as int,dict etc. ) Each has its own private memory pool , The memory pool between objects is not shared , namely int Free memory , Will not be assigned to float Use

Level+2 layer : When the requested memory size is less than 256KB when , Memory allocation is mainly made up of Python Object allocator (Python’s object allocator) The implementation of

Level+1 layer : When the requested memory size is greater than 256KB when , from Python The native memory allocator allocates , Call in essence C In the standard library malloc/realloc Such as function

About freeing memory , When the reference count of an object changes to 0 when ,Python Will call its destructor . Calling a destructor does not mean that it will eventually call free To free up memory space , If so , Then apply frequently 、 Freeing up memory will make Python The efficiency of the implementation is greatly reduced . Therefore, the memory pool mechanism is also used in the destruct , The memory requested from the memory pool will be returned to the memory pool , To avoid frequent application and release actions .

 Integer object buffer pool

about [-5,256] Such a small integer , The system has been initialized , You can use it directly . And for other large integers , The system proposes
I applied for a piece of memory space before , Create large integer objects on this when needed .
>>> a=3
>>> print(getrefcount(a))
49

 String cache

To verify that two references point to the same object , We can use is keyword .is Used to determine whether two references refer to the same object .
When the caching mechanism is triggered , It's just creating new quotes , Not the object itself .
Single character , After creation, it will be stored in the string resident area
Multiple characters , If there are no special characters after creation , It will be stored in the string resident area

 Be careful :

This is also an optimized solution for frequently used numbers and strings

 A string of intern Mechanism

python For short , Only alphanumeric strings automatically trigger the caching mechanism . Other situations don't cache

 


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved