Python3 --- understanding and application of iteratable objects, iterators and generators


List of articles

    • 1. Iteratable object (iterable)
      • 1). Iteratibility ----for Principle of circulation
      • 2). Characteristics of iteratable objects :
      • 3). The source code of the iteratable object :
    • 2. iterator (iterator)
      • 1). Iterator source code :
      • 2). Iteratable object & The difference between iterators
      • 3). Custom iterators --- Fibonacci sequence
      • 4). Application scenarios of iterators ?
    • 3. generator (generator)
      • 1). The characteristics of the generator ?
      • 2). The creation of the generator ?
      • 3).yield workflow
      • 4). In generator yield & return
      • 5). In generator send()& next() difference
      • 6). The application of the generator --- Simple production / Consumption model
      • 7). The application of the generator ---yield Multitasking switching ( Co process simulation )
      • 8). The application of the generator --- Reading of large files

1. Iteratable object (iterable)

1). Iteratibility ----for Principle of circulation

1. character string , list , Yuan Zu , Dictionaries , aggregate 、 Documents, etc. , Are all iteratable objects , Both have iterative properties .
2. Iterative , Does not mean that it is an iteratable object

1. contain __getitem__ Magic methods : Iterative

from collections import Iterable
# 1、 Only achieve __getitem__
class A:
def __init__(self):
self.data = [1, 2, 3]
def __getitem__(self, index):
return self.data[index]
a = A()
print(isinstance(a, Iterable)) # Determine whether it is an iterative object 
for i in a:
# result :

2. contain __getitem__ Magic methods & __iter__ Magic methods : Iteratable object

from collections import Iterable
class A:
def __init__(self):
self.data = [1, 2, 3]
self.data1 = [4, 5, 6]
def __iter__(self):
return iter(self.data1)
def __getitem__(self, index):
return self.data[index]
a = A()
print(isinstance(a, Iterable)) # Determine whether it is an iterative object 
for i in a:
# The result is :
  • As shown in the above code , If you just realize __getitem__,for in The loop automatically calls __getitem__ function , And automatically Index from 0 Start to grow , And assign the value of the corresponding index position to i, Until it triggers IndexError error
  • If it does __iter__, Will ignore __getitem__, Just call __iter__, Also on __iter__ Iterator returned for member traversal , And automatically assign the traversal members to i, Until it triggers StopIteration

2). Characteristics of iteratable objects :

  1. character string , list , Yuan Zu , Dictionaries , aggregate 、 Documents, etc. , Are all iteratable objects
  2. Realized __iter__ Methods are called iteratable objects ,_iter__ Method can return an iterator object , And then through next() Method to get elements one by one .
  3. Intuitive understanding is , It works for The iterated object is the iteratable object .

【 The following figure is very important 、 Very important 、 Very important !!!】

understand :

  1. for example X List objects , Can pass iter() Method to get the iterator , Through iterators next() Method to get the object element , shows X List objects are iteratable objects
  2. There is a new concept — iterator , This follow-up explanation …

my_list = ["hello", "alien", "world"]
# The following two methods have the same effect , Is to get the iterator 
list_type_iterator = my_list.__iter__() # In this way , View the source code of the iteratable object 
# list_type_iterator = iter(my_list)
item = list_type_iterator.next() # In this way , Check the iterator source code 
item = list_type_iterator.next()
item = list_type_iterator.next()

<listiterator object at 0x10a8fa3d0>

3). The source code of the iteratable object :

# Through the above code , adopt __iter__() Function into the source code (Ctrl + B), Let's find out 
list_type_iterator = my_list.__iter__()
Enter into __builtin__.py In file , This document defines python3 Data types commonly used in .
We search this file for __iter__ Method , Find the following code :

class list(object):
""" list() -> new empty list list(iterable) -> new list initialized from iterable's items """
def append(self, p_object): # real signature unknown; restored from __doc__
""" L.append(object) -- append object to end """
def __iter__(self): # real signature unknown; restored from __doc__
""" x.__iter__() <==> iter(x) """
class dict(object):
""" dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2) """
def clear(self): # real signature unknown; restored from __doc__
""" D.clear() -> None. Remove all items from D. """
def __iter__(self): # real signature unknown; restored from __doc__
""" x.__iter__() <==> iter(x) """
class file(object):
""" file(name[, mode[, buffering]]) -> file object Open a file. The mode can be 'r', 'w' or 'a' for reading (default), writing or appending. The file will be created if it doesn't exist when opened for writing or appending; it will be truncated when opened for writing. Add a 'b' to the mode for binary files. Add a '+' to the mode to allow simultaneous reading and writing. If the buffering argument is given, 0 means unbuffered, 1 means line buffered, and larger numbers specify the buffer size. The preferred way to open a file is with the builtin open() function. Add a 'U' to mode to open the file for input with universal newline support. Any line ending in the input file will be seen as a '\n' in Python. Also, a file so opened gains the attribute 'newlines'; the value for this attribute is one of None (no newline read yet), '\r', '\n', '\r\n' or a tuple containing all the newline types seen. 'U' cannot be combined with 'w' or '+' mode. """
def readline(self, size=None): # real signature unknown; restored from __doc__
""" readline([size]) -> next line from the file, as a string. Retain newline. A non-negative size argument limits the maximum number of bytes to return (an incomplete line may be returned then). Return an empty string at EOF. """
def close(self): # real signature unknown; restored from __doc__
""" close() -> None or (perhaps) an integer. Close the file. Sets data attribute .closed to True. A closed file cannot be used for further I/O operations. close() may be called more than once without error. Some kinds of file objects (for example, opened by popen()) may return an exit status upon closing. """
def __iter__(self): # real signature unknown; restored from __doc__
""" x.__iter__() <==> iter(x) """
  • We found that the commonly used iteratable objects are all defined __iter__ Method , So later when we automatically use iterators , This method is also needed .

2. iterator (iterator)

1). Iterator source code :

# Through the beginning of the article next() Method , You can go to the source code of the iterator to see what happened 
item = list_type_iterator.next()
item It is an element in the iteratable object

class Iterable(Protocol[_T_co]):
def __iter__(self) -> Iterator[_T_co]: ...
# explain : Iteratable objects pass __iter__() Method to get the iterator 
class Iterator(Iterable[_T_co], Protocol[_T_co]):
def next(self) -> _T_co: ...
# explain : The iterator passes through next() Method to get an element 
def __iter__(self) -> Iterator[_T_co]: ...
  • Through the above , There is another unique way to discover iterators —next(), This is to get one of the elements

2). Iteratable object & The difference between iterators

Through the above code 、 Source code we conclude :

  • Every iteratable object has a __iter__ function
  • Iteratable objects pass __iter__ Get the iterator , Iterator re pass next() Method , You can get the elements
  • Every time next() after , The iterator records the progress of the current execution , The next time to perform next() When , Continue to execute at the last position . So the elements are continuous at each iteration .

3). Custom iterators — Fibonacci sequence

class Fib():
def __init__(self, max):
self.n = 0
self.prev = 0
self.curr = 1
self.max = max
def __iter__(self):
return self
def __next__(self):
if self.n < self.max:
value = self.curr
self.curr += self.prev
self.prev = value
self.n += 1
return value
raise StopIteration
fb = Fib(5)

Traceback (most recent call last):
File "/Volumes/Develop/iterator_generator.py", line 43, in <module>
File "/Volumes/Develop/iterator_generator.py", line 34, in __next__
raise StopIteration

Be careful :
1. When the iterator has no data , If you call next() Method , Will throw out StopIteration error

4). Application scenarios of iterators ?

1. Why use iterators ?

  1. Iterators take little memory resources and reduce computation cycles .
  2. Because the calculation of iterators is lazy . It can be understood as each execution next() Method , Only calculate once each time , Otherwise, the data will not be calculated and saved .
  3. Iterators can also record the progress of execution , Next time next() When , You can start from the last end position .

for example , You want to create one with 100000000 Fibonacci series of data . If it is used after all the data are generated , It will definitely consume a lot of memory resources . If you use iterators to handle , You can basically ignore the occupation of memory and the cost of computing time .

2. The principle of iterator is used to read the file

【 Common read method 】

# readlines() The method is actually to read all the contents of the file and form a list, No line is one of the elements 
for line in open("test.txt").readlines():
print line
# 1. Read all the contents of the file at once and load them into memory , Then print line by line .
# 2. When the file is large , The memory cost of this method is very large , If the file is larger than the memory , The program will collapse 

【 Iterator mode reads 】

for line in open("test.txt"): #use file iterators
print line
# This is the simplest and fastest way to write , He didn't read the file explicitly , Instead, use the iterator to read the next line at a time .

3. generator (generator)

1). The characteristics of the generator ?

  1. Generator is a special kind of iterator , Have the properties of iterators , But this iterator is more elegant .
  2. It doesn't need to be written like the class above __iter__() and __next__() The method , Just one yiled keyword .
  3. The generator must be an iterator ( Otherwise, it doesn't work ), So any generator also generates values in a lazy load mode .

2). The creation of the generator ?

1. Just put a list into a generative [ ] Change to ( )

L = [x * 2 for x in range(5)]
G = (x * 2 for x in range(5))
# give the result as follows :
<type 'list'>
<generator object <genexpr> at 0x10fa48730>
# Get the element in the generator 
G = (x * 2 for x in range(5))
# give the result as follows 

2. Create a generator with a function

def fib(max):
n, a, b = 0, 0, 1
while n < max:
yield b
a, b = b, a + b
n = n + 1
a = fib(10)
# result :

3).yield workflow

def fib(max):
n, a, b = 0, 0, 1
while n < max:
yield b # Every time you execute next() Method , It's all here , And return an element 
a, b = b, a + b # yield The following section , The next time next Method to execute 
n = n + 1
fb = fib(5)
# result :

4). In generator yield & return

  1. Generators are like iterators , When all element iterations are complete , If we do it again next() function , Will report a mistake . To optimize this problem , have access to return solve
  2. return It can be done in iterations , Return to a specific 【 error message 】, And then through try Capture StopIteration error , You can receive this 【 error message 】

def fib(max):
n, a, b = 0, 0, 1
while n < max:
yield b
a, b = b, a + b
n = n + 1
return 'iter num finish'

1. Mode one :

def iter_list(iterator):
x = next(iterator)
print("----->", x)
except StopIteration as ret:
stop_reason = ret.value
# result :
-----> 1
-----> 1
-----> 2
-----> 3
-----> 5
iter num finish

1. Mode two :

fb = fib(5)
def iter_list(iterator):
while True:
x = next(iterator)
print("----->", x)
except StopIteration as ret:
stop_reason = ret.value
# result :
-----> 1
-----> 1
-----> 2
-----> 3
-----> 5
iter num finish

5). In generator send()& next() difference

1.next() To wake up and continue
2.send() To wake up and continue , At the same time, send a message to the generator , Need a variable receive

def fib(max):
n, a, b = 0, 0, 1
while n < max:
temp = yield b
print("\n temp------>", temp)
a, b = b, a + b
n = n + 1
a = fib(10)
abc = a.send("hello")
abc = a.send("alien")
# result :
temp------> hello
temp------> alien

  1. Through the above send() Use of functions , explain send once , It's equivalent to next() once , It also passes a value to temp Variable reception , Explain that at the same time 2 thing .
  2. a.send(“hello”) Result , Equivalent to next(a) Result
  3. send Execution of a function , First pass the passed value , Assign a value to temp, And then execute next The function of

6). The application of the generator — Simple production / Consumption model

def producter(num):
print("produce %s product" % num)
while num > 0:
consume_num = yield num
if consume_num:
print("consume %s product" % consume_num)
num -= consume_num
print("consume 1 time")
num -= 1
return "consume finish"
p = producter(20)
print("start----->", next(p), "\n")
abc = p.send(2)
print("the rest num---->", abc, "\n")
print("the rest num---->", next(p), "\n")
# result :
produce 20 product
start-----> 20
consume 2 product
the rest num----> 18
consume 1 time
the rest num----> 17

7). The application of the generator —yield Multitasking switching ( Co process simulation )

The main features of the collaborative process :

  • 1. Concurrency is a non preemptive feature : There is also switching in the coordination process , This switching is controlled by our users . The main solution of the cooperation project is IO The operation of
  • 2. Collaborative process has high execution efficiency : Because subroutine switching is not a thread switching , It's controlled by the program itself , therefore , There is no overhead for thread switching , Ratio to multithreading , The more threads there are , The greater the performance advantage of the coroutine
  • 3. The coroutine does not need to care about the multi-threaded locking mechanism , There is no need to care about data sharing : No multi-threaded locking mechanism is required , Because there's only one thread , There is no conflict between writing variables at the same time , Control Shared resources without locking in the coroutine , You just have to judge the state , So execution efficiency is much higher than multithreading

def task1(times):
for i in range(times):
print('task_1 done the :{} time'.format(i + 1))
def task2(times):
for i in range(times):
print('task_2 done the :{} time'.format(i + 1))
gene1 = task1(5)
gene2 = task2(5)
for i in range(10):
# result :
task_1 done the :1 time
task_2 done the :1 time
task_1 done the :2 time
task_2 done the :2 time
task_1 done the :3 time
task_2 done the :3 time
task_1 done the :4 time
task_2 done the :4 time
task_1 done the :5 time
task_2 done the :5 time

8). The application of the generator — Reading of large files

Use the generator's hang and rerun feature , We can achieve on-demand , Each time a file of a specified size is read , Avoid reading files because , Because too much content is read at one time , Cause problems such as memory overflow

def read_file(fpath):
with open(fpath, 'rb') as f:
while True:
block = f.read(BLOCK_SIZE)
if block:
yield block

