Python yield使用
To understand what yield does, you must understand what generators are. And before you can understand generators, you must understand iterables. 套 娃 理 解 想要理解yield的用法,首先要理解generators,想理解generators就先要理解iterables
防止接下来会晕,先来理解一下这三个单词:
- 可迭代对象(Iterable):
Python中任意的对象,只要它定义了可以返回一个迭代器的__iter__方法,或者定义了可以支持下标索引的__getitem__方法(这些双下划线方法会在其他章节中全面解释),那么它就是一个可迭代对象。 - 迭代器(Iterator)
任意对象,只要定义了next(Python2) 或者__next__方法,它就是一个迭代器。 - 迭代(Iteration)
用简单的话讲,它就是从某个地方(比如一个列表)取出一个元素的过程。当我们使用一个循环来遍历某个东西时,这个过程本身就叫迭代。 - 生成器(Generators)
生成器也是一种迭代器,但是你只能对其迭代一次。这是因为它们并没有把所有的值存在内存中,而是在运行时生成值。你通过遍历来使用它们,要么用一个“for”循环,要么将它们传递给任意可以进行迭代的函数和结构。 大多数时候生成器是以函数来实现的。然而,它们并不返回一个值,而是yield(暂且译作“生出”)一个值。
根据维基百科的解释:迭代器是一个让程序员可以遍历一个容器(特别是列表)的对象。 记住:迭代器(Iterator)是对象
Understand in code:
When you create a list, you can read its items one by one. Reading its items one by one is called iteration:
mylist = [1, 2, 3]
for i in mylist:
print(i)
1
2
3
mylist is an iterable. When you use a list comprehension, you create a list, and so an iterable: 下面是使用列表表达式生成列表
mylist = [x*x for x in range(3)]
for i in mylist:
print(i)
0
1
4
Everything you can use “for… in…” on is an iterable; lists, strings, files…
These iterables are handy because you can read them as much as you wish, but you store all the values in memory and this is not always what you want when you have a lot of values.
Generators
Generators are iterators, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly:
这里说了,生成器是一种迭代器,但是你只能对其最多迭代一次。
mygenerator = (x*x for x in range(3))
for i in mygenerator:
print(i)
0
1
4
It is just the same except you used () instead of []. BUT, you cannot perform for i in mygenerator a second time since generators can only be used once: they calculate 0, then forget about it and calculate 1, and end calculating 4, one by one. 注意这里的例子遇上变得区别就是一个使用了()另一个使用了[]
Yield
yield is a keyword that is used like return, except the function will return a generator.
def create_generator():
mylist = range(3)
for i in mylist:
yield i*i
mygenerator = create_generator() # create a generator
print(mygenerator) # mygenerator is an object!
<generator object create_generator at 0xb7555c34>
for i in mygenerator:
print(i)
0
1
4
Here it’s a useless example, but it’s handy when you know your function will return a huge set of values that you will only need to read once.
To master yield, you must understand that when you call the function, the code you have written in the function body does not run. The function only returns the generator object, this is a bit tricky.
Then, your code will continue from where it left off each time for uses the generator.
Now the hard part:
The first time the for calls the generator object created from your function,
it will run the code in your function from the beginning until it hits yield, then it'll return the first value of the loop.
Then, each subsequent call will run another iteration of the loop you have written in the function and return the next value.
This will continue until the generator is considered empty, which happens when the function runs without hitting yield.
That can be because the loop has come to an end, or because you no longer satisfy an "if/else".
可以理解为: 一个带有 yield 的函数就是一个 generator,它和普通函数不同,生成一个 generator 看起来像函数调用,但不会执行任何函数代码,直到对其调用 next()(在 for 循环中会自动调用 next())才开始执行。 虽然执行流程仍按函数的流程执行,但每执行到一个 yield 语句就会中断,并返回一个迭代值,下次执行时从 yield 的下一个语句继续执行。 看起来就好像一个函数在正常执行的过程中被 yield 中断了数次,每次中断都会通过 yield 返回当前的迭代值。
Your code explained Generator:
# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):
# Here is the code that will be called each time you use the generator object:
# If there is still a child of the node object on its left
# AND if the distance is ok, return the next child
if self._leftchild and distance - max_dist < self._median:
yield self._leftchild
# If there is still a child of the node object on its right
# AND if the distance is ok, return the next child
if self._rightchild and distance + max_dist >= self._median:
yield self._rightchild
# If the function arrives here, the generator will be considered empty
# there is no more than two values: the left and the right children
Caller:
# Create an empty list and a list with the current object reference
result, candidates = list(), [self]
# Loop on candidates (they contain only one element at the beginning)
while candidates:
# Get the last candidate and remove it from the list
node = candidates.pop()
# Get the distance between obj and the candidate
distance = node._get_dist(obj)
# If distance is ok, then you can fill the result
if distance <= max_dist and distance >= min_dist:
result.extend(node._values)
# Add the children of the candidate in the candidate's list
# so the loop will keep running until it will have looked
# at all the children of the children of the children, etc. of the candidate
candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result This code contains several smart parts:
The loop iterates on a list, but the list expands while the loop is being iterated. It’s a concise way to go through all these nested data even if it’s a bit dangerous since you can end up with an infinite loop. In this case, candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) exhaust all the values of the generator, but while keeps creating new generator objects which will produce different values from the previous ones since it’s not applied on the same node.
The extend() method is a list object method that expects an iterable and adds its values to the list.
Usually we pass a list to it:
a = [1, 2]
b = [3, 4]
a.extend(b)
print(a)
[1, 2, 3, 4]
But in your code, it gets a generator, which is good because:
You don’t need to read the values twice. You may have a lot of children and you don’t want them all stored in memory. And it works because Python does not care if the argument of a method is a list or not. Python expects iterables so it will work with strings, lists, tuples, and generators! This is called duck typing and is one of the reasons why Python is so cool. But this is another story, for another question…
You can stop here, or read a little bit to see an advanced use of a generator:
Controlling a generator exhaustion
>>> class Bank(): # Let's create a bank, building ATMs
... crisis = False
... def create_atm(self):
... while not self.crisis:
... yield "$100"
>>> hsbc = Bank() # When everything's ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
['$100', '$100', '$100', '$100', '$100']
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> wall_street_atm = hsbc.create_atm() # It's even true for new ATMs
>>> print(wall_street_atm.next())
<type 'exceptions.StopIteration'>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
... print cash
$100 $100 $100 $100 $100 $100 $100 $100 $100 … Note: For Python 3, useprint(corner_street_atm.next()) or print(next(corner_street_atm))
It can be useful for various things like controlling access to a resource.
食用原文更佳 https://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do
|