One of the advantages of the built-in functions all() and any() is the simplicity and understandability – at least if the check is small.
内置函数 all()和 any()的一个优点是简单易懂——至少在检查内容较少时是如此。
When developing software, being nice and understandable is not the only criteria. Often, a big question is “is it fast enough?”. Even if that question can be answered with “yes”, usually there are some conditions when, where and how to use it.
在开发软件时,友好和易于理解并不是唯一的标准。通常,一个大问题是“它够快吗?”。即使这个问题的答案是“是”,通常也会有一些条件,说明何时、何地以及如何使用它。
For all() and any() the same applies. They have their own rights to exist – otherwise, Python wouldn’t provide them as built-in functions.
对于 all()和 any()函数也是如此。它们有存在的理由——否则,Python 不会将它们作为内置函数提供。
In this small blog post, I will try to give an overview of the performance in different scenarios, and some pointers to help you make your own decisions when you want to use this functions and when it makes sense to write your own functions (or for-loops).
在这篇小博客文章中,我将尝试概述在不同场景下的性能,并提供一些指引,帮助你在决定何时使用这些函数以及何时编写自己的函数(或 for 循环)时做出决策。
Implementation 实现
First things first: how are the functions implemented in Python?
首先:这些函数在 Python 中是如何实现的?
They both expect an iterable as input, loop over it and check each element. For all(), it breaks as soon as an element evaluates as False (or to be precise, the breaking condition is ’not True’), for any() as soon as an element is evaluated as True:
它们都期望一个可迭代对象作为输入,遍历它并检查每个元素。对于 all(),一旦有元素评估为 False(或者更准确地说,断开条件是“not True”),它就会中断;对于 any(),一旦有元素评估为 True,它就会中断:
|
|
In Python, an interable is any object which can return its members one at a time. To be precise, the methods __iter__() or __getitem__() need to be implemented. This includes the Python classes list, tuple, str, dict and generators.
在 Python 中,可迭代对象是任何可以一次返回一个成员的对象。准确地说,需要实现__iter__()或__getitem__()方法。这包括 Python 类 list、tuple、str、dict 和生成器。
We only need to test one built-in function to check the performance: Both are looping over all elements in the iterable and breaking as soon as either an element satisfies (any()) or breaches (all()) the Boolean condition.
我们只需要测试一个内置函数来检查性能:两者都遍历可迭代对象中的所有元素,并在任一元素满足(any())或违反(all())布尔条件时中断。
⚐ Small hint: it is possible to call the functions with empty lists: any([]) will evaluate to False, and all([]) to True. Totally logically if you look at the implementations, but my brain struggles with the different behaviors of the functions: Why is “everything” ok in an empty list, but not “anything”?
⚐ 小提示:可以用空列表调用这些函数:any([])将评估为 False,而 all([])将评估为 True。如果你查看实现,这完全是合乎逻辑的,但我的大脑在处理这些函数的不同行为时有些挣扎:为什么在空列表中“所有”都可以,但“任何”都不行?
I also wanted to run the built-in function against a self-implemented any-function which has the condition hard coded and evaluates the given list against it. I used the condition “is not zero”, which implies the question “are any elements non-zero values?”. My implementation looks like this:
我还想将内置函数与一个自实现的 any 函数进行比较,该函数具有硬编码的条件并根据该条件评估给定的列表。我使用的条件是“不是零”,这意味着问题是“是否有任何元素是非零值?”。我的实现如下:
Performance 性能
I wanted to test the built-in functions with different types of iterables, and used list, set and generators. The calls for the any() function are:
我想用不同类型的可迭代对象测试内置函数,并使用了列表、集合和生成器。对 any()函数的调用如下:
My own functions was tested against the original list elements (no generators, no comprehension) , without any changes.
我的自定义函数是针对原始列表元素(没有生成器,没有理解式)进行测试的,没有任何更改。
The setup of the provided iterable is simple: I provided iterables were all elements are either 0 or 1. I tested three scenarios: In the best case (for performance), the first element will break the loop, in the worst case the last element will, and the “normal” case will break in the middle – so in all three cases the return value will be False.
提供的可迭代对象的设置很简单:我提供的可迭代对象中所有元素都是 0 或 1。我测试了三种情况:在最佳情况下(对于性能),第一个元素将中断循环;在最坏情况下,最后一个元素将中断;在“正常”情况下,中间的元素将中断——因此在所有三种情况下,返回值都将是 False。
Setup clear? Good. Now let us check the performance!
设置清楚了吗?好。现在让我们检查性能!
Short, but important recap: the any() function can stop as soon as “any element fits the condition”. In our case, as soon as an element is one and not zero. In the best case, this means only a single element has to be evaluated, worst case, all elements have to be iterated over and checked.
简短但重要的回顾:any()函数可以在“任何元素符合条件”时立即停止。在我们的例子中,只要有一个元素是 1 而不是 0。在最佳情况下,这意味着只需要评估一个元素;在最坏情况下,所有元素都需要被迭代和检查。
Let us start with the worst case:
让我们从最坏的情况开始:
We can see, that the generator needs the most time (0.06ms with a list of size 1’000), and the self-implemented function is the fastest (0.03ms with a list of size 1’000) – twice as fast as the generator.
我们可以看到,生成器需要最多的时间(列表大小为 1,000 时为 0.06 毫秒),而自实现的函数是最快的(列表大小为 1,000 时为 0.03 毫秒)——比生成器快一倍。
For me, it is not a surprise to see that the self-implemented method is faster: It doesn’t have to go through two lists (first to transform it into True/False values, then inside the built-in function). What did surprise me is that the generator takes double time and is slowest. I did not expected that.
对我来说,自实现的方法更快并不意外:它不需要遍历两次列表(第一次将其转换为 True/False 值,然后在内置函数内部)。令我惊讶的是,生成器花费的时间是两倍,并且是最慢的。我没有预料到这一点。
That the set is faster than list for bigger sizes makes sense to me, because the second loop (the loop inside any()) will be smaller: If reduced to True and/or False values only, it consists maximal of two elements in a set-comprehension, but the same length as the original list when using list-comprehension.
对于较大的列表,集合比列表更快,这对我来说是有道理的,因为第二个循环(any()内部的循环)会更小:如果仅限于 True 和/或 False 值,集合理解式中最多包含两个元素,但使用列表理解式时长度与原始列表相同。
Still, the set implementation is 1/3 slower than the self-implemented function. For time-sensitive applications, and big lists, this has an impact.
尽管如此,集合实现仍比自实现的函数慢 1/3。对于时间敏感的应用程序和大列表,这会产生影响。
Now, this was the worst case, let us see if there is a difference if we can break early, using the “normal” case:
现在,这是最坏的情况,让我们看看在“正常”情况下是否有区别:
The self-implemented function only needs half the time it did for the worst case – makes sense, because it can break after half the elements. This implies, my test setup is working ;)
自实现的函数只需要最坏情况下的一半时间——这很合理,因为它可以在一半的元素后中断。这意味着我的测试设置是有效的 ;)
Now, we look at the different iterables provided to the built-in function and we see something interesting: the generator is becoming faster! Before, the generator took 50% more time compared to the set iterable. Now, in the “normal case”, the generator takes 25% less time than the set version.
现在,我们来看一下提供给内置函数的不同可迭代对象,我们发现了一些有趣的事情:生成器变得更快了!之前,生成器花费的时间比集合可迭代对象多 50%。现在,在“正常情况下”,生成器花费的时间比集合版本少 25%。
If we think about it, it again makes sense. As mentioned in the worst case, the list and set version need to iterate twice over the list. The generator only once, but seems to be slower during the loop or evaluation inside the built-in function.
如果我们仔细想想,这也是有道理的。如在最坏情况下所述,列表和集合版本需要遍历列表两次。生成器只需要一次,但在内置函数内部的循环或评估过程中似乎较慢。
Nice to point out: the runtime of the set is the same – again, this is logical because it still has to run through the original list, and provides a list of maximal two elements (True and False) to the built-in function. The runtime of the list is almost the same, just a little bit faster because the second loop can be stopped after half the elements.
值得指出的是:集合的运行时间是相同的——这也是合乎逻辑的,因为它仍然需要遍历原始列表,并向内置函数提供最多两个元素(True 和 False)的列表。列表的运行时间几乎相同,只是稍微快了一点,因为第二个循环可以在一半的元素后停止。
After this realization, the results of the best case scenario shouldn’t be a surprise:
在这种认识之后,最佳情况的结果不应该令人惊讶:
The run time of the list and set version remain the same, but now, the generator version is almost as fast as the self-implemented function. I did a sneak a peek and zoomed in: any_none_zero_values() takes around 500ns and the generator version 900ns.
列表和集合版本的运行时间保持不变,但现在,生成器版本几乎和自实现的函数一样快。我偷偷看了一下并放大了:any_none_zero_values()大约需要 500 纳秒,而生成器版本需要 900 纳秒。
Finally, I also looked at the worst case scenario for small lists:
最后,我还查看了小列表的最坏情况:
I didn’t expect any changes from the worst case with many elements, but there is something I would like to point out: With a small number of elements (smaller than 6 I think), the list version is faster than the set. Which implies that checking “is this element already in the set” is slower than the few extra elements to check.
我没有预料到与许多元素的最坏情况有任何变化,但有一点我想指出:对于少量元素(我认为少于 6 个),列表版本比集合更快。这意味着检查“这个元素是否已经在集合中”比检查几个额外的元素更慢。
Performance-wise, the gap is slightly bigger for small numbers of elements: With only 10 elements, the set version takes double the time of the self-implemented function – for 1’000 elements it is only 1/3 slower.
在性能方面,对于少量元素,差距稍大一些:只有 10 个元素时,集合版本花费的时间是自实现函数的两倍——对于 1,000 个元素,它只慢了 1/3。
Conclusion 结论
Generally speaking, if you develop a time-sensitive application, you should think about implementing your own functions with “hard-coded” conditions.
一般来说,如果你开发一个时间敏感的应用程序,你应该考虑实现自己的具有“硬编码”条件的函数。
If you want to use the built-in functions – for example because you like the readability – you can still improve the performance:
如果你想使用内置函数——例如因为你喜欢它的可读性——你仍然可以提高性能:
I would highly suggest to use list (small size) or set (otherwise) comprehension before the call, if you use it as a check. For example, if you check that your list doesn’t have any None values, and if it has, you raise an exception. In this case your expected behavior is that the call needs to run through all elements and normally will not “break early”.
我强烈建议在调用之前使用列表(小尺寸)或集合(否则)理解式,如果你将其用作检查。例如,如果你检查你的列表中没有任何 None 值,如果有,你会引发异常。在这种情况下,你的预期行为是调用需要遍历所有元素,通常不会“提前中断”。
Other way to identify the right choice of iterable:
识别正确可迭代对象的其他方法:
If you expect that all() returns True and/or any() returns False, use set-comprehension. If you expect the opposite outcome, use generators to reduce time.
如果你期望 all()返回 True 和/或 any()返回 False,请使用集合推导。如果你期望相反的结果,请使用生成器以减少时间。
Appendix 附录
I used the Python package perfplot for the visualization with the following settings:
我使用 Python 包 perfplot 进行可视化,设置如下: