API Reference
More routines for operating on iterables, beyond itertools
Grouping
These tools yield groups of items from a source iterable.
New itertools
- more_itertools.chunked(iterable, n, strict=False)[source]
Break iterable into lists of length n:
>>> list(chunked([1, 2, 3, 4, 5, 6], 3)) [[1, 2, 3], [4, 5, 6]]
By the default, the last yielded list will have fewer than n elements if the length of iterable is not divisible by n:
>>> list(chunked([1, 2, 3, 4, 5, 6, 7, 8], 3)) [[1, 2, 3], [4, 5, 6], [7, 8]]
To use a fill-in value instead, see the
grouper()
recipe.If the length of iterable is not divisible by n and strict is
True
, thenValueError
will be raised before the last list is yielded.
- more_itertools.ichunked(iterable, n)[source]
Break iterable into sub-iterables with n elements each.
ichunked()
is likechunked()
, but it yields iterables instead of lists.If the sub-iterables are read in order, the elements of iterable won’t be stored in memory. If they are read out of order,
itertools.tee()
is used to cache elements as necessary.>>> from itertools import count >>> all_chunks = ichunked(count(), 4) >>> c_1, c_2, c_3 = next(all_chunks), next(all_chunks), next(all_chunks) >>> list(c_2) # c_1's elements have been cached; c_3's haven't been [4, 5, 6, 7] >>> list(c_1) [0, 1, 2, 3] >>> list(c_3) [8, 9, 10, 11]
- more_itertools.chunked_even(iterable, n)[source]
Break iterable into lists of approximately length n. Items are distributed such the lengths of the lists differ by at most 1 item.
>>> iterable = [1, 2, 3, 4, 5, 6, 7] >>> n = 3 >>> list(chunked_even(iterable, n)) # List lengths: 3, 2, 2 [[1, 2, 3], [4, 5], [6, 7]] >>> list(chunked(iterable, n)) # List lengths: 3, 3, 1 [[1, 2, 3], [4, 5, 6], [7]]
- more_itertools.sliced(seq, n, strict=False)[source]
Yield slices of length n from the sequence seq.
>>> list(sliced((1, 2, 3, 4, 5, 6), 3)) [(1, 2, 3), (4, 5, 6)]
By the default, the last yielded slice will have fewer than n elements if the length of seq is not divisible by n:
>>> list(sliced((1, 2, 3, 4, 5, 6, 7, 8), 3)) [(1, 2, 3), (4, 5, 6), (7, 8)]
If the length of seq is not divisible by n and strict is
True
, thenValueError
will be raised before the last slice is yielded.This function will only work for iterables that support slicing. For non-sliceable iterables, see
chunked()
.
- more_itertools.constrained_batches(iterable, max_size, max_count=None, get_len=len, strict=True)[source]
Yield batches of items from iterable with a combined size limited by max_size.
>>> iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1'] >>> list(constrained_batches(iterable, 10)) [(b'12345', b'123'), (b'12345678', b'1', b'1'), (b'12', b'1')]
If a max_count is supplied, the number of items per batch is also limited:
>>> iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1'] >>> list(constrained_batches(iterable, 10, max_count = 2)) [(b'12345', b'123'), (b'12345678', b'1'), (b'1', b'12'), (b'1',)]
If a get_len function is supplied, use that instead of
len()
to determine item size.If strict is
True
, raiseValueError
if any single item is bigger than max_size. Otherwise, allow single items to exceed max_size.
- more_itertools.distribute(n, iterable)[source]
Distribute the items from iterable among n smaller iterables.
>>> group_1, group_2 = distribute(2, [1, 2, 3, 4, 5, 6]) >>> list(group_1) [1, 3, 5] >>> list(group_2) [2, 4, 6]
If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:
>>> children = distribute(3, [1, 2, 3, 4, 5, 6, 7]) >>> [list(c) for c in children] [[1, 4, 7], [2, 5], [3, 6]]
If the length of iterable is smaller than n, then the last returned iterables will be empty:
>>> children = distribute(5, [1, 2, 3]) >>> [list(c) for c in children] [[1], [2], [3], [], []]
This function uses
itertools.tee()
and may require significant storage.If you need the order items in the smaller iterables to match the original iterable, see
divide()
.
- more_itertools.divide(n, iterable)[source]
Divide the elements from iterable into n parts, maintaining order.
>>> group_1, group_2 = divide(2, [1, 2, 3, 4, 5, 6]) >>> list(group_1) [1, 2, 3] >>> list(group_2) [4, 5, 6]
If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:
>>> children = divide(3, [1, 2, 3, 4, 5, 6, 7]) >>> [list(c) for c in children] [[1, 2, 3], [4, 5], [6, 7]]
If the length of the iterable is smaller than n, then the last returned iterables will be empty:
>>> children = divide(5, [1, 2, 3]) >>> [list(c) for c in children] [[1], [2], [3], [], []]
This function will exhaust the iterable before returning. If order is not important, see
distribute()
, which does not first pull the iterable into memory.
- more_itertools.split_at(iterable, pred, maxsplit=-1, keep_separator=False)[source]
Yield lists of items from iterable, where each list is delimited by an item where callable pred returns
True
.>>> list(split_at('abcdcba', lambda x: x == 'b')) [['a'], ['c', 'd', 'c'], ['a']]
>>> list(split_at(range(10), lambda n: n % 2 == 1)) [[0], [2], [4], [6], [8], []]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
>>> list(split_at(range(10), lambda n: n % 2 == 1, maxsplit=2)) [[0], [2], [4, 5, 6, 7, 8, 9]]
By default, the delimiting items are not included in the output. To include them, set keep_separator to
True
.>>> list(split_at('abcdcba', lambda x: x == 'b', keep_separator=True)) [['a'], ['b'], ['c', 'd', 'c'], ['b'], ['a']]
- more_itertools.split_before(iterable, pred, maxsplit=-1)[source]
Yield lists of items from iterable, where each list ends just before an item for which callable pred returns
True
:>>> list(split_before('OneTwo', lambda s: s.isupper())) [['O', 'n', 'e'], ['T', 'w', 'o']]
>>> list(split_before(range(10), lambda n: n % 3 == 0)) [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
>>> list(split_before(range(10), lambda n: n % 3 == 0, maxsplit=2)) [[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
- more_itertools.split_after(iterable, pred, maxsplit=-1)[source]
Yield lists of items from iterable, where each list ends with an item where callable pred returns
True
:>>> list(split_after('one1two2', lambda s: s.isdigit())) [['o', 'n', 'e', '1'], ['t', 'w', 'o', '2']]
>>> list(split_after(range(10), lambda n: n % 3 == 0)) [[0], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
>>> list(split_after(range(10), lambda n: n % 3 == 0, maxsplit=2)) [[0], [1, 2, 3], [4, 5, 6, 7, 8, 9]]
- more_itertools.split_into(iterable, sizes)[source]
Yield a list of sequential items from iterable of length ‘n’ for each integer ‘n’ in sizes.
>>> list(split_into([1,2,3,4,5,6], [1,2,3])) [[1], [2, 3], [4, 5, 6]]
If the sum of sizes is smaller than the length of iterable, then the remaining items of iterable will not be returned.
>>> list(split_into([1,2,3,4,5,6], [2,3])) [[1, 2], [3, 4, 5]]
If the sum of sizes is larger than the length of iterable, fewer items will be returned in the iteration that overruns iterable and further lists will be empty:
>>> list(split_into([1,2,3,4], [1,2,3,4])) [[1], [2, 3], [4], []]
When a
None
object is encountered in sizes, the returned list will contain items up to the end of iterable the same way that itertools.slice does:>>> list(split_into([1,2,3,4,5,6,7,8,9,0], [2,3,None])) [[1, 2], [3, 4, 5], [6, 7, 8, 9, 0]]
split_into()
can be useful for grouping a series of items where the sizes of the groups are not uniform. An example would be where in a row from a table, multiple columns represent elements of the same feature (e.g. a point represented by x,y,z) but, the format is not the same for all columns.
- more_itertools.split_when(iterable, pred, maxsplit=-1)[source]
Split iterable into pieces based on the output of pred. pred should be a function that takes successive pairs of items and returns
True
if the iterable should be split in between them.For example, to find runs of increasing numbers, split the iterable when element
i
is larger than elementi + 1
:>>> list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2], lambda x, y: x > y)) [[1, 2, 3, 3], [2, 5], [2, 4], [2]]
At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:
>>> list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2], ... lambda x, y: x > y, maxsplit=2)) [[1, 2, 3, 3], [2, 5], [2, 4, 2]]
- more_itertools.bucket(iterable, key, validator=None)[source]
Wrap iterable and return an object that buckets the iterable into child iterables based on a key function.
>>> iterable = ['a1', 'b1', 'c1', 'a2', 'b2', 'c2', 'b3'] >>> s = bucket(iterable, key=lambda x: x[0]) # Bucket by 1st character >>> sorted(list(s)) # Get the keys ['a', 'b', 'c'] >>> a_iterable = s['a'] >>> next(a_iterable) 'a1' >>> next(a_iterable) 'a2' >>> list(s['b']) ['b1', 'b2', 'b3']
The original iterable will be advanced and its items will be cached until they are used by the child iterables. This may require significant storage.
By default, attempting to select a bucket to which no items belong will exhaust the iterable and cache all values. If you specify a validator function, selected buckets will instead be checked against it.
>>> from itertools import count >>> it = count(1, 2) # Infinite sequence of odd numbers >>> key = lambda x: x % 10 # Bucket by last digit >>> validator = lambda x: x in {1, 3, 5, 7, 9} # Odd digits only >>> s = bucket(it, key=key, validator=validator) >>> 2 in s False >>> list(s[2]) []
- more_itertools.unzip(iterable)[source]
The inverse of
zip()
, this function disaggregates the elements of the zipped iterable.The
i
-th iterable contains thei
-th element from each element of the zipped iterable. The first element is used to determine the length of the remaining elements.>>> iterable = [('a', 1), ('b', 2), ('c', 3), ('d', 4)] >>> letters, numbers = unzip(iterable) >>> list(letters) ['a', 'b', 'c', 'd'] >>> list(numbers) [1, 2, 3, 4]
This is similar to using
zip(*iterable)
, but it avoids reading iterable into memory. Note, however, that this function usesitertools.tee()
and thus may require significant storage.
Itertools recipes
- more_itertools.batched(iterable, n, *, strict=False)[source]
Batch data into tuples of length n. If the number of items in iterable is not divisible by n: * The last batch will be shorter if strict is
False
. *ValueError
will be raised if strict isTrue
.>>> list(batched('ABCDEFG', 3)) [('A', 'B', 'C'), ('D', 'E', 'F'), ('G',)]
On Python 3.13 and above, this is an alias for
itertools.batched()
.
- more_itertools.grouper(iterable, n, incomplete='fill', fillvalue=None)[source]
Group elements from iterable into fixed-length groups of length n.
>>> list(grouper('ABCDEF', 3)) [('A', 'B', 'C'), ('D', 'E', 'F')]
The keyword arguments incomplete and fillvalue control what happens for iterables whose length is not a multiple of n.
When incomplete is ‘fill’, the last group will contain instances of fillvalue.
>>> list(grouper('ABCDEFG', 3, incomplete='fill', fillvalue='x')) [('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'x', 'x')]
When incomplete is ‘ignore’, the last group will not be emitted.
>>> list(grouper('ABCDEFG', 3, incomplete='ignore', fillvalue='x')) [('A', 'B', 'C'), ('D', 'E', 'F')]
When incomplete is ‘strict’, a subclass of ValueError will be raised.
>>> it = grouper('ABCDEFG', 3, incomplete='strict') >>> list(it) Traceback (most recent call last): ... UnequalIterablesError
- more_itertools.partition(pred, iterable)[source]
Returns a 2-tuple of iterables derived from the input iterable. The first yields the items that have
pred(item) == False
. The second yields the items that havepred(item) == True
.>>> is_odd = lambda x: x % 2 != 0 >>> iterable = range(10) >>> even_items, odd_items = partition(is_odd, iterable) >>> list(even_items), list(odd_items) ([0, 2, 4, 6, 8], [1, 3, 5, 7, 9])
If pred is None,
bool()
is used.>>> iterable = [0, 1, False, True, '', ' '] >>> false_items, true_items = partition(None, iterable) >>> list(false_items), list(true_items) ([0, False, ''], [1, True, ' '])
Lookahead and lookback
These tools peek at an iterable’s values without advancing it.
New itertools
- more_itertools.spy(iterable, n=1)[source]
Return a 2-tuple with a list containing the first n elements of iterable, and an iterator with the same items as iterable. This allows you to “look ahead” at the items in the iterable without advancing it.
There is one item in the list by default:
>>> iterable = 'abcdefg' >>> head, iterable = spy(iterable) >>> head ['a'] >>> list(iterable) ['a', 'b', 'c', 'd', 'e', 'f', 'g']
You may use unpacking to retrieve items instead of lists:
>>> (head,), iterable = spy('abcdefg') >>> head 'a' >>> (first, second), iterable = spy('abcdefg', 2) >>> first 'a' >>> second 'b'
The number of items requested can be larger than the number of items in the iterable:
>>> iterable = [1, 2, 3, 4, 5] >>> head, iterable = spy(iterable, 10) >>> head [1, 2, 3, 4, 5] >>> list(iterable) [1, 2, 3, 4, 5]
- class more_itertools.peekable(iterable)[source]
Wrap an iterator to allow lookahead and prepending elements.
Call
peek()
on the result to get the value that will be returned bynext()
. This won’t advance the iterator:>>> p = peekable(['a', 'b']) >>> p.peek() 'a' >>> next(p) 'a'
Pass
peek()
a default value to return that instead of raisingStopIteration
when the iterator is exhausted.>>> p = peekable([]) >>> p.peek('hi') 'hi'
peekables also offer a
prepend()
method, which “inserts” items at the head of the iterable:>>> p = peekable([1, 2, 3]) >>> p.prepend(10, 11, 12) >>> next(p) 10 >>> p.peek() 11 >>> list(p) [11, 12, 1, 2, 3]
peekables can be indexed. Index 0 is the item that will be returned by
next()
, index 1 is the item after that, and so on: The values up to the given index will be cached.>>> p = peekable(['a', 'b', 'c', 'd']) >>> p[0] 'a' >>> p[1] 'b' >>> next(p) 'a'
Negative indexes are supported, but be aware that they will cache the remaining items in the source iterator, which may require significant storage.
To check whether a peekable is exhausted, check its truth value:
>>> p = peekable(['a', 'b']) >>> if p: # peekable has items ... list(p) ['a', 'b'] >>> if not p: # peekable is exhausted ... list(p) []
- class more_itertools.seekable(iterable, maxlen=None)[source]
Wrap an iterator to allow for seeking backward and forward. This progressively caches the items in the source iterable so they can be re-visited.
Call
seek()
with an index to seek to that position in the source iterable.To “reset” an iterator, seek to
0
:>>> from itertools import count >>> it = seekable((str(n) for n in count())) >>> next(it), next(it), next(it) ('0', '1', '2') >>> it.seek(0) >>> next(it), next(it), next(it) ('0', '1', '2')
You can also seek forward:
>>> it = seekable((str(n) for n in range(20))) >>> it.seek(10) >>> next(it) '10' >>> it.seek(20) # Seeking past the end of the source isn't a problem >>> list(it) [] >>> it.seek(0) # Resetting works even after hitting the end >>> next(it) '0'
Call
relative_seek()
to seek relative to the source iterator’s current position.>>> it = seekable((str(n) for n in range(20))) >>> next(it), next(it), next(it) ('0', '1', '2') >>> it.relative_seek(2) >>> next(it) '5' >>> it.relative_seek(-3) # Source is at '6', we move back to '3' >>> next(it) '3' >>> it.relative_seek(-3) # Source is at '4', we move back to '1' >>> next(it) '1'
Call
peek()
to look ahead one item without advancing the iterator:>>> it = seekable('1234') >>> it.peek() '1' >>> list(it) ['1', '2', '3', '4'] >>> it.peek(default='empty') 'empty'
Before the iterator is at its end, calling
bool()
on it will returnTrue
. After it will returnFalse
:>>> it = seekable('5678') >>> bool(it) True >>> list(it) ['5', '6', '7', '8'] >>> bool(it) False
You may view the contents of the cache with the
elements()
method. That returns aSequenceView
, a view that updates automatically:>>> it = seekable((str(n) for n in range(10))) >>> next(it), next(it), next(it) ('0', '1', '2') >>> elements = it.elements() >>> elements SequenceView(['0', '1', '2']) >>> next(it) '3' >>> elements SequenceView(['0', '1', '2', '3'])
By default, the cache grows as the source iterable progresses, so beware of wrapping very large or infinite iterables. Supply maxlen to limit the size of the cache (this of course limits how far back you can seek).
>>> from itertools import count >>> it = seekable((str(n) for n in count()), maxlen=2) >>> next(it), next(it), next(it), next(it) ('0', '1', '2', '3') >>> list(it.elements()) ['2', '3'] >>> it.seek(0) >>> next(it), next(it), next(it), next(it) ('2', '3', '4', '5') >>> next(it) '6'
Windowing
These tools yield windows of items from an iterable.
New itertools
- more_itertools.windowed(seq, n, fillvalue=None, step=1)[source]
Return a sliding window of width n over the given iterable.
>>> all_windows = windowed([1, 2, 3, 4, 5], 3) >>> list(all_windows) [(1, 2, 3), (2, 3, 4), (3, 4, 5)]
When the window is larger than the iterable, fillvalue is used in place of missing values:
>>> list(windowed([1, 2, 3], 4)) [(1, 2, 3, None)]
Each window will advance in increments of step:
>>> list(windowed([1, 2, 3, 4, 5, 6], 3, fillvalue='!', step=2)) [(1, 2, 3), (3, 4, 5), (5, 6, '!')]
To slide into the iterable’s items, use
chain()
to add filler items to the left:>>> iterable = [1, 2, 3, 4] >>> n = 3 >>> padding = [None] * (n - 1) >>> list(windowed(chain(padding, iterable), 3)) [(None, None, 1), (None, 1, 2), (1, 2, 3), (2, 3, 4)]
- more_itertools.substrings(iterable)[source]
Yield all of the substrings of iterable.
>>> [''.join(s) for s in substrings('more')] ['m', 'o', 'r', 'e', 'mo', 'or', 're', 'mor', 'ore', 'more']
Note that non-string iterables can also be subdivided.
>>> list(substrings([0, 1, 2])) [(0,), (1,), (2,), (0, 1), (1, 2), (0, 1, 2)]
- more_itertools.substrings_indexes(seq, reverse=False)[source]
Yield all substrings and their positions in seq
The items yielded will be a tuple of the form
(substr, i, j)
, wheresubstr == seq[i:j]
.This function only works for iterables that support slicing, such as
str
objects.>>> for item in substrings_indexes('more'): ... print(item) ('m', 0, 1) ('o', 1, 2) ('r', 2, 3) ('e', 3, 4) ('mo', 0, 2) ('or', 1, 3) ('re', 2, 4) ('mor', 0, 3) ('ore', 1, 4) ('more', 0, 4)
Set reverse to
True
to yield the same items in the opposite order.
- more_itertools.stagger(iterable, offsets=(-1, 0, 1), longest=False, fillvalue=None)[source]
Yield tuples whose elements are offset from iterable. The amount by which the i-th item in each tuple is offset is given by the i-th item in offsets.
>>> list(stagger([0, 1, 2, 3])) [(None, 0, 1), (0, 1, 2), (1, 2, 3)] >>> list(stagger(range(8), offsets=(0, 2, 4))) [(0, 2, 4), (1, 3, 5), (2, 4, 6), (3, 5, 7)]
By default, the sequence will end when the final element of a tuple is the last item in the iterable. To continue until the first element of a tuple is the last item in the iterable, set longest to
True
:>>> list(stagger([0, 1, 2, 3], longest=True)) [(None, 0, 1), (0, 1, 2), (1, 2, 3), (2, 3, None), (3, None, None)]
By default,
None
will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.
- more_itertools.windowed_complete(iterable, n)[source]
Yield
(beginning, middle, end)
tuples, where:Each
middle
has n items from iterableEach
beginning
has the items before the ones inmiddle
Each
end
has the items after the ones inmiddle
>>> iterable = range(7) >>> n = 3 >>> for beginning, middle, end in windowed_complete(iterable, n): ... print(beginning, middle, end) () (0, 1, 2) (3, 4, 5, 6) (0,) (1, 2, 3) (4, 5, 6) (0, 1) (2, 3, 4) (5, 6) (0, 1, 2) (3, 4, 5) (6,) (0, 1, 2, 3) (4, 5, 6) ()
Note that n must be at least 0 and most equal to the length of iterable.
This function will exhaust the iterable and may require significant storage.
Itertools recipes
- more_itertools.pairwise(iterable)[source]
Returns an iterator of paired items, overlapping, from the original
>>> take(4, pairwise(count())) [(0, 1), (1, 2), (2, 3), (3, 4)]
On Python 3.10 and above, this is an alias for
itertools.pairwise()
.
- more_itertools.triplewise(iterable)[source]
Return overlapping triplets from iterable.
>>> list(triplewise('ABCDE')) [('A', 'B', 'C'), ('B', 'C', 'D'), ('C', 'D', 'E')]
- more_itertools.sliding_window(iterable, n)[source]
Return a sliding window of width n over iterable.
>>> list(sliding_window(range(6), 4)) [(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5)]
If iterable has fewer than n items, then nothing is yielded:
>>> list(sliding_window(range(3), 4)) []
For a variant with more features, see
windowed()
.
- more_itertools.subslices(iterable)[source]
Return all contiguous non-empty subslices of iterable.
>>> list(subslices('ABC')) [['A'], ['A', 'B'], ['A', 'B', 'C'], ['B'], ['B', 'C'], ['C']]
This is similar to
substrings()
, but emits items in a different order.
Augmenting
These tools yield items from an iterable, plus additional data.
New itertools
- more_itertools.count_cycle(iterable, n=None)[source]
Cycle through the items from iterable up to n times, yielding the number of completed cycles along with each item. If n is omitted the process repeats indefinitely.
>>> list(count_cycle('AB', 3)) [(0, 'A'), (0, 'B'), (1, 'A'), (1, 'B'), (2, 'A'), (2, 'B')]
- more_itertools.intersperse(e, iterable, n=1)[source]
Intersperse filler element e among the items in iterable, leaving n items between each filler element.
>>> list(intersperse('!', [1, 2, 3, 4, 5])) [1, '!', 2, '!', 3, '!', 4, '!', 5]
>>> list(intersperse(None, [1, 2, 3, 4, 5], n=2)) [1, 2, None, 3, 4, None, 5]
- more_itertools.padded(iterable, fillvalue=None, n=None, next_multiple=False)[source]
Yield the elements from iterable, followed by fillvalue, such that at least n items are emitted.
>>> list(padded([1, 2, 3], '?', 5)) [1, 2, 3, '?', '?']
If next_multiple is
True
, fillvalue will be emitted until the number of items emitted is a multiple of n:>>> list(padded([1, 2, 3, 4], n=3, next_multiple=True)) [1, 2, 3, 4, None, None]
If n is
None
, fillvalue will be emitted indefinitely.To create an iterable of exactly size n, you can truncate with
islice()
.>>> list(islice(padded([1, 2, 3], '?'), 5)) [1, 2, 3, '?', '?'] >>> list(islice(padded([1, 2, 3, 4, 5, 6, 7, 8], '?'), 5)) [1, 2, 3, 4, 5]
- more_itertools.mark_ends(iterable)[source]
Yield 3-tuples of the form
(is_first, is_last, item)
.>>> list(mark_ends('ABC')) [(True, False, 'A'), (False, False, 'B'), (False, True, 'C')]
Use this when looping over an iterable to take special action on its first and/or last items:
>>> iterable = ['Header', 100, 200, 'Footer'] >>> total = 0 >>> for is_first, is_last, item in mark_ends(iterable): ... if is_first: ... continue # Skip the header ... if is_last: ... continue # Skip the footer ... total += item >>> print(total) 300
- more_itertools.repeat_each(iterable, n=2)[source]
Repeat each element in iterable n times.
>>> list(repeat_each('ABC', 3)) ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C']
- more_itertools.repeat_last(iterable, default=None)[source]
After the iterable is exhausted, keep yielding its last element.
>>> list(islice(repeat_last(range(3)), 5)) [0, 1, 2, 2, 2]
If the iterable is empty, yield default forever:
>>> list(islice(repeat_last(range(0), 42), 5)) [42, 42, 42, 42, 42]
- more_itertools.adjacent(predicate, iterable, distance=1)[source]
Return an iterable over (bool, item) tuples where the item is drawn from iterable and the bool indicates whether that item satisfies the predicate or is adjacent to an item that does.
For example, to find whether items are adjacent to a
3
:>>> list(adjacent(lambda x: x == 3, range(6))) [(False, 0), (False, 1), (True, 2), (True, 3), (True, 4), (False, 5)]
Set distance to change what counts as adjacent. For example, to find whether items are two places away from a
3
:>>> list(adjacent(lambda x: x == 3, range(6), distance=2)) [(False, 0), (True, 1), (True, 2), (True, 3), (True, 4), (True, 5)]
This is useful for contextualizing the results of a search function. For example, a code comparison tool might want to identify lines that have changed, but also surrounding lines to give the viewer of the diff context.
The predicate function will only be called once for each item in the iterable.
See also
groupby_transform()
, which can be used with this function to group ranges of items with the same bool value.
- more_itertools.groupby_transform(iterable, keyfunc=None, valuefunc=None, reducefunc=None)[source]
An extension of
itertools.groupby()
that can apply transformations to the grouped data.keyfunc is a function computing a key value for each item in iterable
valuefunc is a function that transforms the individual items from iterable after grouping
reducefunc is a function that transforms each group of items
>>> iterable = 'aAAbBBcCC' >>> keyfunc = lambda k: k.upper() >>> valuefunc = lambda v: v.lower() >>> reducefunc = lambda g: ''.join(g) >>> list(groupby_transform(iterable, keyfunc, valuefunc, reducefunc)) [('A', 'aaa'), ('B', 'bbb'), ('C', 'ccc')]
Each optional argument defaults to an identity function if not specified.
groupby_transform()
is useful when grouping elements of an iterable using a separate iterable as the key. To do this,zip()
the iterables and pass a keyfunc that extracts the first element and a valuefunc that extracts the second element:>>> from operator import itemgetter >>> keys = [0, 0, 1, 1, 1, 2, 2, 2, 3] >>> values = 'abcdefghi' >>> iterable = zip(keys, values) >>> grouper = groupby_transform(iterable, itemgetter(0), itemgetter(1)) >>> [(k, ''.join(g)) for k, g in grouper] [(0, 'ab'), (1, 'cde'), (2, 'fgh'), (3, 'i')]
Note that the order of items in the iterable is significant. Only adjacent items are grouped together, so if you don’t want any duplicate groups, you should sort the iterable by the key function.
Itertools recipes
- more_itertools.padnone()
Combining
These tools combine multiple iterables.
New itertools
- more_itertools.collapse(iterable, base_type=None, levels=None)[source]
Flatten an iterable with multiple levels of nesting (e.g., a list of lists of tuples) into non-iterable types.
>>> iterable = [(1, 2), ([3, 4], [[5], [6]])] >>> list(collapse(iterable)) [1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable and will not be collapsed.
To avoid collapsing other types, specify base_type:
>>> iterable = ['ab', ('cd', 'ef'), ['gh', 'ij']] >>> list(collapse(iterable, base_type=tuple)) ['ab', ('cd', 'ef'), 'gh', 'ij']
Specify levels to stop flattening after a certain level:
>>> iterable = [('a', ['b']), ('c', ['d'])] >>> list(collapse(iterable)) # Fully flattened ['a', 'b', 'c', 'd'] >>> list(collapse(iterable, levels=1)) # Only one level flattened ['a', ['b'], 'c', ['d']]
- more_itertools.interleave(*iterables)[source]
Return a new iterable yielding from each iterable in turn, until the shortest is exhausted.
>>> list(interleave([1, 2, 3], [4, 5], [6, 7, 8])) [1, 4, 6, 2, 5, 7]
For a version that doesn’t terminate after the shortest iterable is exhausted, see
interleave_longest()
.
- more_itertools.interleave_longest(*iterables)[source]
Return a new iterable yielding from each iterable in turn, skipping any that are exhausted.
>>> list(interleave_longest([1, 2, 3], [4, 5], [6, 7, 8])) [1, 4, 6, 2, 5, 7, 3, 8]
This function produces the same output as
roundrobin()
, but may perform better for some inputs (in particular when the number of iterables is large).
- more_itertools.interleave_evenly(iterables, lengths=None)[source]
Interleave multiple iterables so that their elements are evenly distributed throughout the output sequence.
>>> iterables = [1, 2, 3, 4, 5], ['a', 'b'] >>> list(interleave_evenly(iterables)) [1, 2, 'a', 3, 4, 'b', 5]
>>> iterables = [[1, 2, 3], [4, 5], [6, 7, 8]] >>> list(interleave_evenly(iterables)) [1, 6, 4, 2, 7, 3, 8, 5]
This function requires iterables of known length. Iterables without
__len__()
can be used by manually specifying lengths with lengths:>>> from itertools import combinations, repeat >>> iterables = [combinations(range(4), 2), ['a', 'b', 'c']] >>> lengths = [4 * (4 - 1) // 2, 3] >>> list(interleave_evenly(iterables, lengths=lengths)) [(0, 1), (0, 2), 'a', (0, 3), (1, 2), 'b', (1, 3), (2, 3), 'c']
Based on Bresenham’s algorithm.
- more_itertools.partial_product(*iterables)[source]
Yields tuples containing one item from each iterator, with subsequent tuples changing a single item at a time by advancing each iterator until it is exhausted. This sequence guarantees every value in each iterable is output at least once without generating all possible combinations.
This may be useful, for example, when testing an expensive function.
>>> list(partial_product('AB', 'C', 'DEF')) [('A', 'C', 'D'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'C', 'F')]
- more_itertools.sort_together(iterables, key_list=(0,), key=None, reverse=False, strict=False)[source]
Return the input iterables sorted together, with key_list as the priority for sorting. All iterables are trimmed to the length of the shortest one.
This can be used like the sorting function in a spreadsheet. If each iterable represents a column of data, the key list determines which columns are used for sorting.
By default, all iterables are sorted using the
0
-th iterable:>>> iterables = [(4, 3, 2, 1), ('a', 'b', 'c', 'd')] >>> sort_together(iterables) [(1, 2, 3, 4), ('d', 'c', 'b', 'a')]
Set a different key list to sort according to another iterable. Specifying multiple keys dictates how ties are broken:
>>> iterables = [(3, 1, 2), (0, 1, 0), ('c', 'b', 'a')] >>> sort_together(iterables, key_list=(1, 2)) [(2, 3, 1), (0, 0, 1), ('a', 'c', 'b')]
To sort by a function of the elements of the iterable, pass a key function. Its arguments are the elements of the iterables corresponding to the key list:
>>> names = ('a', 'b', 'c') >>> lengths = (1, 2, 3) >>> widths = (5, 2, 1) >>> def area(length, width): ... return length * width >>> sort_together([names, lengths, widths], key_list=(1, 2), key=area) [('c', 'b', 'a'), (3, 2, 1), (1, 2, 5)]
Set reverse to
True
to sort in descending order.>>> sort_together([(1, 2, 3), ('c', 'b', 'a')], reverse=True) [(3, 2, 1), ('a', 'b', 'c')]
If the strict keyword argument is
True
, thenUnequalIterablesError
will be raised if any of the iterables have different lengths.
- more_itertools.value_chain(*args)[source]
Yield all arguments passed to the function in the same order in which they were passed. If an argument itself is iterable then iterate over its values.
>>> list(value_chain(1, 2, 3, [4, 5, 6])) [1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable and are emitted as-is:
>>> list(value_chain('12', '34', ['56', '78'])) ['12', '34', '56', '78']
Pre- or postpend a single element to an iterable:
>>> list(value_chain(1, [2, 3, 4, 5, 6])) [1, 2, 3, 4, 5, 6] >>> list(value_chain([1, 2, 3, 4, 5], 6)) [1, 2, 3, 4, 5, 6]
Multiple levels of nesting are not flattened.
- more_itertools.zip_offset(*iterables, offsets, longest=False, fillvalue=None)[source]
zip
the input iterables together, but offset the i-th iterable by the i-th item in offsets.>>> list(zip_offset('0123', 'abcdef', offsets=(0, 1))) [('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e')]
This can be used as a lightweight alternative to SciPy or pandas to analyze data sets in which some series have a lead or lag relationship.
By default, the sequence will end when the shortest iterable is exhausted. To continue until the longest iterable is exhausted, set longest to
True
.>>> list(zip_offset('0123', 'abcdef', offsets=(0, 1), longest=True)) [('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e'), (None, 'f')]
By default,
None
will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.
- more_itertools.zip_equal(*iterables)[source]
zip
the input iterables together, but raiseUnequalIterablesError
if they aren’t all the same length.>>> it_1 = range(3) >>> it_2 = iter('abc') >>> list(zip_equal(it_1, it_2)) [(0, 'a'), (1, 'b'), (2, 'c')]
>>> it_1 = range(3) >>> it_2 = iter('abcd') >>> list(zip_equal(it_1, it_2)) Traceback (most recent call last): ... more_itertools.more.UnequalIterablesError: Iterables have different lengths
- more_itertools.zip_broadcast(*objects, scalar_types=(str, bytes), strict=False)[source]
A version of
zip()
that “broadcasts” any scalar (i.e., non-iterable) items into output tuples.>>> iterable_1 = [1, 2, 3] >>> iterable_2 = ['a', 'b', 'c'] >>> scalar = '_' >>> list(zip_broadcast(iterable_1, iterable_2, scalar)) [(1, 'a', '_'), (2, 'b', '_'), (3, 'c', '_')]
The scalar_types keyword argument determines what types are considered scalar. It is set to
(str, bytes)
by default. Set it toNone
to treat strings and byte strings as iterable:>>> list(zip_broadcast('abc', 0, 'xyz', scalar_types=None)) [('a', 0, 'x'), ('b', 0, 'y'), ('c', 0, 'z')]
If the strict keyword argument is
True
, thenUnequalIterablesError
will be raised if any of the iterables have different lengths.
Itertools recipes
- more_itertools.flatten(listOfLists)[source]
Return an iterator flattening one level of nesting in a list of lists.
>>> list(flatten([[0, 1], [2, 3]])) [0, 1, 2, 3]
See also
collapse()
, which can flatten multiple levels of nesting.
- more_itertools.roundrobin(*iterables)[source]
Yields an item from each iterable, alternating between them.
>>> list(roundrobin('ABC', 'D', 'EF')) ['A', 'D', 'E', 'B', 'F', 'C']
This function produces the same output as
interleave_longest()
, but may perform better for some inputs (in particular when the number of iterables is small).
- more_itertools.prepend(value, iterator)[source]
Yield value, followed by the elements in iterator.
>>> value = '0' >>> iterator = ['1', '2', '3'] >>> list(prepend(value, iterator)) ['0', '1', '2', '3']
To prepend multiple values, see
itertools.chain()
orvalue_chain()
.
Summarizing
These tools return summarized or aggregated data from an iterable.
New itertools
- more_itertools.ilen(iterable)[source]
Return the number of items in iterable.
>>> ilen(x for x in range(1000000) if x % 3 == 0) 333334
This consumes the iterable, so handle with care.
- more_itertools.unique_to_each(*iterables)[source]
Return the elements from each of the input iterables that aren’t in the other input iterables.
For example, suppose you have a set of packages, each with a set of dependencies:
{'pkg_1': {'A', 'B'}, 'pkg_2': {'B', 'C'}, 'pkg_3': {'B', 'D'}}
If you remove one package, which dependencies can also be removed?
If
pkg_1
is removed, thenA
is no longer necessary - it is not associated withpkg_2
orpkg_3
. Similarly,C
is only needed forpkg_2
, andD
is only needed forpkg_3
:>>> unique_to_each({'A', 'B'}, {'B', 'C'}, {'B', 'D'}) [['A'], ['C'], ['D']]
If there are duplicates in one input iterable that aren’t in the others they will be duplicated in the output. Input order is preserved:
>>> unique_to_each("mississippi", "missouri") [['p', 'p'], ['o', 'u', 'r']]
It is assumed that the elements of each iterable are hashable.
- more_itertools.sample(iterable, k=1, weights=None)[source]
Return a k-length list of elements chosen (without replacement) from the iterable. Similar to
random.sample()
, but works on iterables of unknown length.>>> iterable = range(100) >>> sample(iterable, 5) [81, 60, 96, 16, 4]
For iterables with repeated elements, you may supply counts to indicate the repeats.
>>> iterable = ['a', 'b'] >>> counts = [3, 4] # Equivalent to 'a', 'a', 'a', 'b', 'b', 'b', 'b' >>> sample(iterable, k=3, counts=counts) ['a', 'a', 'b']
An iterable with weights may be given:
>>> iterable = range(100) >>> weights = (i * i + 1 for i in range(100)) >>> sampled = sample(iterable, 5, weights=weights) [79, 67, 74, 66, 78]
Weighted selections are made without replacement. After an element is selected, it is removed from the pool and the relative weights of the other elements increase (this does not match the behavior of
random.sample()
’s counts parameter). Note that weights may not be used with counts.If the length of iterable is less than k,
ValueError
is raised if strict isTrue
and all elements are returned (in shuffled order) if strict isFalse
.By default, the Algorithm L reservoir sampling technique is used. When weights are provided, Algorithm A-ExpJ is used.
- more_itertools.consecutive_groups(iterable, ordering=lambda x: ...)[source]
Yield groups of consecutive items using
itertools.groupby()
. The ordering function determines whether two items are adjacent by returning their position.By default, the ordering function is the identity function. This is suitable for finding runs of numbers:
>>> iterable = [1, 10, 11, 12, 20, 30, 31, 32, 33, 40] >>> for group in consecutive_groups(iterable): ... print(list(group)) [1] [10, 11, 12] [20] [30, 31, 32, 33] [40]
For finding runs of adjacent letters, try using the
index()
method of a string of letters:>>> from string import ascii_lowercase >>> iterable = 'abcdfgilmnop' >>> ordering = ascii_lowercase.index >>> for group in consecutive_groups(iterable, ordering): ... print(list(group)) ['a', 'b', 'c', 'd'] ['f', 'g'] ['i'] ['l', 'm', 'n', 'o', 'p']
Each group of consecutive items is an iterator that shares it source with iterable. When an an output group is advanced, the previous group is no longer available unless its elements are copied (e.g., into a
list
).>>> iterable = [1, 2, 11, 12, 21, 22] >>> saved_groups = [] >>> for group in consecutive_groups(iterable): ... saved_groups.append(list(group)) # Copy group elements >>> saved_groups [[1, 2], [11, 12], [21, 22]]
- class more_itertools.run_length[source]
run_length.encode()
compresses an iterable with run-length encoding. It yields groups of repeated items with the count of how many times they were repeated:>>> uncompressed = 'abbcccdddd' >>> list(run_length.encode(uncompressed)) [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
run_length.decode()
decompresses an iterable that was previously compressed with run-length encoding. It yields the items of the decompressed iterable:>>> compressed = [('a', 1), ('b', 2), ('c', 3), ('d', 4)] >>> list(run_length.decode(compressed)) ['a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd']
- more_itertools.map_reduce(iterable, keyfunc, valuefunc=None, reducefunc=None)[source]
Return a dictionary that maps the items in iterable to categories defined by keyfunc, transforms them with valuefunc, and then summarizes them by category with reducefunc.
valuefunc defaults to the identity function if it is unspecified. If reducefunc is unspecified, no summarization takes place:
>>> keyfunc = lambda x: x.upper() >>> result = map_reduce('abbccc', keyfunc) >>> sorted(result.items()) [('A', ['a']), ('B', ['b', 'b']), ('C', ['c', 'c', 'c'])]
Specifying valuefunc transforms the categorized items:
>>> keyfunc = lambda x: x.upper() >>> valuefunc = lambda x: 1 >>> result = map_reduce('abbccc', keyfunc, valuefunc) >>> sorted(result.items()) [('A', [1]), ('B', [1, 1]), ('C', [1, 1, 1])]
Specifying reducefunc summarizes the categorized items:
>>> keyfunc = lambda x: x.upper() >>> valuefunc = lambda x: 1 >>> reducefunc = sum >>> result = map_reduce('abbccc', keyfunc, valuefunc, reducefunc) >>> sorted(result.items()) [('A', 1), ('B', 2), ('C', 3)]
You may want to filter the input iterable before applying the map/reduce procedure:
>>> all_items = range(30) >>> items = [x for x in all_items if 10 <= x <= 20] # Filter >>> keyfunc = lambda x: x % 2 # Evens map to 0; odds to 1 >>> categories = map_reduce(items, keyfunc=keyfunc) >>> sorted(categories.items()) [(0, [10, 12, 14, 16, 18, 20]), (1, [11, 13, 15, 17, 19])] >>> summaries = map_reduce(items, keyfunc=keyfunc, reducefunc=sum) >>> sorted(summaries.items()) [(0, 90), (1, 75)]
Note that all items in the iterable are gathered into a list before the summarization step, which may require significant storage.
The returned object is a
collections.defaultdict
with thedefault_factory
set toNone
, such that it behaves like a normal dictionary.
- more_itertools.join_mappings(**field_to_map)[source]
Joins multiple mappings together using their common keys.
>>> user_scores = {'elliot': 50, 'claris': 60} >>> user_times = {'elliot': 30, 'claris': 40} >>> join_mappings(score=user_scores, time=user_times) {'elliot': {'score': 50, 'time': 30}, 'claris': {'score': 60, 'time': 40}}
- more_itertools.exactly_n(iterable, n, predicate=bool)[source]
Return
True
if exactlyn
items in the iterable areTrue
according to the predicate function.>>> exactly_n([True, True, False], 2) True >>> exactly_n([True, True, False], 1) False >>> exactly_n([0, 1, 2, 3, 4, 5], 3, lambda x: x < 3) True
The iterable will be advanced until
n + 1
truthy items are encountered, so avoid calling it on infinite iterables.
- more_itertools.is_sorted(iterable, key=None, reverse=False, strict=False)[source]
Returns
True
if the items of iterable are in sorted order, andFalse
otherwise. key and reverse have the same meaning that they do in the built-insorted()
function.>>> is_sorted(['1', '2', '3', '4', '5'], key=int) True >>> is_sorted([5, 4, 3, 1, 2], reverse=True) False
If strict, tests for strict sorting, that is, returns
False
if equal elements are found:>>> is_sorted([1, 2, 2]) True >>> is_sorted([1, 2, 2], strict=True) False
The function returns
False
after encountering the first out-of-order item, which means it may produce results that differ from the built-insorted()
function for objects with unusual comparison dynamics. If there are no out-of-order items, the iterable is exhausted.
- more_itertools.all_unique(iterable, key=None)[source]
Returns
True
if all the elements of iterable are unique (no two elements are equal).>>> all_unique('ABCB') False
If a key function is specified, it will be used to make comparisons.
>>> all_unique('ABCb') True >>> all_unique('ABCb', str.lower) False
The function returns as soon as the first non-unique element is encountered. Iterables with a mix of hashable and unhashable items can be used, but the function will be slower for unhashable items.
- more_itertools.minmax(arg1, arg2, *args[, key])[source]
Returns both the smallest and largest items in an iterable or the largest of two or more arguments.
>>> minmax([3, 1, 5]) (1, 5)
>>> minmax(4, 2, 6) (2, 6)
If a key function is provided, it will be used to transform the input items for comparison.
>>> minmax([5, 30], key=str) # '30' sorts before '5' (30, 5)
If a default value is provided, it will be returned if there are no input items.
>>> minmax([], default=(0, 0)) (0, 0)
Otherwise
ValueError
is raised.This function is based on the recipe by Raymond Hettinger and takes care to minimize the number of comparisons performed.
- more_itertools.iequals(*iterables)[source]
Return
True
if all given iterables are equal to each other, which means that they contain the same elements in the same order.The function is useful for comparing iterables of different data types or iterables that do not support equality checks.
>>> iequals("abc", ['a', 'b', 'c'], ('a', 'b', 'c'), iter("abc")) True
>>> iequals("abc", "acb") False
Not to be confused with
all_equal()
, which checks whether all elements of iterable are equal to each other.
Itertools recipes
- more_itertools.all_equal(iterable, key=None)[source]
Returns
True
if all the elements are equal to each other.>>> all_equal('aaaa') True >>> all_equal('aaab') False
A function that accepts a single argument and returns a transformed version of each input item can be specified with key:
>>> all_equal('AaaA', key=str.casefold) True >>> all_equal([1, 2, 3], key=lambda x: x < 10) True
- more_itertools.first_true(iterable, default=None, pred=None)[source]
Returns the first true value in the iterable.
If no true value is found, returns default
If pred is not None, returns the first item for which
pred(item) == True
.>>> first_true(range(10)) 1 >>> first_true(range(10), pred=lambda x: x > 5) 6 >>> first_true(range(10), default='missing', pred=lambda x: x > 9) 'missing'
Selecting
These tools yield certain items from an iterable.
New itertools
- class more_itertools.islice_extended(iterable, start, stop[, step])[source]
An extension of
itertools.islice()
that supports negative values for stop, start, and step.>>> iterable = iter('abcdefgh') >>> list(islice_extended(iterable, -4, -1)) ['e', 'f', 'g']
Slices with negative values require some caching of iterable, but this function takes care to minimize the amount of memory required.
For example, you can use a negative step with an infinite iterator:
>>> from itertools import count >>> list(islice_extended(count(), 110, 99, -2)) [110, 108, 106, 104, 102, 100]
You can also use slice notation directly:
>>> iterable = map(str, count()) >>> it = islice_extended(iterable)[10:20:2] >>> list(it) ['10', '12', '14', '16', '18']
- more_itertools.first(iterable[, default])[source]
Return the first item of iterable, or default if iterable is empty.
>>> first([0, 1, 2, 3]) 0 >>> first([], 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise
ValueError
.first()
is useful when you have a generator of expensive-to-retrieve values and want any arbitrary one. It is marginally shorter thannext(iter(iterable), default)
.
- more_itertools.last(iterable[, default])[source]
Return the last item of iterable, or default if iterable is empty.
>>> last([0, 1, 2, 3]) 3 >>> last([], 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise
ValueError
.
- more_itertools.one(iterable, too_short=ValueError, too_long=ValueError)[source]
Return the first item from iterable, which is expected to contain only that item. Raise an exception if iterable is empty or has more than one item.
one()
is useful for ensuring that an iterable contains only one item. For example, it can be used to retrieve the result of a database query that is expected to return a single row.If iterable is empty,
ValueError
will be raised. You may specify a different exception with the too_short keyword:>>> it = [] >>> one(it) Traceback (most recent call last): ... ValueError: too many items in iterable (expected 1)' >>> too_short = IndexError('too few items') >>> one(it, too_short=too_short) Traceback (most recent call last): ... IndexError: too few items
Similarly, if iterable contains more than one item,
ValueError
will be raised. You may specify a different exception with the too_long keyword:>>> it = ['too', 'many'] >>> one(it) Traceback (most recent call last): ... ValueError: Expected exactly one item in iterable, but got 'too', 'many', and perhaps more. >>> too_long = RuntimeError >>> one(it, too_long=too_long) Traceback (most recent call last): ... RuntimeError
Note that
one()
attempts to advance iterable twice to ensure there is only one item. Seespy()
orpeekable()
to check iterable contents less destructively.
- more_itertools.only(iterable, default=None, too_long=ValueError)[source]
If iterable has only one item, return it. If it has zero items, return default. If it has more than one item, raise the exception given by too_long, which is
ValueError
by default.>>> only([], default='missing') 'missing' >>> only([1]) 1 >>> only([1, 2]) Traceback (most recent call last): ... ValueError: Expected exactly one item in iterable, but got 1, 2, and perhaps more.' >>> only([1, 2], too_long=TypeError) Traceback (most recent call last): ... TypeError
Note that
only()
attempts to advance iterable twice to ensure there is only one item. Seespy()
orpeekable()
to check iterable contents less destructively.
- more_itertools.strictly_n(iterable, n, too_short=None, too_long=None)[source]
Validate that iterable has exactly n items and return them if it does. If it has fewer than n items, call function too_short with those items. If it has more than n items, call function too_long with the first
n + 1
items.>>> iterable = ['a', 'b', 'c', 'd'] >>> n = 4 >>> list(strictly_n(iterable, n)) ['a', 'b', 'c', 'd']
Note that the returned iterable must be consumed in order for the check to be made.
By default, too_short and too_long are functions that raise
ValueError
.>>> list(strictly_n('ab', 3)) Traceback (most recent call last): ... ValueError: too few items in iterable (got 2)
>>> list(strictly_n('abc', 2)) Traceback (most recent call last): ... ValueError: too many items in iterable (got at least 3)
You can instead supply functions that do something else. too_short will be called with the number of items in iterable. too_long will be called with n + 1.
>>> def too_short(item_count): ... raise RuntimeError >>> it = strictly_n('abcd', 6, too_short=too_short) >>> list(it) Traceback (most recent call last): ... RuntimeError
>>> def too_long(item_count): ... print('The boss is going to hear about this') >>> it = strictly_n('abcdef', 4, too_long=too_long) >>> list(it) The boss is going to hear about this ['a', 'b', 'c', 'd']
- more_itertools.strip(iterable, pred)[source]
Yield the items from iterable, but strip any from the beginning and end for which pred returns
True
.For example, to remove a set of items from both ends of an iterable:
>>> iterable = (None, False, None, 1, 2, None, 3, False, None) >>> pred = lambda x: x in {None, False, ''} >>> list(strip(iterable, pred)) [1, 2, None, 3]
This function is analogous to
str.strip()
.
- more_itertools.lstrip(iterable, pred)[source]
Yield the items from iterable, but strip any from the beginning for which pred returns
True
.For example, to remove a set of items from the start of an iterable:
>>> iterable = (None, False, None, 1, 2, None, 3, False, None) >>> pred = lambda x: x in {None, False, ''} >>> list(lstrip(iterable, pred)) [1, 2, None, 3, False, None]
This function is analogous to to
str.lstrip()
, and is essentially an wrapper foritertools.dropwhile()
.
- more_itertools.rstrip(iterable, pred)[source]
Yield the items from iterable, but strip any from the end for which pred returns
True
.For example, to remove a set of items from the end of an iterable:
>>> iterable = (None, False, None, 1, 2, None, 3, False, None) >>> pred = lambda x: x in {None, False, ''} >>> list(rstrip(iterable, pred)) [None, False, None, 1, 2, None, 3]
This function is analogous to
str.rstrip()
.
- more_itertools.filter_except(validator, iterable, *exceptions)[source]
Yield the items from iterable for which the validator function does not raise one of the specified exceptions.
validator is called for each item in iterable. It should be a function that accepts one argument and raises an exception if that item is not valid.
>>> iterable = ['1', '2', 'three', '4', None] >>> list(filter_except(int, iterable, ValueError, TypeError)) ['1', '2', '4']
If an exception other than one given by exceptions is raised by validator, it is raised like normal.
- more_itertools.map_except(function, iterable, *exceptions)[source]
Transform each item from iterable with function and yield the result, unless function raises one of the specified exceptions.
function is called to transform each item in iterable. It should accept one argument.
>>> iterable = ['1', '2', 'three', '4', None] >>> list(map_except(int, iterable, ValueError, TypeError)) [1, 2, 4]
If an exception other than one given by exceptions is raised by function, it is raised like normal.
- more_itertools.filter_map(func, iterable)[source]
Apply func to every element of iterable, yielding only those which are not
None
.>>> elems = ['1', 'a', '2', 'b', '3'] >>> list(filter_map(lambda s: int(s) if s.isnumeric() else None, elems)) [1, 2, 3]
- more_itertools.iter_suppress(iterable, *exceptions)[source]
Yield each of the items from iterable. If the iteration raises one of the specified exceptions, that exception will be suppressed and iteration will stop.
>>> from itertools import chain >>> def breaks_at_five(x): ... while True: ... if x >= 5: ... raise RuntimeError ... yield x ... x += 1 >>> it_1 = iter_suppress(breaks_at_five(1), RuntimeError) >>> it_2 = iter_suppress(breaks_at_five(2), RuntimeError) >>> list(chain(it_1, it_2)) [1, 2, 3, 4, 2, 3, 4]
- more_itertools.nth_or_last(iterable, n[, default])[source]
Return the nth or the last item of iterable, or default if iterable is empty.
>>> nth_or_last([0, 1, 2, 3], 2) 2 >>> nth_or_last([0, 1], 2) 1 >>> nth_or_last([], 0, 'some default') 'some default'
If default is not provided and there are no items in the iterable, raise
ValueError
.
- more_itertools.unique_in_window(iterable, n, key=None)[source]
Yield the items from iterable that haven’t been seen recently. n is the size of the lookback window.
>>> iterable = [0, 1, 0, 2, 3, 0] >>> n = 3 >>> list(unique_in_window(iterable, n)) [0, 1, 2, 3, 0]
The key function, if provided, will be used to determine uniqueness:
>>> list(unique_in_window('abAcda', 3, key=lambda x: x.lower())) ['a', 'b', 'c', 'd', 'a']
The items in iterable must be hashable.
- more_itertools.duplicates_everseen(iterable, key=None)[source]
Yield duplicate elements after their first appearance.
>>> list(duplicates_everseen('mississippi')) ['s', 'i', 's', 's', 'i', 'p', 'i'] >>> list(duplicates_everseen('AaaBbbCccAaa', str.lower)) ['a', 'a', 'b', 'b', 'c', 'c', 'A', 'a', 'a']
This function is analogous to
unique_everseen()
and is subject to the same performance considerations.
- more_itertools.duplicates_justseen(iterable, key=None)[source]
Yields serially-duplicate elements after their first appearance.
>>> list(duplicates_justseen('mississippi')) ['s', 's', 'p'] >>> list(duplicates_justseen('AaaBbbCccAaa', str.lower)) ['a', 'a', 'b', 'b', 'c', 'c', 'a', 'a']
This function is analogous to
unique_justseen()
.
- more_itertools.classify_unique(iterable, key=None)[source]
Classify each element in terms of its uniqueness.
For each element in the input iterable, return a 3-tuple consisting of:
The element itself
False
if the element is equal to the one preceding it in the input,True
otherwise (i.e. the equivalent ofunique_justseen()
)False
if this element has been seen anywhere in the input before,True
otherwise (i.e. the equivalent ofunique_everseen()
)
>>> list(classify_unique('otto')) [('o', True, True), ('t', True, True), ('t', False, False), ('o', True, False)]
This function is analogous to
unique_everseen()
and is subject to the same performance considerations.
- more_itertools.longest_common_prefix(iterables)[source]
Yield elements of the longest common prefix amongst given iterables.
>>> ''.join(longest_common_prefix(['abcd', 'abc', 'abf'])) 'ab'
- more_itertools.takewhile_inclusive(predicate, iterable)[source]
A variant of
takewhile()
that yields one additional element.>>> list(takewhile_inclusive(lambda x: x < 5, [1, 4, 6, 4, 1])) [1, 4, 6]
takewhile()
would return[1, 4]
.
Itertools recipes
- more_itertools.nth(iterable, n, default=None)[source]
Returns the nth item or a default value.
>>> l = range(10) >>> nth(l, 3) 3 >>> nth(l, 20, "zebra") 'zebra'
- more_itertools.before_and_after(predicate, it)[source]
A variant of
takewhile()
that allows complete access to the remainder of the iterator.>>> it = iter('ABCdEfGhI') >>> all_upper, remainder = before_and_after(str.isupper, it) >>> ''.join(all_upper) 'ABC' >>> ''.join(remainder) # takewhile() would lose the 'd' 'dEfGhI'
Note that the first iterator must be fully consumed before the second iterator can generate valid results.
- more_itertools.take(n, iterable)[source]
Return first n items of the iterable as a list.
>>> take(3, range(10)) [0, 1, 2]
If there are fewer than n items in the iterable, all of them are returned.
>>> take(10, range(3)) [0, 1, 2]
- more_itertools.tail(n, iterable)[source]
Return an iterator over the last n items of iterable.
>>> t = tail(3, 'ABCDEFG') >>> list(t) ['E', 'F', 'G']
- more_itertools.unique_everseen(iterable, key=None)[source]
Yield unique elements, preserving order.
>>> list(unique_everseen('AAAABBBCCDAABBB')) ['A', 'B', 'C', 'D'] >>> list(unique_everseen('ABBCcAD', str.lower)) ['A', 'B', 'C', 'D']
Sequences with a mix of hashable and unhashable items can be used. The function will be slower (i.e., O(n^2)) for unhashable items.
Remember that
list
objects are unhashable - you can use the key parameter to transform the list to a tuple (which is hashable) to avoid a slowdown.>>> iterable = ([1, 2], [2, 3], [1, 2]) >>> list(unique_everseen(iterable)) # Slow [[1, 2], [2, 3]] >>> list(unique_everseen(iterable, key=tuple)) # Faster [[1, 2], [2, 3]]
Similarly, you may want to convert unhashable
set
objects withkey=frozenset
. Fordict
objects,key=lambda x: frozenset(x.items())
can be used.
- more_itertools.unique_justseen(iterable, key=None)[source]
Yields elements in order, ignoring serial duplicates
>>> list(unique_justseen('AAAABBBCCDAABBB')) ['A', 'B', 'C', 'D', 'A', 'B'] >>> list(unique_justseen('ABBCcAD', str.lower)) ['A', 'B', 'C', 'A', 'D']
- more_itertools.unique(iterable, key=None, reverse=False)[source]
Yields unique elements in sorted order.
>>> list(unique([[1, 2], [3, 4], [1, 2]])) [[1, 2], [3, 4]]
key and reverse are passed to
sorted()
.>>> list(unique('ABBcCAD', str.casefold)) ['A', 'B', 'c', 'D'] >>> list(unique('ABBcCAD', str.casefold, reverse=True)) ['D', 'c', 'B', 'A']
The elements in iterable need not be hashable, but they must be comparable for sorting to work.
Combinatorics
These tools yield combinatorial arrangements of items from iterables.
New itertools
- more_itertools.distinct_permutations(iterable, r=None)[source]
Yield successive distinct permutations of the elements in iterable.
>>> sorted(distinct_permutations([1, 0, 1])) [(0, 1, 1), (1, 0, 1), (1, 1, 0)]
Equivalent to yielding from
set(permutations(iterable))
, except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.Duplicate permutations arise when there are duplicated elements in the input iterable. The number of items returned is n! / (x_1! * x_2! * … * x_n!), where n is the total number of items input, and each x_i is the count of a distinct item in the input sequence.
If r is given, only the r-length permutations are yielded.
>>> sorted(distinct_permutations([1, 0, 1], r=2)) [(0, 1), (1, 0), (1, 1)] >>> sorted(distinct_permutations(range(3), r=2)) [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]
iterable need not be sortable, but note that using equal (
x == y
) but non-identical (id(x) != id(y)
) elements may produce surprising behavior. For example,1
andTrue
are equal but non-identical:>>> list(distinct_permutations([1, True, '3'])) [ (1, True, '3'), (1, '3', True), ('3', 1, True) ] >>> list(distinct_permutations([1, 2, '3'])) [ (1, 2, '3'), (1, '3', 2), (2, 1, '3'), (2, '3', 1), ('3', 1, 2), ('3', 2, 1) ]
- more_itertools.distinct_combinations(iterable, r)[source]
Yield the distinct combinations of r items taken from iterable.
>>> list(distinct_combinations([0, 0, 1], 2)) [(0, 0), (0, 1)]
Equivalent to
set(combinations(iterable))
, except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.
- more_itertools.nth_combination_with_replacement(iterable, r, index)[source]
Equivalent to
list(combinations_with_replacement(iterable, r))[index]
.The subsequences with repetition of iterable that are of length r can be ordered lexicographically.
nth_combination_with_replacement()
computes the subsequence at sort position index directly, without computing the previous subsequences with replacement.>>> nth_combination_with_replacement(range(5), 3, 5) (0, 1, 1)
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
- more_itertools.circular_shifts(iterable, steps=1)[source]
Yield the circular shifts of iterable.
>>> list(circular_shifts(range(4))) [(0, 1, 2, 3), (1, 2, 3, 0), (2, 3, 0, 1), (3, 0, 1, 2)]
Set steps to the number of places to rotate to the left (or to the right if negative). Defaults to 1.
>>> list(circular_shifts(range(4), 2)) [(0, 1, 2, 3), (2, 3, 0, 1)]
>>> list(circular_shifts(range(4), -1)) [(0, 1, 2, 3), (3, 0, 1, 2), (2, 3, 0, 1), (1, 2, 3, 0)]
- more_itertools.partitions(iterable)[source]
Yield all possible order-preserving partitions of iterable.
>>> iterable = 'abc' >>> for part in partitions(iterable): ... print([''.join(p) for p in part]) ['abc'] ['a', 'bc'] ['ab', 'c'] ['a', 'b', 'c']
This is unrelated to
partition()
.
- more_itertools.set_partitions(iterable, k=None, min_size=None, max_size=None)[source]
Yield the set partitions of iterable into k parts. Set partitions are not order-preserving.
>>> iterable = 'abc' >>> for part in set_partitions(iterable, 2): ... print([''.join(p) for p in part]) ['a', 'bc'] ['ab', 'c'] ['b', 'ac']
If k is not given, every set partition is generated.
>>> iterable = 'abc' >>> for part in set_partitions(iterable): ... print([''.join(p) for p in part]) ['abc'] ['a', 'bc'] ['ab', 'c'] ['b', 'ac'] ['a', 'b', 'c']
if min_size and/or max_size are given, the minimum and/or maximum size per block in partition is set.
>>> iterable = 'abc' >>> for part in set_partitions(iterable, min_size=2): ... print([''.join(p) for p in part]) ['abc'] >>> for part in set_partitions(iterable, max_size=2): ... print([''.join(p) for p in part]) ['a', 'bc'] ['ab', 'c'] ['b', 'ac'] ['a', 'b', 'c']
- more_itertools.product_index(element, *args)[source]
Equivalent to
list(product(*args)).index(element)
The products of args can be ordered lexicographically.
product_index()
computes the first index of element without computing the previous products.>>> product_index([8, 2], range(10), range(5)) 42
ValueError
will be raised if the given element isn’t in the product of args.
- more_itertools.combination_index(element, iterable)[source]
Equivalent to
list(combinations(iterable, r)).index(element)
The subsequences of iterable that are of length r can be ordered lexicographically.
combination_index()
computes the index of the first element, without computing the previous combinations.>>> combination_index('adf', 'abcdefg') 10
ValueError
will be raised if the given element isn’t one of the combinations of iterable.
- more_itertools.permutation_index(element, iterable)[source]
Equivalent to
list(permutations(iterable, r)).index(element)`
The subsequences of iterable that are of length r where order is important can be ordered lexicographically.
permutation_index()
computes the index of the first element directly, without computing the previous permutations.>>> permutation_index([1, 3, 2], range(5)) 19
ValueError
will be raised if the given element isn’t one of the permutations of iterable.
- more_itertools.combination_with_replacement_index(element, iterable)[source]
Equivalent to
list(combinations_with_replacement(iterable, r)).index(element)
The subsequences with repetition of iterable that are of length r can be ordered lexicographically.
combination_with_replacement_index()
computes the index of the first element, without computing the previous combinations with replacement.>>> combination_with_replacement_index('adf', 'abcdefg') 20
ValueError
will be raised if the given element isn’t one of the combinations with replacement of iterable.
- more_itertools.gray_product(*iterables)[source]
Like
itertools.product()
, but return tuples in an order such that only one element in the generated tuple changes from one iteration to the next.>>> list(gray_product('AB','CD')) [('A', 'C'), ('B', 'C'), ('B', 'D'), ('A', 'D')]
This function consumes all of the input iterables before producing output. If any of the input iterables have fewer than two items,
ValueError
is raised.For information on the algorithm, see this section of Donald Knuth’s The Art of Computer Programming.
- more_itertools.outer_product(func, xs, ys, *args, **kwargs)[source]
A generalized outer product that applies a binary function to all pairs of items. Returns a 2D matrix with
len(xs)
rows andlen(ys)
columns. Also accepts*args
and**kwargs
that are passed tofunc
.Multiplication table:
>>> list(outer_product(mul, range(1, 4), range(1, 6))) [(1, 2, 3, 4, 5), (2, 4, 6, 8, 10), (3, 6, 9, 12, 15)]
Cross tabulation:
>>> xs = ['A', 'B', 'A', 'A', 'B', 'B', 'A', 'A', 'B', 'B'] >>> ys = ['X', 'X', 'X', 'Y', 'Z', 'Z', 'Y', 'Y', 'Z', 'Z'] >>> rows = list(zip(xs, ys)) >>> count_rows = lambda x, y: rows.count((x, y)) >>> list(outer_product(count_rows, sorted(set(xs)), sorted(set(ys)))) [(2, 3, 0), (1, 0, 4)]
Usage with
*args
and**kwargs
:>>> animals = ['cat', 'wolf', 'mouse'] >>> list(outer_product(min, animals, animals, key=len)) [('cat', 'cat', 'cat'), ('cat', 'wolf', 'wolf'), ('cat', 'wolf', 'mouse')]
- more_itertools.powerset_of_sets(iterable)[source]
Yields all possible subsets of the iterable.
>>> list(powerset_of_sets([1, 2, 3])) [set(), {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}] >>> list(powerset_of_sets([1, 1, 0])) [set(), {1}, {0}, {0, 1}]
powerset_of_sets()
takes care to minimize the number of hash operations performed.
Itertools recipes
- more_itertools.powerset(iterable)[source]
Yields all possible subsets of the iterable.
>>> list(powerset([1, 2, 3])) [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
powerset()
will operate on iterables that aren’tset
instances, so repeated elements in the input will produce repeated elements in the output.>>> seq = [1, 1, 0] >>> list(powerset(seq)) [(), (1,), (1,), (0,), (1, 1), (1, 0), (1, 0), (1, 1, 0)]
For a variant that efficiently yields actual
set
instances, seepowerset_of_sets()
.
- more_itertools.random_product(*args, repeat=1)[source]
Draw an item at random from each of the input iterables.
>>> random_product('abc', range(4), 'XYZ') ('c', 3, 'Z')
If repeat is provided as a keyword argument, that many items will be drawn from each iterable.
>>> random_product('abcd', range(4), repeat=2) ('a', 2, 'd', 3)
This equivalent to taking a random selection from
itertools.product(*args, **kwarg)
.
- more_itertools.random_permutation(iterable, r=None)[source]
Return a random r length permutation of the elements in iterable.
If r is not specified or is
None
, then r defaults to the length of iterable.>>> random_permutation(range(5)) (3, 4, 0, 1, 2)
This equivalent to taking a random selection from
itertools.permutations(iterable, r)
.
- more_itertools.random_combination(iterable, r)[source]
Return a random r length subsequence of the elements in iterable.
>>> random_combination(range(5), 3) (2, 3, 4)
This equivalent to taking a random selection from
itertools.combinations(iterable, r)
.
- more_itertools.random_combination_with_replacement(iterable, r)[source]
Return a random r length subsequence of elements in iterable, allowing individual elements to be repeated.
>>> random_combination_with_replacement(range(3), 5) (0, 0, 1, 2, 2)
This equivalent to taking a random selection from
itertools.combinations_with_replacement(iterable, r)
.
- more_itertools.nth_product(index, *args)[source]
Equivalent to
list(product(*args))[index]
.The products of args can be ordered lexicographically.
nth_product()
computes the product at sort position index without computing the previous products.>>> nth_product(8, range(2), range(2), range(2), range(2)) (1, 0, 0, 0)
IndexError
will be raised if the given index is invalid.
- more_itertools.nth_permutation(iterable, r, index)[source]
Equivalent to
list(permutations(iterable, r))[index]`
The subsequences of iterable that are of length r where order is important can be ordered lexicographically.
nth_permutation()
computes the subsequence at sort position index directly, without computing the previous subsequences.>>> nth_permutation('ghijk', 2, 5) ('h', 'i')
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
- more_itertools.nth_combination(iterable, r, index)[source]
Equivalent to
list(combinations(iterable, r))[index]
.The subsequences of iterable that are of length r can be ordered lexicographically.
nth_combination()
computes the subsequence at sort position index directly, without computing the previous subsequences.>>> nth_combination(range(5), 3, 5) (0, 3, 4)
ValueError
will be raised If r is negative or greater than the length of iterable.IndexError
will be raised if the given index is invalid.
Wrapping
These tools provide wrappers to smooth working with objects that produce or consume iterables.
New itertools
- more_itertools.always_iterable(obj, base_type=(<class 'str'>, <class 'bytes'>))[source]
If obj is iterable, return an iterator over its items:
>>> obj = (1, 2, 3) >>> list(always_iterable(obj)) [1, 2, 3]
If obj is not iterable, return a one-item iterable containing obj:
>>> obj = 1 >>> list(always_iterable(obj)) [1]
If obj is
None
, return an empty iterable:>>> obj = None >>> list(always_iterable(None)) []
By default, binary and text strings are not considered iterable:
>>> obj = 'foo' >>> list(always_iterable(obj)) ['foo']
If base_type is set, objects for which
isinstance(obj, base_type)
returnsTrue
won’t be considered iterable.>>> obj = {'a': 1} >>> list(always_iterable(obj)) # Iterate over the dict's keys ['a'] >>> list(always_iterable(obj, base_type=dict)) # Treat dicts as a unit [{'a': 1}]
Set base_type to
None
to avoid any special handling and treat objects Python considers iterable as iterable:>>> obj = 'foo' >>> list(always_iterable(obj, base_type=None)) ['f', 'o', 'o']
- more_itertools.always_reversible(iterable)[source]
An extension of
reversed()
that supports all iterables, not just those which implement theReversible
orSequence
protocols.>>> print(*always_reversible(x for x in range(3))) 2 1 0
If the iterable is already reversible, this function returns the result of
reversed()
. If the iterable is not reversible, this function will cache the remaining items in the iterable and yield them in reverse order, which may require significant storage.
- more_itertools.countable(iterable)[source]
Wrap iterable and keep a count of how many items have been consumed.
The
items_seen
attribute starts at0
and increments as the iterable is consumed:>>> iterable = map(str, range(10)) >>> it = countable(iterable) >>> it.items_seen 0 >>> next(it), next(it) ('0', '1') >>> list(it) ['2', '3', '4', '5', '6', '7', '8', '9'] >>> it.items_seen 10
- more_itertools.consumer(func)[source]
Decorator that automatically advances a PEP-342-style “reverse iterator” to its first yield point so you don’t have to call
next()
on it manually.>>> @consumer ... def tally(): ... i = 0 ... while True: ... print('Thing number %s is %s.' % (i, (yield))) ... i += 1 ... >>> t = tally() >>> t.send('red') Thing number 0 is red. >>> t.send('fish') Thing number 1 is fish.
Without the decorator, you would have to call
next(t)
beforet.send()
could be used.
- more_itertools.with_iter(context_manager)[source]
Wrap an iterable in a
with
statement, so it closes once exhausted.For example, this will close the file when the iterator is exhausted:
upper_lines = (line.upper() for line in with_iter(open('foo')))
Any context manager which returns an iterable is a candidate for
with_iter
.
- class more_itertools.callback_iter(func, callback_kwd='callback', wait_seconds=0.1)[source]
Convert a function that uses callbacks to an iterator.
Let func be a function that takes a callback keyword argument. For example:
>>> def func(callback=None): ... for i, c in [(1, 'a'), (2, 'b'), (3, 'c')]: ... if callback: ... callback(i, c) ... return 4
Use
with callback_iter(func)
to get an iterator over the parameters that are delivered to the callback.>>> with callback_iter(func) as it: ... for args, kwargs in it: ... print(args) (1, 'a') (2, 'b') (3, 'c')
The function will be called in a background thread. The
done
property indicates whether it has completed execution.>>> it.done True
If it completes successfully, its return value will be available in the
result
property.>>> it.result 4
Notes:
If the function uses some keyword argument besides
callback
, supply callback_kwd.If it finished executing, but raised an exception, accessing the
result
property will raise the same exception.If it hasn’t finished executing, accessing the
result
property from within thewith
block will raiseRuntimeError
.If it hasn’t finished executing, accessing the
result
property from outside thewith
block will raise amore_itertools.AbortThread
exception.Provide wait_seconds to adjust how frequently the it is polled for output.
Itertools recipes
- more_itertools.iter_except(func, exception, first=None)[source]
Yields results from a function repeatedly until an exception is raised.
Converts a call-until-exception interface to an iterator interface. Like
iter(func, sentinel)
, but uses an exception instead of a sentinel to end the loop.>>> l = [0, 1, 2] >>> list(iter_except(l.pop, IndexError)) [2, 1, 0]
Multiple exceptions can be specified as a stopping condition:
>>> l = [1, 2, 3, '...', 4, 5, 6] >>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) [7, 6, 5] >>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) [4, 3, 2] >>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError))) []
Math
New itertools
- more_itertools.dft(xarr)[source]
Discrete Fourier Tranform. xarr is a sequence of complex numbers. Yields the components of the corresponding transformed output vector.
>>> import cmath >>> xarr = [1, 2-1j, -1j, -1+2j] >>> Xarr = [2, -2-2j, -2j, 4+4j] >>> all(map(cmath.isclose, dft(xarr), Xarr)) True
See
idft()
for the inverse Discrete Fourier Transform.
- more_itertools.idft(Xarr)[source]
Inverse Discrete Fourier Tranform. Xarr is a sequence of complex numbers. Yields the components of the corresponding inverse-transformed output vector.
>>> import cmath >>> xarr = [1, 2-1j, -1j, -1+2j] >>> Xarr = [2, -2-2j, -2j, 4+4j] >>> all(map(cmath.isclose, idft(Xarr), xarr)) True
See
dft()
for the Discrete Fourier Transform.
Itertools recipes
- more_itertools.convolve(signal, kernel)[source]
Convolve the iterable signal with the iterable kernel.
>>> signal = (1, 2, 3, 4, 5) >>> kernel = [3, 2, 1] >>> list(convolve(signal, kernel)) [3, 8, 14, 20, 26, 14, 5]
Note: the input arguments are not interchangeable, as the kernel is immediately consumed and stored.
- more_itertools.dotproduct(vec1, vec2)[source]
Returns the dot product of the two iterables.
>>> dotproduct([10, 10], [20, 20]) 400
- more_itertools.factor(n)[source]
Yield the prime factors of n.
>>> list(factor(360)) [2, 2, 2, 3, 3, 5]
- more_itertools.matmul(m1, m2)[source]
Multiply two matrices.
>>> list(matmul([(7, 5), (3, 5)], [(2, 5), (7, 9)])) [(49, 80), (41, 60)]
The caller should ensure that the dimensions of the input matrices are compatible with each other.
- more_itertools.polynomial_from_roots(roots)[source]
Compute a polynomial’s coefficients from its roots.
>>> roots = [5, -4, 3] # (x - 5) * (x + 4) * (x - 3) >>> polynomial_from_roots(roots) # x^3 - 4 * x^2 - 17 * x + 60 [1, -4, -17, 60]
- more_itertools.polynomial_derivative(coefficients)[source]
Compute the first derivative of a polynomial.
Example: evaluating the derivative of x^3 - 4 * x^2 - 17 * x + 60
>>> coefficients = [1, -4, -17, 60] >>> derivative_coefficients = polynomial_derivative(coefficients) >>> derivative_coefficients [3, -8, -17]
- more_itertools.polynomial_eval(coefficients, x)[source]
Evaluate a polynomial at a specific value.
Example: evaluating x^3 - 4 * x^2 - 17 * x + 60 at x = 2.5:
>>> coefficients = [1, -4, -17, 60] >>> x = 2.5 >>> polynomial_eval(coefficients, x) 8.125
- more_itertools.sieve(n)[source]
Yield the primes less than n.
>>> list(sieve(30)) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Others
New itertools
- more_itertools.locate(iterable, pred=bool, window_size=None)[source]
Yield the index of each item in iterable for which pred returns
True
.pred defaults to
bool()
, which will select truthy items:>>> list(locate([0, 1, 1, 0, 1, 0, 0])) [1, 2, 4]
Set pred to a custom function to, e.g., find the indexes for a particular item.
>>> list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b')) [1, 3]
If window_size is given, then the pred function will be called with that many items. This enables searching for sub-sequences:
>>> iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] >>> pred = lambda *args: args == (1, 2, 3) >>> list(locate(iterable, pred=pred, window_size=3)) [1, 5, 9]
Use with
seekable()
to find indexes and then retrieve the associated items:>>> from itertools import count >>> from more_itertools import seekable >>> source = (3 * n + 1 if (n % 2) else n // 2 for n in count()) >>> it = seekable(source) >>> pred = lambda x: x > 100 >>> indexes = locate(it, pred=pred) >>> i = next(indexes) >>> it.seek(i) >>> next(it) 106
- more_itertools.rlocate(iterable, pred=bool, window_size=None)[source]
Yield the index of each item in iterable for which pred returns
True
, starting from the right and moving left.pred defaults to
bool()
, which will select truthy items:>>> list(rlocate([0, 1, 1, 0, 1, 0, 0])) # Truthy at 1, 2, and 4 [4, 2, 1]
Set pred to a custom function to, e.g., find the indexes for a particular item:
>>> iterable = iter('abcb') >>> pred = lambda x: x == 'b' >>> list(rlocate(iterable, pred)) [3, 1]
If window_size is given, then the pred function will be called with that many items. This enables searching for sub-sequences:
>>> iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] >>> pred = lambda *args: args == (1, 2, 3) >>> list(rlocate(iterable, pred=pred, window_size=3)) [9, 5, 1]
Beware, this function won’t return anything for infinite iterables. If iterable is reversible,
rlocate
will reverse it and search from the right. Otherwise, it will search from the left and return the results in reverse order.See
locate()
to for other example applications.
- more_itertools.replace(iterable, pred, substitutes, count=None, window_size=1)[source]
Yield the items from iterable, replacing the items for which pred returns
True
with the items from the iterable substitutes.>>> iterable = [1, 1, 0, 1, 1, 0, 1, 1] >>> pred = lambda x: x == 0 >>> substitutes = (2, 3) >>> list(replace(iterable, pred, substitutes)) [1, 1, 2, 3, 1, 1, 2, 3, 1, 1]
If count is given, the number of replacements will be limited:
>>> iterable = [1, 1, 0, 1, 1, 0, 1, 1, 0] >>> pred = lambda x: x == 0 >>> substitutes = [None] >>> list(replace(iterable, pred, substitutes, count=2)) [1, 1, None, 1, 1, None, 1, 1, 0]
Use window_size to control the number of items passed as arguments to pred. This allows for locating and replacing subsequences.
>>> iterable = [0, 1, 2, 5, 0, 1, 2, 5] >>> window_size = 3 >>> pred = lambda *args: args == (0, 1, 2) # 3 items passed to pred >>> substitutes = [3, 4] # Splice in these items >>> list(replace(iterable, pred, substitutes, window_size=window_size)) [3, 4, 5, 3, 4, 5]
- more_itertools.numeric_range(start, stop[, step])[source]
An extension of the built-in
range()
function whose arguments can be any orderable numeric type.With only stop specified, start defaults to
0
and step defaults to1
. The output items will match the type of stop:>>> list(numeric_range(3.5)) [0.0, 1.0, 2.0, 3.0]
With only start and stop specified, step defaults to
1
. The output items will match the type of start:>>> from decimal import Decimal >>> start = Decimal('2.1') >>> stop = Decimal('5.1') >>> list(numeric_range(start, stop)) [Decimal('2.1'), Decimal('3.1'), Decimal('4.1')]
With start, stop, and step specified the output items will match the type of
start + step
:>>> from fractions import Fraction >>> start = Fraction(1, 2) # Start at 1/2 >>> stop = Fraction(5, 2) # End at 5/2 >>> step = Fraction(1, 2) # Count by 1/2 >>> list(numeric_range(start, stop, step)) [Fraction(1, 2), Fraction(1, 1), Fraction(3, 2), Fraction(2, 1)]
If step is zero,
ValueError
is raised. Negative steps are supported:>>> list(numeric_range(3, -1, -1.0)) [3.0, 2.0, 1.0, 0.0]
Be aware of the limitations of floating point numbers; the representation of the yielded numbers may be surprising.
datetime.datetime
objects can be used for start and stop, if step is adatetime.timedelta
object:>>> import datetime >>> start = datetime.datetime(2019, 1, 1) >>> stop = datetime.datetime(2019, 1, 3) >>> step = datetime.timedelta(days=1) >>> items = iter(numeric_range(start, stop, step)) >>> next(items) datetime.datetime(2019, 1, 1, 0, 0) >>> next(items) datetime.datetime(2019, 1, 2, 0, 0)
- more_itertools.side_effect(func, iterable, chunk_size=None, before=None, after=None)[source]
Invoke func on each item in iterable (or on each chunk_size group of items) before yielding the item.
func must be a function that takes a single argument. Its return value will be discarded.
before and after are optional functions that take no arguments. They will be executed before iteration starts and after it ends, respectively.
side_effect can be used for logging, updating progress bars, or anything that is not functionally “pure.”
Emitting a status message:
>>> from more_itertools import consume >>> func = lambda item: print('Received {}'.format(item)) >>> consume(side_effect(func, range(2))) Received 0 Received 1
Operating on chunks of items:
>>> pair_sums = [] >>> func = lambda chunk: pair_sums.append(sum(chunk)) >>> list(side_effect(func, [0, 1, 2, 3, 4, 5], 2)) [0, 1, 2, 3, 4, 5] >>> list(pair_sums) [1, 5, 9]
Writing to a file-like object:
>>> from io import StringIO >>> from more_itertools import consume >>> f = StringIO() >>> func = lambda x: print(x, file=f) >>> before = lambda: print(u'HEADER', file=f) >>> after = f.close >>> it = [u'a', u'b', u'c'] >>> consume(side_effect(func, it, before=before, after=after)) >>> f.closed True
- more_itertools.iterate(func, start)[source]
Return
start
,func(start)
,func(func(start))
, …>>> from itertools import islice >>> list(islice(iterate(lambda x: 2*x, 1), 10)) [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
- more_itertools.difference(iterable, func=operator.sub, *, initial=None)[source]
This function is the inverse of
itertools.accumulate()
. By default it will compute the first difference of iterable usingoperator.sub()
:>>> from itertools import accumulate >>> iterable = accumulate([0, 1, 2, 3, 4]) # produces 0, 1, 3, 6, 10 >>> list(difference(iterable)) [0, 1, 2, 3, 4]
func defaults to
operator.sub()
, but other functions can be specified. They will be applied as follows:A, B, C, D, ... --> A, func(B, A), func(C, B), func(D, C), ...
For example, to do progressive division:
>>> iterable = [1, 2, 6, 24, 120] >>> func = lambda x, y: x // y >>> list(difference(iterable, func)) [1, 2, 3, 4, 5]
If the initial keyword is set, the first element will be skipped when computing successive differences.
>>> it = [10, 11, 13, 16] # from accumulate([1, 2, 3], initial=10) >>> list(difference(it, initial=10)) [1, 2, 3]
- more_itertools.make_decorator(wrapping_func, result_index=0)[source]
Return a decorator version of wrapping_func, which is a function that modifies an iterable. result_index is the position in that function’s signature where the iterable goes.
This lets you use itertools on the “production end,” i.e. at function definition. This can augment what the function returns without changing the function’s code.
For example, to produce a decorator version of
chunked()
:>>> from more_itertools import chunked >>> chunker = make_decorator(chunked, result_index=0) >>> @chunker(3) ... def iter_range(n): ... return iter(range(n)) ... >>> list(iter_range(9)) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
To only allow truthy items to be returned:
>>> truth_serum = make_decorator(filter, result_index=1) >>> @truth_serum(bool) ... def boolean_test(): ... return [0, 1, '', ' ', False, True] ... >>> list(boolean_test()) [1, ' ', True]
The
peekable()
andseekable()
wrappers make for practical decorators:>>> from more_itertools import peekable >>> peekable_function = make_decorator(peekable) >>> @peekable_function() ... def str_range(*args): ... return (str(x) for x in range(*args)) ... >>> it = str_range(1, 20, 2) >>> next(it), next(it), next(it) ('1', '3', '5') >>> it.peek() '7' >>> next(it) '7'
- class more_itertools.SequenceView(target)[source]
Return a read-only view of the sequence object target.
SequenceView
objects are analogous to Python’s built-in “dictionary view” types. They provide a dynamic view of a sequence’s items, meaning that when the sequence updates, so does the view.>>> seq = ['0', '1', '2'] >>> view = SequenceView(seq) >>> view SequenceView(['0', '1', '2']) >>> seq.append('3') >>> view SequenceView(['0', '1', '2', '3'])
Sequence views support indexing, slicing, and length queries. They act like the underlying sequence, except they don’t allow assignment:
>>> view[1] '1' >>> view[1:-1] ['1', '2'] >>> len(view) 4
Sequence views are useful as an alternative to copying, as they don’t require (much) extra storage.
- more_itertools.time_limited(limit_seconds, iterable)[source]
Yield items from iterable until limit_seconds have passed. If the time limit expires before all items have been yielded, the
timed_out
parameter will be set toTrue
.>>> from time import sleep >>> def generator(): ... yield 1 ... yield 2 ... sleep(0.2) ... yield 3 >>> iterable = time_limited(0.1, generator()) >>> list(iterable) [1, 2] >>> iterable.timed_out True
Note that the time is checked before each item is yielded, and iteration stops if the time elapsed is greater than limit_seconds. If your time limit is 1 second, but it takes 2 seconds to generate the first item from the iterable, the function will run for 2 seconds and not yield anything. As a special case, when limit_seconds is zero, the iterator never returns anything.
- more_itertools.map_if(iterable, pred, func, func_else=lambda x: ...)[source]
Evaluate each item from iterable using pred. If the result is equivalent to
True
, transform the item with func and yield it. Otherwise, transform the item with func_else and yield it.pred, func, and func_else should each be functions that accept one argument. By default, func_else is the identity function.
>>> from math import sqrt >>> iterable = list(range(-5, 5)) >>> iterable [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4] >>> list(map_if(iterable, lambda x: x > 3, lambda x: 'toobig')) [-5, -4, -3, -2, -1, 0, 1, 2, 3, 'toobig'] >>> list(map_if(iterable, lambda x: x >= 0, ... lambda x: f'{sqrt(x):.2f}', lambda x: None)) [None, None, None, None, None, '0.00', '1.00', '1.41', '1.73', '2.00']
- more_itertools.doublestarmap(func, iterable)[source]
Apply func to every item of iterable by dictionary unpacking the item into func.
The difference between
itertools.starmap()
anddoublestarmap()
parallels the distinction betweenfunc(*a)
andfunc(**a)
.>>> iterable = [{'a': 1, 'b': 2}, {'a': 40, 'b': 60}] >>> list(doublestarmap(lambda a, b: a + b, iterable)) [3, 100]
TypeError
will be raised if func’s signature doesn’t match the mapping contained in iterable or if iterable does not contain mappings.
Itertools recipes
- more_itertools.iter_index(iterable, value, start=0, stop=None)[source]
Yield the index of each place in iterable that value occurs, beginning with index start and ending before index stop.
>>> list(iter_index('AABCADEAF', 'A')) [0, 1, 4, 7] >>> list(iter_index('AABCADEAF', 'A', 1)) # start index is inclusive [1, 4, 7] >>> list(iter_index('AABCADEAF', 'A', 1, 7)) # stop index is not inclusive [1, 4]
The behavior for non-scalar values matches the built-in Python types.
>>> list(iter_index('ABCDABCD', 'AB')) [0, 4] >>> list(iter_index([0, 1, 2, 3, 0, 1, 2, 3], [0, 1])) [] >>> list(iter_index([[0, 1], [2, 3], [0, 1], [2, 3]], [0, 1])) [0, 2]
See
locate()
for a more general means of finding the indexes associated with particular values.
- more_itertools.consume(iterator, n=None)[source]
Advance iterable by n steps. If n is
None
, consume it entirely.Efficiently exhausts an iterator without returning values. Defaults to consuming the whole iterator, but an optional second argument may be provided to limit consumption.
>>> i = (x for x in range(10)) >>> next(i) 0 >>> consume(i, 3) >>> next(i) 4 >>> consume(i) >>> next(i) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
If the iterator has fewer items remaining than the provided limit, the whole iterator will be consumed.
>>> i = (x for x in range(3)) >>> consume(i, 5) >>> next(i) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
- more_itertools.tabulate(function, start=0)[source]
Return an iterator over the results of
func(start)
,func(start + 1)
,func(start + 2)
…func should be a function that accepts one integer argument.
If start is not specified it defaults to 0. It will be incremented each time the iterator is advanced.
>>> square = lambda x: x ** 2 >>> iterator = tabulate(square, -3) >>> take(4, iterator) [9, 4, 1, 0]
- more_itertools.repeatfunc(func, times=None, *args)[source]
Call func with args repeatedly, returning an iterable over the results.
If times is specified, the iterable will terminate after that many repetitions:
>>> from operator import add >>> times = 4 >>> args = 3, 5 >>> list(repeatfunc(add, times, *args)) [8, 8, 8, 8]
If times is
None
the iterable will not terminate:>>> from random import randrange >>> times = None >>> args = 1, 11 >>> take(6, repeatfunc(randrange, times, *args)) [2, 4, 8, 1, 8, 4]