API Reference¶

More routines for operating on iterables, beyond itertools

Grouping¶

These tools yield groups of items from a source iterable.

New itertools

more_itertools.chunked(iterable, n, strict=False)[source]¶

Break iterable into lists of length n:

>>> list(chunked([1, 2, 3, 4, 5, 6], 3))
[[1, 2, 3], [4, 5, 6]]

By the default, the last yielded list will have fewer than n elements if the length of iterable is not divisible by n:

>>> list(chunked([1, 2, 3, 4, 5, 6, 7, 8], 3))
[[1, 2, 3], [4, 5, 6], [7, 8]]

To use a fill-in value instead, see the grouper() recipe.

If the length of iterable is not divisible by n and strict is True, then ValueError will be raised before the last list is yielded.

more_itertools.ichunked(iterable, n)[source]¶

Break iterable into sub-iterables with n elements each. ichunked() is like chunked(), but it yields iterables instead of lists.

If the sub-iterables are read in order, the elements of iterable won’t be stored in memory. If they are read out of order, itertools.tee() is used to cache elements as necessary.

>>> from itertools import count
>>> all_chunks = ichunked(count(), 4)
>>> c_1, c_2, c_3 = next(all_chunks), next(all_chunks), next(all_chunks)
>>> list(c_2)  # c_1's elements have been cached; c_3's haven't been
[4, 5, 6, 7]
>>> list(c_1)
[0, 1, 2, 3]
>>> list(c_3)
[8, 9, 10, 11]

more_itertools.chunked_even(iterable, n)[source]¶

Break iterable into lists of approximately length n. Items are distributed such the lengths of the lists differ by at most 1 item.

>>> iterable = [1, 2, 3, 4, 5, 6, 7]
>>> n = 3
>>> list(chunked_even(iterable, n))  # List lengths: 3, 2, 2
[[1, 2, 3], [4, 5], [6, 7]]
>>> list(chunked(iterable, n))  # List lengths: 3, 3, 1
[[1, 2, 3], [4, 5, 6], [7]]

more_itertools.sliced(seq, n, strict=False)[source]¶

Yield slices of length n from the sequence seq.

>>> list(sliced((1, 2, 3, 4, 5, 6), 3))
[(1, 2, 3), (4, 5, 6)]

By the default, the last yielded slice will have fewer than n elements if the length of seq is not divisible by n:

>>> list(sliced((1, 2, 3, 4, 5, 6, 7, 8), 3))
[(1, 2, 3), (4, 5, 6), (7, 8)]

If the length of seq is not divisible by n and strict is True, then ValueError will be raised before the last slice is yielded.

This function will only work for iterables that support slicing. For non-sliceable iterables, see chunked().

more_itertools.constrained_batches(iterable, max_size, max_count=None, get_len=len, strict=True)[source]¶

Yield batches of items from iterable with a combined size limited by max_size.

>>> iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1']
>>> list(constrained_batches(iterable, 10))
[(b'12345', b'123'), (b'12345678', b'1', b'1'), (b'12', b'1')]

If a max_count is supplied, the number of items per batch is also limited:

>>> iterable = [b'12345', b'123', b'12345678', b'1', b'1', b'12', b'1']
>>> list(constrained_batches(iterable, 10, max_count = 2))
[(b'12345', b'123'), (b'12345678', b'1'), (b'1', b'12'), (b'1',)]

If a get_len function is supplied, use that instead of len() to determine item size.

If strict is True, raise ValueError if any single item is bigger than max_size. Otherwise, allow single items to exceed max_size.

more_itertools.distribute(n, iterable)[source]¶

Distribute the items from iterable among n smaller iterables.

>>> group_1, group_2 = distribute(2, [1, 2, 3, 4, 5, 6])
>>> list(group_1)
[1, 3, 5]
>>> list(group_2)
[2, 4, 6]

If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:

>>> children = distribute(3, [1, 2, 3, 4, 5, 6, 7])
>>> [list(c) for c in children]
[[1, 4, 7], [2, 5], [3, 6]]

If the length of iterable is smaller than n, then the last returned iterables will be empty:

>>> children = distribute(5, [1, 2, 3])
>>> [list(c) for c in children]
[[1], [2], [3], [], []]

This function uses itertools.tee() and may require significant storage.

If you need the order items in the smaller iterables to match the original iterable, see divide().

more_itertools.divide(n, iterable)[source]¶

Divide the elements from iterable into n parts, maintaining order.

>>> group_1, group_2 = divide(2, [1, 2, 3, 4, 5, 6])
>>> list(group_1)
[1, 2, 3]
>>> list(group_2)
[4, 5, 6]

If the length of iterable is not evenly divisible by n, then the length of the returned iterables will not be identical:

>>> children = divide(3, [1, 2, 3, 4, 5, 6, 7])
>>> [list(c) for c in children]
[[1, 2, 3], [4, 5], [6, 7]]

If the length of the iterable is smaller than n, then the last returned iterables will be empty:

>>> children = divide(5, [1, 2, 3])
>>> [list(c) for c in children]
[[1], [2], [3], [], []]

This function will exhaust the iterable before returning. If order is not important, see distribute(), which does not first pull the iterable into memory.

more_itertools.split_at(iterable, pred, maxsplit=-1, keep_separator=False)[source]¶

Yield lists of items from iterable, where each list is delimited by an item where callable pred returns True.

>>> list(split_at('abcdcba', lambda x: x == 'b'))
[['a'], ['c', 'd', 'c'], ['a']]

>>> list(split_at(range(10), lambda n: n % 2 == 1))
[[0], [2], [4], [6], [8], []]

At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:

>>> list(split_at(range(10), lambda n: n % 2 == 1, maxsplit=2))
[[0], [2], [4, 5, 6, 7, 8, 9]]

By default, the delimiting items are not included in the output. To include them, set keep_separator to True.

>>> list(split_at('abcdcba', lambda x: x == 'b', keep_separator=True))
[['a'], ['b'], ['c', 'd', 'c'], ['b'], ['a']]

more_itertools.split_before(iterable, pred, maxsplit=-1)[source]¶

Yield lists of items from iterable, where each list ends just before an item for which callable pred returns True:

>>> list(split_before('OneTwo', lambda s: s.isupper()))
[['O', 'n', 'e'], ['T', 'w', 'o']]

>>> list(split_before(range(10), lambda n: n % 3 == 0))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:

>>> list(split_before(range(10), lambda n: n % 3 == 0, maxsplit=2))
[[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]

more_itertools.split_after(iterable, pred, maxsplit=-1)[source]¶

Yield lists of items from iterable, where each list ends with an item where callable pred returns True:

>>> list(split_after('one1two2', lambda s: s.isdigit()))
[['o', 'n', 'e', '1'], ['t', 'w', 'o', '2']]

>>> list(split_after(range(10), lambda n: n % 3 == 0))
[[0], [1, 2, 3], [4, 5, 6], [7, 8, 9]]

At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:

>>> list(split_after(range(10), lambda n: n % 3 == 0, maxsplit=2))
[[0], [1, 2, 3], [4, 5, 6, 7, 8, 9]]

more_itertools.split_into(iterable, sizes)[source]¶

Yield a list of sequential items from iterable of length ‘n’ for each integer ‘n’ in sizes.

>>> list(split_into([1,2,3,4,5,6], [1,2,3]))
[[1], [2, 3], [4, 5, 6]]

If the sum of sizes is smaller than the length of iterable, then the remaining items of iterable will not be returned.

>>> list(split_into([1,2,3,4,5,6], [2,3]))
[[1, 2], [3, 4, 5]]

If the sum of sizes is larger than the length of iterable, fewer items will be returned in the iteration that overruns the iterable and further lists will be empty:

>>> list(split_into([1,2,3,4], [1,2,3,4]))
[[1], [2, 3], [4], []]

When a None object is encountered in sizes, the returned list will contain items up to the end of iterable the same way that itertools.slice() does:

>>> list(split_into([1,2,3,4,5,6,7,8,9,0], [2,3,None]))
[[1, 2], [3, 4, 5], [6, 7, 8, 9, 0]]

split_into() can be useful for grouping a series of items where the sizes of the groups are not uniform. An example would be where in a row from a table, multiple columns represent elements of the same feature (e.g. a point represented by x,y,z) but, the format is not the same for all columns.

more_itertools.split_when(iterable, pred, maxsplit=-1)[source]¶

Split iterable into pieces based on the output of pred. pred should be a function that takes successive pairs of items and returns True if the iterable should be split in between them.

For example, to find runs of increasing numbers, split the iterable when element i is larger than element i + 1:

>>> list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2], lambda x, y: x > y))
[[1, 2, 3, 3], [2, 5], [2, 4], [2]]

At most maxsplit splits are done. If maxsplit is not specified or -1, then there is no limit on the number of splits:

>>> list(split_when([1, 2, 3, 3, 2, 5, 2, 4, 2],
...                 lambda x, y: x > y, maxsplit=2))
[[1, 2, 3, 3], [2, 5], [2, 4, 2]]

class more_itertools.bucket(iterable, key, validator=None)[source]¶

Wrap iterable and return an object that buckets the iterable into child iterables based on a key function.

>>> iterable = ['a1', 'b1', 'c1', 'a2', 'b2', 'c2', 'b3']
>>> s = bucket(iterable, key=lambda x: x[0])  # Bucket by 1st character
>>> sorted(list(s))  # Get the keys
['a', 'b', 'c']
>>> a_iterable = s['a']
>>> next(a_iterable)
'a1'
>>> next(a_iterable)
'a2'
>>> list(s['b'])
['b1', 'b2', 'b3']

The original iterable will be advanced and its items will be cached until they are used by the child iterables. This may require significant storage.

By default, attempting to select a bucket to which no items belong will exhaust the iterable and cache all values. If you specify a validator function, selected buckets will instead be checked against it.

>>> from itertools import count
>>> it = count(1, 2)  # Infinite sequence of odd numbers
>>> key = lambda x: x % 10  # Bucket by last digit
>>> validator = lambda x: x in {1, 3, 5, 7, 9}  # Odd digits only
>>> s = bucket(it, key=key, validator=validator)
>>> 2 in s
False
>>> list(s[2])
[]

Lookahead and lookback¶

These tools peek at an iterable’s values without advancing it.

New itertools

more_itertools.spy(iterable, n=1)[source]¶

Return a 2-tuple with a list containing the first n elements of iterable, and an iterator with the same items as iterable. This allows you to “look ahead” at the items in the iterable without advancing it.

There is one item in the list by default:

>>> iterable = 'abcdefg'
>>> head, iterable = spy(iterable)
>>> head
['a']
>>> list(iterable)
['a', 'b', 'c', 'd', 'e', 'f', 'g']

You may use unpacking to retrieve items instead of lists:

>>> (head,), iterable = spy('abcdefg')
>>> head
'a'
>>> (first, second), iterable = spy('abcdefg', 2)
>>> first
'a'
>>> second
'b'

The number of items requested can be larger than the number of items in the iterable:

>>> iterable = [1, 2, 3, 4, 5]
>>> head, iterable = spy(iterable, 10)
>>> head
[1, 2, 3, 4, 5]
>>> list(iterable)
[1, 2, 3, 4, 5]

class more_itertools.peekable(iterable)[source]¶

Wrap an iterator to allow lookahead and prepending elements.

Call peek() on the result to get the value that will be returned by next(). This won’t advance the iterator:

>>> p = peekable(['a', 'b'])
>>> p.peek()
'a'
>>> next(p)
'a'

Pass peek() a default value to return that instead of raising StopIteration when the iterator is exhausted.

>>> p = peekable([])
>>> p.peek('hi')
'hi'

peekables also offer a prepend() method, which “inserts” items at the head of the iterable:

>>> p = peekable([1, 2, 3])
>>> p.prepend(10, 11, 12)
>>> next(p)
10
>>> p.peek()
11
>>> list(p)
[11, 12, 1, 2, 3]

peekables can be indexed. Index 0 is the item that will be returned by next(), index 1 is the item after that, and so on: The values up to the given index will be cached.

>>> p = peekable(['a', 'b', 'c', 'd'])
>>> p[0]
'a'
>>> p[1]
'b'
>>> next(p)
'a'

Negative indexes are supported, but be aware that they will cache the remaining items in the source iterator, which may require significant storage.

To check whether a peekable is exhausted, check its truth value:

>>> p = peekable(['a', 'b'])
>>> if p:  # peekable has items
...     list(p)
['a', 'b']
>>> if not p:  # peekable is exhausted
...     list(p)
[]

class more_itertools.seekable(iterable, maxlen=None)[source]¶

Wrap an iterator to allow for seeking backward and forward. This progressively caches the items in the source iterable so they can be re-visited.

Call seek() with an index to seek to that position in the source iterable.

To “reset” an iterator, seek to 0:

>>> from itertools import count
>>> it = seekable((str(n) for n in count()))
>>> next(it), next(it), next(it)
('0', '1', '2')
>>> it.seek(0)
>>> next(it), next(it), next(it)
('0', '1', '2')

You can also seek forward:

>>> it = seekable((str(n) for n in range(20)))
>>> it.seek(10)
>>> next(it)
'10'
>>> it.seek(20)  # Seeking past the end of the source isn't a problem
>>> list(it)
[]
>>> it.seek(0)  # Resetting works even after hitting the end
>>> next(it)
'0'

Call relative_seek() to seek relative to the source iterator’s current position.

>>> it = seekable((str(n) for n in range(20)))
>>> next(it), next(it), next(it)
('0', '1', '2')
>>> it.relative_seek(2)
>>> next(it)
'5'
>>> it.relative_seek(-3)  # Source is at '6', we move back to '3'
>>> next(it)
'3'
>>> it.relative_seek(-3)  # Source is at '4', we move back to '1'
>>> next(it)
'1'

Call peek() to look ahead one item without advancing the iterator:

>>> it = seekable('1234')
>>> it.peek()
'1'
>>> list(it)
['1', '2', '3', '4']
>>> it.peek(default='empty')
'empty'

Before the iterator is at its end, calling bool() on it will return True. After it will return False:

>>> it = seekable('5678')
>>> bool(it)
True
>>> list(it)
['5', '6', '7', '8']
>>> bool(it)
False

You may view the contents of the cache with the elements() method. That returns a SequenceView, a view that updates automatically:

>>> it = seekable((str(n) for n in range(10)))
>>> next(it), next(it), next(it)
('0', '1', '2')
>>> elements = it.elements()
>>> elements
SequenceView(['0', '1', '2'])
>>> next(it)
'3'
>>> elements
SequenceView(['0', '1', '2', '3'])

Indexing the seekable directly returns items from the cache:

>>> it = seekable((str(n) for n in range(10)))
>>> next(it), next(it), next(it)
('0', '1', '2')
>>> it[-1]
'2'
>>> it[0]
'0'

By default, the cache grows as the source iterable progresses, so beware of wrapping very large or infinite iterables. Supply maxlen to limit the size of the cache (this of course limits how far back you can seek).

>>> from itertools import count
>>> it = seekable((str(n) for n in count()), maxlen=2)
>>> next(it), next(it), next(it), next(it)
('0', '1', '2', '3')
>>> list(it.elements())
['2', '3']
>>> it.seek(0)
>>> next(it), next(it), next(it), next(it)
('2', '3', '4', '5')
>>> next(it)
'6'

Windowing¶

These tools yield windows of items from an iterable.

New itertools

more_itertools.windowed(seq, n, fillvalue=None, step=1)[source]¶

Return a sliding window of width n over the given iterable.

>>> all_windows = windowed([1, 2, 3, 4, 5], 3)
>>> list(all_windows)
[(1, 2, 3), (2, 3, 4), (3, 4, 5)]

When the window is larger than the iterable, fillvalue is used in place of missing values:

>>> list(windowed([1, 2, 3], 4))
[(1, 2, 3, None)]

Each window will advance in increments of step:

>>> list(windowed([1, 2, 3, 4, 5, 6], 3, fillvalue='!', step=2))
[(1, 2, 3), (3, 4, 5), (5, 6, '!')]

To slide into the iterable’s items, use chain() to add filler items to the left:

>>> iterable = [1, 2, 3, 4]
>>> n = 3
>>> padding = [None] * (n - 1)
>>> list(windowed(chain(padding, iterable), 3))
[(None, None, 1), (None, 1, 2), (1, 2, 3), (2, 3, 4)]

more_itertools.substrings(iterable)[source]¶

Yield all of the substrings of iterable.

>>> [''.join(s) for s in substrings('more')]
['m', 'o', 'r', 'e', 'mo', 'or', 're', 'mor', 'ore', 'more']

Note that non-string iterables can also be subdivided.

>>> list(substrings([0, 1, 2]))
[(0,), (1,), (2,), (0, 1), (1, 2), (0, 1, 2)]

Like subslices() but returns tuples instead of lists and returns the shortest substrings first.

more_itertools.substrings_indexes(seq, reverse=False)[source]¶

Yield all substrings and their positions in seq

The items yielded will be a tuple of the form (substr, i, j), where substr == seq[i:j].

This function only works for iterables that support slicing, such as str objects.

>>> for item in substrings_indexes('more'):
...    print(item)
('m', 0, 1)
('o', 1, 2)
('r', 2, 3)
('e', 3, 4)
('mo', 0, 2)
('or', 1, 3)
('re', 2, 4)
('mor', 0, 3)
('ore', 1, 4)
('more', 0, 4)

Set reverse to True to yield the same items in the opposite order.

more_itertools.stagger(iterable, offsets=(-1, 0, 1), longest=False, fillvalue=None)[source]¶

Yield tuples whose elements are offset from iterable. The amount by which the i-th item in each tuple is offset is given by the i-th item in offsets.

>>> list(stagger([0, 1, 2, 3]))
[(None, 0, 1), (0, 1, 2), (1, 2, 3)]
>>> list(stagger(range(8), offsets=(0, 2, 4)))
[(0, 2, 4), (1, 3, 5), (2, 4, 6), (3, 5, 7)]

By default, the sequence will end when the final element of a tuple is the last item in the iterable. To continue until the first element of a tuple is the last item in the iterable, set longest to True:

>>> list(stagger([0, 1, 2, 3], longest=True))
[(None, 0, 1), (0, 1, 2), (1, 2, 3), (2, 3, None), (3, None, None)]

By default, None will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.

more_itertools.windowed_complete(iterable, n)[source]¶

Yield (beginning, middle, end) tuples, where:

Each middle has n items from iterable
Each beginning has the items before the ones in middle
Each end has the items after the ones in middle

>>> iterable = range(7)
>>> n = 3
>>> for beginning, middle, end in windowed_complete(iterable, n):
...     print(beginning, middle, end)
() (0, 1, 2) (3, 4, 5, 6)
(0,) (1, 2, 3) (4, 5, 6)
(0, 1) (2, 3, 4) (5, 6)
(0, 1, 2) (3, 4, 5) (6,)
(0, 1, 2, 3) (4, 5, 6) ()

Note that n must be at least 0 and most equal to the length of iterable.

This function will exhaust the iterable and may require significant storage.

Itertools recipes

more_itertools.pairwise(iterable)[source]¶: Wrapper for itertools.pairwise().

Deprecated since version 11.0.0: Will be removed in a future major release.

more_itertools.triplewise(iterable)[source]¶

Return overlapping triplets from iterable.

>>> list(triplewise('ABCDE'))
[('A', 'B', 'C'), ('B', 'C', 'D'), ('C', 'D', 'E')]

more_itertools.sliding_window(iterable, n)[source]¶

Return a sliding window of width n over iterable.

>>> list(sliding_window(range(6), 4))
[(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5)]

If iterable has fewer than n items, then nothing is yielded:

>>> list(sliding_window(range(3), 4))
[]

For a variant with more features, see windowed().

more_itertools.subslices(iterable)[source]¶

Return all contiguous non-empty subslices of iterable.

>>> list(subslices('ABC'))
[['A'], ['A', 'B'], ['A', 'B', 'C'], ['B'], ['B', 'C'], ['C']]

This is similar to substrings(), but emits items in a different order.

Augmenting¶

These tools yield items from an iterable, plus additional data.

New itertools

more_itertools.count_cycle(iterable, n=None)[source]¶

Cycle through the items from iterable up to n times, yielding the number of completed cycles along with each item. If n is omitted the process repeats indefinitely.

>>> list(count_cycle('AB', 3))
[(0, 'A'), (0, 'B'), (1, 'A'), (1, 'B'), (2, 'A'), (2, 'B')]

more_itertools.intersperse(e, iterable, n=1)[source]¶

Intersperse filler element e among the items in iterable, leaving n items between each filler element.

>>> list(intersperse('!', [1, 2, 3, 4, 5]))
[1, '!', 2, '!', 3, '!', 4, '!', 5]

>>> list(intersperse(None, [1, 2, 3, 4, 5], n=2))
[1, 2, None, 3, 4, None, 5]

more_itertools.padded(iterable, fillvalue=None, n=None, next_multiple=False)[source]¶

Yield the elements from iterable, followed by fillvalue, such that at least n items are emitted.

>>> list(padded([1, 2, 3], '?', 5))
[1, 2, 3, '?', '?']

If next_multiple is True, fillvalue will be emitted until the number of items emitted is a multiple of n:

>>> list(padded([1, 2, 3, 4], n=3, next_multiple=True))
[1, 2, 3, 4, None, None]

If n is None, fillvalue will be emitted indefinitely.

To create an iterable of exactly size n, you can truncate with islice().

>>> list(islice(padded([1, 2, 3], '?'), 5))
[1, 2, 3, '?', '?']
>>> list(islice(padded([1, 2, 3, 4, 5, 6, 7, 8], '?'), 5))
[1, 2, 3, 4, 5]

more_itertools.mark_ends(iterable)[source]¶

Yield 3-tuples of the form (is_first, is_last, item).

>>> list(mark_ends('ABC'))
[(True, False, 'A'), (False, False, 'B'), (False, True, 'C')]

Use this when looping over an iterable to take special action on its first and/or last items:

>>> iterable = ['Header', 100, 200, 'Footer']
>>> total = 0
>>> for is_first, is_last, item in mark_ends(iterable):
...     if is_first:
...         continue  # Skip the header
...     if is_last:
...         continue  # Skip the footer
...     total += item
>>> print(total)
300

more_itertools.repeat_each(iterable, n=2)[source]¶

Repeat each element in iterable n times.

>>> list(repeat_each('ABC', 3))
['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C']

more_itertools.repeat_last(iterable, default=None)[source]¶

After the iterable is exhausted, keep yielding its last element.

>>> list(islice(repeat_last(range(3)), 5))
[0, 1, 2, 2, 2]

If the iterable is empty, yield default forever:

>>> list(islice(repeat_last(range(0), 42), 5))
[42, 42, 42, 42, 42]

more_itertools.adjacent(predicate, iterable, distance=1)[source]¶

Return an iterable over (bool, item) tuples where the item is drawn from iterable and the bool indicates whether that item satisfies the predicate or is adjacent to an item that does.

For example, to find whether items are adjacent to a 3:

>>> list(adjacent(lambda x: x == 3, range(6)))
[(False, 0), (False, 1), (True, 2), (True, 3), (True, 4), (False, 5)]

Set distance to change what counts as adjacent. For example, to find whether items are two places away from a 3:

>>> list(adjacent(lambda x: x == 3, range(6), distance=2))
[(False, 0), (True, 1), (True, 2), (True, 3), (True, 4), (True, 5)]

This is useful for contextualizing the results of a search function. For example, a code comparison tool might want to identify lines that have changed, but also surrounding lines to give the viewer of the diff context.

The predicate function will only be called once for each item in the iterable.

See also groupby_transform(), which can be used with this function to group ranges of items with the same bool value.

more_itertools.groupby_transform(iterable, keyfunc=None, valuefunc=None, reducefunc=None)[source]¶

An extension of itertools.groupby() that can apply transformations to the grouped data.

keyfunc is a function computing a key value for each item in iterable
valuefunc is a function that transforms the individual items from iterable after grouping
reducefunc is a function that transforms each group of items

>>> iterable = 'aAAbBBcCC'
>>> keyfunc = lambda k: k.upper()
>>> valuefunc = lambda v: v.lower()
>>> reducefunc = lambda g: ''.join(g)
>>> list(groupby_transform(iterable, keyfunc, valuefunc, reducefunc))
[('A', 'aaa'), ('B', 'bbb'), ('C', 'ccc')]

Each optional argument defaults to an identity function if not specified.

groupby_transform() is useful when grouping elements of an iterable using a separate iterable as the key. To do this, zip() the iterables and pass a keyfunc that extracts the first element and a valuefunc that extracts the second element:

>>> from operator import itemgetter
>>> keys = [0, 0, 1, 1, 1, 2, 2, 2, 3]
>>> values = 'abcdefghi'
>>> iterable = zip(keys, values)
>>> grouper = groupby_transform(iterable, itemgetter(0), itemgetter(1))
>>> [(k, ''.join(g)) for k, g in grouper]
[(0, 'ab'), (1, 'cde'), (2, 'fgh'), (3, 'i')]

Note that the order of items in the iterable is significant. Only adjacent items are grouped together, so if you don’t want any duplicate groups, you should sort the iterable by the key function or consider bucket() or map_reduce(). map_reduce() consumes the iterable immediately and returns a dictionary, while bucket() does not.

Combining¶

These tools combine multiple iterables.

New itertools

more_itertools.collapse(iterable, base_type=None, levels=None)[source]¶

Flatten an iterable with multiple levels of nesting (e.g., a list of lists of tuples) into non-iterable types.

>>> iterable = [(1, 2), ([3, 4], [[5], [6]])]
>>> list(collapse(iterable))
[1, 2, 3, 4, 5, 6]

Binary and text strings are not considered iterable and will not be collapsed.

To avoid collapsing other types, specify base_type:

>>> iterable = ['ab', ('cd', 'ef'), ['gh', 'ij']]
>>> list(collapse(iterable, base_type=tuple))
['ab', ('cd', 'ef'), 'gh', 'ij']

Specify levels to stop flattening after a certain level:

>>> iterable = [('a', ['b']), ('c', ['d'])]
>>> list(collapse(iterable))  # Fully flattened
['a', 'b', 'c', 'd']
>>> list(collapse(iterable, levels=1))  # Only one level flattened
['a', ['b'], 'c', ['d']]

more_itertools.interleave(*iterables)[source]¶

Return a new iterable yielding from each iterable in turn, until the shortest is exhausted.

>>> list(interleave([1, 2, 3], [4, 5], [6, 7, 8]))
[1, 4, 6, 2, 5, 7]

For a version that doesn’t terminate after the shortest iterable is exhausted, see interleave_longest().

more_itertools.interleave_longest(*iterables)[source]¶

Return a new iterable yielding from each iterable in turn, skipping any that are exhausted.

>>> list(interleave_longest([1, 2, 3], [4, 5], [6, 7, 8]))
[1, 4, 6, 2, 5, 7, 3, 8]

This function produces the same output as roundrobin(), but may perform better for some inputs (in particular when the number of iterables is large).

more_itertools.interleave_evenly(iterables, lengths=None)[source]¶

Interleave multiple iterables so that their elements are evenly distributed throughout the output sequence.

>>> iterables = [1, 2, 3, 4, 5], ['a', 'b']
>>> list(interleave_evenly(iterables))
[1, 2, 'a', 3, 4, 'b', 5]

>>> iterables = [[1, 2, 3], [4, 5], [6, 7, 8]]
>>> list(interleave_evenly(iterables))
[1, 6, 4, 2, 7, 3, 8, 5]

This function requires iterables of known length. Iterables without __len__() can be used by manually specifying lengths with lengths:

>>> from itertools import combinations, repeat
>>> iterables = [combinations(range(4), 2), ['a', 'b', 'c']]
>>> lengths = [4 * (4 - 1) // 2, 3]
>>> list(interleave_evenly(iterables, lengths=lengths))
[(0, 1), (0, 2), 'a', (0, 3), (1, 2), 'b', (1, 3), (2, 3), 'c']

Based on Bresenham’s algorithm.

more_itertools.interleave_randomly(*iterables)[source]¶

Repeatedly select one of the input iterables at random and yield the next item from it.

>>> iterables = [1, 2, 3], 'abc', (True, False, None)
>>> list(interleave_randomly(*iterables))
['a', 'b', 1, 'c', True, False, None, 2, 3]

The relative order of the items in each input iterable will preserved. Note the sequences of items with this property are not equally likely to be generated.

more_itertools.partial_product(*iterables, repeat=1)[source]¶

Yields tuples containing one item from each iterator, with subsequent tuples changing a single item at a time by advancing each iterator until it is exhausted. This sequence guarantees every value in each iterable is output at least once without generating all possible combinations.

This may be useful, for example, when testing an expensive function.

>>> list(partial_product('AB', 'C', 'DEF'))
[('A', 'C', 'D'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'C', 'F')]

The repeat keyword argument specifies the number of repetitions of the iterables. For example, partial_product('AB', repeat=3) is equivalent to partial_product('AB', 'AB', 'AB').

more_itertools.sort_together(iterables, key_list=(0,), key=None, reverse=False, strict=False)[source]¶

Return the input iterables sorted together, with key_list as the priority for sorting. All iterables are trimmed to the length of the shortest one.

This can be used like the sorting function in a spreadsheet. If each iterable represents a column of data, the key list determines which columns are used for sorting.

By default, all iterables are sorted using the 0-th iterable:

>>> iterables = [(4, 3, 2, 1), ('a', 'b', 'c', 'd')]
>>> sort_together(iterables)
[(1, 2, 3, 4), ('d', 'c', 'b', 'a')]

Set a different key list to sort according to another iterable. Specifying multiple keys dictates how ties are broken:

>>> iterables = [(3, 1, 2), (0, 1, 0), ('c', 'b', 'a')]
>>> sort_together(iterables, key_list=(1, 2))
[(2, 3, 1), (0, 0, 1), ('a', 'c', 'b')]

To sort by a function of the elements of the iterable, pass a key function. Its arguments are the elements of the iterables corresponding to the key list:

>>> names = ('a', 'b', 'c')
>>> lengths = (1, 2, 3)
>>> widths = (5, 2, 1)
>>> def area(length, width):
...     return length * width
>>> sort_together([names, lengths, widths], key_list=(1, 2), key=area)
[('c', 'b', 'a'), (3, 2, 1), (1, 2, 5)]

Set reverse to True to sort in descending order.

>>> sort_together([(1, 2, 3), ('c', 'b', 'a')], reverse=True)
[(3, 2, 1), ('a', 'b', 'c')]

If the strict keyword argument is True, then ValueError will be raised if any of the iterables have different lengths.

more_itertools.value_chain(*args)[source]¶

Yield all arguments passed to the function in the same order in which they were passed. If an argument itself is iterable then iterate over its values.

>>> list(value_chain(1, 2, 3, [4, 5, 6]))
[1, 2, 3, 4, 5, 6]

Binary and text strings are not considered iterable and are emitted as-is:

>>> list(value_chain('12', '34', ['56', '78']))
['12', '34', '56', '78']

Pre- or postpend a single element to an iterable:

>>> list(value_chain(1, [2, 3, 4, 5, 6]))
[1, 2, 3, 4, 5, 6]
>>> list(value_chain([1, 2, 3, 4, 5], 6))
[1, 2, 3, 4, 5, 6]

Multiple levels of nesting are not flattened.

more_itertools.zip_offset(*iterables, offsets, longest=False, fillvalue=None)[source]¶

zip the input iterables together, but offset the i-th iterable by the i-th item in offsets.

>>> list(zip_offset('0123', 'abcdef', offsets=(0, 1)))
[('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e')]

This can be used as a lightweight alternative to SciPy or pandas to analyze data sets in which some series have a lead or lag relationship.

By default, the sequence will end when the shortest iterable is exhausted. To continue until the longest iterable is exhausted, set longest to True.

>>> list(zip_offset('0123', 'abcdef', offsets=(0, 1), longest=True))
[('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e'), (None, 'f')]

By default, None will be used to replace offsets beyond the end of the sequence. Specify fillvalue to use some other value.

more_itertools.zip_broadcast(*objects, scalar_types=(str, bytes), strict=False)[source]¶

A version of zip() that “broadcasts” any scalar (i.e., non-iterable) items into output tuples.

>>> iterable_1 = [1, 2, 3]
>>> iterable_2 = ['a', 'b', 'c']
>>> scalar = '_'
>>> list(zip_broadcast(iterable_1, iterable_2, scalar))
[(1, 'a', '_'), (2, 'b', '_'), (3, 'c', '_')]

The scalar_types keyword argument determines what types are considered scalar. It is set to (str, bytes) by default. Set it to None to treat strings and byte strings as iterable:

>>> list(zip_broadcast('abc', 0, 'xyz', scalar_types=None))
[('a', 0, 'x'), ('b', 0, 'y'), ('c', 0, 'z')]

If the strict keyword argument is True, then ValueError will be raised if any of the iterables have different lengths.

Itertools recipes

more_itertools.flatten(list_of_lists)[source]¶

Return an iterator flattening one level of nesting in a list of lists.

>>> list(flatten([[0, 1], [2, 3]]))
[0, 1, 2, 3]

See also collapse(), which can flatten multiple levels of nesting.

more_itertools.roundrobin(*iterables)[source]¶

Visit input iterables in a cycle until each is exhausted.

>>> list(roundrobin('ABC', 'D', 'EF'))
['A', 'D', 'E', 'B', 'F', 'C']

This function produces the same output as interleave_longest(), but may perform better for some inputs (in particular when the number of iterables is small).

more_itertools.prepend(value, iterable)[source]¶

Yield value, followed by the elements in iterable.

>>> value = '0'
>>> iterable = ['1', '2', '3']
>>> list(prepend(value, iterable))
['0', '1', '2', '3']

To prepend multiple values, see itertools.chain() or value_chain().

Concurrency¶

These tools support thread-safe concurrency.

more_itertools.concurrent_tee(iterable, n=2)[source]¶

Variant of itertools.tee() but with guaranteed threading semantics.

Takes a non-threadsafe iterator as an input and creates concurrent tee objects for other threads to have reliable independent copies of the data stream.

The new iterators are only thread-safe if consumed within a single thread. To share just one of the new iterators across multiple threads, wrap it with serialize().

class more_itertools.serialize(iterable)[source]¶

Wrap a non-concurrent iterator with a lock to enforce sequential access.

Applies a non-reentrant lock around calls to __next__, allowing iterator and generator instances to be shared by multiple consumer threads.

more_itertools.synchronized(func)[source]¶

Wrap an iterator-returning callable to make its iterators thread-safe.

Existing itertools and more-itertools can be wrapped so that their iterator instances are serialized.

For example, itertools.count does not make thread-safe instances, but that is easily fixed with:

atomic_counter = synchronized(itertools.count)

Can also be used as a decorator for generator functions definitions so that the generator instances are serialized:

@synchronized
def enumerate_and_timestamp(iterable):
    for count, value in enumerate(iterable):
        yield count, time_ns(), value

Summarizing¶

These tools return summarized or aggregated data from an iterable.

New itertools

more_itertools.ilen(iterable)[source]¶

Return the number of items in iterable.

For example, there are 168 prime numbers below 1,000:

>>> ilen(sieve(1000))
168

Equivalent to, but faster than:

def ilen(iterable):
    count = 0
    for _ in iterable:
        count += 1
    return count

This fully consumes the iterable, so handle with care.

more_itertools.unique_to_each(*iterables)[source]¶

Return the elements from each of the input iterables that aren’t in the other input iterables.

For example, suppose you have a set of packages, each with a set of dependencies:

{'pkg_1': {'A', 'B'}, 'pkg_2': {'B', 'C'}, 'pkg_3': {'B', 'D'}}

If you remove one package, which dependencies can also be removed?

If pkg_1 is removed, then A is no longer necessary - it is not associated with pkg_2 or pkg_3. Similarly, C is only needed for pkg_2, and D is only needed for pkg_3:

>>> unique_to_each({'A', 'B'}, {'B', 'C'}, {'B', 'D'})
[['A'], ['C'], ['D']]

If there are duplicates in one input iterable that aren’t in the others they will be duplicated in the output. Input order is preserved:

>>> unique_to_each("mississippi", "missouri")
[['p', 'p'], ['o', 'u', 'r']]

It is assumed that the elements of each iterable are hashable.

more_itertools.sample(iterable, k=1, weights=None)[source]¶

Return a k-length list of elements chosen (without replacement) from the iterable.

Similar to random.sample(), but works on inputs that aren’t indexable (such as sets and dictionaries) and on inputs where the size isn’t known in advance (such as generators).

>>> iterable = range(100)
>>> sample(iterable, 5)
[81, 60, 96, 16, 4]

For iterables with repeated elements, you may supply counts to indicate the repeats.

>>> iterable = ['a', 'b']
>>> counts = [3, 4]  # Equivalent to 'a', 'a', 'a', 'b', 'b', 'b', 'b'
>>> sample(iterable, k=3, counts=counts)
['a', 'a', 'b']

An iterable with weights may be given:

>>> iterable = range(100)
>>> weights = (i * i + 1 for i in range(100))
>>> sampled = sample(iterable, 5, weights=weights)
[79, 67, 74, 66, 78]

Weighted selections are made without replacement. After an element is selected, it is removed from the pool and the relative weights of the other elements increase (this does not match the behavior of random.sample()’s counts parameter). Note that weights may not be used with counts.

If the length of iterable is less than k, ValueError is raised if strict is True and all elements are returned (in shuffled order) if strict is False.

By default, the Algorithm L reservoir sampling technique is used. When weights are provided, Algorithm A-ExpJ is used instead.

Notes on reproducibility:

The algorithms rely on inexact floating-point functions provided by the underlying math library (e.g. log, log1p, and pow). Those functions can produce slightly different results on different builds. Accordingly, selections can vary across builds even for the same seed.
The algorithms loop over the input and make selections based on ordinal position, so selections from unordered collections (such as sets) won’t reproduce across sessions on the same platform using the same seed. For example, this won’t reproduce:
```
>> seed(8675309)
>> sample(set('abcdefghijklmnopqrstuvwxyz'), 10)
['c', 'p', 'e', 'w', 's', 'a', 'j', 'd', 'n', 't']
```

more_itertools.consecutive_groups(iterable, ordering=lambda x: ...)[source]¶

Yield groups of consecutive items using itertools.groupby(). The ordering function determines whether two items are adjacent by returning their position.

By default, the ordering function is the identity function. This is suitable for finding runs of numbers:

>>> iterable = [1, 10, 11, 12, 20, 30, 31, 32, 33, 40]
>>> for group in consecutive_groups(iterable):
...     print(list(group))
[1]
[10, 11, 12]
[20]
[30, 31, 32, 33]
[40]

To find runs of adjacent letters, apply ord() function to convert letters to ordinals.

>>> iterable = 'abcdfgilmnop'
>>> ordering = ord
>>> for group in consecutive_groups(iterable, ordering):
...     print(list(group))
['a', 'b', 'c', 'd']
['f', 'g']
['i']
['l', 'm', 'n', 'o', 'p']

Each group of consecutive items is an iterator that shares its source with iterable. When an output group is advanced, the previous group is no longer available unless its elements are copied (e.g., into a list).

>>> iterable = [1, 2, 11, 12, 21, 22]
>>> saved_groups = []
>>> for group in consecutive_groups(iterable):
...     saved_groups.append(list(group))  # Copy group elements
>>> saved_groups
[[1, 2], [11, 12], [21, 22]]

class more_itertools.run_length[source]¶

run_length.encode() compresses an iterable with run-length encoding. It yields groups of repeated items with the count of how many times they were repeated:

>>> uncompressed = 'abbcccdddd'
>>> list(run_length.encode(uncompressed))
[('a', 1), ('b', 2), ('c', 3), ('d', 4)]

run_length.decode() decompresses an iterable that was previously compressed with run-length encoding. It yields the items of the decompressed iterable:

>>> compressed = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> list(run_length.decode(compressed))
['a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd']

more_itertools.map_reduce(iterable, keyfunc, valuefunc=None, reducefunc=None)[source]¶

Return a dictionary that maps the items in iterable to categories defined by keyfunc, transforms them with valuefunc, and then summarizes them by category with reducefunc.

valuefunc defaults to the identity function if it is unspecified. If reducefunc is unspecified, no summarization takes place:

>>> keyfunc = lambda x: x.upper()
>>> result = map_reduce('abbccc', keyfunc)
>>> sorted(result.items())
[('A', ['a']), ('B', ['b', 'b']), ('C', ['c', 'c', 'c'])]

Specifying valuefunc transforms the categorized items:

>>> keyfunc = lambda x: x.upper()
>>> valuefunc = lambda x: 1
>>> result = map_reduce('abbccc', keyfunc, valuefunc)
>>> sorted(result.items())
[('A', [1]), ('B', [1, 1]), ('C', [1, 1, 1])]

Specifying reducefunc summarizes the categorized items:

>>> keyfunc = lambda x: x.upper()
>>> valuefunc = lambda x: 1
>>> reducefunc = sum
>>> result = map_reduce('abbccc', keyfunc, valuefunc, reducefunc)
>>> sorted(result.items())
[('A', 1), ('B', 2), ('C', 3)]

You may want to filter the input iterable before applying the map/reduce procedure:

>>> all_items = range(30)
>>> items = [x for x in all_items if 10 <= x <= 20]  # Filter
>>> keyfunc = lambda x: x % 2  # Evens map to 0; odds to 1
>>> categories = map_reduce(items, keyfunc=keyfunc)
>>> sorted(categories.items())
[(0, [10, 12, 14, 16, 18, 20]), (1, [11, 13, 15, 17, 19])]
>>> summaries = map_reduce(items, keyfunc=keyfunc, reducefunc=sum)
>>> sorted(summaries.items())
[(0, 90), (1, 75)]

Note that all items in the iterable are gathered into a list before the summarization step, which may require significant storage.

The returned object is a collections.defaultdict with the default_factory set to None, such that it behaves like a normal dictionary.

Selecting¶

These tools yield certain items from an iterable.

New itertools

class more_itertools.islice_extended(iterable, stop)[source]¶

class more_itertools.islice_extended(iterable, start, stop[, step])[source]

An extension of itertools.islice() that supports negative values for stop, start, and step.

>>> iterator = iter('abcdefgh')
>>> list(islice_extended(iterator, -4, -1))
['e', 'f', 'g']

Slices with negative values require some caching of iterable, but this function takes care to minimize the amount of memory required.

For example, you can use a negative step with an infinite iterator:

>>> from itertools import count
>>> list(islice_extended(count(), 110, 99, -2))
[110, 108, 106, 104, 102, 100]

You can also use slice notation directly:

>>> iterator = map(str, count())
>>> it = islice_extended(iterator)[10:20:2]
>>> list(it)
['10', '12', '14', '16', '18']

more_itertools.first(iterable[, default])[source]¶

Return the first item of iterable, or default if iterable is empty.

>>> first([0, 1, 2, 3])
0
>>> first([], 'some default')
'some default'

If default is not provided and there are no items in the iterable, raise ValueError.

first() is useful when you have a generator of expensive-to-retrieve values and want any arbitrary one. It is marginally shorter than next(iter(iterable), default).

more_itertools.last(iterable[, default])[source]¶

Return the last item of iterable, or default if iterable is empty.

>>> last([0, 1, 2, 3])
3
>>> last([], 'some default')
'some default'

If default is not provided and there are no items in the iterable, raise ValueError.

more_itertools.one(iterable, too_short=ValueError, too_long=ValueError)[source]¶

Return the first item from iterable, which is expected to contain only that item. Raise an exception if iterable is empty or has more than one item.

one() is useful for ensuring that an iterable contains only one item. For example, it can be used to retrieve the result of a database query that is expected to return a single row.

If iterable is empty, ValueError will be raised. You may specify a different exception with the too_short keyword:

>>> it = []
>>> one(it)
Traceback (most recent call last):
...
ValueError: too few items in iterable (expected 1)'
>>> too_short = IndexError('too few items')
>>> one(it, too_short=too_short)
Traceback (most recent call last):
...
IndexError: too few items

Similarly, if iterable contains more than one item, ValueError will be raised. You may specify a different exception with the too_long keyword:

>>> it = ['too', 'many']
>>> one(it)
Traceback (most recent call last):
...
ValueError: Expected exactly one item in iterable, but got 'too',
'many', and perhaps more.
>>> too_long = RuntimeError
>>> one(it, too_long=too_long)
Traceback (most recent call last):
...
RuntimeError

Note that one() attempts to advance iterable twice to ensure there is only one item. See spy() or peekable() to check iterable contents less destructively.

more_itertools.only(iterable, default=None, too_long=ValueError)[source]¶

If iterable has only one item, return it. If it has zero items, return default. If it has more than one item, raise the exception given by too_long, which is ValueError by default.

>>> only([], default='missing')
'missing'
>>> only([1])
1
>>> only([1, 2])
Traceback (most recent call last):
...
ValueError: Expected exactly one item in iterable, but got 1, 2,
 and perhaps more.'
>>> only([1, 2], too_long=TypeError)
Traceback (most recent call last):
...
TypeError

Note that only() attempts to advance iterable twice to ensure there is only one item. See spy() or peekable() to check iterable contents less destructively.

more_itertools.strictly_n(iterable, n, too_short=None, too_long=None)[source]¶

Validate that iterable has exactly n items and return them if it does. If it has fewer than n items, call function too_short with the actual number of items. If it has more than n items, call function too_long with the number n + 1.

>>> iterable = ['a', 'b', 'c', 'd']
>>> n = 4
>>> list(strictly_n(iterable, n))
['a', 'b', 'c', 'd']

Note that the returned iterable must be consumed in order for the check to be made.

By default, too_short and too_long are functions that raise ValueError.

>>> list(strictly_n('ab', 3))
Traceback (most recent call last):
...
ValueError: too few items in iterable (got 2)

>>> list(strictly_n('abc', 2))
Traceback (most recent call last):
...
ValueError: too many items in iterable (got at least 3)

You can instead supply functions that do something else. too_short will be called with the number of items in iterable. too_long will be called with n + 1.

>>> def too_short(item_count):
...     raise RuntimeError
>>> it = strictly_n('abcd', 6, too_short=too_short)
>>> list(it)
Traceback (most recent call last):
...
RuntimeError

>>> def too_long(item_count):
...     print('The boss is going to hear about this')
>>> it = strictly_n('abcdef', 4, too_long=too_long)
>>> list(it)
The boss is going to hear about this
['a', 'b', 'c', 'd']

more_itertools.strip(iterable, pred)[source]¶

Yield the items from iterable, but strip any from the beginning and end for which pred returns True.

For example, to remove a set of items from both ends of an iterable:

>>> iterable = (None, False, None, 1, 2, None, 3, False, None)
>>> pred = lambda x: x in {None, False, ''}
>>> list(strip(iterable, pred))
[1, 2, None, 3]

This function is analogous to str.strip().

more_itertools.lstrip(iterable, pred)[source]¶

Yield the items from iterable, but strip any from the beginning for which pred returns True.

For example, to remove a set of items from the start of an iterable:

>>> iterable = (None, False, None, 1, 2, None, 3, False, None)
>>> pred = lambda x: x in {None, False, ''}
>>> list(lstrip(iterable, pred))
[1, 2, None, 3, False, None]

This function is analogous to str.lstrip(), and is essentially a wrapper for itertools.dropwhile().

more_itertools.rstrip(iterable, pred)[source]¶

Yield the items from iterable, but strip any from the end for which pred returns True.

For example, to remove a set of items from the end of an iterable:

>>> iterable = (None, False, None, 1, 2, None, 3, False, None)
>>> pred = lambda x: x in {None, False, ''}
>>> list(rstrip(iterable, pred))
[None, False, None, 1, 2, None, 3]

This function is analogous to str.rstrip().

more_itertools.filter_except(validator, iterable, *exceptions)[source]¶

Yield the items from iterable for which the validator function does not raise one of the specified exceptions.

validator is called for each item in iterable. It should be a function that accepts one argument and raises an exception if that item is not valid.

>>> iterable = ['1', '2', 'three', '4', None]
>>> list(filter_except(int, iterable, ValueError, TypeError))
['1', '2', '4']

If an exception other than one given by exceptions is raised by validator, it is raised like normal.

more_itertools.map_except(function, iterable, *exceptions)[source]¶

Transform each item from iterable with function and yield the result, unless function raises one of the specified exceptions.

function is called to transform each item in iterable. It should accept one argument.

>>> iterable = ['1', '2', 'three', '4', None]
>>> list(map_except(int, iterable, ValueError, TypeError))
[1, 2, 4]

If an exception other than one given by exceptions is raised by function, it is raised like normal.

more_itertools.filter_map(func, iterable)[source]¶

Apply func to every element of iterable, yielding only those which are not None.

>>> elems = ['1', 'a', '2', 'b', '3']
>>> list(filter_map(lambda s: int(s) if s.isnumeric() else None, elems))
[1, 2, 3]

more_itertools.iter_suppress(iterable, *exceptions)[source]¶

Yield each of the items from iterable. If the iteration raises one of the specified exceptions, that exception will be suppressed and iteration will stop.

>>> from itertools import chain
>>> def breaks_at_five(x):
...     while True:
...         if x >= 5:
...             raise RuntimeError
...         yield x
...         x += 1
>>> it_1 = iter_suppress(breaks_at_five(1), RuntimeError)
>>> it_2 = iter_suppress(breaks_at_five(2), RuntimeError)
>>> list(chain(it_1, it_2))
[1, 2, 3, 4, 2, 3, 4]

more_itertools.nth_or_last(iterable, n[, default])[source]¶

Return the nth or the last item of iterable, or default if iterable is empty.

>>> nth_or_last([0, 1, 2, 3], 2)
2
>>> nth_or_last([0, 1], 2)
1
>>> nth_or_last([], 0, 'some default')
'some default'

If default is not provided and there are no items in the iterable, raise ValueError.

more_itertools.extract(iterable, indices, *, monotonic=False)[source]¶

Yield values at the specified indices.

Example:

>>> data = 'abcdefghijklmnopqrstuvwxyz'
>>> list(extract(data, [7, 4, 11, 11, 14]))
['h', 'e', 'l', 'l', 'o']

The iterable is consumed lazily and can be infinite.

When monotonic is false, the indices are consumed immediately and must be finite. When monotonic is true, indices are consumed lazily and can be infinite but must be non-decreasing.

Raises IndexError if an index lies beyond the iterable. Raises ValueError for a negative index or for a decreasing index when monotonic is true.

more_itertools.unique_in_window(iterable, n, key=None)[source]¶

Yield the items from iterable that haven’t been seen recently. n is the size of the sliding window.

>>> iterable = [0, 1, 0, 2, 3, 0]
>>> n = 3
>>> list(unique_in_window(iterable, n))
[0, 1, 2, 3, 0]

The key function, if provided, will be used to determine uniqueness:

>>> list(unique_in_window('abAcda', 3, key=lambda x: x.lower()))
['a', 'b', 'c', 'd', 'a']

Updates a sliding window no larger than n and yields a value if the item only occurs once in the updated window.

When n == 1, unique_in_window is memoryless:

>>> list(unique_in_window('aab', n=1))
['a', 'a', 'b']

The items in iterable must be hashable.

more_itertools.duplicates_everseen(iterable, key=None)[source]¶

Yield duplicate elements after their first appearance.

>>> list(duplicates_everseen('mississippi'))
['s', 'i', 's', 's', 'i', 'p', 'i']
>>> list(duplicates_everseen('AaaBbbCccAaa', str.lower))
['a', 'a', 'b', 'b', 'c', 'c', 'A', 'a', 'a']

This function is analogous to unique_everseen() and is subject to the same performance considerations.

more_itertools.duplicates_justseen(iterable, key=None)[source]¶

Yields serially-duplicate elements after their first appearance.

>>> list(duplicates_justseen('mississippi'))
['s', 's', 'p']
>>> list(duplicates_justseen('AaaBbbCccAaa', str.lower))
['a', 'a', 'b', 'b', 'c', 'c', 'a', 'a']

This function is analogous to unique_justseen().

more_itertools.classify_unique(iterable, key=None)[source]¶

Classify each element in terms of its uniqueness.

For each element in the input iterable, return a 3-tuple consisting of:

The element itself
False if the element is equal to the one preceding it in the input, True otherwise (i.e. the equivalent of unique_justseen())
False if this element has been seen anywhere in the input before, True otherwise (i.e. the equivalent of unique_everseen())

>>> list(classify_unique('otto'))
[('o', True,  True),
 ('t', True,  True),
 ('t', False, False),
 ('o', True,  False)]

This function is analogous to unique_everseen() and is subject to the same performance considerations.

more_itertools.longest_common_prefix(iterables)[source]¶

Yield elements of the longest common prefix among given iterables.

>>> ''.join(longest_common_prefix(['abcd', 'abc', 'abf']))
'ab'

more_itertools.takewhile_inclusive(predicate, iterable)[source]¶

A variant of takewhile() that yields one additional element.

>>> list(takewhile_inclusive(lambda x: x < 5, [1, 4, 6, 4, 1]))
[1, 4, 6]

takewhile() would return [1, 4].

Itertools recipes

more_itertools.nth(iterable, n, default=None)[source]¶

Returns the nth item or a default value.

>>> l = range(10)
>>> nth(l, 3)
3
>>> nth(l, 20, "zebra")
'zebra'

more_itertools.before_and_after(predicate, it)[source]¶

A variant of takewhile() that allows complete access to the remainder of the iterator.

>>> it = iter('ABCdEfGhI')
>>> all_upper, remainder = before_and_after(str.isupper, it)
>>> ''.join(all_upper)
'ABC'
>>> ''.join(remainder) # takewhile() would lose the 'd'
'dEfGhI'

Note that the first iterator must be fully consumed before the second iterator can generate valid results.

more_itertools.take(n, iterable)[source]¶

Return first n items of the iterable as a list.

>>> take(3, range(10))
[0, 1, 2]

If there are fewer than n items in the iterable, all of them are returned.

>>> take(10, range(3))
[0, 1, 2]

more_itertools.tail(n, iterable)[source]¶

Return an iterator over the last n items of iterable.

>>> t = tail(3, 'ABCDEFG')
>>> list(t)
['E', 'F', 'G']

more_itertools.unique_everseen(iterable, key=None)[source]¶

Yield unique elements, preserving order. Remember all elements ever seen.

>>> list(unique_everseen('AAAABBBCCDAABBB'))
['A', 'B', 'C', 'D']
>>> list(unique_everseen('ABBCcAD', str.casefold))
['A', 'B', 'C', 'D']

Raises TypeError for unhashable items.

Some unhashable objects can be converted to hashable objects using the key parameter:

For list objects, try key=tuple.
For set objects, try key=frozenset.
For dict objects, try key=lambda x: frozenset(x.items()) or in Python 3.15 and later, set key=frozendict.

Alternatively, consider the unique() itertool recipe. It sorts the data and then uses equality to eliminate duplicates. Hashability is not required.

more_itertools.unique_justseen(iterable, key=None)[source]¶

Yields elements in order, ignoring serial duplicates

>>> list(unique_justseen('AAAABBBCCDAABBB'))
['A', 'B', 'C', 'D', 'A', 'B']
>>> list(unique_justseen('ABBCcAD', str.lower))
['A', 'B', 'C', 'A', 'D']

more_itertools.unique(iterable, key=None, reverse=False)[source]¶

Yields unique elements in sorted order.

>>> list(unique([[1, 2], [3, 4], [1, 2]]))
[[1, 2], [3, 4]]

key and reverse are passed to sorted().

>>> list(unique('ABBcCAD', str.casefold))
['A', 'B', 'c', 'D']
>>> list(unique('ABBcCAD', str.casefold, reverse=True))
['D', 'c', 'B', 'A']

The elements in iterable need not be hashable, but they must be comparable for sorting to work.

Combinatorics¶

These tools yield combinatorial arrangements of items from iterables.

New itertools

more_itertools.distinct_permutations(iterable, r=None)[source]¶

Yield successive distinct permutations of the elements in iterable.

>>> sorted(distinct_permutations([1, 0, 1]))
[(0, 1, 1), (1, 0, 1), (1, 1, 0)]

Equivalent to yielding from set(permutations(iterable)), except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.

Duplicate permutations arise when there are duplicated elements in the input iterable. The number of items returned is n! / (x_1! * x_2! * … * x_n!), where n is the total number of items input, and each x_i is the count of a distinct item in the input sequence. The function multinomial() computes this directly.

If r is given, only the r-length permutations are yielded.

>>> sorted(distinct_permutations([1, 0, 1], r=2))
[(0, 1), (1, 0), (1, 1)]
>>> sorted(distinct_permutations(range(3), r=2))
[(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]

iterable need not be sortable, but note that using equal (x == y) but non-identical (id(x) != id(y)) elements may produce surprising behavior. For example, 1 and True are equal but non-identical:

>>> list(distinct_permutations([1, True, '3']))
[
    (1, True, '3'),
    (1, '3', True),
    ('3', 1, True)
]
>>> list(distinct_permutations([1, 2, '3']))
[
    (1, 2, '3'),
    (1, '3', 2),
    (2, 1, '3'),
    (2, '3', 1),
    ('3', 1, 2),
    ('3', 2, 1)
]

more_itertools.distinct_combinations(iterable, r)[source]¶

Yield the distinct combinations of r items taken from iterable.

>>> list(distinct_combinations([0, 0, 1], 2))
[(0, 0), (0, 1)]

Equivalent to set(combinations(iterable)), except duplicates are not generated and thrown away. For larger input sequences this is much more efficient.

more_itertools.nth_combination_with_replacement(iterable, r, index)[source]¶

Equivalent to list(combinations_with_replacement(iterable, r))[index].

The subsequences with repetition of iterable that are of length r can be ordered lexicographically. nth_combination_with_replacement() computes the subsequence at sort position index directly, without computing the previous subsequences with replacement.

>>> nth_combination_with_replacement(range(5), 3, 5)
(0, 1, 1)

ValueError will be raised If r is negative. IndexError will be raised if the given index is invalid.

more_itertools.circular_shifts(iterable, steps=1)[source]¶

Yield the circular shifts of iterable.

>>> list(circular_shifts(range(4)))
[(0, 1, 2, 3), (1, 2, 3, 0), (2, 3, 0, 1), (3, 0, 1, 2)]

Set steps to the number of places to rotate to the left (or to the right if negative). Defaults to 1.

>>> list(circular_shifts(range(4), 2))
[(0, 1, 2, 3), (2, 3, 0, 1)]

>>> list(circular_shifts(range(4), -1))
[(0, 1, 2, 3), (3, 0, 1, 2), (2, 3, 0, 1), (1, 2, 3, 0)]

more_itertools.partitions(iterable)[source]¶

Yield all possible order-preserving partitions of iterable.

>>> iterable = 'abc'
>>> for part in partitions(iterable):
...     print([''.join(p) for p in part])
['abc']
['a', 'bc']
['ab', 'c']
['a', 'b', 'c']

This is unrelated to partition().

more_itertools.set_partitions(iterable, k=None, min_size=None, max_size=None)[source]¶

Yield the set partitions of iterable into k parts. Set partitions are not order-preserving.

>>> iterable = 'abc'
>>> for part in set_partitions(iterable, 2):
...     print([''.join(p) for p in part])
['a', 'bc']
['ab', 'c']
['b', 'ac']

If k is not given, every set partition is generated.

>>> iterable = 'abc'
>>> for part in set_partitions(iterable):
...     print([''.join(p) for p in part])
['abc']
['a', 'bc']
['ab', 'c']
['b', 'ac']
['a', 'b', 'c']

if min_size and/or max_size are given, the minimum and/or maximum size per block in partition is set.

>>> iterable = 'abc'
>>> for part in set_partitions(iterable, min_size=2):
...     print([''.join(p) for p in part])
['abc']
>>> for part in set_partitions(iterable, max_size=2):
...     print([''.join(p) for p in part])
['a', 'bc']
['ab', 'c']
['b', 'ac']
['a', 'b', 'c']

more_itertools.product_index(element, *iterables, repeat=1)[source]¶

Equivalent to list(product(*iterables, repeat=repeat)).index(tuple(element))

The products of iterables can be ordered lexicographically. product_index() computes the first index of element without computing the previous products.

>>> product_index([8, 2], range(10), range(5))
42

The repeat keyword argument specifies the number of repetitions of the iterables:

>>> product_index([8, 0, 7], range(10), repeat=3)
807

ValueError will be raised if the given element isn’t in the product of args.

more_itertools.combination_index(element, iterable)[source]¶

Equivalent to list(combinations(iterable, r)).index(element)

The subsequences of iterable that are of length r can be ordered lexicographically. combination_index() computes the index of the first element, without computing the previous combinations.

>>> combination_index('adf', 'abcdefg')
10

ValueError will be raised if the given element isn’t one of the combinations of iterable.

more_itertools.permutation_index(element, iterable)[source]¶

Equivalent to list(permutations(iterable, r)).index(element)`

The subsequences of iterable that are of length r where order is important can be ordered lexicographically. permutation_index() computes the index of the first element directly, without computing the previous permutations.

>>> permutation_index([1, 3, 2], range(5))
19

ValueError will be raised if the given element isn’t one of the permutations of iterable.

more_itertools.combination_with_replacement_index(element, iterable)[source]¶

Equivalent to list(combinations_with_replacement(iterable, r)).index(element)

The subsequences with repetition of iterable that are of length r can be ordered lexicographically. combination_with_replacement_index() computes the index of the first element, without computing the previous combinations with replacement.

>>> combination_with_replacement_index('adf', 'abcdefg')
20

ValueError will be raised if the given element isn’t one of the combinations with replacement of iterable.

more_itertools.derangements(iterable, r=None)[source]¶

Yield successive derangements of the elements in iterable.

A derangement is a permutation in which no element appears at its original index. In other words, a derangement is a permutation that has no fixed points.

Suppose Alice, Bob, Carol, and Dave are playing Secret Santa. The code below outputs all of the different ways to assign gift recipients such that nobody is assigned to himself or herself:

>>> for d in derangements(['Alice', 'Bob', 'Carol', 'Dave']):
...    print(', '.join(d))
Bob, Alice, Dave, Carol
Bob, Carol, Dave, Alice
Bob, Dave, Alice, Carol
Carol, Alice, Dave, Bob
Carol, Dave, Alice, Bob
Carol, Dave, Bob, Alice
Dave, Alice, Bob, Carol
Dave, Carol, Alice, Bob
Dave, Carol, Bob, Alice

If r is given, only the r-length derangements are yielded.

>>> sorted(derangements(range(3), 2))
[(1, 0), (1, 2), (2, 0)]
>>> sorted(derangements([0, 2, 3], 2))
[(2, 0), (2, 3), (3, 0)]

Elements are treated as unique based on their position, not on their value.

Consider the Secret Santa example with two different people who have the same name. Then there are two valid gift assignments even though it might appear that a person is assigned to themselves:

>>> names = ['Alice', 'Bob', 'Bob']
>>> list(derangements(names))
[('Bob', 'Bob', 'Alice'), ('Bob', 'Alice', 'Bob')]

To avoid confusion, make the inputs distinct:

>>> deduped = [f'{name}{index}' for index, name in enumerate(names)]
>>> list(derangements(deduped))
[('Bob1', 'Bob2', 'Alice0'), ('Bob2', 'Alice0', 'Bob1')]

The number of derangements of a set of size n is known as the “subfactorial of n”. For n > 0, the subfactorial is: round(math.factorial(n) / math.e). The more-itertools function subfactorial() computes this directly.

References:

more_itertools.gray_product(*iterables, repeat=1)[source]¶

Like itertools.product(), but return tuples in an order such that only one element in the generated tuple changes from one iteration to the next.

>>> list(gray_product('AB','CD'))
[('A', 'C'), ('B', 'C'), ('B', 'D'), ('A', 'D')]

The repeat keyword argument specifies the number of repetitions of the iterables. For example, gray_product('AB', repeat=3) is equivalent to gray_product('AB', 'AB', 'AB').

This function consumes all of the input iterables before producing output. If any of the input iterables have fewer than two items, ValueError is raised.

For information on the algorithm, see this section of Donald Knuth’s The Art of Computer Programming.

more_itertools.outer_product(func, xs, ys, *args, **kwargs)[source]¶

A generalized outer product that applies a binary function to all pairs of items. Returns a 2D matrix with len(xs) rows and len(ys) columns. Also accepts *args and **kwargs that are passed to func.

Multiplication table:

>>> from operator import mul
>>> list(outer_product(mul, range(1, 4), range(1, 6)))
[(1, 2, 3, 4, 5), (2, 4, 6, 8, 10), (3, 6, 9, 12, 15)]

Cross tabulation:

>>> xs = ['A', 'B', 'A', 'A', 'B', 'B', 'A', 'A', 'B', 'B']
>>> ys = ['X', 'X', 'X', 'Y', 'Z', 'Z', 'Y', 'Y', 'Z', 'Z']
>>> pair_counts = Counter(zip(xs, ys))
>>> count_rows = lambda x, y: pair_counts[x, y]
>>> list(outer_product(count_rows, sorted(set(xs)), sorted(set(ys))))
[(2, 3, 0), (1, 0, 4)]

Usage with *args and **kwargs:

>>> animals = ['cat', 'wolf', 'mouse']
>>> list(outer_product(min, animals, animals, key=len))
[('cat', 'cat', 'cat'), ('cat', 'wolf', 'wolf'), ('cat', 'wolf', 'mouse')]

more_itertools.powerset_of_sets(iterable, *, baseset=<class 'set'>)[source]¶

Yields all possible subsets of the iterable.

>>> list(powerset_of_sets([1, 2, 3]))
[set(), {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}]
>>> list(powerset_of_sets([1, 1, 0]))
[set(), {1}, {0}, {0, 1}]

powerset_of_sets() takes care to minimize the number of hash operations performed.

The baseset parameter determines what kind of sets are constructed, either set or frozenset.

Itertools recipes

more_itertools.powerset(iterable)[source]¶

Yields all possible subsets of the iterable.

>>> list(powerset([1, 2, 3]))
[(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]

powerset() will operate on iterables that aren’t set instances, so repeated elements in the input will produce repeated elements in the output.

>>> seq = [1, 1, 0]
>>> list(powerset(seq))
[(), (1,), (1,), (0,), (1, 1), (1, 0), (1, 0), (1, 1, 0)]

For a variant that efficiently yields actual set instances, see powerset_of_sets().

more_itertools.random_product(*iterables, repeat=1)[source]¶

Draw an item at random from each of the input iterables.

>>> random_product('abc', range(4), 'XYZ')
('c', 3, 'Z')

If repeat is provided as a keyword argument, that many items will be drawn from each iterable.

>>> random_product('abcd', range(4), repeat=2)
('a', 2, 'd', 3)

This equivalent to taking a random selection from itertools.product(*args, repeat=repeat).

more_itertools.random_permutation(iterable, r=None)[source]¶

Return a random r length permutation of the elements in iterable.

If r is not specified or is None, then r defaults to the length of iterable.

>>> random_permutation(range(5))
(3, 4, 0, 1, 2)

This equivalent to taking a random selection from itertools.permutations(iterable, r).

more_itertools.random_combination(iterable, r)[source]¶

Return a random r length subsequence of the elements in iterable.

>>> random_combination(range(5), 3)
(2, 3, 4)

This equivalent to taking a random selection from itertools.combinations(iterable, r).

more_itertools.random_combination_with_replacement(iterable, r)[source]¶

Return a random r length subsequence of elements in iterable, allowing individual elements to be repeated.

>>> random_combination_with_replacement(range(3), 5)
(0, 0, 1, 2, 2)

This equivalent to taking a random selection from itertools.combinations_with_replacement(iterable, r).

more_itertools.random_derangement(iterable)[source]¶

Return a random derangement of elements in the iterable.

Equivalent to but much faster than choice(list(derangements(iterable))).

more_itertools.nth_product(index, *iterables, repeat=1)[source]¶

Equivalent to list(product(*iterables, repeat=repeat))[index].

The products of iterables can be ordered lexicographically. nth_product() computes the product at sort position index without computing the previous products.

>>> nth_product(8, range(2), range(2), range(2), range(2))
(1, 0, 0, 0)

The repeat keyword argument specifies the number of repetitions of the iterables. The above example is equivalent to:

>>> nth_product(8, range(2), repeat=4)
(1, 0, 0, 0)

IndexError will be raised if the given index is invalid.

more_itertools.nth_permutation(iterable, r, index)[source]¶

Equivalent to list(permutations(iterable, r))[index]`

The subsequences of iterable that are of length r where order is important can be ordered lexicographically. nth_permutation() computes the subsequence at sort position index directly, without computing the previous subsequences.

>>> nth_permutation('ghijk', 2, 5)
('h', 'i')

ValueError will be raised If r is negative. IndexError will be raised if the given index is invalid.

more_itertools.nth_combination(iterable, r, index)[source]¶

Equivalent to list(combinations(iterable, r))[index].

The subsequences of iterable that are of length r can be ordered lexicographically. nth_combination() computes the subsequence at sort position index directly, without computing the previous subsequences.

>>> nth_combination(range(5), 3, 5)
(0, 3, 4)

ValueError will be raised If r is negative. IndexError will be raised if the given index is invalid.

Wrapping¶

These tools provide wrappers to smooth working with objects that produce or consume iterables.

New itertools

more_itertools.always_iterable(obj, base_type=(<class 'str'>, <class 'bytes'>))[source]¶

If obj is iterable, return an iterator over its items:

>>> obj = (1, 2, 3)
>>> list(always_iterable(obj))
[1, 2, 3]

If obj is not iterable, return a one-item iterable containing obj:

>>> obj = 1
>>> list(always_iterable(obj))
[1]

If obj is None, return an empty iterable:

>>> obj = None
>>> list(always_iterable(None))
[]

By default, binary and text strings are not considered iterable:

>>> obj = 'foo'
>>> list(always_iterable(obj))
['foo']

If base_type is set, objects for which isinstance(obj, base_type) returns True won’t be considered iterable.

>>> obj = {'a': 1}
>>> list(always_iterable(obj))  # Iterate over the dict's keys
['a']
>>> list(always_iterable(obj, base_type=dict))  # Treat dicts as a unit
[{'a': 1}]

Set base_type to None to avoid any special handling and treat objects Python considers iterable as iterable:

>>> obj = 'foo'
>>> list(always_iterable(obj, base_type=None))
['f', 'o', 'o']

more_itertools.always_reversible(iterable)[source]¶

An extension of reversed() that supports all iterables, not just those which implement the Reversible or Sequence protocols.

>>> print(*always_reversible(x for x in range(3)))
2 1 0

If the iterable is already reversible, this function returns the result of reversed(). If the iterable is not reversible, this function will cache the remaining items in the iterable and yield them in reverse order, which may require significant storage.

class more_itertools.callback_iter(func, callback_kwd='callback', wait_seconds=0.1)[source]¶

Convert a function that uses callbacks to an iterator.

Deprecated since version 11.0.0: Will be removed in a future major release.

Let func be a function that takes a callback keyword argument. For example:

>>> def func(callback=None):
...     for i, c in [(1, 'a'), (2, 'b'), (3, 'c')]:
...         if callback:
...             callback(i, c)
...     return 4

Use with callback_iter(func) to get an iterator over the parameters that are delivered to the callback.

>>> with callback_iter(func) as it:
...     for args, kwargs in it:
...         print(args)
(1, 'a')
(2, 'b')
(3, 'c')

The function will be called in a background thread. The done property indicates whether it has completed execution.

>>> it.done
True

If it completes successfully, its return value will be available in the result property.

>>> it.result
4

Notes:

If the function uses some keyword argument besides callback, supply callback_kwd.
If it finished executing, but raised an exception, accessing the result property will raise the same exception.
If it hasn’t finished executing, accessing the result property from within the with block will raise RuntimeError.
If it hasn’t finished executing, accessing the result property from outside the with block will raise a more_itertools.AbortThread exception.
Provide wait_seconds to adjust how frequently the it is polled for output.

class more_itertools.countable(iterable)[source]¶

Wrap iterable and keep a count of how many items have been consumed.

The items_seen attribute starts at 0 and increments as the iterable is consumed:

>>> iterable = map(str, range(10))
>>> it = countable(iterable)
>>> it.items_seen
0
>>> next(it), next(it)
('0', '1')
>>> list(it)
['2', '3', '4', '5', '6', '7', '8', '9']
>>> it.items_seen
10

more_itertools.consumer(func)[source]¶

Decorator that automatically advances a PEP-342-style “reverse iterator” to its first yield point so you don’t have to call next() on it manually.

>>> @consumer
... def tally():
...     i = 0
...     while True:
...         print('Thing number %s is %s.' % (i, (yield)))
...         i += 1
...
>>> t = tally()
>>> t.send('red')
Thing number 0 is red.
>>> t.send('fish')
Thing number 1 is fish.

Without the decorator, you would have to call next(t) before t.send() could be used.

class more_itertools.sized_iterator(iterable, length)[source]¶

Wrapper for iterable that implements __len__.

>>> it = map(str, range(5))
>>> sized_it = sized_iterator(it, 5)
>>> len(sized_it)
5
>>> list(sized_it)
['0', '1', '2', '3', '4']

This is useful for tools that use len(), like tqdm .

The wrapper doesn’t validate the provided length, so be sure to choose a value that reflects reality.

more_itertools.with_iter(context_manager)[source]¶

Wrap an iterable in a with statement, so it closes once exhausted.

For example, this will close the file when the iterator is exhausted:

upper_lines = (line.upper() for line in with_iter(open('foo')))

Note that you have to actually exhaust the iterator for opened files to be closed.

Any context manager which returns an iterable is a candidate for with_iter.

Itertools recipes

more_itertools.iter_except(function, exception, first=None)[source]¶

Yields results from a function repeatedly until an exception is raised.

Converts a call-until-exception interface to an iterator interface. Like iter(function, sentinel), but uses an exception instead of a sentinel to end the loop.

>>> l = [0, 1, 2]
>>> list(iter_except(l.pop, IndexError))
[2, 1, 0]

Multiple exceptions can be specified as a stopping condition:

>>> l = [1, 2, 3, '...', 4, 5, 6]
>>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError)))
[7, 6, 5]
>>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError)))
[4, 3, 2]
>>> list(iter_except(lambda: 1 + l.pop(), (IndexError, TypeError)))
[]

Math¶

These tools work with most numeric data types.

New itertools

more_itertools.dft(xarr)[source]¶

Discrete Fourier Transform. xarr is a sequence of complex numbers. Yields the components of the corresponding transformed output vector.

>>> import cmath
>>> xarr = [1, 2-1j, -1j, -1+2j]  # time domain
>>> Xarr = [2, -2-2j, -2j, 4+4j]  # frequency domain
>>> magnitudes, phases = zip(*map(cmath.polar, Xarr))
>>> all(map(cmath.isclose, dft(xarr), Xarr))
True

Inputs are restricted to numeric types that can add and multiply with a complex number. This includes int, float, complex, and Fraction, but excludes Decimal.

See idft() for the inverse Discrete Fourier Transform.

more_itertools.idft(Xarr)[source]¶

Inverse Discrete Fourier Transform. Xarr is a sequence of complex numbers. Yields the components of the corresponding inverse-transformed output vector.

>>> import cmath
>>> xarr = [1, 2-1j, -1j, -1+2j]  # time domain
>>> Xarr = [2, -2-2j, -2j, 4+4j]  # frequency domain
>>> all(map(cmath.isclose, idft(Xarr), xarr))
True

Inputs are restricted to numeric types that can add and multiply with a complex number. This includes int, float, complex, and Fraction, but excludes Decimal.

See dft() for the Discrete Fourier Transform.

Itertools recipes

more_itertools.convolve(signal, kernel)[source]¶

Discrete linear convolution of two iterables. Equivalent to polynomial multiplication.

For example, multiplying (x² -x - 20) by (x - 3) gives (x³ -4x² -17x + 60).

>>> list(convolve([1, -1, -20], [1, -3]))
[1, -4, -17, 60]

Examples of popular kinds of kernels:

The kernel [0.25, 0.25, 0.25, 0.25] computes a moving average. For image data, this blurs the image and reduces noise.
The kernel [1/2, 0, -1/2] estimates the first derivative of a function evaluated at evenly spaced inputs.
The kernel [1, -2, 1] estimates the second derivative of a function evaluated at evenly spaced inputs.

Convolutions are mathematically commutative; however, the inputs are evaluated differently. The signal is consumed lazily and can be infinite. The kernel is fully consumed before the calculations begin.

Supports all numeric types: int, float, complex, Decimal, Fraction.

References:

Article: https://betterexplained.com/articles/intuitive-convolution/
Video by 3Blue1Brown: https://www.youtube.com/watch?v=KuXjwB4LzSA

more_itertools.dotproduct(vec1, vec2)[source]¶

Returns the dot product of the two iterables.

>>> dotproduct([10, 15, 12], [0.65, 0.80, 1.25])
33.5
>>> 10 * 0.65 + 15 * 0.80 + 12 * 1.25
33.5

In Python 3.12 and later, use math.sumprod() instead.

more_itertools.matmul(m1, m2)[source]¶

Multiply two matrices.

>>> list(matmul([(7, 5), (3, 5)], [(2, 5), (7, 9)]))
[(49, 80), (41, 60)]

The caller should ensure that the dimensions of the input matrices are compatible with each other.

Supports all numeric types: int, float, complex, Decimal, Fraction.

more_itertools.polynomial_from_roots(roots)[source]¶

Compute a polynomial’s coefficients from its roots.

>>> roots = [5, -4, 3]            # (x - 5) * (x + 4) * (x - 3)
>>> polynomial_from_roots(roots)  # x³ - 4 x² - 17 x + 60
[1, -4, -17, 60]

Note that polynomial coefficients are specified in descending power order.

Supports all numeric types: int, float, complex, Decimal, Fraction.

more_itertools.polynomial_derivative(coefficients)[source]¶

Compute the first derivative of a polynomial.

Evaluate the derivative of x³ - 4 x² - 17 x + 60:

>>> coefficients = [1, -4, -17, 60]
>>> derivative_coefficients = polynomial_derivative(coefficients)
>>> derivative_coefficients
[3, -8, -17]

Note that polynomial coefficients are specified in descending power order.

Supports all numeric types: int, float, complex, Decimal, Fraction.

more_itertools.polynomial_eval(coefficients, x)[source]¶

Evaluate a polynomial at a specific value.

Computes with better numeric stability than Horner’s method.

Evaluate x^3 - 4 * x^2 - 17 * x + 60 at x = 2.5:

>>> coefficients = [1, -4, -17, 60]
>>> x = 2.5
>>> polynomial_eval(coefficients, x)
8.125

Note that polynomial coefficients are specified in descending power order.

Supports all numeric types: int, float, complex, Decimal, Fraction.

more_itertools.sum_of_squares(iterable)[source]¶

Return the sum of the squares of the input values.

>>> sum_of_squares([10, 20, 30])
1400

Supports all numeric types: int, float, complex, Decimal, Fraction.

Integer Math¶

New itertools

The tools focus on math with integers.

more_itertools.nth_prime(n, *, approximate=False)[source]¶

Return the nth prime (counting from 0).

>>> nth_prime(0)
2
>>> nth_prime(100)
547

If approximate is set to True, will return a prime close to the nth prime. The estimation is much faster than computing an exact result.

>>> nth_prime(200_000_000, approximate=True)  # Exact result is 4222234763
4217820427

more_itertools.subfactorial(n)[source]¶

Number of permutations of n elements with no fixed points.

The subfactorial() function computes the length of derangements(). For example, there are 1,854 ways to rearrange the letters in word “epsilon” without leaving any letter in its original position:

>>> from more_itertools import derangements, ilen
>>> ilen(derangements('epsilon'))
1854
>>> subfactorial(len('epsilon'))
1854

Reference: https://oeis.org/A000166

Itertools recipes

more_itertools.factor(n)[source]¶

Yield the prime factors of n.

>>> list(factor(360))
[2, 2, 2, 3, 3, 5]

Finds small factors with trial division. Larger factors are either verified as prime with is_prime or split into smaller factors with Pollard’s rho algorithm.

more_itertools.is_prime(n)[source]¶

Return True if n is prime and False otherwise.

Basic examples:

>>> is_prime(37)
True
>>> is_prime(3 * 13)
False
>>> is_prime(18_446_744_073_709_551_557)
True

Find the next prime over one billion:

>>> next(filter(is_prime, count(10**9)))
1000000007

Generate random primes up to 200 bits and up to 60 decimal digits:

>>> from random import seed, randrange, getrandbits
>>> seed(18675309)

>>> next(filter(is_prime, map(getrandbits, repeat(200))))
893303929355758292373272075469392561129886005037663238028407

>>> next(filter(is_prime, map(randrange, repeat(10**60))))
269638077304026462407872868003560484232362454342414618963649

This function is exact for values of n below 10**24. For larger inputs, the probabilistic Miller-Rabin primality test has a less than 1 in 2**128 chance of a false positive.

more_itertools.multinomial(*counts)[source]¶

Number of distinct arrangements of a multiset.

The expression multinomial(3, 4, 2) has several equivalent interpretations:

In the expansion of (a + b + c)⁹, the coefficient of the a³b⁴c² term is 1260.
There are 1260 distinct ways to arrange 9 balls consisting of 3 reds, 4 greens, and 2 blues.
There are 1260 unique ways to place 9 distinct objects into three bins with sizes 3, 4, and 2.

The multinomial() function computes the length of distinct_permutations(). For example, there are 83,160 distinct anagrams of the word “abracadabra”:

>>> from more_itertools import distinct_permutations, ilen
>>> ilen(distinct_permutations('abracadabra'))
83160

This can be computed directly from the letter counts, 5a 2b 2r 1c 1d:

>>> from collections import Counter
>>> list(Counter('abracadabra').values())
[5, 2, 2, 1, 1]
>>> multinomial(5, 2, 2, 1, 1)
83160

A binomial coefficient is a special case of multinomial where there are only two categories. For example, the number of ways to arrange 12 balls with 5 reds and 7 blues is multinomial(5, 7) or math.comb(12, 5).

Likewise, factorial is a special case of multinomial where the multiplicities are all just 1 so that multinomial(1, 1, 1, 1, 1, 1, 1) == math.factorial(7).

Reference: https://en.wikipedia.org/wiki/Multinomial_theorem

more_itertools.sieve(n)[source]¶

Yield the primes less than n.

>>> list(sieve(30))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

more_itertools.totient(n)[source]¶

Return the count of natural numbers up to n that are coprime with n.

Euler’s totient function φ(n) gives the number of totatives. Totative are integers k in the range 1 ≤ k ≤ n such that gcd(n, k) = 1.

>>> n = 9
>>> totient(n)
6

>>> totatives = [x for x in range(1, n) if gcd(n, x) == 1]
>>> totatives
[1, 2, 4, 5, 7, 8]
>>> len(totatives)
6

Reference: https://en.wikipedia.org/wiki/Euler%27s_totient_function

Statistics¶

Itertools recipes

The tools focus on simple statistics.

more_itertools.running_min(iterable, *, maxlen=None)[source]¶

Smallest of values seen so far or values in a sliding window.

Set maxlen to a positive integer to specify the maximum size of the sliding window. The default of None is equivalent to an unbounded window.

For example:

>>> list(running_min([4, 3, 7, 0, 8, 1, 6, 2, 9, 5]))
[4, 3, 3, 0, 0, 0, 0, 0, 0, 0]

>>> list(running_min([4, 3, 7, 0, 8, 1, 6, 2, 9, 5], maxlen=3))
[4, 3, 3, 0, 0, 0, 1, 1, 2, 2]

Supports numeric types such as int, float, Decimal, and Fraction, but not complex numbers which are unorderable.

more_itertools.running_max(iterable, *, maxlen=None)[source]¶

Largest of values seen so far or values in a sliding window.

Set maxlen to a positive integer to specify the maximum size of the sliding window. The default of None is equivalent to an unbounded window.

For example:

>>> list(running_max([4, 3, 7, 0, 8, 1, 6, 2, 9, 5]))
[4, 4, 7, 7, 8, 8, 8, 8, 9, 9]

>>> list(running_max([4, 3, 7, 0, 8, 1, 6, 2, 9, 5], maxlen=3))
[4, 4, 7, 7, 8, 8, 8, 6, 9, 9]

Supports numeric types such as int, float, Decimal, and Fraction, but not complex numbers which are unorderable.

more_itertools.running_mean(iterable, *, maxlen=None)[source]¶

Cumulative mean of values seen so far or values in a sliding window.

Set maxlen to a positive integer to specify the maximum size of the sliding window. The default of None is equivalent to an unbounded window.

For example:

>>> list(running_mean([40, 30, 50, 46, 39, 44]))
[40.0, 35.0, 40.0, 41.5, 41.0, 41.5]

>>> list(running_mean([40, 30, 50, 46, 39, 44], maxlen=3))
[40.0, 35.0, 40.0, 42.0, 45.0, 43.0]

Supports numeric types such as int, float, complex, Decimal, and Fraction.

No extra effort is made to reduce round-off errors for float inputs. So the results may be slightly different from statistics.mean.

more_itertools.running_median(iterable, *, maxlen=None)[source]¶

Cumulative median of values seen so far or values in a sliding window.

Set maxlen to a positive integer to specify the maximum size of the sliding window. The default of None is equivalent to an unbounded window.

For example:

>>> list(running_median([5.0, 9.0, 4.0, 12.0, 8.0, 9.0]))
[5.0, 7.0, 5.0, 7.0, 8.0, 8.5]
>>> list(running_median([5.0, 9.0, 4.0, 12.0, 8.0, 9.0], maxlen=3))
[5.0, 7.0, 5.0, 9.0, 8.0, 9.0]

Supports numeric types such as int, float, Decimal, and Fraction, but not complex numbers which are unorderable.

On version Python 3.13 and prior, max-heaps are simulated with negative values. The negation causes Decimal inputs to apply context rounding, making the results slightly different than that obtained by statistics.median().

more_itertools.running_statistics(iterable, *, maxlen=None)[source]¶

Statistics for values seen so far or values in a sliding window.

Set maxlen to a positive integer to specify the maximum size of the sliding window. The default of None is equivalent to an unbounded window.

Yields instances of a Stats dataclass with fields for the dataset size, minimum value, median value, maximum value, and the arithmetic mean.

Supports numeric types such as int, float, Decimal, and Fraction, but not complex numbers which are unorderable.

Others¶

New itertools

more_itertools.locate(iterable, pred=bool, window_size=None)[source]¶

Yield the index of each item in iterable for which pred returns True.

pred defaults to bool(), which will select truthy items:

>>> list(locate([0, 1, 1, 0, 1, 0, 0]))
[1, 2, 4]

Set pred to a custom function to, e.g., find the indexes for a particular item.

>>> list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b'))
[1, 3]

If window_size is given, then the pred function will be called with the values in each window. This enables searching for sub-sequences. Note that pred may receive fewer than window_size arguments at the end of the iterable.

>>> iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
>>> pred = lambda *args: args == (1, 2, 3)
>>> list(locate(iterable, pred=pred, window_size=3))
[1, 5, 9]

Use with seekable() to find indexes and then retrieve the associated items:

>>> from itertools import count
>>> from more_itertools import seekable
>>> source = (3 * n + 1 if (n % 2) else n // 2 for n in count())
>>> it = seekable(source)
>>> pred = lambda x: x > 100
>>> indexes = locate(it, pred=pred)
>>> i = next(indexes)
>>> it.seek(i)
>>> next(it)
106

more_itertools.rlocate(iterable, pred=bool, window_size=None)[source]¶

Yield the index of each item in iterable for which pred returns True, starting from the right and moving left.

pred defaults to bool(), which will select truthy items:

>>> list(rlocate([0, 1, 1, 0, 1, 0, 0]))  # Truthy at 1, 2, and 4
[4, 2, 1]

Set pred to a custom function to, e.g., find the indexes for a particular item:

>>> iterator = iter('abcb')
>>> pred = lambda x: x == 'b'
>>> list(rlocate(iterator, pred))
[3, 1]

If window_size is given, then the pred function will be called with that many items. This enables searching for sub-sequences:

>>> iterable = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
>>> pred = lambda *args: args == (1, 2, 3)
>>> list(rlocate(iterable, pred=pred, window_size=3))
[9, 5, 1]

Beware, this function won’t return anything for infinite iterables. If iterable is reversible, rlocate will reverse it and search from the right. Otherwise, it will search from the left and return the results in reverse order.

See locate() to for other example applications.

more_itertools.replace(iterable, pred, substitutes, count=None, window_size=1)[source]¶

Yield the items from iterable, replacing the items for which pred returns True with the items from the iterable substitutes.

>>> iterable = [1, 1, 0, 1, 1, 0, 1, 1]
>>> pred = lambda x: x == 0
>>> substitutes = (2, 3)
>>> list(replace(iterable, pred, substitutes))
[1, 1, 2, 3, 1, 1, 2, 3, 1, 1]

If count is given, the number of replacements will be limited:

>>> iterable = [1, 1, 0, 1, 1, 0, 1, 1, 0]
>>> pred = lambda x: x == 0
>>> substitutes = [None]
>>> list(replace(iterable, pred, substitutes, count=2))
[1, 1, None, 1, 1, None, 1, 1, 0]

Use window_size to control the number of items passed as arguments to pred. This allows for locating and replacing subsequences.

>>> iterable = [0, 1, 2, 5, 0, 1, 2, 5]
>>> window_size = 3
>>> pred = lambda *args: args == (0, 1, 2)  # 3 items passed to pred
>>> substitutes = [3, 4] # Splice in these items
>>> list(replace(iterable, pred, substitutes, window_size=window_size))
[3, 4, 5, 3, 4, 5]

pred may receive fewer than window_size arguments at the end of the iterable and should be able to handle this.

class more_itertools.numeric_range(stop)[source]¶

class more_itertools.numeric_range(start, stop[, step])[source]

An extension of the built-in range() function whose arguments can be any orderable numeric type.

With only stop specified, start defaults to 0 and step defaults to 1. The output items will match the type of stop:

>>> list(numeric_range(3.5))
[0.0, 1.0, 2.0, 3.0]

With only start and stop specified, step defaults to 1. The output items will match the type of start:

>>> from decimal import Decimal
>>> start = Decimal('2.1')
>>> stop = Decimal('5.1')
>>> list(numeric_range(start, stop))
[Decimal('2.1'), Decimal('3.1'), Decimal('4.1')]

With start, stop, and step specified the output items will match the type of start + step:

>>> from fractions import Fraction
>>> start = Fraction(1, 2)  # Start at 1/2
>>> stop = Fraction(5, 2)  # End at 5/2
>>> step = Fraction(1, 2)  # Count by 1/2
>>> list(numeric_range(start, stop, step))
[Fraction(1, 2), Fraction(1, 1), Fraction(3, 2), Fraction(2, 1)]

If step is zero, ValueError is raised. Negative steps are supported:

>>> list(numeric_range(3, -1, -1.0))
[3.0, 2.0, 1.0, 0.0]

Be aware of the limitations of floating-point numbers; the representation of the yielded numbers may be surprising.

datetime.datetime objects can be used for start and stop, if step is a datetime.timedelta object:

>>> import datetime
>>> start = datetime.datetime(2019, 1, 1)
>>> stop = datetime.datetime(2019, 1, 3)
>>> step = datetime.timedelta(days=1)
>>> items = iter(numeric_range(start, stop, step))
>>> next(items)
datetime.datetime(2019, 1, 1, 0, 0)
>>> next(items)
datetime.datetime(2019, 1, 2, 0, 0)

more_itertools.side_effect(func, iterable, chunk_size=None, before=None, after=None)[source]¶

Invoke func on each item in iterable (or on each chunk_size group of items) before yielding the item.

func must be a function that takes a single argument. Its return value will be discarded.

before and after are optional functions that take no arguments. They will be executed before iteration starts and after it ends, respectively.

side_effect can be used for logging, updating progress bars, or anything that is not functionally “pure.”

Emitting a status message:

>>> from more_itertools import consume
>>> func = lambda item: print('Received {}'.format(item))
>>> consume(side_effect(func, range(2)))
Received 0
Received 1

Operating on chunks of items:

>>> pair_sums = []
>>> func = lambda chunk: pair_sums.append(sum(chunk))
>>> list(side_effect(func, [0, 1, 2, 3, 4, 5], 2))
[0, 1, 2, 3, 4, 5]
>>> list(pair_sums)
[1, 5, 9]

Writing to a file-like object:

>>> from io import StringIO
>>> from more_itertools import consume
>>> f = StringIO()
>>> func = lambda x: print(x, file=f)
>>> before = lambda: print('HEADER', file=f)
>>> after = f.close
>>> it = ['a', 'b', 'c']
>>> consume(side_effect(func, it, before=before, after=after))
>>> f.closed
True

more_itertools.iterate(func, start)[source]¶

Return start, func(start), func(func(start)), …

Produces an infinite iterator. To add a stopping condition, use take(), takewhile, or takewhile_inclusive():.

>>> take(10, iterate(lambda x: 2*x, 1))
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

>>> collatz = lambda x: 3*x + 1 if x%2==1 else x // 2
>>> list(takewhile_inclusive(lambda x: x!=1, iterate(collatz, 10)))
[10, 5, 16, 8, 4, 2, 1]

more_itertools.difference(iterable, func=operator.sub, *, initial=None)[source]¶

This function is the inverse of itertools.accumulate(). By default it will compute the first difference of iterable using operator.sub():

>>> from itertools import accumulate
>>> iterable = accumulate([0, 1, 2, 3, 4])  # produces 0, 1, 3, 6, 10
>>> list(difference(iterable))
[0, 1, 2, 3, 4]

func defaults to operator.sub(), but other functions can be specified. They will be applied as follows:

A, B, C, D, ... --> A, func(B, A), func(C, B), func(D, C), ...

For example, to do progressive division:

>>> iterable = [1, 2, 6, 24, 120]
>>> func = lambda x, y: x // y
>>> list(difference(iterable, func))
[1, 2, 3, 4, 5]

If the initial keyword is set, the first element will be skipped when computing successive differences.

>>> it = [10, 11, 13, 16]  # from accumulate([1, 2, 3], initial=10)
>>> list(difference(it, initial=10))
[1, 2, 3]

more_itertools.make_decorator(wrapping_func, result_index=0)[source]¶

Return a decorator version of wrapping_func, which is a function that modifies an iterable. result_index is the position in that function’s signature where the iterable goes.

This lets you use itertools on the “production end,” i.e. at function definition. This can augment what the function returns without changing the function’s code.

For example, to produce a decorator version of chunked():

>>> from more_itertools import chunked
>>> chunker = make_decorator(chunked, result_index=0)
>>> @chunker(3)
... def iter_range(n):
...     return iter(range(n))
...
>>> list(iter_range(9))
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]

To only allow truthy items to be returned:

>>> truth_serum = make_decorator(filter, result_index=1)
>>> @truth_serum(bool)
... def boolean_test():
...     return [0, 1, '', ' ', False, True]
...
>>> list(boolean_test())
[1, ' ', True]

The peekable() and seekable() wrappers make for practical decorators:

>>> from more_itertools import peekable
>>> peekable_function = make_decorator(peekable)
>>> @peekable_function()
... def str_range(*args):
...     return (str(x) for x in range(*args))
...
>>> it = str_range(1, 20, 2)
>>> next(it), next(it), next(it)
('1', '3', '5')
>>> it.peek()
'7'
>>> next(it)
'7'

class more_itertools.SequenceView(target)[source]¶

Return a read-only view of the sequence object target.

SequenceView objects are analogous to Python’s built-in “dictionary view” types. They provide a dynamic view of a sequence’s items, meaning that when the sequence updates, so does the view.

>>> seq = ['0', '1', '2']
>>> view = SequenceView(seq)
>>> view
SequenceView(['0', '1', '2'])
>>> seq.append('3')
>>> view
SequenceView(['0', '1', '2', '3'])

Sequence views support indexing, slicing, and length queries. They act like the underlying sequence, except they don’t allow assignment:

>>> view[1]
'1'
>>> view[1:-1]
['1', '2']
>>> len(view)
4

Sequence views are useful as an alternative to copying, as they don’t require (much) extra storage.

class more_itertools.time_limited(limit_seconds, iterable)[source]¶

Yield items from iterable until limit_seconds have passed. If the time limit expires before all items have been yielded, the timed_out parameter will be set to True.

>>> from time import sleep
>>> def generator():
...     yield 1
...     yield 2
...     sleep(0.2)
...     yield 3
>>> iterable = time_limited(0.1, generator())
>>> list(iterable)
[1, 2]
>>> iterable.timed_out
True

Note that the time is checked before each item is yielded, and iteration stops if the time elapsed is greater than limit_seconds. If your time limit is 1 second, but it takes 2 seconds to generate the first item from the iterable, the function will run for 2 seconds and not yield anything. As a special case, when limit_seconds is zero, the iterator never returns anything.

more_itertools.map_if(iterable, pred, func, func_else=lambda x: ...)[source]¶

Evaluate each item from iterable using pred. If the result is equivalent to True, transform the item with func and yield it. Otherwise, transform the item with func_else and yield it.

pred, func, and func_else should each be functions that accept one argument. By default, func_else is the identity function.

>>> from math import sqrt
>>> iterable = list(range(-5, 5))
>>> iterable
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
>>> list(map_if(iterable, lambda x: x > 3, lambda x: 'toobig'))
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 'toobig']
>>> list(map_if(iterable, lambda x: x >= 0,
... lambda x: f'{sqrt(x):.2f}', lambda x: None))
[None, None, None, None, None, '0.00', '1.00', '1.41', '1.73', '2.00']

more_itertools.doublestarmap(func, iterable)[source]¶

Apply func to every item of iterable by dictionary unpacking the item into func.

The difference between itertools.starmap() and doublestarmap() parallels the distinction between func(*a) and func(**a).

>>> iterable = [{'a': 1, 'b': 2}, {'a': 40, 'b': 60}]
>>> list(doublestarmap(lambda a, b: a + b, iterable))
[3, 100]

TypeError will be raised if func’s signature doesn’t match the mapping contained in iterable or if iterable does not contain mappings.

Itertools recipes

more_itertools.iter_index(iterable, value, start=0, stop=None)[source]¶

Yield the index of each place in iterable that value occurs, beginning with index start and ending before index stop.

>>> list(iter_index('AABCADEAF', 'A'))
[0, 1, 4, 7]
>>> list(iter_index('AABCADEAF', 'A', 1))  # start index is inclusive
[1, 4, 7]
>>> list(iter_index('AABCADEAF', 'A', 1, 7))  # stop index is not inclusive
[1, 4]

The behavior for non-scalar value arguments matches the built-in Python types.

>>> list(iter_index('ABCDABCD', 'AB'))
[0, 4]
>>> list(iter_index([0, 1, 2, 3, 0, 1, 2, 3], [0, 1]))
[]
>>> list(iter_index([[0, 1], [2, 3], [0, 1], [2, 3]], [0, 1]))
[0, 2]

For range objects (and other objects whose index method’s behavior doesn’t match that of list), wrap iterable with iter:

>>> list(iter_index(iter(range(5)), 2))
[2]

See locate() for a more general means of finding the indexes associated with particular values.

more_itertools.consume(iterator, n=None)[source]¶

Advance iterable by n steps. If n is None, consume it entirely.

Efficiently exhausts an iterator without returning values. Defaults to consuming the whole iterator, but an optional second argument may be provided to limit consumption.

>>> i = (x for x in range(10))
>>> next(i)
0
>>> consume(i, 3)
>>> next(i)
4
>>> consume(i)
>>> next(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

If the iterator has fewer items remaining than the provided limit, the whole iterator will be consumed.

>>> i = (x for x in range(3))
>>> consume(i, 5)
>>> next(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

more_itertools.tabulate(function, start=0)[source]¶

Return an iterator over the results of func(start), func(start + 1), func(start + 2)…

func should be a function that accepts one integer argument.

If start is not specified it defaults to 0. It will be incremented each time the iterator is advanced.

>>> square = lambda x: x ** 2
>>> iterator = tabulate(square, -3)
>>> take(4, iterator)
[9, 4, 1, 0]

more_itertools.repeatfunc(function, times=None, *args)[source]¶

Call function with args repeatedly, returning an iterable over the results.

If times is specified, the iterable will terminate after that many repetitions:

>>> from operator import add
>>> times = 4
>>> args = 3, 5
>>> list(repeatfunc(add, times, *args))
[8, 8, 8, 8]

If times is None the iterable will not terminate:

>>> from random import randrange
>>> times = None
>>> args = 1, 11
>>> take(6, repeatfunc(randrange, times, *args))
[2, 4, 8, 1, 8, 4]

more_itertools.reshape(matrix, shape)[source]¶

Change the shape of a matrix.

If shape is an integer, the matrix must be two dimensional and the shape is interpreted as the desired number of columns:

>>> matrix = [(0, 1), (2, 3), (4, 5)]
>>> cols = 3
>>> list(reshape(matrix, cols))
[(0, 1, 2), (3, 4, 5)]

If shape is a tuple (or other iterable), the input matrix can have any number of dimensions. It will first be flattened and then rebuilt to the desired shape which can also be multidimensional:

>>> matrix = [(0, 1), (2, 3), (4, 5)]    # Start with a 3 x 2 matrix

>>> list(reshape(matrix, (2, 3)))        # Make a 2 x 3 matrix
[(0, 1, 2), (3, 4, 5)]

>>> list(reshape(matrix, (6,)))          # Make a vector of length six
[0, 1, 2, 3, 4, 5]

>>> list(reshape(matrix, (2, 1, 3, 1)))  # Make 2 x 1 x 3 x 1 tensor
[(((0,), (1,), (2,)),), (((3,), (4,), (5,)),)]

Each dimension is assumed to be uniform, either all arrays or all scalars. Flattening stops when the first value in a dimension is a scalar. Scalars are bytes, strings, and non-iterables. The reshape iterator stops when the requested shape is complete or when the input is exhausted, whichever comes first.

more_itertools.loops(n)[source]¶

Returns an iterable with n elements for efficient looping. Like range(n) but doesn’t create integers.

>>> i = 0
>>> for _ in loops(5):
...     i += 1
>>> i
5