开发者

Python - How to check list monotonicity

开发者 https://www.devze.com 2023-02-10 00:25 出处:网络
What would be an efficient and pythonic way to check list monotonicity? i.e. that it has monotonically increasing or decreasing values?

What would be an efficient and pythonic way to check list monotonicity?

i.e. that it has monotonically increasing or decreasing values?

Examples:

[0, 1, 2, 3, 3, 4]   # This is a monotonically increasing list
[4.3, 4.2, 4.2, -2]  # This is a monotonically decreasing list
[2, 开发者_运维百科3, 1]            # This is neither


It's better to avoid ambiguous terms like "increasing" or "decreasing" as it's not clear if equality is acceptable or not. You should always use either for example "non-increasing" (clearly equality is accepted) or "strictly decreasing" (clearly equality is NOT accepted).

def strictly_increasing(L):
    return all(x<y for x, y in zip(L, L[1:]))

def strictly_decreasing(L):
    return all(x>y for x, y in zip(L, L[1:]))

def non_increasing(L):
    return all(x>=y for x, y in zip(L, L[1:]))

def non_decreasing(L):
    return all(x<=y for x, y in zip(L, L[1:]))

def monotonic(L):
    return non_increasing(L) or non_decreasing(L)


If you have large lists of numbers it might be best to use numpy, and if you are:

import numpy as np

def monotonic(x):
    dx = np.diff(x)
    return np.all(dx <= 0) or np.all(dx >= 0)

should do the trick.


import itertools
import operator

def monotone_increasing(lst):
    pairs = zip(lst, lst[1:])
    return all(itertools.starmap(operator.le, pairs))

def monotone_decreasing(lst):
    pairs = zip(lst, lst[1:])
    return all(itertools.starmap(operator.ge, pairs))

def monotone(lst):
    return monotone_increasing(lst) or monotone_decreasing(lst)

This approach is O(N) in the length of the list.


@6502 has an elegant code for sequences (iterables with __getitem__ and __len__ methods) and @chqrlie has an even better code which does not create temporary copies of sequences with slicing. I just want to add a general version that works for all iterables (objects with an __iter__ method):

def pairwise(iterable):
    items = iter(iterable)
    last = next(items)
    for item in items:
        yield last, item
        last = item

def strictly_increasing(iterable):
    return all(x<y for x, y in pairwise(iterable))

def strictly_decreasing(iterable):
    return all(x>y for x, y in pairwise(iterable))

def non_increasing(iterable):
    return all(x>=y for x, y in pairwise(iterable))

def non_decreasing(iterable):
    return all(x<=y for x, y in pairwise(iterable))

def monotonic(iterable):
    return non_increasing(iterable) or non_decreasing(iterable)


The pandas package makes this convenient.

import pandas as pd

The following commands work with a list of integers or floats.

Monotonically increasing (≥):

pd.Series(mylist).is_monotonic_increasing

Strictly monotonically increasing (>):

myseries = pd.Series(mylist)
myseries.is_unique and myseries.is_monotonic_increasing

Alternative using an undocumented private method:

pd.Index(mylist)._is_strictly_monotonic_increasing

Monotonically decreasing (≤):

pd.Series(mylist).is_monotonic_decreasing

Strictly monotonically decreasing (<):

myseries = pd.Series(mylist)
myseries.is_unique and myseries.is_monotonic_decreasing

Alternative using an undocumented private method:

pd.Index(mylist)._is_strictly_monotonic_decreasing


@6502 has elegant python code for this. Here is an alternative solution with simpler iterators and no potentially expensive temporary slices:

def strictly_increasing(L):
    return all(L[i] < L[i+1] for i in range(len(L)-1))

def strictly_decreasing(L):
    return all(L[i] > L[i+1] for i in range(len(L)-1))

def non_increasing(L):
    return all(L[i] >= L[i+1] for i in range(len(L)-1))

def non_decreasing(L):
    return all(L[i] <= L[i+1] for i in range(len(L)-1))

def monotonic(L):
    return non_increasing(L) or non_decreasing(L)


import operator, itertools

def is_monotone(lst):
    op = operator.le            # pick 'op' based upon trend between
    if not op(lst[0], lst[-1]): # first and last element in the 'lst'
        op = operator.ge
    return all(op(x,y) for x, y in itertools.izip(lst, lst[1:]))


Here is a functional solution using reduce of complexity O(n):

is_increasing = lambda L: reduce(lambda a,b: b if a < b else 9999 , L)!=9999

is_decreasing = lambda L: reduce(lambda a,b: b if a > b else -9999 , L)!=-9999

Replace 9999 with the top limit of your values, and -9999 with the bottom limit. For example, if you are testing a list of digits, you can use 10 and -1.


I tested its performance against @6502's answer and its faster.

Case True: [1,2,3,4,5,6,7,8,9]

# my solution .. 
$ python -m timeit "inc = lambda L: reduce(lambda a,b: b if a < b else 9999 , L)!=9999; inc([1,2,3,4,5,6,7,8,9])"
1000000 loops, best of 3: 1.9 usec per loop

# while the other solution:
$ python -m timeit "inc = lambda L: all(x<y for x, y in zip(L, L[1:]));inc([1,2,3,4,5,6,7,8,9])"
100000 loops, best of 3: 2.77 usec per loop

Case False from the 2nd element: [4,2,3,4,5,6,7,8,7]:

# my solution .. 
$ python -m timeit "inc = lambda L: reduce(lambda a,b: b if a < b else 9999 , L)!=9999; inc([4,2,3,4,5,6,7,8,7])"
1000000 loops, best of 3: 1.87 usec per loop

# while the other solution:
$ python -m timeit "inc = lambda L: all(x<y for x, y in zip(L, L[1:]));inc([4,2,3,4,5,6,7,8,7])"
100000 loops, best of 3: 2.15 usec per loop


I timed all of the answers in this question under different conditions, and found that:

  • Sorting was the fastest by a long shot IF the list was already monotonically increasing
  • Sorting was the slowest by a long shot IF the list was shuffled/random or if the number of elements out of order was greater than ~1. The more out of order the list of course corresponds to a slower result.
  • Michael J. Barbers method was the fastest IF the list was mostly monotonically increasing, or completely random.

Here is the code to try it out:

import timeit

setup = '''
import random
from itertools import izip, starmap, islice
import operator

def is_increasing_normal(lst):
    for i in range(0, len(lst) - 1):
        if lst[i] >= lst[i + 1]:
            return False
    return True

def is_increasing_zip(lst):
    return all(x < y for x, y in izip(lst, islice(lst, 1, None)))

def is_increasing_sorted(lst):
    return lst == sorted(lst)

def is_increasing_starmap(lst):
    pairs = izip(lst, islice(lst, 1, None))
    return all(starmap(operator.le, pairs))

if {list_method} in (1, 2):
    lst = list(range({n}))
if {list_method} == 2:
    for _ in range(int({n} * 0.0001)):
        lst.insert(random.randrange(0, len(lst)), -random.randrange(1,100))
if {list_method} == 3:
    lst = [int(1000*random.random()) for i in xrange({n})]
'''

n = 100000
iterations = 10000
list_method = 1

timeit.timeit('is_increasing_normal(lst)', setup=setup.format(n=n, list_method=list_method), number=iterations)

timeit.timeit('is_increasing_zip(lst)', setup=setup.format(n=n, list_method=list_method), number=iterations)

timeit.timeit('is_increasing_sorted(lst)', setup=setup.format(n=n, list_method=list_method), number=iterations)

timeit.timeit('is_increasing_starmap(lst)', setup=setup.format(n=n, list_method=list_method), number=iterations)

If the list was already monotonically increasing (list_method == 1), the fastest to slowest was:

  1. sorted
  2. starmap
  3. normal
  4. zip

If the list was mostly monotonically increasing (list_method == 2), the fastest to slowest was:

  1. starmap
  2. zip
  3. normal
  4. sorted

(Whether or not the starmap or zip was fastest depended on the execution and I couldn't identify a pattern. Starmap appeared to be usually faster)

If the list was completely random (list_method == 3), the fastest to slowest was:

  1. starmap
  2. zip
  3. normal
  4. sorted (extremely bad)


Here's a variant that accepts both materialized and non-materialized sequences. It automatically determines whether or not it's monotonic, and if so, its direction (i.e. increasing or decreasing) and strictness. Inline comments are provided to help the reader. Similarly for test-cases provided at the end.

    def isMonotonic(seq):
    """
    seq.............: - A Python sequence, materialized or not.
    Returns.........:
       (True,0,True):   - Mono Const, Strict: Seq empty or 1-item.
       (True,0,False):  - Mono Const, Not-Strict: All 2+ Seq items same.
       (True,+1,True):  - Mono Incr, Strict.
       (True,+1,False): - Mono Incr, Not-Strict.
       (True,-1,True):  - Mono Decr, Strict.
       (True,-1,False): - Mono Decr, Not-Strict.
       (False,None,None) - Not Monotonic.
    """    
    items = iter(seq) # Ensure iterator (i.e. that next(...) works).
    prev_value = next(items, None) # Fetch 1st item, or None if empty.
    if prev_value == None: return (True,0,True) # seq was empty.

    # ============================================================
    # The next for/loop scans until it finds first value-change.
    # ============================================================
    # Ex: [3,3,3,78,...] --or- [-5,-5,-5,-102,...]
    # ============================================================
    # -- If that 'change-value' represents an Increase or Decrease,
    #    then we know to look for Monotonically Increasing or
    #    Decreasing, respectively.
    # -- If no value-change is found end-to-end (e.g. [3,3,3,...3]),
    #    then it's Monotonically Constant, Non-Strict.
    # -- Finally, if the sequence was exhausted above, which means
    #    it had exactly one-element, then it Monotonically Constant,
    #    Strict.
    # ============================================================
    isSequenceExhausted = True
    curr_value = prev_value
    for item in items:
        isSequenceExhausted = False # Tiny inefficiency.
        if item == prev_value: continue
        curr_value = item
        break
    else:
        return (True,0,True) if isSequenceExhausted else (True,0,False)
    # ============================================================

    # ============================================================
    # If we tricked down to here, then none of the above
    # checked-cases applied (i.e. didn't short-circuit and
    # 'return'); so we continue with the final step of
    # iterating through the remaining sequence items to
    # determine Monotonicity, direction and strictness.
    # ============================================================
    strict = True
    if curr_value > prev_value: # Scan for Increasing Monotonicity.
        for item in items:
            if item < curr_value: return (False,None,None)
            if item == curr_value: strict = False # Tiny inefficiency.
            curr_value = item
        return (True,+1,strict)
    else:                       # Scan for Decreasing Monotonicity.
        for item in items: 
            if item > curr_value: return (False,None,None)
            if item == curr_value: strict = False # Tiny inefficiency.
            curr_value = item
        return (True,-1,strict)
    # ============================================================


# Test cases ...
assert isMonotonic([1,2,3,4])     == (True,+1,True)
assert isMonotonic([4,3,2,1])     == (True,-1,True)
assert isMonotonic([-1,-2,-3,-4]) == (True,-1,True)
assert isMonotonic([])            == (True,0,True)
assert isMonotonic([20])          == (True,0,True)
assert isMonotonic([-20])         == (True,0,True)
assert isMonotonic([1,1])         == (True,0,False)
assert isMonotonic([1,-1])        == (True,-1,True)
assert isMonotonic([1,-1,-1])     == (True,-1,False)
assert isMonotonic([1,3,3])       == (True,+1,False)
assert isMonotonic([1,2,1])       == (False,None,None)
assert isMonotonic([0,0,0,0])     == (True,0,False)

I suppose this could be more Pythonic, but it's tricky because it avoids creating intermediate collections (e.g. list, genexps, etc); as well as employs a fall/trickle-through and short-circuit approach to filter through the various cases: E.g. Edge-sequences (like empty or one-item sequences; or sequences with all identical items); Identifying increasing or decreasing monotonicity, strictness, and so on. I hope it helps.


L = [1,2,3]
L == sorted(L)

L == sorted(L, reverse=True)


Here is an implementation that is both efficient (the space required is constant, no slicing performing a temporary shallow copy of the input) and general (any iterables are supported as input, not just sequences):

def is_weakly_increasing(iterable):
    iterator = iter(iterable)
    next(iterator)
    return all(x <= y for x, y in zip(iterable, iterator))


def is_weakly_decreasing(iterable):
    iterator = iter(iterable)
    next(iterator)
    return all(x >= y for x, y in zip(iterable, iterator))


def is_weakly_monotonic(iterable):
    return is_weakly_increasing(iterable) or is_weakly_decreasing(iterable)


def is_strictly_increasing(iterable):
    iterator = iter(iterable)
    next(iterator)
    return all(x < y for x, y in zip(iterable, iterator))


def is_stricly_decreasing(iterable):
    iterator = iter(iterable)
    next(iterator)
    return all(x > y for x, y in zip(iterable, iterator))


def is_strictly_monotonic(iterable):
    return is_strictly_increasing(iterable) or is_strictly_decreasing(iterable)


def IsMonotonic(data):
    ''' Returns true if data is monotonic.'''
    data = np.array(data)
    # Greater-Equal
    if (data[-1] > data[0]):
        return np.all(data[1:] >= data[:-1])
    # Less-Equal
    else:
        return np.all(data[1:] <= data[:-1])

My proposition (with numpy) as a summary of few ideas here. Uses

  • casting to np.array for creation of bool values for each lists comparision,
  • np.all for checking if all results are True
  • checking diffrence between first and last element for choosing comparison operator,
  • using direct comparison >=, <= instead of calculatin np.diff,


Here are two ways of determining if a list if monotonically increasing or decreasing using just range or list comprehensions. Using range is slightly more efficient because it can short-circuit, whereas the list comprehension must iterate over the entire list. Enjoy.

a = [1,2,3,4,5]
b = [0,1,6,1,0]
c = [9,8,7,6,5]

def monotonic_increase(x):
    if len(x) <= 1: return False

    for i in range(1, len(x)):
        if x[i-1] >= x[i]:
            return False
    return True

def monotonic_decrease(x):
    if len(x) <= 1: return False

    for i in range(1, len(x)):
        if x[i-1] <= x[i]:
            return False

    return True

monotonic_increase = lambda x: len(x) > 1 and all(x[i-1] < x[i] for i in range(1, len(x)))
monotonic_decrease = lambda x: len(x) > 1 and all(x[i-1] > x[i] for i in range(1, len(x)))

print(monotonic_increase(a))
print(monotonic_decrease(c))
print(monotonic_decrease([]))
print(monotonic_increase(c))
print(monotonic_decrease(a))
print(monotonic_increase(b))
print(monotonic_decrease(b))


def solution1(a):
    up, down = True, True
    for i in range(1, len(a)):
        if a[i] < a[i-1]: up = False
        if a[i] > a[i-1]: down = False
    return up or down
    
def solution2(a):
    return a == sorted(a[::-1])

Solution1 is O(n) and the shorter solution2 is O(n log n)


summary: Yes, setting a single variable from another variable is thread safe. This provides a high-speed way to transfer a single value bewteen tasks.

this link has a clear explanation:Grok the GIL: How to write fast and thread-safe Python

My simplified explanation:

All computers, way down at the machine code, provide some (not all) instructions that are non-interruptable. These are reffered to as "atomic".

For example, storing a 16 bit-constant into a 16-bit memory location. way-back the 8-bit computers were a mix of features and implimentations. For example, the 8-bit processors 6502, 6800 both had non-interruptable (atomic) instructions to move 16-bits from one memory location to another. The Z80 did not. The PDP-11 machine code had an atomic increment/decrement instructions (var++; v--) which is how K&R came to add that to the early C compilers. At least from what I remember.

Why does this matter?

Today a lot of computer programming languages are written to a custom-designed "virtual machine". These virtual machines, like all computer processors, support a mix of features and implimentations.

Python's virtual machine is written with it's custom GIL ("global interpreter lock")

what this means is that some (but not all) python's operations are "thread-safe" in that some (but not all) execute in a single "GIL" cycle.

There are some surprises. For example "sort" is atomic and therefor thread-safe. Since this is a relatively long process, that's a surprise (read the link for a much better explanation)

Python has a function that will display the "virtual machine codes" a function is "compiled" into:

## ref: https://opensource.com/article/17/4/grok-gil
import dis

vara= 33
varb= 55

def foo():
    global vara
    vara= 33
    varb= 55
    varb= vara
    return

dis.dis(foo)

which gives the output:

415           0 LOAD_CONST               1 (33)
              2 STORE_GLOBAL             0 (vara)

416           4 LOAD_CONST               2 (55)
              6 STORE_FAST               0 (varb)

417           8 LOAD_GLOBAL              0 (vara)
             10 STORE_FAST               0 (varb)

418          12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

Care must be taken when using this (well, all multithreaded programs should be carefully designed and coded) since ONLY A SINGLE value is safe! The good news is that this SINGLE VALUE can point to an entire data structure. Of course the entired data structure isn't atomic so any changes to it should be wrapped in mutlithread "stalls" (another name for semaphores, locks, etc- since they basically "stall" the code, until the semaphores/locks are released by other threads. I read this somewhere from one of the co-developers of multithreading...in hindsight he thought "stalls" were a better, more discription term. Perhaps dijkstra??)


>>> seq = [0, 1, 2, 3, 3, 4]
>>> seq == sorted(seq) or seq == sorted(seq, reverse=True)
0

精彩评论

暂无评论...
验证码 换一张
取 消