Algorithms - Count all pairs of equal numbers in a sorted array in O(n)?

2 min read 06-10-2024
Algorithms - Count all pairs of equal numbers in a sorted array in O(n)?


Counting Equal Pairs in a Sorted Array: An Efficient O(n) Approach

Finding pairs of equal numbers in an array is a common task in computer science. While a brute-force approach might seem intuitive, it leads to a time complexity of O(n^2), which can be inefficient for large arrays. This article explores a much faster approach to count all pairs of equal numbers in a sorted array, achieving a time complexity of O(n).

Understanding the Challenge

The problem asks us to determine the number of pairs where two elements in a sorted array have the same value. For example, in the array [1, 2, 2, 3, 4, 4, 4, 5], there are three pairs of equal numbers: (2, 2), (4, 4), and another (4, 4).

The Naive Approach (O(n^2))

The simplest way to solve this problem is to compare each element with every other element in the array. This brute-force method would look like this:

def count_pairs_naive(arr):
  count = 0
  for i in range(len(arr)):
    for j in range(i + 1, len(arr)):
      if arr[i] == arr[j]:
        count += 1
  return count

This approach works but has a time complexity of O(n^2), as it requires nested loops. For large arrays, this becomes computationally expensive.

The Efficient O(n) Solution: Leveraging Sortedness

The key to an efficient solution lies in recognizing that the array is sorted. This allows us to iterate through the array once and count the pairs by observing consecutive equal elements.

def count_pairs_efficient(arr):
  count = 0
  i = 0
  while i < len(arr) - 1:
    if arr[i] == arr[i + 1]:
      count += 1
      i += 2
    else:
      i += 1
  return count

This code iterates through the array, incrementing i by two when it encounters consecutive equal elements, effectively counting the pairs. This approach has a time complexity of O(n) because we visit each element at most twice.

Advantages of the Efficient Approach

  • Faster execution: The O(n) approach drastically reduces the execution time compared to the O(n^2) solution, especially for large datasets.
  • Less memory usage: It utilizes a single pass through the array, minimizing memory usage compared to the nested loops approach.
  • Simplicity: The code is straightforward and easy to understand.

Real-world Applications

This algorithm finds applications in various scenarios, such as:

  • Data analysis: Identifying patterns in sorted data, like repeated values in sales records.
  • Database optimization: Counting occurrences of duplicate entries in a database table.
  • Image processing: Detecting contiguous regions of pixels with the same color.

Conclusion

By leveraging the sorted nature of the input array, we can efficiently count pairs of equal numbers in O(n) time complexity. This optimized approach offers significant performance improvements over the brute-force method, making it a valuable tool for various data processing tasks.