Lists

Posted on August 1st, 2022

Updated on August 2nd, 2022

Part of (Re)Learning CS

Talks aboutComputer Science

In computing, a list is a data collection which storesa linear, ordered collection of values of the same data type.

Array

An array stores its values in adjacent memory locations (contiguous) and has a fixed length like int[]. For cases where the program doesn't know the data size in advance, languages also use dynamic arrays, like vector<T>, which resize on the fly.

Indexing

Because array elements are stored together, they can use pointer arithmetic for indexing. For any array of type TT, where BB is the size of TT in bytes and XX is its starting memory address, an item at index II is at address X+BIX+BI.

SourceExample of pointer arithmetic.

This means they have random access — all elements can be accessed in constant time (O(1)\mathcal{O}(1)).

Adding & Removing

For fixed-length arrays, pushing and popping new elements is O(1)\mathcal{O}(1) because it's just indexing. Removing an element from the middle requires all subsequent elements to be shifted a position, making it O(n)\mathcal{O}(n). Removing the first value is O(1)\mathcal{O}(1) however, as the pointer to the first value can be incremented to the next element.

For dynamic arrays, appending a new value beyond the current array length is O(n)\mathcal{O}(n):

  1. The array size is doubled
  2. New memory is allocated
  3. Its values are copied across
  4. The new element is stored at the end

Searching

For an unsorted array, finding an element is O(n)\mathcal{O}(n) as it must be iterated to compare against each value.

Sorted arrays can utilise random access and the binary search algorithm to achieve O(logn)\mathcal{O}(\log n). The algorithm compares the middle item in the array to determine whether the search value is in the lower or upper half; this is performed iteratively/recursively, reducing the array's length for each run and thus the time complexity.

def binary_search(arr: List[int], el: int):
  left = 0
  right = len(arr) - 1

  while True:
    # Indexes have passed each other
    # Call off the search
    if left > right:
      return False

    middle = (left + right) // 2

    if el == arr[middle]:
      return True
    elif el < arr[middle]:
      # Reduce search to lower half
      right = middle - 1
    else:
      # Reduce search to upper half
      left = middle + 1
Binary Search implementation in Python.

Linked List

A linked list is composed of nodes which store a pointer to the next node in a singly-linked list, and an additional pointer to the previous node in a doubly-linked list.

A global head pointer is kept to directly access the start of the chain, and a tail pointer is also typically kept to access the end.

Indexing

Accessing nodes in a linked list is O(n)\mathcal{O}(n), as the nodes must be traversed until the index is reached. To lessen the blow, we can store the length of the list LL to determine if an index II is closer to the head (00) or tail (L1L-1).

Adding & Removing

Again, it's O(n)\mathcal{O}(n).

To insert a new element EE at index II, the current node at index II is updated to point to EE and EE is updated to point at node I+1I+1. For a double-linked list, the previous pointer is also updated.

Removal is the same idea but the current value is forgotten about.

Searching

Nothing clever can be done in a linked list, so it's O(n)\mathcal{O}(n) once more.

Circular Array

Arrays are good for random access; linked lists are good for start/end insertions. Circular arrays bridge the two.

A large backing array stores the array values, and its start & end memory addresses are managed with a head & tail pointer.

SourceCircular Array with Head and Tail pointers.

Indexing

We can still use pointer arithmetic to achieve O(1)\mathcal{O}(1), as long as we respect the head pointer.

For any array of type TT with length LL, where BB is the size of TT in bytes, and HH is the head pointer address, an item at index II is at address (H+BI)modL(H+BI)\mod{L}.

Adding & Removing

To add values from the start/end of a circular array, the appropriate pointer is incremented and the value is stored in memory. To remove, the logic is reversed resulting in O(1)\mathcal{O}(1) as long as the backing array isn't full.

Searching

Same as arraysO(logn)\mathcal{O}(\log n).