# Reasoning About Algorithms

# COMP 382

## BSTs §

Assumption: keys are linearly ordered.

Let $x$ and $y$ be nodes in the tree.

- If node $y$ is in the left subtree of $x$, then $y_{key} \leq x_{key}$
- If node $y$ is in the right subtree of $x$, then $y_{key} \geq x_{key}$

Every subtree of a BST is a BST.

### Traversal §

```
in_order(x):
if x != null:
in_order(x.left)
print(x.key)
in_order(x.right)
```

- preorder: root, left, right
- inorder: left, root, right
- postorder: left, right, root

Running time: $\Theta(n)$ where $n$ is the tree size

### Search §

```
tree_search(x, k):
if x == null or k == x.key
return x
if k < x.key:
return tree_search(x.left, k)
return tree_serach(x.right, k)
```

$O(h)$, where $h$ is the height of the tree. $h$ is

```
tree_search(x, k):
while x != null and k != x.key:
if k < x.key:
x = x.left
else:
x = x.right
return x
```

### Successor §

Pre-condition: `x != null`

```
tree_successor(x):
if x.right != null:
return tree_min(x)
y = x.p
while y != null and x == y.right:
x = y
y = y.p
return y
```

## Red-Black Trees §

- every node is either red or black
- the root is black
- every leaf (NIL) is black
- the children of a red node must be black
- for every node $u$, all simple paths from $u$ to descendent leaves have the same number of black nodes.

properties 4 and 5 together ensure that the tree is “roughly balanced”.

## AVL Trees §

Binary search trees that are balanced.

For every node $x$ with a left subtree $l$ and a right subtree $r$, $|\textrm{height}(r) - \textrm{height}(l)| \leq 1$. The heights of subtrees $l$ and $r$ differ by at most 1.

Balance factor $\textrm{balance}(x) = \textrm{height}(x.r) - \textrm{height}(x.l)$

## B-Trees §

Assumption: each node is stored on a separate disk page.

The branching factor of B trees can be large.

A node with $n$ keys has $n + 1$ children.

A rooted tree:

- every node
`x`

has attributes- number of keys stored
`x.n`

- the keys
`x.key[1] <= x.key[2] <= ... <= x.key[x.n]`

(ordered) - boolean value
`x.leaf`

- number of keys stored
- every internal node
`x`

has`x.n + 1`

pointers`x.c[1]`

,`x.c[2]`

, …,`x.c[x.n + 1]`

to its children - the keys separate the ranges of keys for subtrees
- all leaves have the same depth, which is equal to the height
`h`

of the tree - a b-tree has a parameter $t \geq 2$ called the minimum degree
- each node (except root) has at least $t - 1$ keys. each internal node (except root) has at least $t$ children.
- each node has at msot $2t - 1$ keys

Consider a non-empty B-tree $T$ with minimum degree $t$. The height of $T$ is $h$.

```
# assumption: x.node has already been read from disk k.key
b_tree_search(x, k):
i = 1
while i <= x.n and k > x.key[i]:
i++
if i <= x.n and k == x.key[i]:
return (x, i)
else if x.leaf:
return null
else:
disk_read(x, c[i])
return b_tree_search(x.c[i], k)
```

$O(h)$ reads from disk and $O(t \cdot h)$ running time.

## Hash Tables §

Data structure that stores keys with additional satellite data

### The universe is small §

Direct address table: array `T[0..m-1]`

.

- If key
`k`

is not present,`T[k] = NIL`

- If key
`k`

is present and has value`val`

,`T[k] = val`

### The universe is large; the set $k$ of present keys is small §

Memory: $\Theta(|k|)$

Search: $O(1)$ time in the average case

Components:

- table
`T[0..m-1]`

where typically m \llt |U| - hash function $h : U \rightarrow \{0, 1, \cdots, m - 1\}$
- key $k$ hashes to the slot $h(k)$

values stored in `T[h(k)]`

. collisions unavoidable since $m < |U|$.

chaining: elements that hash to the same value are placed into a linked list.

`insert(T, x)`

: insert`x`

at the head of list`T[h(x.key)]`

`search(T, k)`

: search for`k`

in the list`T[h(k)]`

`delete(T, x)`

: delete`x`

from the list`T[h(x.key)]`

worst case for search: all $n$ keys hash to the same slot. searching will be $\Theta(n)$.

[Assumption: simple, uniform hashing]

We insert keys $k_1, k_2, \cdots, k_n$ into a hash table with $m$ slots

$P(h(i) = j) = \frac{1}{m}$

$n_j$ length of the list `T[j]`

load factor