Computational Thinking
COMP 140
Module 1 §
 Identify the objective.
 Determine the inputs.
 Break the problem into subproblems.
 Solve the subproblems.
 Use the solution to the subproblems to solve the original problem.
 Write your solutions in Python.
Intersection of two lines §
$x_i = \frac{(s_1x_1)(s_2 x_2)+(y_2  y_1)}{s_1  s_2}$
$y_i = s_1 (x_i  x_1) + y_1$
Module 2 §
Modular arithmetic §
$a^{1}$ in a prime finite field $= a^{p  2} \mod p$ when $p$ is prime
Projective Geometry §
Properties:
 A point: $(x,y,z)$ except $(0,0,0)$
 A line: $[a,b,c]$ except $(0,0,0)$
 k \neq 0 \and (x,y,z)=(kx,ky,kz) \implies They’re the same point. k \neq 0 \and [a,b,c]=[ka,kb,kc] \implies They’re the same line.
 $(x,y,z)$ is on $[a,b,c]$ iff $ax + by + cz = 0$
Finite field §
$\mathbb{Z}_m$ where $m$ is a prime number. All operations are performed $\mod m$.
Division in a finite field §
$c = a \times b^{1} \mod m$
If the modulo $m$ is prime, $b^{1}= b^{m  2} \mod m$
Cross product §
cross product of two points: the line that connects two points
$p \times q = \left[p_y q_z  p_z q_y, p_z q_x  p_x q_z, p_x q_y  p_y q_x\right]$
Module 3 §
random
§
random.random()
: $[0, 1)$random.randrange([start=0], end)
— exclusiverandom.choice(iterable)
random.shuffle(list)
Dictionaries §
not ordered.
for key in dictionary:
pass
for key in dictionary.keys():
pass
for key, value in dictionary.items():
pass
for value in dictionary.values():
pass
#import collections
from collections import defaultdict
def fn():
"""
Returns a value that should be used as the
default value of keys of the defaultdict
Inputs: none.
"""
return 0
counts = defaultdict(fn)
Markov chains §
states. transitions with probabilities. discrete time.
Markov property: doesn’t matter how you go to any state. probabilities of the next transition would remain the same regardless.
Empirically calculate transition probabilities.
Predicc.
Markov chains with memory §
$n$th order markov models. $n$ states of history.
$n=0$: no memory
$n=1$: one state of memory
$\dots$
encoding, in a single state, the history
first_order = {
('A',): { 'A': 0.2, 'B': 0.5, 'C': 0.3 },
('B',): { 'A': 0.1, 'B': 0, 'C': 0.9 },
('C',): { 'A': 0.4, 'B': 0.5, 'C': 0.1 }
}
second_order = {
('A', 'A'): { 'A': 0.4, 'B': 0.6 },
('A', 'B'): { 'A': 0.2, 'B': 0.8},
('B', 'A'): { 'A': 1, 'B': 0 },
('B', 'B'): { 'A': 0.3, 'B': 0.7 }
}
Experimental method §
Data:
 train dataset
 test dataset
randomness.
mean squared error: $\frac{1}{n} \sum^n_{i=1}\left(P_i  A_i\right)^2$ where $P$ is the vector of predictions, $A$ is the vector of actual values
simpleplot
§
Data format:
# doesn't allow repeated x coordinates
wages = {2000: 200, 2001: 200, 2002: 250, 2003: 300}
# allows repeated x coordinates
wages = [(2000, 200), (2001, 200), (2002: 250), (2003: 300)]
plot_lines
§
plot_lines(title, width, height, x_label, y_label, datasets, show_points, [dataset_labels])
datasets
must be a list
plot_bars
§
plot_bars(title, width, height, x_label, y_label, datasets, [dataset_labels])
List comprehensions §
[x ** 2 for x in range(n)]
zip
and enumerate
§
zip
take iterables and returns an iterator over tuples of their $i$th values.
for a_i, b_i in zip(a, b):
zip([1, 2, 3], [1, 2, 3])
# [(1, 1), (2, 2), (3, 3)]
zip([1, 2, 3], ['a', 'b', 'c'], ['i', 'ii', 'iii'])
# [(1, 'a', 'i'), (2, 'b', 'ii'), (3, 'c', 'iii')]
zip([1, 2, 3, 4], [1, 2, 3])
# [(1, 1), (2, 2), (3, 3)]
enumerate([5, 4, 3])
# [(0, 5), (1, 4), (2, 3)]
enumerate([5, 4, 3], 2)
# [(2, 5), (3, 4), (4, 3)]
Module 4 §
OOP §
class Foo:
"""
"""
def __init__(self, a, b):
self.x = a
self.y = b
# return nothing
def fn(self):
pass
def bar(self):
pass
f = Foo(1, 2)
f.fn()
f.x
f.y
Graphs §
$G = (V, E)$
$V$: nonempty set of vertices
$E$: set of pairs of vertices (aka. edges)
degree of a vertex is the number of edges connected to it.
deleting some vertices and/or edges leaves a subgraph
$G' = (V',E')$ is a subgraph of $G$ iff:
 $V' \subset V$ and $V'$ is nonempty
 $E' \subset E$ that only contains edges connecting vertices in $V'$ (not necessarily all of those edges)
Types §
directed graphs: edges have direction: edges are ordered pairs of vertices.
 indegree of a vertex: # of edges connecting to the vertex
 outdegree of a vertex: # of edges connecting from the vertex
weighted graphs: edges have weights (numbers) associated. weight usually represents “cost”, but may also represent importance etc.
nodes and edges may have attributes. node/edge metadata.
Properties §

connectivity: are all nodes connected?
how many isolated components? how large are those components.
trickier def when dealing with directed graphs.

path: sequences of edges connecting a set of vertices between a starting vertex and an ending vertex.
length: number of edges.
must follow the direction (if applicable).
may have 0 edges (path from $x$ to $x$)
may repeat edges.

distance: length of the shortest path(s)
distance between unconnected nodes is $\infty$

diameter: longest shortest path between any two nodes
Representation §
$V = \{1, 2, 3, 4\}$
$E = \{(1, 2), (2, 4), (4, 1)\}$
For undirected graphs you could just rep two edges: $(x, y)$ and $(y, x)$
Adjacency matrix §
square matrix. $i$th row and $i$th column corresponding to $i$th vertex.
$a_{ij} = \begin{cases} 1 & (i,j) \in E\\ 0 & (i,j) \not\in E \end{cases}$
elements could also be weights. if undirected, use as upper/lower matrix perhaps.
fine if the number of vertices is small compared to the number of edges. else, space inefficient.
Adjacency list §
map.
 keys: nodes
 values: lists of nodes the key is connected to
graph = {
'A': set(['B']), # *from* A *to* B
'B': set(['A', 'C']), # *from* B *to* A and C
'C': set([]) # *from* C *to* nothing
}
directed by default.
Sets §
 unordered
 no dupes
s = set([3, 4, 5, 1, 9])

s.add(element)

s.remove(element)
KeyError when
element not in s

s.discard(element)
doesn’t error when
element not in s

s.pop()

s.union(b)
doesn’t mutate. returns a new set.

s.intersection(b)

s.difference(b)

s.symmetric_difference(b)

s.intersection_update(b)
mutates.
Stacks and Queues §
Stacks §
Lastin firstout
 PUSH
 POP
Queues §
Firstin firstout
 PUSH
 POP
BFS §
“breadthfirst”
node_queue = Queue()
distance = {}
parent = {}
for node in graph.get_nodes():
distance[node] = None # None: the node cannot be reached
parent[node] = None
node_queue.push(start_node)
distance[start_node] = 0
while not node_queue.empty():
current_node = node_queue.pop()
for neighbor_node in graph.get_neighbors(current_node):
if distance[neighbor_node] == None:
distance[neighbor_node] = distance[current_node] + 1
parent[neighbor_node] = current_node
node_queue.push(neighbor_node)
return distance, parent
Module 5 §
QR Codes §
in: message
 encode message as binary (asciilike)
 ReedSolomon ECC
 Order codewords (interleave)
 Place data on a QR grid
 Mask data
 Add format and version info
out: qr code
Error detection and correction §
Parity bit: additional bit.
$P = \begin{cases}1 & \textrm{odd number of ones} \\ 0 & \textrm{even number of ones} \end{cases}$
can’t detect where or what the errors are.
Hamming codes
$H_0 = P(D_0, D_1, D_3)$
$H_1 = P(D_0, D_2, D_3)$
$H_2 = P(D_1, D_2, D_3)$
Hamming distance: “how many bits are different?”
Hamming codes can:
 detect and correct 1bit errors
 detect 2bit errors
Copying §
Be careful when creating copies of objects with nested objects. You usually want a deep copy (as opposed to a shallow copy).
$Z_{256}$ §
Reed Solomon §
$n$ is the number of message bytes
$k$ is the number of EC bytes required
$m = \sum_{i = 0}^{n  1} M_i x^{k+n1i}$ in $Z_{256}$
where $M = m_0 m_1 \cdots m_{n  1}$
$g = \prod_{i = 0}^{k  1}(x  2^i)$ in $Z_{256}$
Programming Principles §
 think
 don’t program by coincidence
 DRY
 make it to reuse
 avoid unrelated sideeffects
 refactor {early,often}
 don’t live with broken windows
 don’t panic :p