My personal notes on the book Computer Science Distilled: Learn the Art of Solving Computational Problems.

## 1. Basics

• Dump all important stuff on paper.
• Probability
• The formula:
$$P(event) = \frac{\#\ of\ ways\ event\ can\ happen}{\#\ of\ possible\ outcomes}$$

• Independent events: $$P(A \cap B) = P(A) \cdot P(B)$$
• Mututally exclusive events: $$P(A \cup B) = P(A) + P(B)$$
• Complimentary events: $$1 = P(A) + P(B)$$
• The Gambler’s Fallacy: past events never affect the outcome of an independent event.

## 2. Complexity

• Time complexity, denoted by T(n), gives the number of operations the number of operations the algorithm performs when processing an input of size n.
• When an algorithm can have different values of T(n) for the same value n, we resort to cases:
• Best case
• Worst case
• Average case
• In general, the most important is the worst case (baseline).
• We can find the time complexity of an algorithm by counting the number of basic operations it requires for an hypothetical input of size n.
• We can approximate T(n) by its fastest-growing term, called the dominant term: $$T(n) = n^2 + n - 2 \approx O(n^2)$$
• The Big-O notation
• This notation is used to express the dominant term of an algorithm’s cost function in the worst case. A function with a fastest-growing term of $$2^n$$ or “weaker” is $$O(2^n)$$
• Space complexity measures the working storage for an algorithm.

## 3. Strategy

• The iterative strategy consists of using loops to repeat a process until a condition is met.
• The recursive strategy consists of using functions that deleted work to “clones of itself”.
• The brute force strategy solves problems by inspecting all off the problem’s possible solution candidates.
• The backtrack strategy involves rolling back the previous step and continuing the search. It works best in problems where the solution is a sequence of choices and making a choice restrains subsequent choices.
• A heuristic (method) leads to a solution without guaranteeing it is the best or optimal one.
• Greedy approach: the opposite of backtracking; it consist in never coming back to previous choices.
• The divide and conquer strategy divides a problem over and over into subproblems, until the become easy enough to solve. The subproblem solutions are combined to obtain the original problem solution.
• Dynaminc programming is identifying repeated subproblems in order to compute them only once.
• Memoization allows the reuse of partial calculations.
• Brand and bound strategies provide a solution in the form of a sequence of choices. It works by detecting and discarding bad choices.

## 4. Data

• Abstractions are interfaces that let us omit details.
• An Abstract Data Type (ADT) is the specification of a group of operations that make sense for a given data type.
• Common abstractions
• Primitive data types
• Stack
• Queue
• Priority queue
• List
• Sorted list
• Map
• Set
• Structures
• Data structures describe how data is to be organized and accessed in the computer’s memory.
• Array: the simplest way to store stuff in computer memory. It consists in allocation sequential space in the memory, and writing your data sequentially on that space.
• Linked list: data is stored in a chain of cells that doesn’t need to be at sequential memory addresses. Memory for each cell is allocated as needed.
• Tree: they are similar to lists, but are more suitable to hierarchical data.
• Binary search tree: it’s a special type of tree that can be efficiently searched.
• Red-black tree
• AVL tree
• B-tree
• Binary heap: it’s a special type of search tree, in which we can find the smallest (or biggest) item instantly ($$O(1)$$), because it’s always the root node of the tree.
• Graph: data is freely arranged as nodes and edges; it can represent almost any kind of data.
• Hash table: it’s a data structure that allows finding items in $$O(1)$$ time.

## 5. Algorithms

• Sorting
• Insertion sort: it’s very efficient at sorting near sorted datasets, regardless of size.
• Quicksort and mergesort, both $$O(n \log n)$$, are efficient for large datasets.
• Searching
• Binary search: $$O(\log n)$$ search in arrays.
• Hash table: $$O(1)$$ search.
• Graphs
• They can be searched in two ways: DFS and BFS.
• Depth-first search (DFS): we keep following edges, going deeper and deeper in the graph; when we reach a node that has no edges to any new nodes, we go back to the previous node and continue the process. We use a stack to keep track of the trail, pushing a node when we explore it, and popping a node when we need to go back.
• Breadth-first search (BFS): it explores the graph per level, first the neighboors of your start node, then its neighboors’ neighboors, and so on. To keep track of nodes we visit, we use a queue: once we explore a node, we enqueue its children, then dequeue the next node to explore.
• Operations research
• It involes defining an objective to “maximize” or “minimize”.
• Linear optimization problems are problems where the objective can be modeled using linear equations.
• The simplex method solves an optimization problem with the input functions and the equations that model its constraints. Example:
• Equation: z = 8x + 12y
• Budget constraints: 10x + 20y <= 140
• Space constraint: 6x + 8y <= 72
• We can’t have negative items: x >=0, y >= 0