Add trees and hash table stuff
This commit is contained in:
parent
f8b2dca3a3
commit
a63f6add94
7 changed files with 257 additions and 3 deletions
BIN
labs/excel-thing
BIN
labs/excel-thing
Binary file not shown.
126
notes/fund-prog-3/chapter-5/notes.md
Normal file
126
notes/fund-prog-3/chapter-5/notes.md
Normal file
|
@ -0,0 +1,126 @@
|
||||||
|
# Chapter 5
|
||||||
|
|
||||||
|
## Linear Probing
|
||||||
|
|
||||||
|
When a collision occurs, we go through the hash table and find the next available slot.
|
||||||
|
|
||||||
|
## Quadratic Probing
|
||||||
|
|
||||||
|
Going through a hash table and finding the next available slot by adding a quadratic value to the current index.
|
||||||
|
|
||||||
|
$(H + c1*i + c2*i^2) \mod \textit{(table size)}$
|
||||||
|
|
||||||
|
Where $H$ is the hash value, $c1$ and $c2$ are constants, and $i$ is the number of times we've probed.
|
||||||
|
|
||||||
|
For instance:
|
||||||
|
|
||||||
|
| Index | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| 0 | 20 |
|
||||||
|
| 1 | 41 |
|
||||||
|
| 2 | null |
|
||||||
|
| 3 | null |
|
||||||
|
| 4 | null |
|
||||||
|
|
||||||
|
- $c1 = 2$
|
||||||
|
- $c2 = 4$
|
||||||
|
|
||||||
|
So running `insert(40)` would result in:
|
||||||
|
|
||||||
|
- $H = 40 \bmod 5 = 0$
|
||||||
|
- $i = 1$
|
||||||
|
- $(0 + (2 * 1) + (4 * 1^2)) \bmod 5$ -> $(0 + 2 + 4) \bmod 5 = 1$
|
||||||
|
- $i = 2$ - retrying because there's already a value at 1
|
||||||
|
- $(0 + (2 * 2) + (4 * 2^2)) \bmod 5$ -> $(0 + 4 + 16) \bmod 5 = 0$
|
||||||
|
- $i = 3$ - retrying because there's already a value at 0
|
||||||
|
- $(0 + (2 * 3) + (4 * 3^2)) \bmod 5$ -> $(0 + 6 + 36) \bmod 5 = 2$
|
||||||
|
- There is no value at 2, so we can insert 40 there.
|
||||||
|
|
||||||
|
| Index | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| 0 | 20 |
|
||||||
|
| 1 | 41 |
|
||||||
|
| 2 | 40 |
|
||||||
|
| 3 | null |
|
||||||
|
| 4 | null |
|
||||||
|
|
||||||
|
And a script to do it automatically:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import sys
|
||||||
|
|
||||||
|
size = int(sys.argv[-5]) # size of the table
|
||||||
|
num = int(sys.argv[-4]) # number to hash
|
||||||
|
c1 = int(sys.argv[-3]) # c1
|
||||||
|
c2 = int(sys.argv[-2]) # c2
|
||||||
|
i = int(sys.argv[-1]) # number of times we've probed
|
||||||
|
|
||||||
|
h = num % size # hash value
|
||||||
|
|
||||||
|
print((h + (c1 * i) + (c2 * i * i)) % size)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Double Hashing
|
||||||
|
|
||||||
|
Another collision resolution thing, where we use a second hash function to find the next available slot. The formula is:
|
||||||
|
|
||||||
|
$(h1(key) + i * h2(key)) \bmod (table size)$
|
||||||
|
|
||||||
|
Where $h1$ is the first hash function, $h2$ is the second hash function, and $i$ is the number of times we've probed.
|
||||||
|
|
||||||
|
For instance:
|
||||||
|
|
||||||
|
| Index | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| 0 | 20 |
|
||||||
|
| 1 | 41 |
|
||||||
|
| 2 | null |
|
||||||
|
| 3 | null |
|
||||||
|
| 4 | null |
|
||||||
|
|
||||||
|
Assuming:
|
||||||
|
|
||||||
|
- $h1() = h \bmod 5$
|
||||||
|
- $h2() = (i * h \bmod 3)$ - idk if there's any logic to this, but it's just an example
|
||||||
|
|
||||||
|
So running `insert(40)` would result in:
|
||||||
|
|
||||||
|
- $h1(40) = 40 \bmod 5 = 0$ - using the second function because there's already a value at 0
|
||||||
|
- $h2(40) = (1 * 40) \bmod 3 = 1$;  $i = 0 + (1 * 1) = 1$ - there's already a value at 1
|
||||||
|
- $h2(40) = (2 * 40) \bmod 3 = 2$;  $i = 0 + (2 * 2) = 4$ - there's no value at 4, so we can insert 40 there.
|
||||||
|
|
||||||
|
## Common Hash Functions
|
||||||
|
|
||||||
|
### Mid-Square hashing
|
||||||
|
|
||||||
|
1. Square the key
|
||||||
|
2. Take the middle $R$ digits - $R$ must be greater than or equal to $log_{10}(\text{table size})$
|
||||||
|
3. Use that as the hash value - if it's above the number of slots, we can use the modulo of the table size. e.g., if the table size is 100, and the hash value is 123, we can use 23.
|
||||||
|
|
||||||
|
#### Binary
|
||||||
|
|
||||||
|
Usually mid-square hashing is done using binary, since that's what computers work with and is faster. It works the same, but using base 2:
|
||||||
|
|
||||||
|
1. Square the key
|
||||||
|
2. Take the middle $R$ bits - $R$ must be greater than or equal to $log_2(\text{table size})$
|
||||||
|
3. Use that as the hash value - if it's above the number of slots, we can use the modulo of the table size.
|
||||||
|
|
||||||
|
### Multiplicative string hashing
|
||||||
|
|
||||||
|
Multiplicative string hashing in Python-like pseudocode:
|
||||||
|
|
||||||
|
```py
|
||||||
|
multiplicativeStringHash(string key) {
|
||||||
|
string_hash = initial_value # using Bernstein's hash, initial_value is 5381
|
||||||
|
hash_multiplier = 33 # using Bernstein's hash, hash_multiplier is 33
|
||||||
|
|
||||||
|
for character in key {
|
||||||
|
string_hash *= hash_multiplier
|
||||||
|
strChar += ascii(character) # returns the ASCII number for the character
|
||||||
|
}
|
||||||
|
|
||||||
|
return string_hash % table_size
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Bernstein's hash uses an initial value of 5381 and a hash multiplier of 33, and works well for hashing short English strings.
|
20
notes/fund-prog-3/chapter-5/programs/mult_str_hashing.py
Normal file
20
notes/fund-prog-3/chapter-5/programs/mult_str_hashing.py
Normal file
|
@ -0,0 +1,20 @@
|
||||||
|
import sys
|
||||||
|
|
||||||
|
initial = int(sys.argv[-4])
|
||||||
|
multiplier = int(sys.argv[-3])
|
||||||
|
size = int(sys.argv[-2])
|
||||||
|
string = sys.argv[-1]
|
||||||
|
|
||||||
|
|
||||||
|
def multiplicative_string_hash(string):
|
||||||
|
string_hash = initial
|
||||||
|
hash_multiplier = multiplier
|
||||||
|
|
||||||
|
for character in string:
|
||||||
|
string_hash *= hash_multiplier
|
||||||
|
string_hash += ord(character)
|
||||||
|
|
||||||
|
return string_hash % size
|
||||||
|
|
||||||
|
|
||||||
|
print(multiplicative_string_hash(string))
|
24
notes/fund-prog-3/chapter-5/programs/num_hashing.py
Normal file
24
notes/fund-prog-3/chapter-5/programs/num_hashing.py
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
import sys
|
||||||
|
|
||||||
|
size = int(sys.argv[-5])
|
||||||
|
num = int(sys.argv[-4])
|
||||||
|
var_1 = int(sys.argv[-3])
|
||||||
|
var_2 = int(sys.argv[-2])
|
||||||
|
i = int(sys.argv[-1])
|
||||||
|
|
||||||
|
def quadratic():
|
||||||
|
h = num % size
|
||||||
|
return (h + (var_1 * i) + (var_2 * i * i)) % size
|
||||||
|
|
||||||
|
def mod_size():
|
||||||
|
return num % size
|
||||||
|
|
||||||
|
def h2():
|
||||||
|
return var_2 - (num % var_2)
|
||||||
|
|
||||||
|
def combined():
|
||||||
|
return (mod_size() + (i * h2())) % size
|
||||||
|
|
||||||
|
func = combined
|
||||||
|
|
||||||
|
print(func())
|
|
@ -24,11 +24,13 @@ ShellSort(array, gapList)
|
||||||
|
|
||||||
## Hibbard
|
## Hibbard
|
||||||
|
|
||||||
2<sup>k</sup> - 1 where k is 1 to p where └ k<sup>p</sup> ┐ = N
|
$2^k - 1$ where $k$ is 1 to $p$ where $└ k^p ┐$ = N
|
||||||
|
|
||||||
|
editor's (now-future me) note: idk why those weird bracket things are there lol, but they were written on the board for some reason and i just don't remember it
|
||||||
|
|
||||||
## Pratt
|
## Pratt
|
||||||
|
|
||||||
For a Z-tuples for (0, 0) -> (k, k) create all the cartesian pairs
|
For a Z-tuples for $(0, 0) -> (k, k)$ create all the cartesian pairs
|
||||||
|
|
||||||
```txt
|
```txt
|
||||||
(0, 0), (0, 1), (0, 2), ..., (0, k)
|
(0, 0), (0, 1), (0, 2), ..., (0, k)
|
||||||
|
@ -40,7 +42,7 @@ For a Z-tuples for (0, 0) -> (k, k) create all the cartesian pairs
|
||||||
|
|
||||||
## Naive gap values
|
## Naive gap values
|
||||||
|
|
||||||
N/2<sup>k</sup>, k = 1 to p where N/2<sup>p</sup> => 1
|
$N/2^k$, $k = 1$ to $p$ where $N/2^p=> 1$
|
||||||
|
|
||||||
```py
|
```py
|
||||||
for i, j in S
|
for i, j in S
|
||||||
|
|
82
notes/fund-prog-3/trees.md
Normal file
82
notes/fund-prog-3/trees.md
Normal file
|
@ -0,0 +1,82 @@
|
||||||
|
# Trees
|
||||||
|
|
||||||
|
A general tree is a data structure where each node can have zero or more children. Each node has exactly one parent except for the root that has no parent.
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
std::vector<TreeNode*> stack;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Definition
|
||||||
|
|
||||||
|
A tree T is defined by a root r and a (possibly empty) list of subtrees $T^1, T^2, ..., T^n$ where $n \geq 0$. Each subtree is itself a general tree according to the same definition.
|
||||||
|
|
||||||
|
## Diagram
|
||||||
|
|
||||||
|
`TreeNode`
|
||||||
|
|
||||||
|
```text
|
||||||
|
|------------------|
|
||||||
|
| Data |
|
||||||
|
| l1 l2 |
|
||||||
|
|------------------|
|
||||||
|
/ \
|
||||||
|
/ \
|
||||||
|
|--------------| |----------------------------|
|
||||||
|
| Data | | Data |
|
||||||
|
| l1 l2 | | l1 l2 l3 |
|
||||||
|
|--------------| |----------------------------|
|
||||||
|
/ \ / / \
|
||||||
|
/ |--------| |------| |------| |----------|
|
||||||
|
|------| | Data | | Data | | Data | | Data |
|
||||||
|
| Data | | l1 | |------| |------| | l1 l2 |
|
||||||
|
|------| |--------| |----------|
|
||||||
|
/ / \
|
||||||
|
|------| / \
|
||||||
|
| Data | |------| |------|
|
||||||
|
|------| | Data | | Data |
|
||||||
|
|------| | l1 |
|
||||||
|
|------|
|
||||||
|
/
|
||||||
|
/
|
||||||
|
|------|
|
||||||
|
| Data |
|
||||||
|
|------|
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Trees are traversed starting at the root. Traversals are often recursive because of the recursive nature of trees. It is **bad** practice to expose a reference of any node externally. However, in this lab you have the following publicly exposed:
|
||||||
|
|
||||||
|
- `TreeNode* root`
|
||||||
|
- `void add Child()`
|
||||||
|
- `TreeNode* getRoot()`
|
||||||
|
- `TreeNode* findNode()`
|
||||||
|
- `collection(TreeNode*) getChildren()`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Python-like pseudocode for Level-Order Traversal:
|
||||||
|
|
||||||
|
```py
|
||||||
|
LOT(root):
|
||||||
|
# q is a queue of TreeNode references
|
||||||
|
q;
|
||||||
|
while q is not empty:
|
||||||
|
node = q.dequeue()
|
||||||
|
visit node
|
||||||
|
for each child of ode:
|
||||||
|
q.enqueue(child)
|
||||||
|
```
|
||||||
|
|
||||||
|
```py
|
||||||
|
LOT_ReportLevels(root):
|
||||||
|
# result is a list of lists of TreeNode references
|
||||||
|
result;
|
||||||
|
# q is a queue of TreeNode refs
|
||||||
|
q;
|
||||||
|
while q is not empty:
|
||||||
|
# level is a list of TreeNode references
|
||||||
|
level;
|
||||||
|
# levelSize is q.size(?)
|
||||||
|
# incomplete
|
||||||
|
```
|
Loading…
Reference in a new issue