Add trees and hash table stuff

This commit is contained in:
askiiart 2024-03-11 17:44:38 -05:00
parent f8b2dca3a3
commit a63f6add94
Signed by untrusted user who does not match committer: askiiart
GPG key ID: BC3800E55FB54D67
7 changed files with 257 additions and 3 deletions

View file

@ -0,0 +1,126 @@
# Chapter 5
## Linear Probing
When a collision occurs, we go through the hash table and find the next available slot.
## Quadratic Probing
Going through a hash table and finding the next available slot by adding a quadratic value to the current index.
$(H + c1*i + c2*i^2) \mod \textit{(table size)}$
Where $H$ is the hash value, $c1$ and $c2$ are constants, and $i$ is the number of times we've probed.
For instance:
| Index | Value |
|-------|-------|
| 0 | 20 |
| 1 | 41 |
| 2 | null |
| 3 | null |
| 4 | null |
- $c1 = 2$
- $c2 = 4$
So running `insert(40)` would result in:
- $H = 40 \bmod 5 = 0$
- $i = 1$
- $(0 + (2 * 1) + (4 * 1^2)) \bmod 5$ -> $(0 + 2 + 4) \bmod 5 = 1$
- $i = 2$ - retrying because there's already a value at 1
- $(0 + (2 * 2) + (4 * 2^2)) \bmod 5$ -> $(0 + 4 + 16) \bmod 5 = 0$
- $i = 3$ - retrying because there's already a value at 0
- $(0 + (2 * 3) + (4 * 3^2)) \bmod 5$ -> $(0 + 6 + 36) \bmod 5 = 2$
- There is no value at 2, so we can insert 40 there.
| Index | Value |
|-------|-------|
| 0 | 20 |
| 1 | 41 |
| 2 | 40 |
| 3 | null |
| 4 | null |
And a script to do it automatically:
```python
import sys
size = int(sys.argv[-5]) # size of the table
num = int(sys.argv[-4]) # number to hash
c1 = int(sys.argv[-3]) # c1
c2 = int(sys.argv[-2]) # c2
i = int(sys.argv[-1]) # number of times we've probed
h = num % size # hash value
print((h + (c1 * i) + (c2 * i * i)) % size)
```
## Double Hashing
Another collision resolution thing, where we use a second hash function to find the next available slot. The formula is:
$(h1(key) + i * h2(key)) \bmod (table size)$
Where $h1$ is the first hash function, $h2$ is the second hash function, and $i$ is the number of times we've probed.
For instance:
| Index | Value |
|-------|-------|
| 0 | 20 |
| 1 | 41 |
| 2 | null |
| 3 | null |
| 4 | null |
Assuming:
- $h1() = h \bmod 5$
- $h2() = (i * h \bmod 3)$ - idk if there's any logic to this, but it's just an example
So running `insert(40)` would result in:
- $h1(40) = 40 \bmod 5 = 0$ - using the second function because there's already a value at 0
- $h2(40) = (1 * 40) \bmod 3 = 1$;  $i = 0 + (1 * 1) = 1$ - there's already a value at 1
- $h2(40) = (2 * 40) \bmod 3 = 2$;  $i = 0 + (2 * 2) = 4$ - there's no value at 4, so we can insert 40 there.
## Common Hash Functions
### Mid-Square hashing
1. Square the key
2. Take the middle $R$ digits - $R$ must be greater than or equal to $log_{10}(\text{table size})$
3. Use that as the hash value - if it's above the number of slots, we can use the modulo of the table size. e.g., if the table size is 100, and the hash value is 123, we can use 23.
#### Binary
Usually mid-square hashing is done using binary, since that's what computers work with and is faster. It works the same, but using base 2:
1. Square the key
2. Take the middle $R$ bits - $R$ must be greater than or equal to $log_2(\text{table size})$
3. Use that as the hash value - if it's above the number of slots, we can use the modulo of the table size.
### Multiplicative string hashing
Multiplicative string hashing in Python-like pseudocode:
```py
multiplicativeStringHash(string key) {
string_hash = initial_value # using Bernstein's hash, initial_value is 5381
hash_multiplier = 33 # using Bernstein's hash, hash_multiplier is 33
for character in key {
string_hash *= hash_multiplier
strChar += ascii(character) # returns the ASCII number for the character
}
return string_hash % table_size
}
```
Bernstein's hash uses an initial value of 5381 and a hash multiplier of 33, and works well for hashing short English strings.

View file

@ -0,0 +1,20 @@
import sys
initial = int(sys.argv[-4])
multiplier = int(sys.argv[-3])
size = int(sys.argv[-2])
string = sys.argv[-1]
def multiplicative_string_hash(string):
string_hash = initial
hash_multiplier = multiplier
for character in string:
string_hash *= hash_multiplier
string_hash += ord(character)
return string_hash % size
print(multiplicative_string_hash(string))

View file

@ -0,0 +1,24 @@
import sys
size = int(sys.argv[-5])
num = int(sys.argv[-4])
var_1 = int(sys.argv[-3])
var_2 = int(sys.argv[-2])
i = int(sys.argv[-1])
def quadratic():
h = num % size
return (h + (var_1 * i) + (var_2 * i * i)) % size
def mod_size():
return num % size
def h2():
return var_2 - (num % var_2)
def combined():
return (mod_size() + (i * h2())) % size
func = combined
print(func())