its still overkill
major layout changes in the network
using AdamW optim again, AdamW is the go to for small toys and transformers. refactored NN classes to thier own module under pairwise_comp_nn.py