notes.txt

Hyperparameters are adjustable parameters that let you control the model optimization process (below).
These differ from regular parameters in that regular parameters are what's being fitted in the model.

Number of Epochs - the number times to iterate over the dataset

Batch Size - the number of data samples propagated through the network before the parameters are updated

Learning Rate - how much to update models parameters at each batch/epoch. Smaller values yield slow learning speed, while large values may result in unpredictable behavior during training.

Loss function is calculated on every pass of prediction vs actual: EG mean squared error or cross entropy loss

Optimizer is how the net will converge: EG gradient descent and stochastic gradient descent
Optimization is the process of adjusting model parameters to reduce model error in each training step.