Why are the weights only usable in training?

Question 1

After calling the fit function I can see that the model is converging in training but after I go to call the evaluate method it acts as if the model hasn't done the fitting at all. The best example is below where I use the training generator for train and validation and get different results.

import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

from ImageGenerator import ImageGenerator

if __name__== "__main__":

    batch_size=64

    train_gen = ImageGenerator('synthetic3/train/open/*.png', 'synthetic3/train/closed/*.png', batch_size=batch_size)

    model = tf.keras.applications.mobilenet_v2.MobileNetV2(weights=None, classes=2, input_shape=(256, 256, 3))

    model.compile(optimizer='adam', 
                loss=tf.keras.losses.CategoricalCrossentropy(),
                metrics=['accuracy'])

    history = model.fit(
        train_gen,
        validation_data=train_gen,
        epochs=5,
        verbose=1
    )
    
    model.evaluate(train_gen)

Results

Epoch 1/5
19/19 [==============================] - 11s 600ms/step - loss: 0.7707 - accuracy: 0.5016 - val_loss: 0.6932 - val_accuracy: 0.5016
Epoch 2/5
19/19 [==============================] - 10s 533ms/step - loss: 0.6991 - accuracy: 0.5855 - val_loss: 0.6935 - val_accuracy: 0.4975
Epoch 3/5
19/19 [==============================] - 10s 509ms/step - loss: 0.6213 - accuracy: 0.6637 - val_loss: 0.6932 - val_accuracy: 0.4992
Epoch 4/5
19/19 [==============================] - 10s 514ms/step - loss: 0.4407 - accuracy: 0.8158 - val_loss: 0.6934 - val_accuracy: 0.5008
Epoch 5/5
19/19 [==============================] - 10s 504ms/step - loss: 0.3200 - accuracy: 0.8643 - val_loss: 0.6949 - val_accuracy: 0.5000
19/19 [==============================] - 3s 159ms/step - loss: 0.6953 - accuracy: 0.4967

This is problematic because even when saving weights it saves as if the model hasn't done the fitting.

Question 2

evaluate() function takes a validation dataset as an input to evaluate already trained model.

From the looks of it you are using a training dataset (train_gen) for validation_data and passing the same dataset as an input to model.evaluate()

Question 3

Hi everyone after many days of pain finally discovered the solution to this problem. This is due to batch normalization layers in the model. The momentum parameter needs to be changed according to your batch size if you plan on training as a custom dataset.

for layer in model.layers:
    if type(layer)==type(tf.keras.layers.BatchNormalization()):
        # renorm=True, Can have renomalization for smaller batch sizes
        layer.momentum=new_momentum

Sources: https://github.com/tensorflow/tensorflow/issues/36065

Uğur Kahveci · Answer 1 · 2021-11-24T11:43:27

evaluate() function takes a validation dataset as an input to evaluate already trained model.

From the looks of it you are using a training dataset (train_gen) for validation_data and passing the same dataset as an input to model.evaluate()

Yeah I've done that on purpose to show that even though the train accuracy is improving the validation isn't. Even on the same dataset

ac4824 · Answer 2 · 2021-12-10T04:30:31

Hi everyone after many days of pain finally discovered the solution to this problem. This is due to batch normalization layers in the model. The momentum parameter needs to be changed according to your batch size if you plan on training as a custom dataset.

for layer in model.layers:
    if type(layer)==type(tf.keras.layers.BatchNormalization()):
        # renorm=True, Can have renomalization for smaller batch sizes
        layer.momentum=new_momentum

Sources: https://github.com/tensorflow/tensorflow/issues/36065

Why are the weights only usable in training?

In other languages

This page is in other languages

Popular in the category