How to train your ProbNMN?¶
Training a ProbNMN is done in three phases (plus an extra preprocessing-ish phase). This codebase supports training based on our proposed objective for ProbNMN, as well as baseline objective according to Johnson et al. (CVPR 2017).
Training is governed by a YAML config file, which has a PHASE
field, with
the name of the training phase. Phase names are corresponding trainers are:
program_prior
:ProgramPriorTrainer
question_coding
:QuestionCodingTrainer
module_training
:ModuleTrainingTrainer
joint_training
:JointTrainingTrainer
Note
Execute all the commands from
$PROJECT_ROOT
to use the config files used to reproduce results in the paper. Configuration is managed through YAML files, with a central package-wide configuration management system. Read more atConfig
.All the training phases will by default serialize checkpoints every few iterations, and serialize tensorboard logs every iteration in the same directory provided through
--serialization-dir
. Usetensorboard --logdir $SERIALIZATION_DIR
to view training curves, validation metrics etc. directly on tensorboard.The subset of question-program paired training data is selected randomly, hence the quality of supervision dataset is governed by random seed. Sometimes it may not be the best, and training might be slow. We recommend running
question_coding
andjoint_training
for at least 5-7 different random seeds, use--config-override RANDOM_SEED $NUM
. Having same random seed will ensure the selection of same paired data across different run and different machines.If time / resources are limited, we recommend random seed 700 for decent results.
PHASE: program_prior
¶
Train a ProgramPrior
using all the
programs from CLEVR v1.0 training split. Alternatively, this can be trained
using programs simulated using syntax.
python scripts/train.py \
--config-yml configs/program_prior.yml \
--phase program_prior \
--gpu-ids 0 \
--serialization-dir checkpoints/program_prior
This step does not apply for baseline
objective.
PHASE: question_coding
¶
Learn a latent “code” for questions, given some program-question pairs, and a
large amount of questions without paired programs. Choose appropriate config
file according to training objective (baseline
or ours
).
python scripts/train.py \
--config-yml configs/question_coding_ours.yml \
--phase question_coding \
--config-override CHECKPOINTS.PROGRAM_PRIOR checkpoints/program_prior/program_prior_best.pth \
--gpu-ids 0 \
--serialization-dir checkpoints/question_coding_ours
Parameters of ProgramPrior
are frozen.
PHASE: module_training
¶
Train a neural module network with (image, question, answer) tuples, where the
ProgramGenerator
trained in
question_coding
phase (kept frozen) infers programs from questions.
python scripts/train.py \
--config-yml configs/module_training.yml \
--phase module_training \
--config-override CHECKPOINTS.QUESTION_CODING checkpoints/question_coding_ours/question_coding_best.pth \
--gpu-ids 0 1 2 3 \
--serialization-dir checkpoints/question_coding_ours
Multi-GPU execution is supported here. This phase is the same for both training objectives.
PHASE: joint_training
¶
python scripts/train.py \
--config-yml configs/joint_training_ours.yml \
--phase joint_training \
--config-override CHECKPOINTS.PROGRAM_PRIOR checkpoints/program_prior/program_prior_best.pth \
CHECKPOINTS.QUESTION_CODING checkpoints/question_coding_ours/question_coding_best.pth \
CHECKPOINTS.MODULE_TRAINING checkpoints/module_training/module_training_best.pth \
--gpu-ids 0 1 2 3 \
--serialization-dir checkpoints/joint_training_ours