probnmn.modules.nmn_modules¶
Collection of PyTorch modules used by our Neural Module Network.
Adopted from: davidmascharka/tbd-nets.
-
class
probnmn.modules.nmn_modules.
AndModule
[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that (basically) performs a logical and.
An
AndModule
is a neural module that takes two input attention masks and (basically) performs a set intersection. This would be used in a question like “What color is the cube to the left of the sphere and right of the yellow cylinder?” After localizing the regions left of the sphere and right of the yellow cylinder, anAndModule
would be used to find the intersection of the two. Its output would then go into anAttentionModule
that finds cubes.
-
class
probnmn.modules.nmn_modules.
OrModule
[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that (basically) performs a logical or.
An
OrModule
is a neural module that takes two input attention masks and (basically) performs a set union. This would be used in a question like “How many cubes are left of the brown sphere or right of the cylinder?” After localizing the regions left of the brown sphere and right of the cylinder, anOrModule
would be used to find the union of the two. Its output would then go into anAttentionModule
that finds cubes.
-
class
probnmn.modules.nmn_modules.
AttentionModule
(dim: int)[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that takes a feature map and attention, attends to the features, and produces an attention.
A
AttentionModule
takes input features and an attention and produces an attention. It multiplicatively combines its input feature map and attention to attend to the relevant region of the feature map. It then processes the attended features via a series of convolutions and produces an attention mask highlighting the objects that possess the attribute the module is looking for.For example, an
AttentionModule
may be tasked with finding cubes. Given an input attention of all ones, it will highlight all the cubes in the provided input features. Given an attention mask highlighting all the red objects, it will produce an attention mask highlighting all the red cubes.- Parameters
- dim: int
The number of channels of each convolutional filter.
-
class
probnmn.modules.nmn_modules.
QueryModule
(dim: int)[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that takes as input a feature map and an attention and produces a feature map as output.
A
QueryModule
takes a feature map and an attention mask as input. It attends to the feature map via an elementwise multiplication with the attention mask, then processes this attended feature map via a series of convolutions to extract relevant information.For example, a
QueryModule
tasked with determining the color of objects would output a feature map encoding what color the attended object is. A module intended to count would output a feature map encoding the number of attended objects in the scene.- Parameters
- dim: int
The number of channels of each convolutional filter.
-
class
probnmn.modules.nmn_modules.
RelateModule
(dim: int)[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that takes as input a feature map and an attention and produces an attention as output.
A
RelateModule
takes input features and an attention and produces an attention. It multiplicatively combines the attention and the features to attend to a relevant region, then uses a series of dilated convolutional filters to indicate a spatial relationship to the input attended region.- Parameters
- dim: int
The number of channels of each convolutional filter.
-
class
probnmn.modules.nmn_modules.
SameModule
(dim: int)[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that takes as input a feature map and an attention and produces an attention as output.
A
SameModule
takes input features and an attention and produces an attention. It determines the index of the maximally-attended object, extracts the feature vector at that spatial location, then performs a cross-correlation at each spatial location to determine which other regions have this same property. This correlated feature map then goes through a convolutional block whose output is an attention mask.As an example, this module can be used with the CLEVR dataset to perform the
same_shape
operation, which will highlight every region of an image that shares the same shape as an object of interest (excluding the original object).- Parameters
- dim: int
The number of channels in the input feature map.
-
class
probnmn.modules.nmn_modules.
ComparisonModule
(dim: int)[source]¶ Bases:
torch.nn.modules.module.Module
A neural module that takes as input two feature maps and produces a feature map as output.
A
ComparisonModule
takes two feature maps as input and concatenates these. It then processes the concatenated features and produces a feature map encoding whether the two input feature maps encode the same property.This block is useful in making integer comparisons, for example to answer the question, “Are there more red things than small spheres?” It can also be used to determine whether some relationship holds of two objects (e.g. they are the same shape, size, color, or material).
- Parameters
- dim: int
The number of channels of each convolutional filter.