simnets.ops package

Submodules

simnets.ops.mex module

simnets.ops.mex._expand_dim_specification(image_shape, dim_spec)[source]

Expand mex dimension specification.

The dimension specification can be 2 or 3 long, it is processed in two steps: 1. If it is of length 2, a -1 is prepended to it 2. Each dimension with -1 is replaced with the whole corresponding image dimension

Parameters:
  • image_shape – list(int) the shape of the input image, of length 3 (without batch) or 4 (with bach)
  • dim_spec – list(int) the specification to be expanded
Returns:

The expanded dimension specification

simnets.ops.mex.mex(input, offsets, num_instances, softmax_mode=None, padding=None, strides=None, blocks=None, epsilon=None, blocks_out_of_bounds_value=None, blocks_round_down=None, use_unshared_regions=None, shared_offset_region=None, unshared_offset_region=None, name=None)[source]

Computes the MEX layer given 4-D input and 5-D offsets tensors.

As defined in https://arxiv.org/abs/1506.03059

Given an input tensor of shape [batch, in_channels, in_height, in_width] and a offsets tensor of shape [num_regions, num_instances, filter_channels, filter_height, filter_width], where num_regions is calculated from the output dimensions and the shared/unshared offsets parmaeter

This op performs the following: Extract virtual patches of size blocks from the input tensor, according to the padding, strides and blocks parameters. this results in a 3D grid of patches indexed by c,i,j. For each output element we select the corresponding patch and offsets region then calculate:

\[\frac{1}{\epsilon} \log\left(\frac{1}{n} \sum\exp(\epsilon (patch + region))\right)\]

The different parameters change the behaviour as described below.

Parameters:
  • input – A Tensor. Must be one of the following types: float32, float64. A 4-D tensor. with dimensions [batch, in_channels, in_height, in_width].
  • offsets – A Tensor. Must have the same type as input. A 5-D tensor of shape [num_regions, num_instances, filter_channels, filter_height, filter_width] must be non negative!
  • num_instances – An int. the number of instances of the layer.
  • softmax_mode – An optional bool. Defaults to False. in softmax mode we do not divide by the patch size inside of the log
  • padding – An optional list of ints. Defaults to [0, 0, 0]. list of length 3. The padding to use for the dimensions of input.
  • strides – An optional list of ints. Defaults to [1, 1, 1]. list of length 3. The stride of the sliding window for the dimensions of input.
  • blocks – An optional list of ints. Defaults to [1, 1, 1]. list of length 3. The 3D dimensions of the blocks.
  • epsilon – An optional float. Defaults to 1. the epsilon parameter. can be +inf, -inf
  • blocks_out_of_bounds_value – An optional float. Defaults to 0. value to use for out of bounds elements
  • blocks_round_down

    An optional bool. Defaults to True. controls the calculation of the output size. with round_down it is:

    image_size + 2 * pad_size - patch_size) / stride + 1
    

    without it is:

    static_cast<int>(
       std::ceil(static_cast<float>(
           image_size + 2 * pad_size - patch_size) / stride)) + 1
    
  • use_unshared_regions – An optional bool. Defaults to True. alternative to defining a shared region, unshared region.
  • shared_offset_region – An optional list of ints. Defaults to [-1]. the region in which offsets are shared. a value of -1 is replaced by the entire respective dimension. can be a list of length 3, or 1. if it is of length 1 [d], it is expanded to [-1, d, d]
  • unshared_offset_region – An optional list of ints. Defaults to [-1]. the region in which offsets are unshared. a value of -1 is replaced by the entire respective dimension. can be a list of length 3, or 1. if it is of length 1 [d], it is expanded to [-1, d, d]
  • name – A name for the operation (optional).
Returns:

A Tensor. Has the same type as input. A 4-D tensor of shape [batch, out_channels, out_height, out_width]

simnets.ops.similarity module

simnets.ops.similarity.similarity(input, templates, weights, similarity_function=None, blocks=None, strides=None, padding=None, normalization_term=None, normalization_term_fudge=None, ignore_nan_input=None, out_of_bounds_value=None, name=None)[source]

Computes a similarity measure given 4-D input templates and weights tensors.

As defined in https://arxiv.org/abs/1506.03059

Given an input tensor of shape [batch, in_channels, in_height, in_width] and a templates, weights tensor of shape [out_channels, in_channels, filter_height, filter_width], this op performs the following:

  1. Extract virtual patches of size blocks from the input tensor, according to the padding, strides and blocks parameters. block size in the channels dimension is always the number of input channels. this results in a 2D grid of patches indexed by i,j
  2. For the simplest version, for output element e = [b, c, i, j], compute output[b, c, i ,j] = sum(weights[c] * \(phi`(templates[c], patches[i, j])) where :math:`phi\) is either -|a - b|_1 (l1) or -|a - b|_2 (l2)

Let \(I\) be the input image, \(T\) the temapltes, \(W\) the weights and \(O\) the output, \(p\) the padding and \(s\) the strides then the output element at [b, c, i, j] is:

\[\sum_{dc, di, dj} T[c, dc, di, dj] \cdot \phi(I[b, dc, s[0] \cdot i + di - p[0], s[1] \cdot j + dj - p[1]], T[c, dc, di, dj])\]

the different parameters change the behaviour as described below.

Parameters:
  • input – A Tensor. Must be one of the following types: float32, float64. A 4-D tensor. with dimensions [batch, in_channels, in_height, in_width].
  • templates – A Tensor. Must have the same type as input. A 4-D tensor of shape [out_channels, in_channels, filter_height, filter_width]
  • weights – A Tensor. Must have the same type as input. A 4-D tensor of shape [out_channels, in_channels, filter_height, filter_width] must be non negative!
  • similarity_function – An optional string from: “L1”, “L2”. Defaults to “L2”.
  • blocks – An optional list of ints. Defaults to [3, 3]. list of length 2. The height and width of the blocks.
  • strides – An optional list of ints. Defaults to [2, 2]. list of length 2. The stride of the sliding window for the height and width dimension of input.
  • padding – An optional list of ints. Defaults to [0, 0]. list of length 2. The padding to use for the height and width dimension of input.
  • normalization_term – An optional bool. Defaults to False. if true, add a normalization term to the output, used to make the L2 version of this operator into a proper (log) probability measure. the normalization term is -0.5 * K * log(2*pi) where K is the total block size, or the number of non-nan elements in the block if ignore_nan is on.
  • normalization_term_fudge – An optional float. Defaults to 0.001. TODO
  • ignore_nan_input – An optional bool. Defaults to False. if true, and when using L2 with normalization term compute the probability while marginalizing over elements which are nan
  • out_of_bounds_value – An optional float. Defaults to 0. value to use for elements outside the bounds
  • name – A name for the operation (optional).
Returns:

A Tensor. Has the same type as input. A 4-D tensor of shape [batch, out_channels, out_height, out_width]