Added comments

This commit is contained in:
Zhengyi Chen 2024-01-30 17:06:29 +00:00
parent 38a16e75fe
commit bcff06f9c2

View file

@ -26,6 +26,21 @@ class PerspectiveEstimator(nn.Module):
Input: Pre-processed, uniformly-sized image data Input: Pre-processed, uniformly-sized image data
Output: Perspective factor Output: Perspective factor
**Note**
--------
Loss input needs to be computed from beyond the **entire** rev-perspective
network. Needs to therefore compute:
- Effective pixel of each row after transformation.
- Feature density (count) along row, summed over column.
Loss is computed as a variance over row feature densities. Ref. paper 3.2.
After all, it is reasonable to say that you see more when you look at
faraway places.
This do imply that **we need to obtain a reasonably good feature extractor
from general images before training this submodule**. Hence, for now, we
prob. should work on transformer first.
:param input_shape: (N, C, H, W) :param input_shape: (N, C, H, W)
:param conv_kernel_shape: Oriented as (H, W) :param conv_kernel_shape: Oriented as (H, W)
:param conv_dilation: equidistance dilation factor along H, W :param conv_dilation: equidistance dilation factor along H, W
@ -96,3 +111,5 @@ class PerspectiveEstimator(nn.Module):
out = torch.exp(-out) + self.epsilon out = torch.exp(-out) + self.epsilon
return out return out