Added comments
This commit is contained in:
parent
38a16e75fe
commit
bcff06f9c2
1 changed files with 17 additions and 0 deletions
|
|
@ -26,6 +26,21 @@ class PerspectiveEstimator(nn.Module):
|
|||
Input: Pre-processed, uniformly-sized image data
|
||||
Output: Perspective factor
|
||||
|
||||
**Note**
|
||||
--------
|
||||
Loss input needs to be computed from beyond the **entire** rev-perspective
|
||||
network. Needs to therefore compute:
|
||||
- Effective pixel of each row after transformation.
|
||||
- Feature density (count) along row, summed over column.
|
||||
|
||||
Loss is computed as a variance over row feature densities. Ref. paper 3.2.
|
||||
After all, it is reasonable to say that you see more when you look at
|
||||
faraway places.
|
||||
|
||||
This do imply that **we need to obtain a reasonably good feature extractor
|
||||
from general images before training this submodule**. Hence, for now, we
|
||||
prob. should work on transformer first.
|
||||
|
||||
:param input_shape: (N, C, H, W)
|
||||
:param conv_kernel_shape: Oriented as (H, W)
|
||||
:param conv_dilation: equidistance dilation factor along H, W
|
||||
|
|
@ -96,3 +111,5 @@ class PerspectiveEstimator(nn.Module):
|
|||
out = torch.exp(-out) + self.epsilon
|
||||
|
||||
return out
|
||||
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue