Added comments
This commit is contained in:
parent
38a16e75fe
commit
bcff06f9c2
1 changed files with 17 additions and 0 deletions
|
|
@ -26,6 +26,21 @@ class PerspectiveEstimator(nn.Module):
|
||||||
Input: Pre-processed, uniformly-sized image data
|
Input: Pre-processed, uniformly-sized image data
|
||||||
Output: Perspective factor
|
Output: Perspective factor
|
||||||
|
|
||||||
|
**Note**
|
||||||
|
--------
|
||||||
|
Loss input needs to be computed from beyond the **entire** rev-perspective
|
||||||
|
network. Needs to therefore compute:
|
||||||
|
- Effective pixel of each row after transformation.
|
||||||
|
- Feature density (count) along row, summed over column.
|
||||||
|
|
||||||
|
Loss is computed as a variance over row feature densities. Ref. paper 3.2.
|
||||||
|
After all, it is reasonable to say that you see more when you look at
|
||||||
|
faraway places.
|
||||||
|
|
||||||
|
This do imply that **we need to obtain a reasonably good feature extractor
|
||||||
|
from general images before training this submodule**. Hence, for now, we
|
||||||
|
prob. should work on transformer first.
|
||||||
|
|
||||||
:param input_shape: (N, C, H, W)
|
:param input_shape: (N, C, H, W)
|
||||||
:param conv_kernel_shape: Oriented as (H, W)
|
:param conv_kernel_shape: Oriented as (H, W)
|
||||||
:param conv_dilation: equidistance dilation factor along H, W
|
:param conv_dilation: equidistance dilation factor along H, W
|
||||||
|
|
@ -96,3 +111,5 @@ class PerspectiveEstimator(nn.Module):
|
||||||
out = torch.exp(-out) + self.epsilon
|
out = torch.exp(-out) + self.epsilon
|
||||||
|
|
||||||
return out
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue