Stereo vision is a fundamental task that other information and multimodal knowledge can gain. Hence, stereo vision technology has vast practical uses in real-world applications like robotics, self-driving cars, and most tasks that depend on perceiving depth as prior knowledge! To no surprise— this topic has been motivated by many over the past several decades. With modern-day supervised deep learning, the high complexity of the problem is better matched with highly complex networks. The requirement is now big, labeled data to fit the capacity of these supervised deep nets, which remains a challenge in the acquisition and mitigating the requirement to satisfy the large demand for data. In summary, we are at the point or near the brink that depth perception can be deployed confidently in practice. But, since so much data is needed to train the deep nets, the deployed model needs large data to learn the mapping function that transforms left/right pairs to its disparity maps as shown in the figure above.

