• Rui Huang School of Computer Science and Technology, Tianjin Polytechnic University, Tianjin 300387,China.


CRN network, Mask R-CNN, MaskScoringR-CNN, Photogrphic images


Correct detection of blurred objects has always been a difficult problem for image detection and recognition. Most of the existing methods are to identify fuzzy objects through extensive training by constructing new algorithms. In this paper, we provide a new method to generate semantic layouts by dividing the image of a blurred object according to the contour of the object, then set different labels on the occlusion object, and then generate photographic images using the CRN network. Use the classification network to effectively detect the most likely objects. In more details, our method of synthesizing image is not like traditional methods relying on adversarial training to synthesizing photographic images. Our method uses a single feedforward network, trained end-to-end with a direct regression objective, as a rendering engine, use the 2D semantic specification of the scene to generate the corresponding photographic image. The experimental results on the Cityscapes data set shows that our method can generate the high quality images that is as good as the state of art method while it solves the problem of the awareness of the objects.


I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014. 2, 6

E. L. Denton, S. Chintala, A. Szlam, and R. Fergus. Deep generative image models using a Laplacian pyramid of adversarial networks. In NIPS, 2015. 2, 3, 6

J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. IJCV, 2013. 2

J. Hosang, R. Benenson, P. Dollár, and B. Schiele. What makes for effective detection proposals PAMI, 2015

A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012

K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV. 2014. 2

R. Girshick. Fast R-CNN. In ICCV, 2015. 2 ,3 ,4 ,5 ,6

S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015. 2 ,3

J. Dai, K. He, Y. Li, S. Ren, and J. Sun. Instance-sensitive fully convolutional networks. In ECCV, 2016. 2 ,3

Y. Li, H. Qi, J. Dai, X. Ji, and Y. Wei. Fully convolutional instance-aware semantic segmentation. In CVPR, 2017.2 ,3 ,4

A. Shrivastava, A. Gupta, and R. Girshick. Training regionbased object detectors with online hard example mining. In CVPR, 2016. ,2

P. O. Pinheiro, R. Collobert, and P. Dollar. Learning to segment object candidates. In NIPS, 2015. 2 ,3

A. Guzm´an-Rivera, D. Batra, and P. Kohli. Multiple choice learning: Learningtoproducemultiplestructuredoutputs. In NIPS, 2012. 5

S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In ICML, 2016. 1, 2, 3, 6

S. E. Reed, Z. Akata, S. Mohan, S. Tenka, B. Schiele, and H. Lee. Learning what and where to draw. In NIPS, 2016. 1, 2

H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. In ICCV, 2017. 1, 2, 3, 6, 7

I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014. 2, 6

Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, VincentDumoulin, and Aaron Courville. Improved training of Wasserstein GANs. Advances in Neural Information ProcessingSystems (NIPS), 2017

S.Ren,K.He,R.Girshick,andJ.Sun. Fasterr-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, pages 91–99, 2015.

Ting Chen, Mario Lucic, Neil Houlsby, and Sylvain Gelly.On Self Modulation for Generative Adversarial Networks.In International Conference on Learning Representations(ICLR), 2019.

V. Kumar Verma, G. Arora, A. Mishra, and P. Rai. Generalized zero-shot learning via synthesized examples. In CVPR,2018

Leonid Karlinsky, Joseph Shtok Sivan Harary and Eli Schwartz. RepMet: Representative-based metric learning for classification and few-shot object detection. In CVPR, 2019.

E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. M. Bronstein. - Encoder: an Effective Sample Synthesis Method for FewShot Object Recognition. NIPS, 2018. 3

T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and ´ S. Belongie. Feature Pyramid Networks for Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 4, 5

Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang and Xinggang Wang. Mask Scoring R-CNN. In CVPR, 2019

Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, Xinggang Wang. Mask Scoring R-CNN. In CVPR, 2018

J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In IEEE International Conference on Computer Vision, pages 764–773, 2017. 4, 6

M. Pharr, W. Jakob, and G. Humphreys. Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann, 3rd edition, 2016. 2

J. Portilla and E. P. Simoncelli. A parametric texture model based on joint statistics of complex wavelet coefficients. IJCV, 40(1), 2000. 4

K. Sayood. Introduction to Data Compression. Morgan Kaufmann, 2012. 4

Additional Files