VOS: Learning What You Don't Know by Virtual Outlier Synthesis

source

arxiv

source_type

latex

converted_with

pandoc

paper_version

2202.01197v4

title

VOS: Learning What You Don't Know by Virtual Outlier Synthesis

authors

["Xuefeng Du","Zhaoning Wang","Mu Cai","Yixuan Li"]

date_published

2022-02-02 18:43:01+00:00

data_last_modified

2022-05-09 20:12:32+00:00

abstract

Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves competitive performance on both object detection and image classification models, reducing the FPR95 by up to 9.36% compared to the previous best method on object detectors. Code is available at https://github.com/deeplearning-wisc/vos.

author_comment

ICLR 2022

journal_ref

null

doi

null

primary_category

cs.LG

categories

["cs.LG","cs.CV"]

citation_level

alignment_text

pos

confidence_score

1.0

main_tex_filename

iclr2022_conference.tex

bibliography_bbl

\begin{thebibliography}{56} \providecommand{\natexlab}[1]{#1} \providecommand{\url}[1]{\texttt{#1}} \expandafter\ifx\csname urlstyle\endcsname\relax \providecommand{\doi}[1]{doi: #1}\else \providecommand{\doi}{doi: \begingroup \urlstyle{rm}\Url}\fi \bibitem[Bendale \& Boult(2016)Bendale and Boult]{bendale2016towards} Abhijit Bendale and Terrance~E Boult. \newblock Towards open set deep networks. \newblock In \emph{Proceedings of the IEEE conference on computer vision and pattern recognition}, pp.\ 1563--1572, 2016. \bibitem[Besnier et~al.(2021)Besnier, Bursuc, Picard, and Briot]{Besnier_2021_ICCV} Victor Besnier, Andrei Bursuc, David Picard, and Alexandre Briot. \newblock Triggering failures: Out-of-distribution detection by learning from local adversarial attacks in semantic segmentation. \newblock In \emph{Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, pp.\ 15701--15710, October 2021. \bibitem[Blum et~al.(2021)Blum, Sarlin, Nieto, Siegwart, and Cadena]{DBLP:journals/ijcv/BlumSNSC21} Hermann Blum, Paul{-}Edouard Sarlin, Juan~I. Nieto, Roland Siegwart, and Cesar Cadena. \newblock The fishyscapes benchmark: Measuring blind spots in semantic segmentation. \newblock \emph{Int. J. Comput. Vis.}, 129\penalty0 (11):\penalty0 3119--3135, 2021. \newblock \doi{10.1007/s11263-021-01511-6}. \bibitem[Cimpoi et~al.(2014)Cimpoi, Maji, Kokkinos, Mohamed, and Vedaldi]{DBLP:conf/cvpr/CimpoiMKMV14} Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. \newblock Describing textures in the wild. \newblock In \emph{{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2014}, pp.\ 3606--3613, 2014. \bibitem[Deepshikha et~al.(2021)Deepshikha, Yelleni, Srijith, and Mohan]{DBLP:journals/corr/abs-2108-03614} Kumari Deepshikha, Sai~Harsha Yelleni, P.~K. Srijith, and C.~Krishna Mohan. \newblock Monte carlo dropblock for modelling uncertainty in object detection. \newblock \emph{CoRR}, abs/2108.03614, 2021. \bibitem[Dhamija et~al.(2020)Dhamija, G{\"{u}}nther, Ventura, and Boult]{DBLP:conf/wacv/DhamijaGVB20} Akshay~Raj Dhamija, Manuel G{\"{u}}nther, Jonathan Ventura, and Terrance~E. Boult. \newblock The overlooked elephant of object detection: Open set. \newblock In \emph{{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2020}, pp.\ 1010--1019, 2020. \bibitem[Everingham et~al.(2010)Everingham, Gool, Williams, Winn, and Zisserman]{DBLP:journals/ijcv/EveringhamGWWZ10} Mark Everingham, Luc~Van Gool, Christopher K.~I. Williams, John~M. Winn, and Andrew Zisserman. \newblock The pascal visual object classes {(VOC)} challenge. \newblock \emph{International Journal of Computer Vision}, 88\penalty0 (2):\penalty0 303--338, 2010. \bibitem[Gal \& Ghahramani(2016)Gal and Ghahramani]{gal2016dropout} Yarin Gal and Zoubin Ghahramani. \newblock Dropout as a bayesian approximation: Representing model uncertainty in deep learning. \newblock In \emph{international conference on machine learning}, pp.\ 1050--1059. PMLR, 2016. \bibitem[Girshick et~al.(2018)Girshick, Radosavovic, Gkioxari, Doll\'{a}r, and He]{Detectron2018} Ross Girshick, Ilija Radosavovic, Georgia Gkioxari, Piotr Doll\'{a}r, and Kaiming He. \newblock Detectron. \newblock \url{https://github.com/facebookresearch/detectron}, 2018. \bibitem[Goodfellow et~al.(2014)Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, and Bengio]{goodfellow2014generative} Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. \newblock Generative adversarial nets. \newblock In \emph{Advances in neural information processing systems}, pp.\ 2672--2680, 2014. \bibitem[Grcic et~al.(2021)Grcic, Bevandic, and Segvic]{DBLP:conf/visapp/GrcicBS21} Matej Grcic, Petra Bevandic, and Sinisa Segvic. \newblock Dense open-set recognition with synthetic outliers generated by real {NVP}. \newblock In \emph{Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, {VISIGRAPP} 2021, Volume 4: VISAPP, Online Streaming, February 8-10, 2021}, pp.\ 133--143. {SCITEPRESS}, 2021. \newblock \doi{10.5220/0010260701330143}. \bibitem[Gu et~al.(2022)Gu, Lin, Kuo, and Cui]{DBLP:journals/corr/abs-2104-13921} Xiuye Gu, Tsung{-}Yi Lin, Weicheng Kuo, and Yin Cui. \newblock Zero-shot detection via vision and language knowledge distillation. \newblock In \emph{10th International Conference on Learning Representations, {ICLR} 2022}, 2022. \bibitem[Hall et~al.(2020)Hall, Dayoub, Skinner, Zhang, Miller, Corke, Carneiro, Angelova, and S{\"{u}}nderhauf]{DBLP:conf/wacv/0003DSZMCCAS20} David Hall, Feras Dayoub, John Skinner, Haoyang Zhang, Dimity Miller, Peter Corke, Gustavo Carneiro, Anelia Angelova, and Niko S{\"{u}}nderhauf. \newblock Probabilistic object detection: Definition and evaluation. \newblock In \emph{{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2020}, pp.\ 1020--1029, 2020. \bibitem[Harakeh \& Waslander(2021)Harakeh and Waslander]{DBLP:journals/corr/abs-2101-05036} Ali Harakeh and Steven~L. Waslander. \newblock Estimating and evaluating regression predictive uncertainty in deep object detectors. \newblock In \emph{International Conference on Learning Representations}, 2021. \bibitem[He et~al.(2016)He, Zhang, Ren, and Sun]{he2016identity} Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. \newblock Identity mappings in deep residual networks. \newblock In \emph{European conference on computer vision}, pp.\ 630--645. Springer, 2016. \bibitem[Hendrycks \& Gimpel(2017)Hendrycks and Gimpel]{hendrycks2016baseline} Dan Hendrycks and Kevin Gimpel. \newblock A baseline for detecting misclassified and out-of-distribution examples in neural networks. \newblock In \emph{International Conference on Learning Representations, {ICLR} 2017}, 2017. \bibitem[Hendrycks et~al.(2019)Hendrycks, Mazeika, and Dietterich]{hendrycks2018deep} Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. \newblock Deep anomaly detection with outlier exposure. \newblock In \emph{International Conference on Learning Representations}, 2019. \bibitem[Hsu et~al.(2020)Hsu, Shen, Jin, and Kira]{hsu2020generalized} Yen-Chang Hsu, Yilin Shen, Hongxia Jin, and Zsolt Kira. \newblock Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data. \newblock In \emph{Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pp.\ 10951--10960, 2020. \bibitem[Huang et~al.(2017)Huang, Liu, Van Der~Maaten, and Weinberger]{huang2017densely} Gao Huang, Zhuang Liu, Laurens Van Der~Maaten, and Kilian~Q Weinberger. \newblock Densely connected convolutional networks. \newblock In \emph{Proceedings of the IEEE conference on computer vision and pattern recognition}, pp.\ 4700--4708, 2017. \bibitem[Huang et~al.(2021)Huang, Geng, and Li]{huang2021importance} Rui Huang, Andrew Geng, and Yixuan Li. \newblock On the importance of gradients for detecting distributional shifts in the wild. \newblock In \emph{Advances in Neural Information Processing Systems}, 2021. \bibitem[Joseph et~al.(2020)Joseph, Rajasegaran, Khan, Khan, Balasubramanian, and Shao]{DBLP:journals/corr/abs-2003-08798} K.~J. Joseph, Jathushan Rajasegaran, Salman~H. Khan, Fahad~Shahbaz Khan, Vineeth Balasubramanian, and Ling Shao. \newblock Incremental object detection via meta-learning. \newblock \emph{arXiv preprint arXiv:2003.08798}, 2020. \bibitem[Joseph et~al.(2021)Joseph, Khan, Khan, and Balasubramanian]{DBLP:journals/corr/abs-2103-02603} K.~J. Joseph, Salman Khan, Fahad~Shahbaz Khan, and Vineeth~N. Balasubramanian. \newblock Towards open world object detection. \newblock In \emph{{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2021}, 2021. \bibitem[Jung et~al.(2021)Jung, Lee, Gwak, Choi, and Choo]{DBLP:journals/corr/abs-2107-11264} Sanghun Jung, Jungsoo Lee, Daehoon Gwak, Sungha Choi, and Jaegul Choo. \newblock Standardized max logits: {A} simple yet effective approach for identifying unexpected road obstacles in urban-scene segmentation. \newblock In \emph{{IEEE} International Conference on Computer Vision, {ICCV}}, 2021. \bibitem[Kim et~al.(2021)Kim, Lin, Angelova, Kweon, and Kuo]{DBLP:journals/corr/abs-2108-06753} Dahun Kim, Tsung{-}Yi Lin, Anelia Angelova, In~So Kweon, and Weicheng Kuo. \newblock Learning open-world object proposals without learning to classify. \newblock \emph{CoRR}, abs/2108.06753, 2021. \bibitem[Kingma \& Ba(2015)Kingma and Ba]{DBLP:journals/corr/KingmaB14} Diederik~P. Kingma and Jimmy Ba. \newblock Adam: {A} method for stochastic optimization. \newblock In \emph{3rd International Conference on Learning Representations, {ICLR} 2015}, 2015. \bibitem[Krizhevsky \& Hinton(2009)Krizhevsky and Hinton]{cifar} Alex Krizhevsky and Geoffrey Hinton. \newblock Learning multiple layers of features from tiny images. \newblock Technical Report, 2009. \bibitem[Kuznetsova et~al.(2020)Kuznetsova, Rom, Alldrin, Uijlings, Krasin, Pont-Tuset, Kamali, Popov, Malloci, Kolesnikov, et~al.]{kuznetsova2020open} Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, et~al. \newblock The open images dataset v4. \newblock \emph{International Journal of Computer Vision}, pp.\ 1--26, 2020. \bibitem[Lakshminarayanan et~al.(2017)Lakshminarayanan, Pritzel, and Blundell]{DBLP:conf/nips/Lakshminarayanan17} Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. \newblock Simple and scalable predictive uncertainty estimation using deep ensembles. \newblock In \emph{Advances in Neural Information Processing Systems}, pp.\ 6402--6413, 2017. \bibitem[Lee et~al.(2018{\natexlab{a}})Lee, Lee, Lee, and Shin]{lee2018training} Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. \newblock Training confidence-calibrated classifiers for detecting out-of-distribution samples. \newblock In \emph{International Conference on Learning Representations}, 2018{\natexlab{a}}. \bibitem[Lee et~al.(2018{\natexlab{b}})Lee, Lee, Lee, and Shin]{lee2018simple} Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. \newblock A simple unified framework for detecting out-of-distribution samples and adversarial attacks. \newblock In \emph{Advances in Neural Information Processing Systems}, pp.\ 7167--7177, 2018{\natexlab{b}}. \bibitem[Liang et~al.(2018)Liang, Li, and Srikant]{liang2018enhancing} Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. \newblock Enhancing the reliability of out-of-distribution image detection in neural networks. \newblock In \emph{International Conference on Learning Representations, ICLR 2018}, 2018. \bibitem[Lin et~al.(2014)Lin, Maire, Belongie, Hays, Perona, Ramanan, Doll{\'a}r, and Zitnick]{lin2014microsoft} Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll{\'a}r, and C~Lawrence Zitnick. \newblock Microsoft coco: Common objects in context. \newblock In \emph{European conference on computer vision}, pp.\ 740--755, 2014. \bibitem[Liu et~al.(2020{\natexlab{a}})Liu, Wang, Owens, and Li]{liu2020energy} Weitang Liu, Xiaoyun Wang, John Owens, and Yixuan Li. \newblock Energy-based out-of-distribution detection. \newblock \emph{Advances in Neural Information Processing Systems}, 2020{\natexlab{a}}. \bibitem[Liu et~al.(2020{\natexlab{b}})Liu, Yang, Ravichandran, Bhotika, and Soatto]{DBLP:journals/corr/abs-2002-05347} Xialei Liu, Hao Yang, Avinash Ravichandran, Rahul Bhotika, and Stefano Soatto. \newblock Continual universal object detection. \newblock \emph{CoRR}, abs/2002.05347, 2020{\natexlab{b}}. \bibitem[Miller et~al.(2018)Miller, Nicholson, Dayoub, and S{\"{u}}nderhauf]{DBLP:conf/icra/MillerNDS18} Dimity Miller, Lachlan Nicholson, Feras Dayoub, and Niko S{\"{u}}nderhauf. \newblock Dropout sampling for robust object detection in open-set conditions. \newblock In \emph{{IEEE} International Conference on Robotics and Automation, {ICRA} 2018}, pp.\ 1--7, 2018. \newblock \doi{10.1109/ICRA.2018.8460700}. \bibitem[Miller et~al.(2019)Miller, Dayoub, Milford, and S{\"{u}}nderhauf]{DBLP:conf/icra/MillerDMS19} Dimity Miller, Feras Dayoub, Michael Milford, and Niko S{\"{u}}nderhauf. \newblock Evaluating merging strategies for sampling-based uncertainty techniques in object detection. \newblock In \emph{International Conference on Robotics and Automation, {ICRA} 2019}, pp.\ 2348--2354, 2019. \bibitem[Mohseni et~al.(2020)Mohseni, Pitale, Yadawa, and Wang]{mohseni2020self} Sina Mohseni, Mandar Pitale, JBS Yadawa, and Zhangyang Wang. \newblock Self-supervised learning for generalizable out-of-distribution detection. \newblock In \emph{Proceedings of the AAAI Conference on Artificial Intelligence}, volume~34, pp.\ 5216--5223, 2020. \bibitem[Morteza \& Li(2022)Morteza and Li]{morteza2022provable} Peyman Morteza and Yixuan Li. \newblock Provable guarantees for understanding out-of-distribution detection. \newblock \emph{Proceedings of the AAAI Conference on Artificial Intelligence}, 2022. \bibitem[Netzer et~al.(2011)Netzer, Wang, Coates, Bissacco, Wu, and Ng]{netzer2011reading} Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo~Wu, and Andrew~Y Ng. \newblock Reading digits in natural images with unsupervised feature learning. \newblock 2011. \bibitem[Nguyen et~al.(2015)Nguyen, Yosinski, and Clune]{nguyen2015deep} Anh Nguyen, Jason Yosinski, and Jeff Clune. \newblock Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. \newblock In \emph{Proceedings of the IEEE conference on computer vision and pattern recognition}, pp.\ 427--436, 2015. \bibitem[P{\'{e}}rez{-}R{\'{u}}a et~al.(2020)P{\'{e}}rez{-}R{\'{u}}a, Zhu, Hospedales, and Xiang]{DBLP:conf/cvpr/Perez-RuaZHX20} Juan{-}Manuel P{\'{e}}rez{-}R{\'{u}}a, Xiatian Zhu, Timothy~M. Hospedales, and Tao Xiang. \newblock Incremental few-shot object detection. \newblock In \emph{2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pp.\ 13843--13852, 2020. \newblock \doi{10.1109/CVPR42600.2020.01386}. \bibitem[Radosavovic et~al.(2020)Radosavovic, Kosaraju, Girshick, He, and Doll{\'{a}}r]{DBLP:conf/cvpr/RadosavovicKGHD20} Ilija Radosavovic, Raj~Prateek Kosaraju, Ross~B. Girshick, Kaiming He, and Piotr Doll{\'{a}}r. \newblock Designing network design spaces. \newblock In \emph{2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pp.\ 10425--10433, 2020. \bibitem[Rahman et~al.(2020)Rahman, Khan, and Porikli]{DBLP:journals/ijcv/RahmanKP20} Shafin Rahman, Salman~H. Khan, and Fatih Porikli. \newblock Zero-shot object detection: Joint recognition and localization of novel concepts. \newblock \emph{International Journal of Computer Vision}, 128\penalty0 (12):\penalty0 2979--2999, 2020. \newblock \doi{10.1007/s11263-020-01355-6}. \bibitem[Ren et~al.(2015)Ren, He, Girshick, and Sun]{ren2015faster} Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. \newblock Faster r-cnn: Towards real-time object detection with region proposal networks. \newblock In \emph{Advances in neural information processing systems}, pp.\ 91--99, 2015. \bibitem[Riedlinger et~al.(2021)Riedlinger, Rottmann, Schubert, and Gottschalk]{DBLP:journals/corr/abs-2107-04517} Tobias Riedlinger, Matthias Rottmann, Marius Schubert, and Hanno Gottschalk. \newblock Gradient-based quantification of epistemic uncertainty for deep object detectors. \newblock \emph{CoRR}, abs/2107.04517, 2021. \bibitem[Sastry \& Oore(2020)Sastry and Oore]{DBLP:conf/icml/SastryO20} Chandramouli~Shama Sastry and Sageev Oore. \newblock Detecting out-of-distribution examples with gram matrices. \newblock In \emph{Proceedings of the 37th International Conference on Machine Learning, {ICML} 2020}, volume 119, pp.\ 8491--8501, 2020. \bibitem[Sun et~al.(2021)Sun, Guo, and Li]{sun2021react} Yiyou Sun, Chuan Guo, and Yixuan Li. \newblock React: Out-of-distribution detection with rectified activations. \newblock In \emph{Advances in Neural Information Processing Systems}, 2021. \bibitem[Tack et~al.(2020)Tack, Mo, Jeong, and Shin]{tack2020csi} Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. \newblock Csi: Novelty detection via contrastive learning on distributionally shifted instances. \newblock In \emph{Advances in Neural Information Processing Systems}, 2020. \bibitem[Wang et~al.(2021)Wang, Huang, Liu, Yu, Wang, Gonzalez, and Darrell]{DBLP:journals/corr/abs-2104-08381} Xin Wang, Thomas~E. Huang, Benlin Liu, Fisher Yu, Xiaolong Wang, Joseph~E. Gonzalez, and Trevor Darrell. \newblock Robust object detection via instance-level temporal cycle confusion. \newblock In \emph{Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, 2021. \bibitem[Xu et~al.(2015)Xu, Ehinger, Zhang, Finkelstein, Kulkarni, and Xiao]{DBLP:journals/corr/XuEZFKX15} Pingmei Xu, Krista~A Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev~R Kulkarni, and Jianxiong Xiao. \newblock Turkergaze: Crowdsourcing saliency with webcam based eye tracking. \newblock \emph{arXiv preprint arXiv:1504.06755}, 2015. \bibitem[Yu et~al.(2015)Yu, Seff, Zhang, Song, Funkhouser, and Xiao]{DBLP:journals/corr/YuZSSX15} Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. \newblock Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. \newblock \emph{arXiv preprint arXiv:1506.03365}, 2015. \bibitem[Yu et~al.(2020)Yu, Chen, Wang, Xian, Chen, Liu, Madhavan, and Darrell]{DBLP:conf/cvpr/YuCWXCLMD20} Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. \newblock {BDD100K:} {A} diverse driving dataset for heterogeneous multitask learning. \newblock In \emph{{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pp.\ 2633--2642, 2020. \bibitem[Zagoruyko \& Komodakis(2016)Zagoruyko and Komodakis]{zagoruyko2016wide} Sergey Zagoruyko and Nikos Komodakis. \newblock Wide residual networks. \newblock \emph{arXiv preprint arXiv:1605.07146}, 2016. \bibitem[Zhang et~al.(2018)Zhang, Ciss{\'{e}}, Dauphin, and Lopez{-}Paz]{DBLP:conf/iclr/ZhangCDL18} Hongyi Zhang, Moustapha Ciss{\'{e}}, Yann~N. Dauphin, and David Lopez{-}Paz. \newblock mixup: Beyond empirical risk minimization. \newblock In \emph{6th International Conference on Learning Representations, {ICLR} 2018}, 2018. \bibitem[Zhang et~al.(2021)Zhang, Inkawhich, Chen, and Li]{DBLP:journals/corr/abs-2106-03917} Jingyang Zhang, Nathan Inkawhich, Yiran Chen, and Hai Li. \newblock Fine-grained out-of-distribution detection with mixup outlier exposure. \newblock \emph{CoRR}, abs/2106.03917, 2021. \bibitem[Zhou et~al.(2018)Zhou, Lapedriza, Khosla, Oliva, and Torralba]{DBLP:journals/pami/ZhouLKO018} Bolei Zhou, {\`{A}}gata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. \newblock Places: {A} 10 million image database for scene recognition. \newblock \emph{{IEEE} Transactions Pattern Analysis and Machine Intelligence.}, 40\penalty0 (6):\penalty0 1452--1464, 2018. \newblock \doi{10.1109/TPAMI.2017.2723009}. \end{thebibliography}

bibliography_bib

@article{DBLP:journals/ijcv/BlumSNSC21, author = {Hermann Blum and Paul{-}Edouard Sarlin and Juan I. Nieto and Roland Siegwart and Cesar Cadena}, title = {The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation}, journal = {Int. J. Comput. Vis.}, volume = {129}, number = {11}, pages = {3119--3135}, year = {2021}, doi = {10.1007/s11263-021-01511-6}, timestamp = {Wed, 03 Nov 2021 08:27:59 +0100}, biburl = {https://dblp.org/rec/journals/ijcv/BlumSNSC21.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/eccv/0002LG020, author = {Hongjie Zhang and Ang Li and Jie Guo and Yanwen Guo}, title = {Hybrid Models for Open Set Recognition}, booktitle = {16th European Conference on Computer Vision, {ECCV} 2020 }, pages = {102--117}, year = {2020}, timestamp = {Tue, 09 Feb 2021 15:29:44 +0100}, biburl = {https://dblp.org/rec/conf/eccv/0002LG020.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:journals/corr/abs-2107-11264, author = {Sanghun Jung and Jungsoo Lee and Daehoon Gwak and Sungha Choi and Jaegul Choo}, title = {Standardized Max Logits: {A} Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation}, booktitle = {{IEEE} International Conference on Computer Vision, {ICCV} }, year = {2021}, timestamp = {Thu, 14 Oct 2021 09:16:56 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2107-11264.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/visapp/GrcicBS21, author = {Matej Grcic and Petra Bevandic and Sinisa Segvic}, title = {Dense Open-set Recognition with Synthetic Outliers Generated by Real {NVP}}, booktitle = {Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, {VISIGRAPP} 2021, Volume 4: VISAPP, Online Streaming, February 8-10, 2021}, pages = {133--143}, publisher = {{SCITEPRESS}}, year = {2021}, doi = {10.5220/0010260701330143}, timestamp = {Tue, 02 Mar 2021 16:20:23 +0100}, biburl = {https://dblp.org/rec/conf/visapp/GrcicBS21.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2108-09976, author = {Zhilin Zhao and Longbing Cao and Kun{-}Yu Lin}, title = {Revealing Distributional Vulnerability of Explicit Discriminators by Implicit Generators}, journal = {CoRR}, volume = {abs/2108.09976}, eprinttype = {arXiv}, eprint = {2108.09976}, timestamp = {Wed, 10 Nov 2021 12:34:13 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-2108-09976.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/NguyenYC15, author = {Anh Mai Nguyen and Jason Yosinski and Jeff Clune}, title = {Deep neural networks are easily fooled: High confidence predictions for unrecognizable images}, booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2015, Boston, MA, USA, June 7-12, 2015}, pages = {427--436}, publisher = {{IEEE} Computer Society}, year = {2015}, url = {https://doi.org/10.1109/CVPR.2015.7298640}, doi = {10.1109/CVPR.2015.7298640}, timestamp = {Wed, 16 Oct 2019 14:14:50 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/NguyenYC15.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2010-05119, author = {Ad{\'{\i}}n Ram{\'{\i}}rez Rivera and Adil Khan and Imad Eddine Ibrahim Bekkouch and Taimoor Shakeel Sheikh}, title = {Anomaly Detection based on Zero-Shot Outlier Synthesis and Hierarchical Feature Distillation}, journal = {CoRR}, volume = {abs/2010.05119}, year = {2020}, url = {https://arxiv.org/abs/2010.05119}, archivePrefix = {arXiv}, eprint = {2010.05119}, timestamp = {Tue, 20 Oct 2020 15:08:10 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2010-05119.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/ijcv/EveringhamGWWZ10, author = {Mark Everingham and Luc Van Gool and Christopher K. I. Williams and John M. Winn and Andrew Zisserman}, title = {The Pascal Visual Object Classes {(VOC)} Challenge}, journal = {International Journal of Computer Vision}, volume = {88}, number = {2}, pages = {303--338}, year = {2010} } @inproceedings{DBLP:conf/icml/TanL19, author = {Mingxing Tan and Quoc V. Le}, editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov}, title = {EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks}, booktitle = {Proceedings of the 36th International Conference on Machine Learning, {ICML} 2019}, series = {Proceedings of Machine Learning Research}, volume = {97}, pages = {6105--6114}, year = {2019} } @inproceedings{DBLP:conf/cvpr/RadosavovicKGHD20, author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross B. Girshick and Kaiming He and Piotr Doll{\'{a}}r}, title = {Designing Network Design Spaces}, booktitle = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pages = {10425--10433}, year = {2020} } @inproceedings{DBLP:journals/corr/abs-2103-02603, author = {K. J. Joseph and Salman Khan and Fahad Shahbaz Khan and Vineeth N. Balasubramanian}, title = {Towards Open World Object Detection}, booktitle = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2021}, year = {2021} } @article{DBLP:journals/corr/abs-1802-03426, author = {Leland McInnes and John Healy}, title = {{UMAP:} Uniform Manifold Approximation and Projection for Dimension Reduction}, journal = {CoRR}, volume = {abs/1802.03426}, year = {2018}, url = {http://arxiv.org/abs/1802.03426}, archivePrefix = {arXiv}, eprint = {1802.03426}, timestamp = {Tue, 17 Sep 2019 14:15:10 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-1802-03426.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/pami/RenHG017, author = {Shaoqing Ren and Kaiming He and Ross B. Girshick and Jian Sun}, title = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal Networks}, journal = {{IEEE} Transactions Pattern Analysis and Machine Intelligence.}, volume = {39}, number = {6}, pages = {1137--1149}, year = {2017}, url = {https://doi.org/10.1109/TPAMI.2016.2577031}, doi = {10.1109/TPAMI.2016.2577031}, timestamp = {Wed, 14 Nov 2018 10:51:18 +0100}, biburl = {https://dblp.org/rec/journals/pami/RenHG017.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/ZhangCDL18, author = {Hongyi Zhang and Moustapha Ciss{\'{e}} and Yann N. Dauphin and David Lopez{-}Paz}, title = {mixup: Beyond Empirical Risk Minimization}, booktitle = {6th International Conference on Learning Representations, {ICLR} 2018}, year = {2018}, timestamp = {Thu, 25 Jul 2019 14:25:50 +0200}, biburl = {https://dblp.org/rec/conf/iclr/ZhangCDL18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2007-06096, author = {Th{\'{e}}o Gu{\'{e}}nais and Dimitris Vamvourellis and Yaniv Yacoby and Finale Doshi{-}Velez and Weiwei Pan}, title = {BaCOUn: Bayesian Classifers with Out-of-Distribution Uncertainty}, journal={arXiv preprint arXiv:2007.06096}, year = {2020}, url = {https://arxiv.org/abs/2007.06096}, archivePrefix = {arXiv}, eprint = {2007.06096}, timestamp = {Wed, 22 Jul 2020 12:09:15 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2007-06096.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-1910-04241, author = {Sachin Vernekar and Ashish Gaurav and Vahdat Abdelzad and Taylor Denouden and Rick Salay and Krzysztof Czarnecki}, title = {Out-of-distribution Detection in Classifiers via Generation}, journal={arXiv preprint arXiv:1910.04241}, year={2019} } @inproceedings{DBLP:conf/iclr/LeeLLS18, author = {Kimin Lee and Honglak Lee and Kibok Lee and Jinwoo Shin}, title = {Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples}, booktitle = {6th International Conference on Learning Representations, {ICLR} 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings}, publisher = {OpenReview.net}, year = {2018}, timestamp = {Thu, 25 Jul 2019 14:25:57 +0200}, biburl = {https://dblp.org/rec/conf/iclr/LeeLLS18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @misc{chen2020informative, title={Informative Outlier Matters: Robustifying Out-of-distribution Detection Using Outlier Mining}, author={Jiefeng Chen and Yixuan Li and Xi Wu and Yingyu Liang and Somesh Jha}, year={2020}, eprint={2006.15207}, archivePrefix={arXiv}, primaryClass={cs.LG} } @inproceedings{DBLP:conf/iclr/HendrycksMD19, author = {Dan Hendrycks and Mantas Mazeika and Thomas G. Dietterich}, title = {Deep Anomaly Detection with Outlier Exposure}, booktitle = {7th International Conference on Learning Representations, {ICLR} 2019, New Orleans, LA, USA, May 6-9, 2019}, publisher = {OpenReview.net}, year = {2019}, url = {https://openreview.net/forum?id=HyxCxhRcY7}, timestamp = {Thu, 25 Jul 2019 14:25:52 +0200}, biburl = {https://dblp.org/rec/conf/iclr/HendrycksMD19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/eccv/LinMBHPRDZ14, author = {Tsung{-}Yi Lin and Michael Maire and Serge J. Belongie and James Hays and Pietro Perona and Deva Ramanan and Piotr Doll{\'{a}}r and C. Lawrence Zitnick}, editor = {David J. Fleet and Tom{\'{a}}s Pajdla and Bernt Schiele and Tinne Tuytelaars}, title = {Microsoft {COCO:} Common Objects in Context}, booktitle = {Computer Vision - {ECCV} 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part {V}}, series = {Lecture Notes in Computer Science}, volume = {8693}, pages = {740--755}, publisher = {Springer}, year = {2014}, url = {https://doi.org/10.1007/978-3-319-10602-1\_48}, doi = {10.1007/978-3-319-10602-1\_48}, timestamp = {Tue, 14 May 2019 10:00:45 +0200}, biburl = {https://dblp.org/rec/conf/eccv/LinMBHPRDZ14.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @misc{Detectron2018, author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and Piotr Doll\'{a}r and Kaiming He}, title = {Detectron}, howpublished = {\url{https://github.com/facebookresearch/detectron}}, year = {2018} } @article{DBLP:journals/ijcv/KuznetsovaRAUKP20, author = {Alina Kuznetsova and Hassan Rom and Neil Alldrin and Jasper R. R. Uijlings and Ivan Krasin and Jordi Pont{-}Tuset and Shahab Kamali and Stefan Popov and Matteo Malloci and Alexander Kolesnikov and Tom Duerig and Vittorio Ferrari}, title = {The Open Images Dataset {V4}}, journal = {Int. J. Comput. Vis.}, volume = {128}, number = {7}, pages = {1956--1981}, year = {2020}, url = {https://doi.org/10.1007/s11263-020-01316-z}, doi = {10.1007/s11263-020-01316-z}, timestamp = {Fri, 20 Nov 2020 14:04:06 +0100}, biburl = {https://dblp.org/rec/journals/ijcv/KuznetsovaRAUKP20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/YuCWXCLMD20, author = {Fisher Yu and Haofeng Chen and Xin Wang and Wenqi Xian and Yingying Chen and Fangchen Liu and Vashisht Madhavan and Trevor Darrell}, title = {{BDD100K:} {A} Diverse Driving Dataset for Heterogeneous Multitask Learning}, booktitle = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pages = {2633--2642}, year = {2020} } @article{DBLP:journals/pami/LinGGHD20, author = {Tsung{-}Yi Lin and Priya Goyal and Ross B. Girshick and Kaiming He and Piotr Doll{\'{a}}r}, title = {Focal Loss for Dense Object Detection}, journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.}, volume = {42}, number = {2}, pages = {318--327}, year = {2020}, url = {https://doi.org/10.1109/TPAMI.2018.2858826}, doi = {10.1109/TPAMI.2018.2858826}, timestamp = {Sat, 30 May 2020 20:02:13 +0200}, biburl = {https://dblp.org/rec/journals/pami/LinGGHD20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-1804-02748, author = {Dima Damen and Hazel Doughty and Giovanni Maria Farinella and Sanja Fidler and Antonino Furnari and Evangelos Kazakos and Davide Moltisanti and Jonathan Munro and Toby Perrett and Will Price and Michael Wray}, title = {Scaling Egocentric Vision: The {EPIC-KITCHENS} Dataset}, journal = {CoRR}, volume = {abs/1804.02748}, year = {2018}, url = {http://arxiv.org/abs/1804.02748}, archivePrefix = {arXiv}, eprint = {1804.02748}, timestamp = {Sat, 23 Jan 2021 01:20:26 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-1804-02748.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/GkioxariGDH18, author = {Georgia Gkioxari and Ross B. Girshick and Piotr Doll{\'{a}}r and Kaiming He}, title = {Detecting and Recognizing Human-Object Interactions}, booktitle = {2018 {IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2018, Salt Lake City, UT, USA, June 18-22, 2018}, pages = {8359--8367}, publisher = {{IEEE} Computer Society}, year = {2018}, url = {http://openaccess.thecvf.com/content\_cvpr\_2018/html/Gkioxari\_Detecting\_and\_Recognizing\_CVPR\_2018\_paper.html}, doi = {10.1109/CVPR.2018.00872}, timestamp = {Wed, 16 Oct 2019 14:14:50 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/GkioxariGDH18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iccv/ZhangH17, author = {Haoyang Zhang and Xuming He}, title = {Deep Free-Form Deformation Network for Object-Mask Registration}, booktitle = {{IEEE} International Conference on Computer Vision, {ICCV} 2017, Venice, Italy, October 22-29, 2017}, pages = {4261--4269}, publisher = {{IEEE} Computer Society}, year = {2017}, url = {http://doi.ieeecomputersociety.org/10.1109/ICCV.2017.456}, doi = {10.1109/ICCV.2017.456}, timestamp = {Wed, 23 Dec 2020 08:54:04 +0100}, biburl = {https://dblp.org/rec/conf/iccv/ZhangH17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/PhilbinCISZ07, author = {James Philbin and Ondrej Chum and Michael Isard and Josef Sivic and Andrew Zisserman}, title = {Object retrieval with large vocabularies and fast spatial matching}, booktitle = {2007 {IEEE} Computer Society Conference on Computer Vision and Pattern Recognition {(CVPR} 2007), 18-23 June 2007, Minneapolis, Minnesota, {USA}}, publisher = {{IEEE} Computer Society}, year = {2007}, url = {https://doi.org/10.1109/CVPR.2007.383172}, doi = {10.1109/CVPR.2007.383172}, timestamp = {Wed, 16 Oct 2019 14:14:50 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/PhilbinCISZ07.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/tmm/MengYYWT16, author = {Jingjing Meng and Junsong Yuan and Jiong Yang and Gang Wang and Yap{-}Peng Tan}, title = {Object Instance Search in Videos via Spatio-Temporal Trajectory Discovery}, journal = {{IEEE} Trans. Multim.}, volume = {18}, number = {1}, pages = {116--127}, year = {2016}, url = {https://doi.org/10.1109/TMM.2015.2500734}, doi = {10.1109/TMM.2015.2500734}, timestamp = {Thu, 01 Oct 2020 10:57:38 +0200}, biburl = {https://dblp.org/rec/journals/tmm/MengYYWT16.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/nips/LiuWOL20, author = {Weitang Liu and Xiaoyun Wang and John D. Owens and Yixuan Li}, editor = {Hugo Larochelle and Marc'Aurelio Ranzato and Raia Hadsell and Maria{-}Florina Balcan and Hsuan{-}Tien Lin}, title = {Energy-based Out-of-distribution Detection}, booktitle = {Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual}, year = {2020}, url = {https://proceedings.neurips.cc/paper/2020/hash/f5496252609c43eb8a3d147ab9b9c006-Abstract.html}, timestamp = {Tue, 19 Jan 2021 15:57:41 +0100}, biburl = {https://dblp.org/rec/conf/nips/LiuWOL20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/HsuSJK20, author = {Yen{-}Chang Hsu and Yilin Shen and Hongxia Jin and Zsolt Kira}, title = {Generalized {ODIN:} Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data}, booktitle = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020, Seattle, WA, USA, June 13-19, 2020}, pages = {10948--10957}, publisher = {{IEEE}}, year = {2020}, url = {https://doi.org/10.1109/CVPR42600.2020.01096}, doi = {10.1109/CVPR42600.2020.01096}, timestamp = {Tue, 11 Aug 2020 16:59:49 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/HsuSJK20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/nips/LeeLLS18, author = {Kimin Lee and Kibok Lee and Honglak Lee and Jinwoo Shin}, editor = {Samy Bengio and Hanna M. Wallach and Hugo Larochelle and Kristen Grauman and Nicol{\`{o}} Cesa{-}Bianchi and Roman Garnett}, title = {A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks}, booktitle = {Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montr{\'{e}}al, Canada}, pages = {7167--7177}, year = {2018}, url = {https://proceedings.neurips.cc/paper/2018/hash/abdeb6f575ac5c6676b747bca8d09cc2-Abstract.html}, timestamp = {Thu, 21 Jan 2021 15:15:21 +0100}, biburl = {https://dblp.org/rec/conf/nips/LeeLLS18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/pami/ZhouLKO018, author = {Bolei Zhou and {\`{A}}gata Lapedriza and Aditya Khosla and Aude Oliva and Antonio Torralba}, title = {Places: {A} 10 Million Image Database for Scene Recognition}, journal = {{IEEE} Transactions Pattern Analysis and Machine Intelligence.}, volume = {40}, number = {6}, pages = {1452--1464}, year = {2018}, doi = {10.1109/TPAMI.2017.2723009}, timestamp = {Wed, 14 Nov 2018 10:51:18 +0100}, biburl = {https://dblp.org/rec/journals/pami/ZhouLKO018.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/HendrycksD19, author = {Dan Hendrycks and Thomas G. Dietterich}, title = {Benchmarking Neural Network Robustness to Common Corruptions and Perturbations}, booktitle = { International Conference on Learning Representations, {ICLR} 2019}, year = {2019}, url = {https://openreview.net/forum?id=HJz6tiCqYm}, timestamp = {Thu, 25 Jul 2019 14:25:46 +0200}, biburl = {https://dblp.org/rec/conf/iclr/HendrycksD19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-1906-03509, author = {Aristotelis{-}Angelos Papadopoulos and Mohammad Reza Rajati}, title = {Simultaneous Classification and Novelty Detection Using Deep Neural Networks}, journal={arXiv preprint arXiv:1906.03509}, year={2019} } @inproceedings{DBLP:journals/corr/KingmaB14, author = {Diederik P. Kingma and Jimmy Ba}, title = {Adam: {A} Method for Stochastic Optimization}, booktitle = {3rd International Conference on Learning Representations, {ICLR} 2015}, year = {2015}, timestamp = {Thu, 25 Jul 2019 14:25:37 +0200}, biburl = {https://dblp.org/rec/journals/corr/KingmaB14.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2106-03917, author = {Jingyang Zhang and Nathan Inkawhich and Yiran Chen and Hai Li}, title = {Fine-grained Out-of-Distribution Detection with Mixup Outlier Exposure}, journal = {CoRR}, volume = {abs/2106.03917}, year = {2021}, eprinttype = {arXiv}, eprint = {2106.03917}, timestamp = {Thu, 10 Jun 2021 16:34:18 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2106-03917.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{bendale2016towards, title={Towards open set deep networks}, author={Bendale, Abhijit and Boult, Terrance E}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={1563--1572}, year={2016} } @inproceedings{DBLP:conf/icml/SastryO20, author = {Chandramouli Shama Sastry and Sageev Oore}, title = {Detecting Out-of-Distribution Examples with Gram Matrices}, booktitle = {Proceedings of the 37th International Conference on Machine Learning, {ICML} 2020}, volume = {119}, pages = {8491--8501}, year = {2020}, timestamp = {Tue, 15 Dec 2020 17:40:19 +0100}, biburl = {https://dblp.org/rec/conf/icml/SastryO20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{netzer2011reading, title={Reading digits in natural images with unsupervised feature learning}, author={Netzer, Yuval and Wang, Tao and Coates, Adam and Bissacco, Alessandro and Wu, Bo and Ng, Andrew Y}, year={2011} } @inproceedings{DBLP:conf/cvpr/CimpoiMKMV14, author = {Mircea Cimpoi and Subhransu Maji and Iasonas Kokkinos and Sammy Mohamed and Andrea Vedaldi}, title = {Describing Textures in the Wild}, booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2014}, pages = {3606--3613}, year = {2014} } @article{DBLP:journals/corr/abs-2003-08798, author = {K. J. Joseph and Jathushan Rajasegaran and Salman H. Khan and Fahad Shahbaz Khan and Vineeth Balasubramanian and Ling Shao}, title = {Incremental Object Detection via Meta-Learning}, journal={arXiv preprint arXiv:2003.08798}, year={2020} } @inproceedings{DBLP:conf/cvpr/Perez-RuaZHX20, author = {Juan{-}Manuel P{\'{e}}rez{-}R{\'{u}}a and Xiatian Zhu and Timothy M. Hospedales and Tao Xiang}, title = {Incremental Few-Shot Object Detection}, booktitle = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pages = {13843--13852}, year = {2020}, doi = {10.1109/CVPR42600.2020.01386}, timestamp = {Tue, 11 Aug 2020 16:59:49 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/Perez-RuaZHX20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{cifar, author = {Alex Krizhevsky and Geoffrey Hinton}, title = {Learning multiple layers of features from tiny images}, volume={Technical Report}, year = {2009} } @inproceedings{DBLP:conf/bmvc/ZagoruykoK16, author = {Sergey Zagoruyko and Nikos Komodakis}, editor = {Richard C. Wilson and Edwin R. Hancock and William A. P. Smith}, title = {Wide Residual Networks}, booktitle = {Proceedings of the British Machine Vision Conference 2016, {BMVC} 2016, York, UK, September 19-22, 2016}, publisher = {{BMVA} Press}, year = {2016}, url = {http://www.bmva.org/bmvc/2016/papers/paper087/index.html}, timestamp = {Wed, 03 Feb 2021 08:36:18 +0100}, biburl = {https://dblp.org/rec/conf/bmvc/ZagoruykoK16.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iccv/ShmelkovSA17, author = {Konstantin Shmelkov and Cordelia Schmid and Karteek Alahari}, title = {Incremental Learning of Object Detectors without Catastrophic Forgetting}, booktitle = {{IEEE} International Conference on Computer Vision, {ICCV} 2017}, pages = {3420--3429}, year = {2017}, url = {https://doi.org/10.1109/ICCV.2017.368}, doi = {10.1109/ICCV.2017.368}, timestamp = {Wed, 16 Oct 2019 14:14:51 +0200}, biburl = {https://dblp.org/rec/conf/iccv/ShmelkovSA17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/accv/RahmanKP18, author = {Shafin Rahman and Salman H. Khan and Fatih Porikli}, title = {Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts}, booktitle = {Asian Conference on Computer Vision 2018}, volume = {11361}, pages = {547--563}, year = {2018}, doi = {10.1007/978-3-030-20887-5\_34}, timestamp = {Mon, 03 Jun 2019 16:21:27 +0200}, biburl = {https://dblp.org/rec/conf/accv/RahmanKP18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/aaai/LiYZ0KZ19, author = {Zhihui Li and Lina Yao and Xiaoqin Zhang and Xianzhi Wang and Salil S. Kanhere and Huaxiang Zhang}, title = {Zero-Shot Object Detection with Textual Descriptions}, booktitle = {The Thirty-Third {AAAI} Conference on Artificial Intelligence, {AAAI} 2019}, pages = {8690--8697}, year = {2019} } @article{DBLP:journals/ijcv/RahmanKP20, author = {Shafin Rahman and Salman H. Khan and Fatih Porikli}, title = {Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts}, journal = {International Journal of Computer Vision}, volume = {128}, number = {12}, pages = {2979--2999}, year = {2020}, doi = {10.1007/s11263-020-01355-6}, timestamp = {Tue, 06 Oct 2020 17:44:10 +0200}, biburl = {https://dblp.org/rec/journals/ijcv/RahmanKP20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/wacv/0003DSZMCCAS20, author = {David Hall and Feras Dayoub and John Skinner and Haoyang Zhang and Dimity Miller and Peter Corke and Gustavo Carneiro and Anelia Angelova and Niko S{\"{u}}nderhauf}, title = {Probabilistic Object Detection: Definition and Evaluation}, booktitle = {{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2020}, pages = {1020--1029}, year = {2020}, timestamp = {Fri, 09 Apr 2021 18:46:49 +0200}, biburl = {https://dblp.org/rec/conf/wacv/0003DSZMCCAS20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/icra/MillerDMS19, author = {Dimity Miller and Feras Dayoub and Michael Milford and Niko S{\"{u}}nderhauf}, title = {Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection}, booktitle = {International Conference on Robotics and Automation, {ICRA} 2019}, pages = {2348--2354}, year = {2019}, timestamp = {Mon, 15 Jun 2020 17:08:48 +0200}, biburl = {https://dblp.org/rec/conf/icra/MillerDMS19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/icra/MillerNDS18, author = {Dimity Miller and Lachlan Nicholson and Feras Dayoub and Niko S{\"{u}}nderhauf}, title = {Dropout Sampling for Robust Object Detection in Open-Set Conditions}, booktitle = {{IEEE} International Conference on Robotics and Automation, {ICRA} 2018}, pages = {1--7}, year = {2018}, doi = {10.1109/ICRA.2018.8460700}, timestamp = {Fri, 27 Mar 2020 08:57:10 +0100}, biburl = {https://dblp.org/rec/conf/icra/MillerNDS18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2104-01328, author = {Dimity Miller and Niko S{\"{u}}nderhauf and Michael Milford and Feras Dayoub}, title = {Uncertainty for Identifying Open-Set Errors in Visual Object Detection}, journal={arXiv preprint arXiv:2104.01328}, year={2021} } @inproceedings{DBLP:conf/wacv/DhamijaGVB20, author = {Akshay Raj Dhamija and Manuel G{\"{u}}nther and Jonathan Ventura and Terrance E. Boult}, title = {The Overlooked Elephant of Object Detection: Open Set}, booktitle = {{IEEE} Winter Conference on Applications of Computer Vision, {WACV} 2020}, pages = {1010--1019}, year = {2020} } @inproceedings{DBLP:journals/corr/abs-2101-05036, title={Estimating and Evaluating Regression Predictive Uncertainty in Deep Object Detectors}, author = {Ali Harakeh and Steven L. Waslander}, booktitle={International Conference on Learning Representations}, year={2021} } @inproceedings{DBLP:conf/nips/Lakshminarayanan17, author = {Balaji Lakshminarayanan and Alexander Pritzel and Charles Blundell}, title = {Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles}, booktitle = {Advances in Neural Information Processing Systems}, pages = {6402--6413}, year = {2017}, } @inproceedings{DBLP:conf/iclr/SerraAGSNL20, author = {Joan Serr{\`{a}} and David {\'{A}}lvarez and Vicen{\c{c}} G{\'{o}}mez and Olga Slizovskaia and Jos{\'{e}} F. N{\'{u}}{\~{n}}ez and Jordi Luque}, title = {Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models}, booktitle = {International Conference on Learning Representations, {ICLR} 2020 }, year = {2020}, timestamp = {Thu, 18 Jun 2020 15:43:42 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SerraAGSNL20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/nips/KingmaD18, author = {Diederik P. Kingma and Prafulla Dhariwal}, title = {Glow: Generative Flow with Invertible 1x1 Convolutions}, booktitle = {Advances in Neural Information Processing Systems 2018, NeurIPS 2018}, pages = {10236--10245}, year = {2018}, timestamp = {Thu, 21 Jan 2021 15:15:21 +0100}, biburl = {https://dblp.org/rec/conf/nips/KingmaD18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/GrathwohlWJD0S20, author = {Will Grathwohl and Kuan{-}Chieh Wang and J{\"{o}}rn{-}Henrik Jacobsen and David Duvenaud and Mohammad Norouzi and Kevin Swersky}, title = {Your classifier is secretly an energy based model and you should treat it like one}, booktitle = {International Conference on Learning Representations, {ICLR} 2020}, year = {2020} } @inproceedings{DBLP:journals/corr/abs-1903-08689, author = {Yilun Du and Igor Mordatch}, title = {Implicit Generation and Modeling with Energy Based Models}, booktitle = {Advances in Neural Information Processing Systems 2019, NeurIPS 2019}, pages = {3603--3613}, year = {2019} } @inproceedings{DBLP:conf/icml/GalG16, author = {Yarin Gal and Zoubin Ghahramani}, editor = {Maria{-}Florina Balcan and Kilian Q. Weinberger}, title = {Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning}, booktitle = {Proceedings of the 33nd International Conference on Machine Learning, {ICML} 2016, New York City, NY, USA, June 19-24, 2016}, series = {{JMLR} Workshop and Conference Proceedings}, volume = {48}, pages = {1050--1059}, publisher = {JMLR.org}, year = {2016}, url = {http://proceedings.mlr.press/v48/gal16.html}, timestamp = {Wed, 29 May 2019 08:41:46 +0200}, biburl = {https://dblp.org/rec/conf/icml/GalG16.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/cvpr/HuangLMW17, author = {Gao Huang and Zhuang Liu and Laurens van der Maaten and Kilian Q. Weinberger}, title = {Densely Connected Convolutional Networks}, booktitle = {2017 {IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2017, Honolulu, HI, USA, July 21-26, 2017}, pages = {2261--2269}, publisher = {{IEEE} Computer Society}, year = {2017}, url = {https://doi.org/10.1109/CVPR.2017.243}, doi = {10.1109/CVPR.2017.243}, timestamp = {Wed, 16 Oct 2019 14:14:50 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/HuangLMW17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/NalisnickMTGL19, author = {Eric T. Nalisnick and Akihiro Matsukawa and Yee Whye Teh and Dilan G{\"{o}}r{\"{u}}r and Balaji Lakshminarayanan}, title = {Do Deep Generative Models Know What They Don't Know?}, booktitle = {International Conference on Learning Representations, {ICLR} 2019 }, year = {2019}, timestamp = {Thu, 25 Jul 2019 14:25:59 +0200}, biburl = {https://dblp.org/rec/conf/iclr/NalisnickMTGL19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/HinzHW19, author = {Tobias Hinz and Stefan Heinrich and Stefan Wermter}, title = {Generating Multiple Objects at Spatially Distinct Locations}, booktitle = {International Conference on Learning Representations, {ICLR} 2019}, year = {2019}, timestamp = {Thu, 25 Jul 2019 14:25:42 +0200}, biburl = {https://dblp.org/rec/conf/iclr/HinzHW19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-1810-01392, author = {Hyunsun Choi and Eric Jang}, title = {Generative Ensembles for Robust Anomaly Detection}, journal={arXiv preprint arXiv:1810.01392}, year = {2018}, url = {http://arxiv.org/abs/1810.01392}, archivePrefix = {arXiv}, eprint = {1810.01392}, timestamp = {Thu, 01 Nov 2018 18:03:07 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-1810-01392.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/nips/RenLFSPDDL19, author = {Jie Ren and Peter J. Liu and Emily Fertig and Jasper Snoek and Ryan Poplin and Mark A. DePristo and Joshua V. Dillon and Balaji Lakshminarayanan}, title = {Likelihood Ratios for Out-of-Distribution Detection}, booktitle = {Advances in Neural Information Processing Systems 2019, NeurIPS 2019}, pages = {14680--14691}, year = {2019}, timestamp = {Thu, 21 Jan 2021 15:15:20 +0100}, biburl = {https://dblp.org/rec/conf/nips/RenLFSPDDL19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/RiquelmeTS18, author = {Carlos Riquelme and George Tucker and Jasper Snoek}, title = {Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling}, booktitle = {International Conference on Learning Representations, {ICLR} 2018 }, year = {2018}, url = {https://openreview.net/forum?id=SyYe6k-CW}, timestamp = {Thu, 25 Jul 2019 14:25:48 +0200}, biburl = {https://dblp.org/rec/conf/iclr/RiquelmeTS18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/aaai/MohseniPYW20, author = {Sina Mohseni and Mandar Pitale and J. B. S. Yadawa and Zhangyang Wang}, title = {Self-Supervised Learning for Generalizable Out-of-Distribution Detection}, booktitle = {The Thirty-Fourth {AAAI} Conference on Artificial Intelligence, {AAAI} 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, {IAAI} 2020, The Tenth {AAAI} Symposium on Educational Advances in Artificial Intelligence, {EAAI} 2020, New York, NY, USA, February 7-12, 2020}, pages = {5216--5223}, publisher = {{AAAI} Press}, year = {2020}, url = {https://aaai.org/ojs/index.php/AAAI/article/view/5966}, timestamp = {Tue, 02 Feb 2021 08:01:02 +0100}, biburl = {https://dblp.org/rec/conf/aaai/MohseniPYW20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/nips/HendrycksMKS19, author = {Dan Hendrycks and Mantas Mazeika and Saurav Kadavath and Dawn Song}, title = {Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty}, booktitle = {Advances in Neural Information Processing Systems 2019, NeurIPS 2019}, pages = {15637--15648}, year = {2019} } @inproceedings{DBLP:conf/cvpr/0001AB19, author = {Matthias Hein and Maksym Andriushchenko and Julian Bitterwolf}, title = {Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem}, booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR} 2019, Long Beach, CA, USA, June 16-20, 2019}, pages = {41--50}, publisher = {Computer Vision Foundation / {IEEE}}, year = {2019}, url = {http://openaccess.thecvf.com/content\_CVPR\_2019/html/Hein\_Why\_ReLU\_Networks\_Yield\_High-Confidence\_Predictions\_Far\_Away\_From\_the\_CVPR\_2019\_paper.html}, doi = {10.1109/CVPR.2019.00013}, timestamp = {Thu, 27 Aug 2020 07:31:34 +0200}, biburl = {https://dblp.org/rec/conf/cvpr/0001AB19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/LiangLS18, author = {Shiyu Liang and Yixuan Li and R. Srikant}, title = {Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks}, booktitle = {6th International Conference on Learning Representations, {ICLR} 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings}, publisher = {OpenReview.net}, year = {2018}, url = {https://openreview.net/forum?id=H1VGkIxRZ}, timestamp = {Thu, 22 Aug 2019 13:48:13 +0200}, biburl = {https://dblp.org/rec/conf/iclr/LiangLS18.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/iclr/HendrycksG17, author = {Dan Hendrycks and Kevin Gimpel}, title = {A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks}, booktitle = {5th International Conference on Learning Representations, {ICLR} 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings}, publisher = {OpenReview.net}, year = {2017}, url = {https://openreview.net/forum?id=Hkg4TI9xl}, timestamp = {Thu, 25 Jul 2019 14:25:55 +0200}, biburl = {https://dblp.org/rec/conf/iclr/HendrycksG17.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:conf/icml/FilosTMRLG20, author = {Angelos Filos and Panagiotis Tigas and Rowan McAllister and Nicholas Rhinehart and Sergey Levine and Yarin Gal}, title = {Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?}, booktitle = {Proceedings of the 37th International Conference on Machine Learning, {ICML} 2020, 13-18 July 2020, Virtual Event}, series = {Proceedings of Machine Learning Research}, volume = {119}, pages = {3145--3153}, publisher = {{PMLR}}, year = {2020}, url = {http://proceedings.mlr.press/v119/filos20a.html}, timestamp = {Fri, 05 Feb 2021 11:08:07 +0100}, biburl = {https://dblp.org/rec/conf/icml/FilosTMRLG20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2007-04250, author = {Tianshi Cao and Chinwei Huang and David Yu{-}Tung Hui and Joseph Paul Cohen}, title = {A Benchmark of Medical Out of Distribution Detection}, journal = {CoRR}, volume = {abs/2007.04250}, year = {2020}, url = {https://arxiv.org/abs/2007.04250}, archivePrefix = {arXiv}, eprint = {2007.04250}, timestamp = {Mon, 20 Jul 2020 14:20:39 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2007-04250.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{chen2020, title={Informative Outlier Matters: Robustifying Out-of-distribution Detection Using Outlier Mining}, author={Chen, Jiefeng and Li, Yixuan and Wu, Xi and Liang, Yingyu and Jha, Somesh}, journal={arXiv preprint arXiv:2006.15207}, year={2020} } @inproceedings{mohseni2020self, title={Self-supervised learning for generalizable out-of-distribution detection}, author={Mohseni, Sina and Pitale, Mandar and Yadawa, JBS and Wang, Zhangyang}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={34}, number={04}, pages={5216--5223}, year={2020} } @inproceedings{lee2020gradients, title={Gradients as a Measure of Uncertainty in Neural Networks}, author={Lee, Jinsol and AlRegib, Ghassan}, booktitle={2020 IEEE International Conference on Image Processing (ICIP)}, pages={2416--2420}, year={2020}, organization={IEEE} } @inproceedings{zhou2016cvpr, author = {Zhou, Bolei and Khosla, Aditya and Lapedriza, Agata and Oliva, Aude and Torralba, Antonio}, title = {Learning Deep Features for Discriminative Localization}, booktitle = {Computer Vision and Pattern Recognition}, year = {2016} } @inproceedings{irvin2019chexpert, title={Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison}, author={Irvin, Jeremy and Rajpurkar, Pranav and Ko, Michael and Yu, Yifan and Ciurea-Ilcus, Silviana and Chute, Chris and Marklund, Henrik and Haghgoo, Behzad and Ball, Robyn and Shpanskaya, Katie and others}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={33}, number={01}, pages={590--597}, year={2019} } @article{rajpurkar2017chexnet, title={Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning}, author={Rajpurkar, Pranav and Irvin, Jeremy and Zhu, Kaylie and Yang, Brandon and Mehta, Hershel and Duan, Tony and Ding, Daisy and Bagul, Aarti and Langlotz, Curtis and Shpanskaya, Katie and others}, journal={arXiv preprint arXiv:1711.05225}, year={2017} } @inproceedings{goodfellow2014explaining, author = {Ian J. Goodfellow and Jonathon Shlens and Christian Szegedy}, title = {Explaining and Harnessing Adversarial Examples}, booktitle = {International Conference on Learning Representations, {ICLR} 2015}, year = {2015}, } @inproceedings{lee2018training, title={Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples}, author={Lee, Kimin and Lee, Honglak and Lee, Kibok and Shin, Jinwoo}, booktitle={International Conference on Learning Representations}, year={2018} } @misc{papadopoulos2020, title={Outlier Exposure with Confidence Control for Out-of-Distribution Detection}, author={Aristotelis-Angelos Papadopoulos and Mohammad Reza Rajati and Nazim Shaikh and Jiamian Wang}, year={2020}, eprint={1906.03509}, archivePrefix={arXiv}, primaryClass={cs.LG} } @inproceedings{kingma2014autoencoding, author = {Diederik P. Kingma and Max Welling}, editor = {Yoshua Bengio and Yann LeCun}, title = {Auto-Encoding Variational Bayes}, booktitle = {2nd International Conference on Learning Representations, {ICLR} 2014}, year = {2014}, } @inproceedings{dinh2017density, author = {Laurent Dinh and Jascha Sohl{-}Dickstein and Samy Bengio}, title = {Density estimation using Real {NVP}}, booktitle = {5th International Conference on Learning Representations, {ICLR} 2017}, year = {2017}, } @inproceedings{rezende2014stochastic, title={Stochastic backpropagation and approximate inference in deep generative models}, author={Rezende, Danilo Jimenez and Mohamed, Shakir and Wierstra, Daan}, booktitle={International conference on machine learning}, pages={1278--1286}, year={2014}, organization={PMLR} } @inproceedings{oord2016conditional, title={Conditional image generation with PixelCNN decoders}, author={Oord, A{\"a}ron van den and Kalchbrenner, Nal and Vinyals, Oriol and Espeholt, Lasse and Graves, Alex and Kavukcuoglu, Koray}, booktitle={Proceedings of the 30th International Conference on Neural Information Processing Systems}, pages={4797--4805}, year={2016} } @inproceedings{nalisnick2018deep, title={Do Deep Generative Models Know What They Don't Know?}, author={Nalisnick, Eric and Matsukawa, Akihiro and Teh, Yee Whye and Gorur, Dilan and Lakshminarayanan, Balaji}, booktitle={International Conference on Learning Representations}, year={2018} } @misc{choi2019waic, title={WAIC, but Why? Generative Ensembles for Robust Anomaly Detection}, author={Hyunsun Choi and Eric Jang and Alexander A. Alemi}, year={2019}, eprint={1810.01392}, archivePrefix={arXiv}, primaryClass={stat.ML} } @inproceedings{ren2019likelihood, title={Likelihood ratios for out-of-distribution detection}, author={Ren, Jie and Liu, Peter J and Fertig, Emily and Snoek, Jasper and Poplin, Ryan and Depristo, Mark and Dillon, Joshua and Lakshminarayanan, Balaji}, booktitle={Advances in Neural Information Processing Systems}, pages={14680--14691}, year={2019} } @inproceedings{ serra2019input, title={Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models}, author={Joan Serrà and David Álvarez and Vicenç Gómez and Olga Slizovskaia and José F. Núñez and Jordi Luque}, booktitle={International Conference on Learning Representations}, year={2020} } @InProceedings{pope20a, title = {Adversarial Robustness of Flow-Based Generative Models}, author = {Pope, Phillip and Balaji, Yogesh and Feizi, Soheil}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3795--3805}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR} } @misc{hendrycks2020scaling, title={Scaling Out-of-Distribution Detection for Real-World Settings}, author={Dan Hendrycks and Steven Basart and Mantas Mazeika and Mohammadreza Mostajabi and Jacob Steinhardt and Dawn Song}, year={2020}, eprint={1911.11132}, archivePrefix={arXiv}, primaryClass={cs.CV} } @article{Roady2019, author = {Roady, Ryne and Hayes, Tyler and Kemker, Ronald and Gonzales, Ayesha and Kanan, Christopher}, title = {Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?}, journal = {CoRR}, volume = {abs/1910.14034}, year = {2019}, url = {http://arxiv.org/abs/1910.14034}, archivePrefix = {arXiv}, eprint = {1910.14034}, timestamp = {Mon, 04 Nov 2019 09:10:30 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-1910-14034.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @InProceedings{cimpoi14describing, Author = {M. Cimpoi and S. Maji and I. Kokkinos and S. Mohamed and and A. Vedaldi}, Title = {Describing Textures in the Wild}, Booktitle = {Proceedings of the {IEEE} Conf. on Computer Vision and Pattern Recognition ({CVPR})}, Year = {2014}} @article{zhou2017places, title={Places: A 10 million image database for scene recognition}, author={Zhou, Bolei and Lapedriza, Agata and Khosla, Aditya and Oliva, Aude and Torralba, Antonio}, journal={IEEE transactions on pattern analysis and machine intelligence}, volume={40}, number={6}, pages={1452--1464}, year={2017}, publisher={IEEE} } @inproceedings{deng2009imagenet, title={Imagenet: A large-scale hierarchical image database}, author={Deng, Jia and Dong, Wei and Socher, Richard and Li, Li-Jia and Li, Kai and Fei-Fei, Li}, booktitle={2009 IEEE conference on computer vision and pattern recognition}, pages={248--255}, year={2009}, organization={Ieee} } % inat @inproceedings{van2018inaturalist, title={The inaturalist species classification and detection dataset}, author={Van Horn, Grant and Mac Aodha, Oisin and Song, Yang and Cui, Yin and Sun, Chen and Shepard, Alex and Adam, Hartwig and Perona, Pietro and Belongie, Serge}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={8769--8778}, year={2018} } % sun @inproceedings{xiao2010sun, title={Sun database: Large-scale scene recognition from abbey to zoo}, author={Xiao, Jianxiong and Hays, James and Ehinger, Krista A and Oliva, Aude and Torralba, Antonio}, booktitle={2010 IEEE computer society conference on computer vision and pattern recognition}, pages={3485--3492}, year={2010}, organization={IEEE} } %% group-based learning @article{hinton2015distilling, title={Distilling the knowledge in a neural network}, author={Hinton, Geoffrey and Vinyals, Oriol and Dean, Jeff}, journal={arXiv preprint arXiv:1503.02531}, year={2015} } @inproceedings{ahmed2016network, title={Network of experts for large-scale image categorization}, author={Ahmed, Karim and Baig, Mohammad Haris and Torresani, Lorenzo}, booktitle={European Conference on Computer Vision}, pages={516--532}, year={2016}, organization={Springer} } @inproceedings{yan2015hd, title={HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition}, author={Yan, Zhicheng and Zhang, Hao and Piramuthu, Robinson and Jagadeesh, Vignesh and DeCoste, Dennis and Di, Wei and Yu, Yizhou}, booktitle={Proceedings of the IEEE international conference on computer vision}, pages={2740--2748}, year={2015} } @article{warde2014self, title={Self-informed neural network structure learning}, author={Warde-Farley, David and Rabinovich, Andrew and Anguelov, Dragomir}, journal={arXiv preprint arXiv:1412.6563}, year={2014} } @inproceedings{gross2017hard, title={Hard mixtures of experts for large scale weakly supervised vision}, author={Gross, Sam and Ranzato, Marc'Aurelio and Szlam, Arthur}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={6865--6873}, year={2017} } %% large-scale pre-training % Google BiT @inproceedings{kolesnikov2020big, author = {Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Joan Puigcerver and Jessica Yung and Sylvain Gelly and Neil Houlsby}, title = {Big Transfer (BiT): General Visual Representation Learning}, booktitle = {ECCV 2020}, year = {2020}, } % Instagram @inproceedings{mahajan2018exploring, title={Exploring the limits of weakly supervised pretraining}, author={Mahajan, Dhruv and Girshick, Ross and Ramanathan, Vignesh and He, Kaiming and Paluri, Manohar and Li, Yixuan and Bharambe, Ashwin and van der Maaten, Laurens}, booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, pages={181--196}, year={2018} } @inproceedings{joulin2016learning, title={Learning visual features from large weakly supervised data}, author={Joulin, Armand and Van Der Maaten, Laurens and Jabri, Allan and Vasilache, Nicolas}, booktitle={European Conference on Computer Vision}, pages={67--84}, year={2016}, organization={Springer} } @inproceedings{sun2017revisiting, title={Revisiting unreasonable effectiveness of data in deep learning era}, author={Sun, Chen and Shrivastava, Abhinav and Singh, Saurabh and Gupta, Abhinav}, booktitle={Proceedings of the IEEE international conference on computer vision}, pages={843--852}, year={2017} } @inproceedings{sharif2014cnn, title={CNN features off-the-shelf: an astounding baseline for recognition}, author={Sharif Razavian, Ali and Azizpour, Hossein and Sullivan, Josephine and Carlsson, Stefan}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition workshops}, pages={806--813}, year={2014} } @inproceedings{girshick2014rich, title={Rich feature hierarchies for accurate object detection and semantic segmentation}, author={Girshick, Ross and Donahue, Jeff and Darrell, Trevor and Malik, Jitendra}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={580--587}, year={2014} } @inproceedings{pinheiro2015learning, title={Learning to segment object candidates}, author={Pinheiro, Pedro OO and Collobert, Ronan and Doll{\'a}r, Piotr}, booktitle={Advances in Neural Information Processing Systems}, pages={1990--1998}, year={2015} } @inproceedings{jaderberg2015spatial, title={Spatial transformer networks}, author={Jaderberg, Max and Simonyan, Karen and Zisserman, Andrew and others}, booktitle={Advances in neural information processing systems}, pages={2017--2025}, year={2015} } @inproceedings{li2017learning, title={Learning visual n-grams from web data}, author={Li, Ang and Jabri, Allan and Joulin, Armand and van der Maaten, Laurens}, booktitle={Proceedings of the IEEE International Conference on Computer Vision}, pages={4183--4192}, year={2017} } @article{chen2017deeplab, title={Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs}, author={Chen, Liang-Chieh and Papandreou, George and Kokkinos, Iasonas and Murphy, Kevin and Yuille, Alan L}, journal={IEEE transactions on pattern analysis and machine intelligence}, volume={40}, number={4}, pages={834--848}, year={2017}, publisher={IEEE} } @inproceedings{ren2015faster, title={Faster r-cnn: Towards real-time object detection with region proposal networks}, author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian}, booktitle={Advances in neural information processing systems}, pages={91--99}, year={2015} } % kmeans @inproceedings{macqueen1967some, title={Some methods for classification and analysis of multivariate observations}, author={MacQueen, James and others}, booktitle={Proceedings of the fifth Berkeley symposium on mathematical statistics and probability}, volume={1}, number={14}, pages={281--297}, year={1967}, organization={Oakland, CA, USA} } %% hierachical softmax @inproceedings{bengio2010label, title={Label embedding trees for large multi-class tasks}, author={Bengio, Samy and Weston, Jason and Grangier, David}, booktitle={Advances in Neural Information Processing Systems}, pages={163--171}, year={2010} } @inproceedings{deng2011fast, title={Fast and balanced: Efficient label tree learning for large scale object recognition}, author={Deng, Jia and Satheesh, Sanjeev and Berg, Alexander C and Li, Fei}, booktitle={Advances in Neural Information Processing Systems}, pages={567--575}, year={2011} } @inproceedings{liu2013probabilistic, title={Probabilistic label trees for efficient large scale image classification}, author={Liu, Baoyuan and Sadeghi, Fereshteh and Tappen, Marshall and Shamir, Ohad and Liu, Ce}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={843--850}, year={2013} } @inproceedings{deng2014large, title={Large-scale object classification using label relation graphs}, author={Deng, Jia and Ding, Nan and Jia, Yangqing and Frome, Andrea and Murphy, Kevin and Bengio, Samy and Li, Yuan and Neven, Hartmut and Adam, Hartwig}, booktitle={European conference on computer vision}, pages={48--64}, year={2014}, organization={Springer} } @inproceedings{redmon2017yolo9000, title={YOLO9000: better, faster, stronger}, author={Redmon, Joseph and Farhadi, Ali}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={7263--7271}, year={2017} } %% large-scale OOD @article{roady2019outofdistribution, author={Ryne Roady and Tyler L. Hayes and Ronald Kemker and Ayesha Gonzales and Christopher Kanan}, title = {Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?}, journal = {CoRR}, volume = {abs/1910.14034}, year = {2019}, url = {http://arxiv.org/abs/1910.14034}, archivePrefix = {arXiv}, eprint = {1910.14034}, timestamp = {Mon, 04 Nov 2019 09:10:30 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-1910-14034.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{hendrycks2019benchmark, title={A benchmark for anomaly segmentation}, author={Hendrycks, Dan and Basart, Steven and Mazeika, Mantas and Mostajabi, Mohammadreza and Steinhardt, Jacob and Song, Dawn}, journal={arXiv preprint arXiv:1911.11132}, year={2019} } %% previous OOD methods % mc dropout @inproceedings{gal2016dropout, title={Dropout as a bayesian approximation: Representing model uncertainty in deep learning}, author={Gal, Yarin and Ghahramani, Zoubin}, booktitle={international conference on machine learning}, pages={1050--1059}, year={2016}, organization={PMLR} } % MSP @inproceedings{hendrycks2016baseline, author = {Dan Hendrycks and Kevin Gimpel}, title = {A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks}, booktitle = {International Conference on Learning Representations, {ICLR} 2017}, year = {2017}, } % ODIN @inproceedings{liang2018enhancing, title={Enhancing the reliability of out-of-distribution image detection in neural networks}, author={Liang, Shiyu and Li, Yixuan and Srikant, Rayadurgam}, booktitle={International Conference on Learning Representations, ICLR 2018}, year={2018} } % Mahalanobis @inproceedings{lee2018simple, title={A simple unified framework for detecting out-of-distribution samples and adversarial attacks}, author={Lee, Kimin and Lee, Kibok and Lee, Honglak and Shin, Jinwoo}, booktitle={Advances in Neural Information Processing Systems}, pages={7167--7177}, year={2018} } % Energy @article{liu2020energy, title={Energy-based Out-of-distribution Detection}, author={Liu, Weitang and Wang, Xiaoyun and Owens, John and Li, Yixuan}, journal={Advances in Neural Information Processing Systems}, year={2020} } % OE @inproceedings{hendrycks2018deep, title={Deep Anomaly Detection with Outlier Exposure}, author={Hendrycks, Dan and Mazeika, Mantas and Dietterich, Thomas}, booktitle={International Conference on Learning Representations}, year={2019} } % deep ensembles @inproceedings{lakshminarayanan2017simple, title={Simple and scalable predictive uncertainty estimation using deep ensembles}, author={Lakshminarayanan, Balaji and Pritzel, Alexander and Blundell, Charles}, booktitle={Advances in neural information processing systems}, pages={6402--6413}, year={2017} } % a confidence branch @article{devries2018learning, title={Learning confidence for out-of-distribution detection in neural networks}, author={DeVries, Terrance and Taylor, Graham W}, journal={arXiv preprint arXiv:1802.04865}, year={2018} } %% datasets % cifar @article{krizhevsky2009learning, title={Learning multiple layers of features from tiny images}, author={Krizhevsky, Alex and Hinton, Geoffrey and others}, year={2009}, publisher={Citeseer} } %% models % resnet-v2 @inproceedings{he2016identity, title={Identity mappings in deep residual networks}, author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian}, booktitle={European conference on computer vision}, pages={630--645}, year={2016}, organization={Springer} } % group softmax paper @inproceedings{li2020overcoming, title={Overcoming Classifier Imbalance for Long-Tail Object Detection With Balanced Group Softmax}, author={Li, Yu and Wang, Tao and Kang, Bingyi and Tang, Sheng and Wang, Chunfeng and Li, Jintao and Feng, Jiashi}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={10991--11000}, year={2020} } @article{miller1995wordnet, title={WordNet: a lexical database for English}, author={Miller, George A}, journal={Communications of the ACM}, volume={38}, number={11}, pages={39--41}, year={1995}, publisher={ACM New York, NY, USA} } @incollection{Bengio+chapter2007, author = {Bengio, Yoshua and LeCun, Yann}, booktitle = {Large Scale Kernel Machines}, publisher = {MIT Press}, title = {Scaling Learning Algorithms Towards {AI}}, year = {2007} } @article{Hinton06, author = {Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee Whye}, journal = {Neural Computation}, pages = {1527--1554}, title = {A Fast Learning Algorithm for Deep Belief Nets}, volume = {18}, year = {2006} } @book{goodfellow2016deep, title={Deep learning}, author={Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron and Bengio, Yoshua}, volume={1}, year={2016}, publisher={MIT Press} } @inproceedings{nguyen2015deep, title={Deep neural networks are easily fooled: High confidence predictions for unrecognizable images}, author={Nguyen, Anh and Yosinski, Jason and Clune, Jeff}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={427--436}, year={2015} } @inproceedings{hein2019relu, title={Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem}, author={Hein, Matthias and Andriushchenko, Maksym and Bitterwolf, Julian}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={41--50}, year={2019} } @article{hendrycks2019benchmark, title={A benchmark for anomaly segmentation}, author={Hendrycks, Dan and Basart, Steven and Mazeika, Mantas and Mostajabi, Mohammadreza and Steinhardt, Jacob and Song, Dawn}, journal={arXiv preprint arXiv:1911.11132}, year={2019} } @inproceedings{wang2016cnn, title={Cnn-rnn: A unified framework for multi-label image classification}, author={Wang, Jiang and Yang, Yi and Mao, Junhua and Huang, Zhiheng and Huang, Chang and Xu, Wei}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={2285--2294}, year={2016} } @inproceedings{nam2014large, title={Large-scale multi-label text classification—revisiting neural networks}, author={Nam, Jinseok and Kim, Jungi and Menc{\'\i}a, Eneldo Loza and Gurevych, Iryna and F{\"u}rnkranz, Johannes}, booktitle={Joint european conference on machine learning and knowledge discovery in databases}, pages={437--452}, year={2014}, organization={Springer} } @article{lakhani2017deep, title={Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks}, author={Lakhani, Paras and Sundaram, Baskaran}, journal={Radiology}, volume={284}, number={2}, pages={574--582}, year={2017}, publisher={Radiological Society of North America} } @article{huang2018added, title={Added value of computer-aided CT image features for early lung cancer diagnosis with small pulmonary nodules: a matched case-control study}, author={Huang, Peng and Park, Seyoun and Yan, Rongkai and Lee, Junghoon and Chu, Linda C and Lin, Cheng T and Hussien, Amira and Rathmell, Joshua and Thomas, Brett and Chen, Chen and others}, journal={Radiology}, volume={286}, number={1}, pages={286--295}, year={2018}, publisher={Radiological Society of North America} } @article{tsoumakas2007multi, title={Multi-label classification: An overview}, author={Tsoumakas, Grigorios and Katakis, Ioannis}, journal={International Journal of Data Warehousing and Mining (IJDWM)}, volume={3}, number={3}, pages={1--13}, year={2007}, publisher={IGI Global} } @inproceedings{DBLP:conf/nips/TackMJS20, author = {Jihoon Tack and Sangwoo Mo and Jongheon Jeong and Jinwoo Shin}, title = {{CSI:} Novelty Detection via Contrastive Learning on Distributionally Shifted Instances}, booktitle = {Advances in Neural Information Processing Systems 33, NeurIPS 2020}, year = {2020} } @inproceedings{DBLP:conf/cvpr/SunYZLP20, author = {Xin Sun and Zhenning Yang and Chi Zhang and Keck Voon Ling and Guohao Peng}, title = {Conditional Gaussian Distribution Learning for Open Set Recognition}, booktitle = {2020 {IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR} 2020}, pages = {13477--13486}, year = {2020}, url = {https://openaccess.thecvf.com/content\_CVPR\_2020/html/Sun\_Conditional\_Gaussian\_Distribution\_Learning\_for\_Open\_Set\_Recognition\_CVPR\_2020\_paper.html}, doi = {10.1109/CVPR42600.2020.01349}, timestamp = {Fri, 21 Jan 2022 09:17:12 +0100}, biburl = {https://dblp.org/rec/conf/cvpr/SunYZLP20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:journals/corr/abs-2104-13921, author = {Xiuye Gu and Tsung{-}Yi Lin and Weicheng Kuo and Yin Cui}, title = {Zero-Shot Detection via Vision and Language Knowledge Distillation}, year = {2022}, booktitle={10th International Conference on Learning Representations, {ICLR} 2022}, } @inproceedings{hsu2020generalized, title={Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data}, author={Hsu, Yen-Chang and Shen, Yilin and Jin, Hongxia and Kira, Zsolt}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={10951--10960}, year={2020} } @inproceedings{hinz2018generating, title={Generating Multiple Objects at Spatially Distinct Locations}, author={Hinz, Tobias and Heinrich, Stefan and Wermter, Stefan}, booktitle={International Conference on Learning Representations}, year={2018} } @article{cao2020benchmark, title={A Benchmark of Medical Out of Distribution Detection}, author={Cao, Tianshi and Huang, Chinwei and Hui, David Yu-Tung and Cohen, Joseph Paul}, journal={arXiv preprint arXiv:2007.04250}, year={2020} } @inproceedings{lin2014microsoft, title={Microsoft coco: Common objects in context}, author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence}, booktitle={European conference on computer vision}, pages={740--755}, year={2014} } @inproceedings{nus-wide-civr09, author={Tat-Seng Chua and Jinhui Tang and Richang Hong and Haojie Li and Zhiping Luo and Yan-Tao Zheng}, booktitle={Proc. of ACM Conf. on Image and Video Retrieval (CIVR'09)}, posted-at={2009}, title={NUS-WIDE: A Real-World Web Image Database from National University of Singapore}, address={Santorini, Greece. }, year={2009} } @article{lecun2006tutorial, title={A tutorial on energy-based learning}, author={LeCun, Yann and Chopra, Sumit and Hadsell, Raia and Ranzato, M and Huang, F}, journal={Predicting structured data}, volume={1}, number={0}, year={2006} } @article{du2019implicit, title={Implicit generation and generalization in energy-based models}, author={Du, Yilun and Mordatch, Igor}, journal={arXiv preprint arXiv:1903.08689}, year={2019} } @inproceedings{kingma2018glow, title={Glow: Generative flow with invertible 1x1 convolutions}, author={Kingma, Durk P and Dhariwal, Prafulla}, booktitle={Advances in Neural Information Processing Systems}, pages={10215--10224}, year={2018} } @article{choi2018generative, title={Generative ensembles for robust anomaly detection}, author={Choi, Hyunsun and Jang, Eric}, year={2018} } @inproceedings{grathwohl2019your, title={Your classifier is secretly an energy based model and you should treat it like one}, author={Grathwohl, Will and Wang, Kuan-Chieh and Jacobsen, Joern-Henrik and Duvenaud, David and Norouzi, Mohammad and Swersky, Kevin}, booktitle={International Conference on Learning Representations}, year={2019} } @article{feinman2017detecting, title={Detecting adversarial samples from artifacts}, author={Feinman, Reuben and Curtin, Ryan R and Shintre, Saurabh and Gardner, Andrew B}, journal={arXiv preprint arXiv:1703.00410}, year={2017} } @inproceedings{salakhutdinov2010efficient, title={Efficient learning of deep Boltzmann machines}, author={Salakhutdinov, Ruslan and Larochelle, Hugo}, booktitle={Proceedings of the thirteenth international conference on artificial intelligence and statistics}, pages={693--700}, year={2010} } @article{ackley1985learning, title={A learning algorithm for Boltzmann machines}, author={Ackley, David H and Hinton, Geoffrey E and Sejnowski, Terrence J}, journal={Cognitive science}, volume={9}, number={1}, pages={147--169}, year={1985}, publisher={Elsevier} } @inproceedings{ma2018characterizing, title={Characterizing adversarial subspaces using local intrinsic dimensionality}, author={Ma, Xingjun and Li, Bo and Wang, Yisen and Erfani, Sarah M and Wijewickrema, Sudanthi and Schoenebeck, Grant and Song, Dawn and Houle, Michael E and Bailey, James}, booktitle={6th International Conference on Learning Representations, ICLR 2018}, year={2018} } @inproceedings{zhao2019energy, title={Energy-based generative adversarial networks}, author={Zhao, Junbo and Mathieu, Michael and LeCun, Yann}, booktitle={5th International Conference on Learning Representations, ICLR 2017}, year={2019} } @article{ben2010theory, title={A theory of learning from different domains}, author={Ben-David, Shai and Blitzer, John and Crammer, Koby and Kulesza, Alex and Pereira, Fernando and Vaughan, Jennifer Wortman}, journal={Machine learning}, volume={79}, number={1-2}, pages={151--175}, year={2010}, publisher={Springer} } @article{DBLP:journals/corr/YuZSSX15, title={Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop}, author={Yu, Fisher and Seff, Ari and Zhang, Yinda and Song, Shuran and Funkhouser, Thomas and Xiao, Jianxiong}, journal={arXiv preprint arXiv:1506.03365}, year={2015} } @article{DBLP:journals/corr/XuEZFKX15, title={Turkergaze: Crowdsourcing saliency with webcam based eye tracking}, author={Xu, Pingmei and Ehinger, Krista A and Zhang, Yinda and Finkelstein, Adam and Kulkarni, Sanjeev R and Xiao, Jianxiong}, journal={arXiv preprint arXiv:1504.06755}, year={2015} } @inproceedings{davis2006relationship, title={The relationship between Precision-Recall and ROC curves}, author={Davis, Jesse and Goadrich, Mark}, booktitle={Proceedings of the 23rd international conference on Machine learning}, pages={233--240}, year={2006} } @article{fawcett2006introduction, title={An introduction to ROC analysis}, author={Fawcett, Tom}, journal={Pattern recognition letters}, volume={27}, number={8}, pages={861--874}, year={2006}, publisher={Elsevier} } @book{manning1999foundations, title={Foundations of statistical natural language processing}, author={Manning, Christopher D and Manning, Christopher D and Sch{\"u}tze, Hinrich}, year={1999}, publisher={MIT press} } @article{saito2015precision, title={The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets}, author={Saito, Takaya and Rehmsmeier, Marc}, journal={PloS one}, volume={10}, number={3}, year={2015}, publisher={Public Library of Science} } @article{madry2017towards, title={Towards deep learning models resistant to adversarial attacks}, author={Madry, Aleksander and Makelov, Aleksandar and Schmidt, Ludwig and Tsipras, Dimitris and Vladu, Adrian}, journal={arXiv preprint arXiv:1706.06083}, year={2017} } @inproceedings{ghorbani2019interpretation, title={Interpretation of neural networks is fragile}, author={Ghorbani, Amirata and Abid, Abubakar and Zou, James}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={33}, pages={3681--3688}, year={2019} } @inproceedings{chen2019robust, title={Robust Attribution Regularization}, author={Chen, Jiefeng and Wu, Xi and Rastogi, Vaibhav and Liang, Yingyu and Jha, Somesh}, booktitle={Advances in Neural Information Processing Systems}, pages={14300--14310}, year={2019} } @inproceedings{huang2017densely, title={Densely connected convolutional networks}, author={Huang, Gao and Liu, Zhuang and Van Der Maaten, Laurens and Weinberger, Kilian Q}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={4700--4708}, year={2017} } @article{duchi2011adaptive, title={Adaptive subgradient methods for online learning and stochastic optimization}, author={Duchi, John and Hazan, Elad and Singer, Yoram}, journal={Journal of machine learning research}, volume={12}, number={Jul}, pages={2121--2159}, year={2011} } @article{kingma2014adam, title={Adam: A method for stochastic optimization}, author={Kingma, Diederik P and Ba, Jimmy}, journal={arXiv preprint arXiv:1412.6980}, year={2014} } @article{loshchilov2016sgdr, title={Sgdr: Stochastic gradient descent with warm restarts}, author={Loshchilov, Ilya and Hutter, Frank}, journal={arXiv preprint arXiv:1608.03983}, year={2016} } @article{krizhevsky2009learning, title={Learning multiple layers of features from tiny images}, author={Krizhevsky, Alex and Hinton, Geoffrey and others}, year={2009}, publisher={Citeseer} } @inproceedings{kingma2013auto, author = {Diederik P. Kingma and Max Welling}, editor = {Yoshua Bengio and Yann LeCun}, title = {Auto-Encoding Variational Bayes}, booktitle = {2nd International Conference on Learning Representations, {ICLR} 2014} } @article{xiao2020likelihood, title={Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder}, author={Xiao, Zhisheng and Yan, Qing and Amit, Yali}, journal={Advances in Neural Information Processing Systems}, volume={33}, year={2020} } @article{kirichenko2020normalizing, title={Why Normalizing Flows Fail to Detect Out-of-Distribution Data}, author={Kirichenko, Polina and Izmailov, Pavel and Wilson, Andrew G}, journal={Advances in Neural Information Processing Systems}, volume={33}, year={2020} } @article{morteza2022provable, title={Provable Guarantees for Understanding Out-of-distribution Detection}, author={Morteza, Peyman and Li, Yixuan}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, year={2022} } @inproceedings{sun2021react, title={ReAct: Out-of-distribution Detection With Rectified Activations}, author={Sun, Yiyou and Guo, Chuan and Li, Yixuan}, booktitle={Advances in Neural Information Processing Systems}, year={2021} } @inproceedings{huang2021importance, title={On the Importance of Gradients for Detecting Distributional Shifts in the Wild}, author={Huang, Rui and Geng, Andrew and Li, Yixuan}, booktitle={Advances in Neural Information Processing Systems}, year={2021} } @article{DBLP:journals/corr/abs-2108-03614, author = {Kumari Deepshikha and Sai Harsha Yelleni and P. K. Srijith and C. Krishna Mohan}, title = {Monte Carlo DropBlock for Modelling Uncertainty in Object Detection}, journal = {CoRR}, volume = {abs/2108.03614}, year = {2021}, eprinttype = {arXiv}, eprint = {2108.03614}, timestamp = {Wed, 11 Aug 2021 15:24:08 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2108-03614.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2108-11941, author = {Jingkang Yang and Haoqi Wang and Litong Feng and Xiaopeng Yan and Huabin Zheng and Wayne Zhang and Ziwei Liu}, title = {Semantically Coherent Out-of-Distribution Detection}, journal = {CoRR}, volume = {abs/2108.11941}, year = {2021}, url = {https://arxiv.org/abs/2108.11941}, eprinttype = {arXiv}, eprint = {2108.11941}, timestamp = {Fri, 27 Aug 2021 15:02:29 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2108-11941.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{DBLP:journals/corr/abs-2104-08381, title={Robust Object Detection via Instance-Level Temporal Cycle Confusion}, author={Xin Wang and Thomas E. Huang and Benlin Liu and Fisher Yu and Xiaolong Wang and Joseph E. Gonzalez and Trevor Darrell}, booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, year={2021} } @InProceedings{Besnier_2021_ICCV, author = {Besnier, Victor and Bursuc, Andrei and Picard, David and Briot, Alexandre}, title = {Triggering Failures: Out-of-Distribution Detection by Learning From Local Adversarial Attacks in Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {15701-15710} } @article{DBLP:journals/corr/abs-2108-01634, author = {Victor Besnier and Andrei Bursuc and David Picard and Alexandre Briot}, title = {Triggering Failures: Out-Of-Distribution detection by learning from local adversarial attacks in Semantic Segmentation}, journal = {CoRR}, volume = {abs/2108.01634}, year = {2021}, url = {https://arxiv.org/abs/2108.01634}, eprinttype = {arXiv}, eprint = {2108.01634}, timestamp = {Thu, 05 Aug 2021 14:27:08 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2108-01634.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2107-04517, author = {Tobias Riedlinger and Matthias Rottmann and Marius Schubert and Hanno Gottschalk}, title = {Gradient-Based Quantification of Epistemic Uncertainty for Deep Object Detectors}, journal = {CoRR}, volume = {abs/2107.04517}, year = {2021}, eprinttype = {arXiv}, eprint = {2107.04517}, timestamp = {Tue, 20 Jul 2021 15:08:33 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2107-04517.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2108-06753, author = {Dahun Kim and Tsung{-}Yi Lin and Anelia Angelova and In So Kweon and Weicheng Kuo}, title = {Learning Open-World Object Proposals without Learning to Classify}, journal = {CoRR}, volume = {abs/2108.06753}, year = {2021}, archivePrefix = {arXiv}, eprint = {2108.06753}, timestamp = {Wed, 18 Aug 2021 19:45:42 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2108-06753.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @article{DBLP:journals/corr/abs-2002-05347, author = {Xialei Liu and Hao Yang and Avinash Ravichandran and Rahul Bhotika and Stefano Soatto}, title = {Continual Universal Object Detection}, journal = {CoRR}, volume = {abs/2002.05347}, year = {2020}, archivePrefix = {arXiv}, eprint = {2002.05347}, timestamp = {Fri, 14 Feb 2020 12:07:41 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-2002-05347.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } @inproceedings{van2016conditional, title={Conditional image generation with pixelcnn decoders}, author={Van den Oord, Aaron and Kalchbrenner, Nal and Espeholt, Lasse and Vinyals, Oriol and Graves, Alex and others}, booktitle={Advances in neural information processing systems}, pages={4790--4798}, year={2016} } @article{tabak2013family, title={A family of nonparametric density estimation algorithms}, author={Tabak, Esteban G and Turner, Cristina V}, journal={Communications on Pure and Applied Mathematics}, volume={66}, number={2}, pages={145--164}, year={2013}, publisher={Wiley Online Library} } @article{mu2019mnist, title={Mnist-c: A robustness benchmark for computer vision}, author={Mu, Norman and Gilmer, Justin}, journal={arXiv preprint arXiv:1906.02337}, year={2019} } @inproceedings{guo2017calibration, title={On calibration of modern neural networks}, author={Guo, Chuan and Pleiss, Geoff and Sun, Yu and Weinberger, Kilian Q}, booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70}, pages={1321--1330}, year={2017}, organization={JMLR. org} } @article{subramanya2017confidence, title={Confidence estimation in deep neural networks via density modelling}, author={Subramanya, Akshayvarun and Srinivas, Suraj and Babu, R Venkatesh}, journal={arXiv preprint arXiv:1707.07013}, year={2017} } @inproceedings{malinin2018predictive, title={Predictive uncertainty estimation via prior networks}, author={Malinin, Andrey and Gales, Mark}, booktitle={Advances in Neural Information Processing Systems}, pages={7047--7058}, year={2018} } @article{bevandic2018discriminative, title={Discriminative out-of-distribution detection for semantic segmentation}, author={Bevandi{\'c}, Petra and Kre{\v{s}}o, Ivan and Or{\v{s}}i{\'c}, Marin and {\v{S}}egvi{\'c}, Sini{\v{s}}a}, journal={arXiv preprint arXiv:1808.07703}, year={2018} } @article{torralba200880, title={80 million tiny images: A large data set for nonparametric object and scene recognition}, author={Torralba, Antonio and Fergus, Rob and Freeman, William T}, journal={IEEE transactions on pattern analysis and machine intelligence}, volume={30}, number={11}, pages={1958--1970}, year={2008}, publisher={IEEE} } @article{tsipras2018robustness, title={Robustness may be at odds with accuracy}, author={Tsipras, Dimitris and Santurkar, Shibani and Engstrom, Logan and Turner, Alexander and Madry, Aleksander}, journal={arXiv preprint arXiv:1805.12152}, year={2018} } @article{athalye2018obfuscated, title={Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples}, author={Athalye, Anish and Carlini, Nicholas and Wagner, David}, journal={arXiv preprint arXiv:1802.00420}, year={2018} } @article{szegedy2013intriguing, title={Intriguing properties of neural networks}, author={Szegedy, Christian and Zaremba, Wojciech and Sutskever, Ilya and Bruna, Joan and Erhan, Dumitru and Goodfellow, Ian and Fergus, Rob}, journal={arXiv preprint arXiv:1312.6199}, year={2013} } @inproceedings{papernot2016limitations, title={The limitations of deep learning in adversarial settings}, author={Papernot, Nicolas and McDaniel, Patrick and Jha, Somesh and Fredrikson, Matt and Celik, Z Berkay and Swami, Ananthram}, booktitle={2016 IEEE European symposium on security and privacy (EuroS\&P)}, pages={372--387}, year={2016}, organization={IEEE} } @inproceedings{hadsell2006dimensionality, title={Dimensionality reduction by learning an invariant mapping}, author={Hadsell, Raia and Chopra, Sumit and LeCun, Yann}, booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)}, volume={2}, pages={1735--1742}, year={2006}, organization={IEEE} } @inproceedings{biggio2013evasion, title={Evasion attacks against machine learning at test time}, author={Biggio, Battista and Corona, Igino and Maiorca, Davide and Nelson, Blaine and {\v{S}}rndi{\'c}, Nedim and Laskov, Pavel and Giacinto, Giorgio and Roli, Fabio}, booktitle={Joint European conference on machine learning and knowledge discovery in databases}, pages={387--402}, year={2013}, organization={Springer} } @article{kurakin2016adversarial, title={Adversarial examples in the physical world}, author={Kurakin, Alexey and Goodfellow, Ian and Bengio, Samy}, journal={arXiv preprint arXiv:1607.02533}, year={2016} } @inproceedings{moosavi2016deepfool, title={Deepfool: a simple and accurate method to fool deep neural networks}, author={Moosavi-Dezfooli, Seyed-Mohsen and Fawzi, Alhussein and Frossard, Pascal}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={2574--2582}, year={2016} } @inproceedings{goodfellow2014generative, title={Generative adversarial nets}, author={Goodfellow, Ian and Pouget-Abadie, Jean and Mirza, Mehdi and Xu, Bing and Warde-Farley, David and Ozair, Sherjil and Courville, Aaron and Bengio, Yoshua}, booktitle={Advances in neural information processing systems}, pages={2672--2680}, year={2014} } @inproceedings{ranzato2007unified, title={A unified energy-based framework for unsupervised learning}, author={Ranzato, Marc’Aurelio and Boureau, Y-Lan and Chopra, Sumit and LeCun, Yann}, booktitle={Artificial Intelligence and Statistics}, pages={371--379}, year={2007} } @inproceedings{carlini2017adversarial, title={Adversarial examples are not easily detected: Bypassing ten detection methods}, author={Carlini, Nicholas and Wagner, David}, booktitle={Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security}, pages={3--14}, year={2017} } @inproceedings{ranzato2007efficient, title={Efficient learning of sparse representations with an energy-based model}, author={Ranzato, Marc'Aurelio and Poultney, Christopher and Chopra, Sumit and Cun, Yann L}, booktitle={Advances in neural information processing systems}, pages={1137--1144}, year={2007} } @article{meinke2019towards, title={Towards neural networks that provably know when they don't know}, author={Meinke, Alexander and Hein, Matthias}, journal={arXiv preprint arXiv:1909.12180}, year={2019} } @inproceedings{geifman2019selectivenet, title={Selectivenet: A deep neural network with an integrated reject option}, author={Geifman, Yonatan and El-Yaniv, Ran}, booktitle={International Conference on Machine Learning}, pages={2151--2159}, year={2019}, organization={PMLR} } @inproceedings{sehwag2019analyzing, title={Analyzing the robustness of open-world machine learning}, author={Sehwag, Vikash and Bhagoji, Arjun Nitin and Song, Liwei and Sitawarin, Chawin and Cullina, Daniel and Chiang, Mung and Mittal, Prateek}, booktitle={Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security}, pages={105--116}, year={2019} } @article{stallkamp2012man, title={Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition}, author={Stallkamp, Johannes and Schlipsing, Marc and Salmen, Jan and Igel, Christian}, journal={Neural networks}, volume={32}, pages={323--332}, year={2012}, publisher={Elsevier} } @inproceedings{salimans2016weight, title={Weight normalization: A simple reparameterization to accelerate training of deep neural networks}, author={Salimans, Tim and Kingma, Durk P}, booktitle={Advances in neural information processing systems}, pages={901--909}, year={2016} } @article{lecun2010mnist, title={MNIST handwritten digit database}, author={LeCun, Yann and Cortes, Corinna and Burges, CJ}, journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist}, volume={2}, year={2010} } @inproceedings{chen2019deep, title={Deep hierarchical multi-label classification of chest X-ray images}, author={Chen, Haomin and Miao, Shun and Xu, Daguang and Hager, Gregory D and Harrison, Adam P}, booktitle={International Conference on Medical Imaging with Deep Learning}, pages={109--120}, year={2019}, organization={PMLR} } @article{zagoruyko2016wide, title={Wide residual networks}, author={Zagoruyko, Sergey and Komodakis, Nikos}, journal={arXiv preprint arXiv:1605.07146}, year={2016} } @article{balescu1975equilibrium, title={Equilibrium and nonequilibrium statistical mechanics}, author={Balescu, Radu}, journal={STIA}, volume={76}, pages={32809}, year={1975} } @inproceedings{cimpoi2014describing, title={Describing textures in the wild}, author={Cimpoi, Mircea and Maji, Subhransu and Kokkinos, Iasonas and Mohamed, Sammy and Vedaldi, Andrea}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={3606--3613}, year={2014} } @Article{Everingham15, author = "Everingham, M. and Eslami, S. M. A. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.", title = "The Pascal Visual Object Classes Challenge: A Retrospective", journal = "International Journal of Computer Vision", volume = "111", year = "2015", number = "1", month = jan, pages = "98--136", } @inproceedings{wang2017chestx, title={Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases}, author={Wang, Xiaosong and Peng, Yifan and Lu, Le and Lu, Zhiyong and Bagheri, Mohammadhadi and Summers, Ronald M}, booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition}, pages={2097--2106}, year={2017} } @inproceedings{zhu2017learning, title={Learning spatial regularization with image-level supervisions for multi-label image classification}, author={Zhu, Feng and Li, Hongsheng and Ouyang, Wanli and Yu, Nenghai and Wang, Xiaogang}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={5513--5522}, year={2017} } @article{gong2013deep, title={Deep convolutional ranking for multilabel image annotation}, author={Gong, Yunchao and Jia, Yangqing and Leung, Thomas and Toshev, Alexander and Ioffe, Sergey}, journal={arXiv preprint arXiv:1312.4894}, year={2013} } @article{kullback1951information, title={On information and sufficiency}, author={Kullback, Solomon and Leibler, Richard A}, journal={The annals of mathematical statistics}, volume={22}, number={1}, pages={79--86}, year={1951}, publisher={JSTOR} } @article{chen2019integration, title={Deep integration: A multi-label architecture for road scene recognition}, author={Chen, Long and Zhan, Wujing and Tian, Wei and He, Yuhang and Zou, Qin}, journal={IEEE Transactions on Image Processing}, volume={28}, number={10}, pages={4883--4898}, year={2019}, publisher={IEEE} } @article{blum2019fishyscapes, title={The fishyscapes benchmark: Measuring blind spots in semantic segmentation}, author={Blum, Hermann and Sarlin, Paul-Edouard and Nieto, Juan and Siegwart, Roland and Cadena, Cesar}, journal={arXiv preprint arXiv:1904.03215}, year={2019} } @inproceedings{breunig2000lof, title={LOF: identifying density-based local outliers}, author={Breunig, Markus M and Kriegel, Hans-Peter and Ng, Raymond T and Sander, J{\"o}rg}, booktitle={Proceedings of the 2000 ACM SIGMOD international conference on Management of data}, pages={93--104}, year={2000} } @inproceedings{liu2008isolation, title={Isolation forest}, author={Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua}, booktitle={2008 Eighth IEEE International Conference on Data Mining}, pages={413--422}, year={2008}, organization={IEEE} } @article{papadopoulos2019outlier, title={Outlier exposure with confidence control for out-of-distribution detection}, author={Papadopoulos, Aristotelis-Angelos and Rajati, Mohammad Reza and Shaikh, Nazim and Wang, Jiamian}, journal={arXiv preprint arXiv:1906.03509}, year={2019} } @InProceedings{Karen2013saliency, Author = {Karen Simonyan and Andrea Vedaldi and Andrew Zisserman}, Title = {Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps}, booktitle = "ICLR (workshop track)", year = "2014", } @InProceedings{Jost2015gbp, author = "J.T. Springenberg and A. Dosovitskiy and T. Brox and M. Riedmiller", title = "Striving for Simplicity: The All Convolutional Net", booktitle = "ICLR (workshop track)", year = "2015", } @InProceedings{Selvaraju2017gradcam, author = {Selvaraju, Ramprasaath R. and Cogswell, Michael and Das, Abhishek and Vedantam, Ramakrishna and Parikh, Devi and Batra, Dhruv}, title = {Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, month = {Oct}, year = {2017} } @InProceedings{Mukund2017intgrad, title = {Axiomatic Attribution for Deep Networks}, author = {Mukund Sundararajan and Ankur Taly and Qiqi Yan}, booktitle = {Proceedings of the 34th International Conference on Machine Learning (ICML)}, pages = {3319--3328}, year = {2017}, } @inproceedings{ Marco2018deeplift, title={Towards better understanding of gradient-based attribution methods for Deep Neural Networks}, author={Marco Ancona and Enea Ceolini and Cengiz Öztireli and Markus Gross}, booktitle={International Conference on Learning Representations}, year={2018}, } @INPROCEEDINGS{Krishna2017has, author={K. K. {Singh} and Y. J. {Lee}}, booktitle={2017 IEEE International Conference on Computer Vision (ICCV)}, title={Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization}, year={2017}, volume={}, number={}, pages={3544-3553},} @article{Choe2019adl, author = {Choe, Junsuk and Shim, Hyunjung}, booktitle = {Computer Vision and Pattern Recognition (CVPR)}, year = {2019}, month = {08}, pages = {}, title = {Attention-based Dropout Layer for Weakly Supervised Object Localization} } @article{Yun2019cutmix, author = {Yun, Sangdoo and Han, Dongyoon and Oh, Seong Joon and Chun, Sanghyuk and Choe, Junsuk and Yoo, Youngjoon}, booktitle = {2019 IEEE International Conference on Computer Vision (ICCV)}, year = {2019}, month = {05}, pages = {}, title = {CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features} } @article{Zhang2018acol, author = {Zhang, Xiaolin and Wei, Yunchao and Feng, Jiashi and Yang, Yi and Huang, Thomas}, booktitle = {Computer Vision and Pattern Recognition (CVPR)}, year = {2018}, month = {04}, pages = {}, title = {Adversarial Complementary Learning for Weakly Supervised Object Localization} } @InProceedings{Zhang2018spg, author = {Zhang, Xiaolin and Wei, Yunchao and Kang, Guoliang and Yang, Yi and Huang, Thomas}, title = {Self-produced Guidance for Weakly-supervised Object Localization}, booktitle = {The European Conference on Computer Vision (ECCV)}, month = {September}, year = {2018} } @InProceedings{Choe2019wsol, author = {Choe, Junsuk and Shim, Hyunjung}, title = {Attention-Based Dropout Layer for Weakly Supervised Object Localization}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2019} } @article{kuznetsova2020open, title={The open images dataset v4}, author={Kuznetsova, Alina and Rom, Hassan and Alldrin, Neil and Uijlings, Jasper and Krasin, Ivan and Pont-Tuset, Jordi and Kamali, Shahab and Popov, Stefan and Malloci, Matteo and Kolesnikov, Alexander and others}, journal={International Journal of Computer Vision}, pages={1--26}, year={2020}, publisher={Springer} } @inproceedings{ Eric2019ganknow, title={Do Deep Generative Models Know What They Don't Know? }, author={Eric Nalisnick and Akihiro Matsukawa and Yee Whye Teh and Dilan Gorur and Balaji Lakshminarayanan}, booktitle={International Conference on Learning Representations}, year={2019}, } @article{du2019implicit, title={Implicit generation and generalization in energy-based models}, author={Du, Yilun and Mordatch, Igor}, journal={arXiv preprint arXiv:1903.08689}, year={2019} } @inproceedings{kingma2018glow, title={Glow: Generative flow with invertible 1x1 convolutions}, author={Kingma, Durk P and Dhariwal, Prafulla}, booktitle={Advances in Neural Information Processing Systems}, pages={10215--10224}, year={2018} } @article{choi2018generative, title={Generative ensembles for robust anomaly detection}, author={Choi, Hyunsun and Jang, Eric}, year={2018} } @inproceedings{grathwohl2019your, title={Your classifier is secretly an energy based model and you should treat it like one}, author={Grathwohl, Will and Wang, Kuan-Chieh and Jacobsen, Joern-Henrik and Duvenaud, David and Norouzi, Mohammad and Swersky, Kevin}, booktitle={International Conference on Learning Representations}, year={2019} } @article{Kevin, author = {Dan Hendrycks and Kevin Gimpel}, title = {A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks}, journal = {Proceedings of International Conference on Learning Representations}, year = {2017}, } @misc{ wang2021energybased, title={Energy-based Out-of-distribution Detection for Multi-label Classification}, author={Haoran Wang and Weitang Liu and Alex Bocchieri and Yixuan Li}, year={2021}, journal={https://openreview.net/forum?id=KsN9p5qJN3} } @Article{Wang2020covid, author={Wang, Linda and Lin, Zhong Qiu and Wong, Alexander}, title={COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images}, journal={Scientific Reports}, year={2020}, month={Nov}, day={11}, volume={10}, number={1}, pages={19549}, issn={2045-2322}, doi={10.1038/s41598-020-76550-z}, url={https://doi.org/10.1038/s41598-020-76550-z} } @inproceedings{ sehwag2021ssd, title={{SSD}: A Unified Framework for Self-Supervised Outlier Detection}, author={Vikash Sehwag and Mung Chiang and Prateek Mittal}, booktitle={International Conference on Learning Representations}, year={2021}, url={https://openreview.net/forum?id=v5gjXpmR8J} } @inproceedings{tack2020csi, title={CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances}, author={Jihoon Tack and Sangwoo Mo and Jongheon Jeong and Jinwoo Shin}, booktitle={Advances in Neural Information Processing Systems}, year={2020} } @inproceedings{pmlr-v137-wang20a, author = {Ziyu Wang and Bin Dai and David P. Wipf and Jun Zhu}, title = {Further Analysis of Outlier Detection with Deep Generative Models}, booktitle = {Advances in Neural Information Processing Systems 33, NeurIPS 2020}, year = {2020}, timestamp = {Tue, 19 Jan 2021 15:57:19 +0100}, biburl = {https://dblp.org/rec/conf/nips/0006DWZ20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

arxiv_citations

{"2108.09976":true,"2010.05119":true,"1802.03426":true,"2007.06096":true,"1910.04241":true,"1804.02748":true,"1906.03509":true,"2106.03917":true,"2003.08798":true,"2104.01328":true,"1810.01392":true,"2007.04250":true,"2006.15207":true,"1711.05225":true,"1910.14034":true,"1503.02531":true,"1412.6563":true,"1911.11132":true,"1802.04865":true,"1903.08689":true,"1703.00410":true,"1506.03365":true,"1504.06755":true,"1706.06083":true,"1412.6980":true,"1608.03983":true,"2108.03614":true,"2108.11941":true,"2108.01634":true,"2107.04517":true,"2108.06753":true,"2002.05347":true,"1906.02337":true,"1707.07013":true,"1808.07703":true,"1805.12152":true,"1802.00420":true,"1312.6199":true,"1607.02533":true,"1909.12180":true,"1605.07146":true,"1312.4894":true,"1904.03215":true}

abstract: | Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves competitive performance on both object detection and image classification models, reducing the FPR95 by up to 9.36% compared to the previous best method on object detectors. Code is available at https://github.com/deeplearning-wisc/vos. author:

| Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li
Department of Computer Sciences
University of Wisconsin - Madison
{xfdu,mucai,sharonli}@cs.wisc.edu
bibliography:
citations.bib title: | VOS: Learning What You Don't Know by
Virtual Outlier Synthesis

Introduction

Modern deep neural networks have achieved unprecedented success in known contexts for which they are trained, yet they often struggle to handle the unknowns. In particular, neural networks have been shown to produce high posterior probability for out-of-distribution (OOD) test inputs [@nguyen2015deep], which arise from unknown categories and should not be predicted by the model. Taking self-driving car as an example, an object detection model trained to recognize in-distribution objects (e.g., cars, stop signs) can produce a high-confidence prediction for an unseen object of a moose; see Figure 1{reference-type="ref" reference="fig::GMM_toy"}(a). Such a failure case raises concerns in model reliability, and worse, may lead to catastrophe when deployed in safety-critical applications.

The vulnerability to OOD inputs arises due to the lack explicit knowledge of unknowns during training time. In particular, neural networks are typically optimized only on the in-distribution (ID) data. The resulting decision boundary, despite being useful on ID tasks such as classification, can be ill-fated for OOD detection. We illustrate this in Figure 1{reference-type="ref" reference="fig::GMM_toy"}. The ID data (gray) consists of three class-conditional Gaussians, on which a three-way softmax classifier is trained. The resulting classifier is overconfident for regions far away from the ID data (see the red shade in Figure 1{reference-type="ref" reference="fig::GMM_toy"}(b)), causing trouble for OOD detection. Ideally, a model should learn a more compact decision boundary that produces low uncertainty for the ID data, with high OOD uncertainty elsewhere (e.g., Figure 1{reference-type="ref" reference="fig::GMM_toy"}(c)). However, achieving this goal is non-trivial due to the lack of supervision signal of unknowns. This motivates the question: Can we synthesize virtual outliers for effective model regularization?

In this paper, we propose a novel unknown-aware learning framework dubbed VOS (Virtual Outlier Synthesis), which optimizes the dual objectives of both ID task and OOD detection performance. In a nutshell, VOS consists of three components tackling challenges of outlier synthesis and effective model regularization with synthesized outliers. To synthesize the outliers, we estimate the class-conditional distribution in the feature space, and sample outliers from the low-likelihood region of ID classes (Section 3.1{reference-type="ref" reference="sec:gda"}). Key to our method, we show that sampling in the feature space is more tractable than synthesizing images in the high-dimensional pixel space [@lee2018training]. Alongside, we propose a novel unknown-aware training objective, which contrastively shapes the uncertainty surface between the ID data and synthesized outliers (Section 3.2{reference-type="ref" reference="sec:training"}). During training, VOS simultaneously performs the ID task (e.g., classification or object detection) as well as the OOD uncertainty regularization. During inference time, the uncertainty estimation branch produces a larger probabilistic score for ID data and vice versa, which enables effective OOD detection (Section 3.3{reference-type="ref" reference="sec:inference"}).

$(a) A Faster-RCNN [@ren2015faster] model trained on BDD-100k dataset [@DBLP:conf/cvpr/YuCWXCLMD20] produces overconfident predictions for OOD object (e.g., moose). (b)-(c) The uncertainty measurement with and without virtual outlier training. The in-distribution data $\mathbf{x}\in \altmathcal{X}=\mathbb{R}^2$ is sampled from a Gaussian mixture model). Regularizing the model with virtual outliers (c) better captures the OOD uncertainty than without (b).$ {#fig::GMM_toy width="0.95\linewidth" height="0.35\linewidth"}[[fig:teaser]]{#fig:teaser label="fig:teaser"}

VOS offers several compelling advantages compared to existing solutions. (1) VOS is a general learning framework that is effective for both object detection and image classification tasks, whereas previous methods were primarily driven by image classification. Image-level detection can be limiting as an image could be OOD in certain regions while being in-distribution elsewhere. Our work bridges a critical research gap since OOD detection for object detection is timely yet underexplored in literature. (2) VOS enables adaptive outlier synthesis, which can be flexibly and conveniently used for any ID data without manual data collection or cleaning. In contrast, previous methods using outlier exposure [@hendrycks2018deep] require an auxiliary image dataset that is sufficiently diverse, which can be arguably prohibitive to obtain. Moreover, one needs to perform careful data cleaning to ensure the auxiliary outlier dataset does not overlap with ID data. (3) VOS synthesizes outliers that can estimate a compact decision boundary between ID and OOD data. In contrast, existing solutions use outliers that are either too trivial to regularize the OOD estimator, or too hard to be separated from ID data, resulting in sub-optimal performance.

Our key contributions and results are summarized as follows:

We propose a new framework VOS addressing a pressing issue---unknown-aware deep learning that optimizes for both ID and OOD performance. VOS establishes state-of-the-art results on a challenging object detection task. Compared to the best method, VOS reduces the FPR95 by up to 9.36% while preserving the accuracy on the ID task.
We conduct extensive ablations and reveal important insights by contrasting different outlier synthesis approaches. We show that VOS is more advantageous than generating outliers directly in the high-dimensional pixel space (e.g., using GAN [@lee2018training]) or using noise as outliers.
We comprehensively evaluate our method on common OOD detection benchmarks, along with a more challenging yet underexplored task in the context of object detection. Our effort facilitates future research to evaluate OOD detection in a real-world setting.

Problem Setup

We start by formulating the problem of OOD detection in the setting of object detection. Our framework can be easily generalized to image classification when the bounding box is the entire image (see Section [sec:cls]{reference-type="ref" reference="sec:cls"}). Most previous formulations of OOD detection treat entire images as anomalies, which can lead to ambiguity shown in Figure 1{reference-type="ref" reference="fig::GMM_toy"}. In particular, natural images are composed of numerous objects and components. Knowing which regions of an image are anomalous could allow for safer handling of unfamiliar objects. This setting is more realistic in practice, yet also more challenging as it requires reasoning OOD uncertainty at the fine-grained object level.

Specifically, we denote the input and label space by $\altmathcal{X}=\mathbb{R}^d$ and $\altmathcal{Y}={1,2,...,K}$, respectively. Let $\mathbf{x} \in \mathcal{X}$ be the input image, $\mathbf{b} \in \mathbb{R}^4$ be the bounding box coordinates associated with object instances in the image, and $y \in \mathcal{Y}$ be the semantic label for $K$-way classification. An object detection model is trained on in-distribution data $\altmathcal{D}=\left{\left(\mathbf{x}{i},\mathbf{b}i, {y}{i}\right)\right}{i=1}^{N}$ drawn from an unknown joint distribution $\altmathcal{P}$. We use neural networks with parameters $\theta$ to model the bounding box regression $p_\theta(\mathbf{b} |\mathbf{x})$ and the classification $p_\theta(y|\mathbf{x}, \mathbf{b})$.

The OOD detection can be formulated as a binary classification problem, which distinguishes between the in- vs. out-of-distribution objects. Let $P_{\altmathcal{X}}$ denote the marginal probability distribution on $\altmathcal{X}$. Given a test input $\mathbf{x}^\sim P_\mathcal{X}$, as well as an object instance $\mathbf{b}^$ predicted by the object detector, the goal is to predict $p_\theta(g \vert \mathbf{x}^, \mathbf{b}^)$. We use $g=1$ to indicate a detected object being in-distribution, and $g=0$ being out-of-distribution, with semantics outside the support of $\mathcal{Y}$.

$The framework of VOS. We model the feature representation of ID objects as class-conditional Gaussians, and sample virtual outliers $\mathbf{v}$ from the low-likelihood region. The virtual outliers, along with the ID objects, are used to produce the uncertainty loss for regularization. The uncertainty estimation branch ($\altmathcal{L}{\mathrm{uncertainty}}$) is jointly trained with the object detection loss ($\altmathcal{L}{\mathrm{loc}},\altmathcal{L}_{\mathrm{cls}}$).$ {#fig:overview width="100%"}

Method {#sec:method}

Our novel unknown-aware learning framework is illustrated in Figure 2{reference-type="ref" reference="fig:overview"}. Our framework encompasses three novel components and addresses the following questions: (1) how to synthesize the virtual outliers (Section 3.1{reference-type="ref" reference="sec:gda"}), (2) how to leverage the synthesized outliers for effective model regularization (Section 3.2{reference-type="ref" reference="sec:training"}), and (3) how to perform OOD detection during inference time (Section 3.3{reference-type="ref" reference="sec:inference"})?

VOS: Virtual Outlier Synthesis {#sec:gda}

Our framework VOS generates virtual outliers for model regularization, without relying on external data. While a straightforward idea is to train generative models such as GANs [@goodfellow2014generative; @lee2018training], synthesizing images in the high-dimensional pixel space can be difficult to optimize. Instead, our key idea is to synthesize virtual outliers in the feature space, which is more tractable given lower dimensionality. Moreover, our method is based on a discriminatively trained classifier in the object detector, which circumvents the difficult optimization process in training generative models.

Specifically, we assume the feature representation of object instances forms a class-conditional multivariate Gaussian distribution (see Figure [fig:umap]{reference-type="ref" reference="fig:umap"}): $$p_\theta(h(\mathbf{x},\mathbf{b}) | y=k)=\altmathcal{N}(\boldsymbol\mu_{k}, \mathbf{\Sigma}),$$ where $\boldsymbol\mu_k$ is the Gaussian mean of class $k \in{1,2, . ., \mathrm{K}}$, $\mathbf{\Sigma}$ is the tied covariance matrix, and $h(\mathbf{x},\mathbf{b}) \in \mathbb{R}^m$ is the latent representation of an object instance $(\mathbf{x},\mathbf{b})$. To extract the latent representation, we use the penultimate layer of the neural network. The dimensionality $m$ is significantly smaller than the input dimension $d$.

r0.25

{width="\linewidth"}

[[fig:umap]]{#fig:umap label="fig:umap"}

To estimate the parameters of the class-conditional Gaussian, we compute empirical class mean $\widehat{\bm\mu}{k}$ and covariance $\widehat{\mathbf{\Sigma}}$ of training samples $\left{\left(\mathbf{x}{i},\mathbf{b}i, {y}{i}\right)\right}_{i=1}^{N}$:

$$\begin{aligned} \widehat{\bm\mu}{k}&=\frac{1}{N{k}} \sum_{i: y_{i}=k} h(\mathbf{x}i, \mathbf{b}i) \ \widehat{\mathbf{\Sigma}}&=\frac{1}{N} \sum{k} \sum{i: y_{i}=k}\left(h(\mathbf{x}_i, \mathbf{b}i)-\widehat{\bm\mu}{k}\right)\left(h(\mathbf{x}_i,\mathbf{b}i)-\widehat{\bm\mu}{k}\right)^{\top}, \label{eq:mean_cov} \vspace{-4.5em}\end{aligned}$$

where $N_k$ is the number of objects in class $k$, and $N$ is the total number of objects. We use online estimation for efficient training, where we maintain a class-conditional queue with $|Q_k|$ object instances from each class. In each iteration, we enqueue the embeddings of objects to their corresponding class-conditional queues, and dequeue the same number of object embeddings.

Sampling from the feature representation space. We propose sampling the virtual outliers from the feature representation space, using the multivariate distributions estimated above. Ideally, these virtual outliers should help estimate a more compact decision boundary between ID and OOD data.

To achieve this, we propose sampling the virtual outliers $\mathcal{V}_k$ from the $\epsilon$-likelihood region of the estimated class-conditional distribution: $$\begin{aligned} \mathcal{V}_k= { \mathbf{v}_k &\vert \frac{1}{(2 \pi)^{m / 2}|\widehat{\mathbf{\mathbf{\Sigma}}}|^{1 / 2}} \exp \left(-\frac{1}{2}(\mathbf{v}_k-\widehat{\bm\mu}_k)^{\top} \widehat{\mathbf{\Sigma}}^{-1}(\mathbf{v}_k-\widehat{\bm\mu}_k)\right) < \epsilon}, \label{eq:virtual}\end{aligned}$$ where $\mathbf{v}_k \sim \altmathcal{N}(\widehat{\bm\mu}_k,\widehat{\mathbf{\Sigma}})$ denotes the sampled virtual outliers for class $k$, which are in the sublevel set based on the likelihood. $\epsilon$ is sufficiently small so that the sampled outliers are near class boundary.

Classification outputs for virtual outliers. For a given sampled virtual outlier $\mathbf{v}\in \mathbb{R}^m$, the output of the classification branch can be derived through a linear transformation: $$f(\mathbf{v}; \theta)= W_\text{cls}^\top\mathbf{v} ,$$ where $W_\text{cls}\in \mathbb{R}^{m\times K}$ is the weight of the last fully connected layer. We proceed with describing how to regularize the output of virtual outliers for improved OOD detection.

Unknown-aware Training Objective {#sec:training}

We now introduce a new training objective for unknown-aware learning, leveraging the virtual outliers in Section 3.1{reference-type="ref" reference="sec:gda"}. The key idea is to perform visual recognition task while regularizing the model to produce a low OOD score for ID data, and a high OOD score for the synthesized outlier.

Uncertainty regularization for classification.

For simplicity, we first describe the regularization in the multi-class classification setting. The regularization loss should ideally optimize for the separability between the ID vs. OOD data under some function that captures the data density. However, directly estimating $\log p(\mathbf{x})$ can be computationally intractable as it requires sampling from the entire space $\mathcal{X}$. We note that the log partition function $E(\mathbf{x};\theta) := - \log \sum_{k=1}^K e^{f_k(\mathbf{x};\theta)}$ is proportional to $\log p(\mathbf{x})$ with some unknown factor, which can be seen from the following: $$p(y | \mathbf{x}) = \frac{p(\mathbf{x},y)}{p(\mathbf{x})} = \frac{e^{f_y(\mathbf{x};\theta)}}{\sum_{k=1}^K e^{f_k(\mathbf{x};\theta)}},$$ where $f_y(\mathbf{x}; {\theta})$ denotes the $y$-th element of logit output corresponding to the label $y$. The negative log partition function is also known as the free energy, which was shown to be an effective uncertainty measurement for OOD detection [@liu2020energy].

Our idea is to explicitly perform a level-set estimation based on the energy function (threshold at 0), where the ID data has negative energy values and the synthesized outlier has positive energy: $$\begin{aligned} \mathcal{L}\text{uncertainty} = \mathbb{E}{\mathbf{v}\sim \mathcal{V}}~~\mathds{1} {E(\mathbf{v};\theta) > 0} + \mathbb{E}{\mathbf{x}\sim \mathcal{D}}~~\mathds{1}{E(\mathbf{x};\theta) \le 0}\end{aligned}$$ This is a simpler objective than estimating density. Since the $0/1$ loss is intractable, we replace it with the binary sigmoid loss, a smooth approximation of the $0/1$ loss, yielding the following: $$\mathcal{L}\text{uncertainty}=\mathbb{E}{\mathbf{v}\sim \mathcal{V}} \left[-\log \frac{1}{1+\exp ^{- \phi(E(\mathbf{v};\theta))}} \right]+\mathbb{E}{{\mathbf{x} \sim \mathcal{D}}} \left[-\log \frac{\exp ^{- \phi(E(\mathbf{x};\theta))}}{1+\exp ^{- \phi(E(\mathbf{x};\theta))}} \right]. \label{eq:reg_loss}$$ Here $\phi(\cdot)$ is a nonlinear MLP function, which allows learning flexible energy surface. The learning process shapes the uncertainty surface, which predicts high probability for ID data and low probability for virtual outliers $\mathbf{v}$. @liu2020energy employed energy for model uncertainty regularization, however, the loss function is based on the squared hinge loss and requires tuning two margin hyperparameters. In contrast, our uncertainty regularization loss is completely hyperparameter-free and is much easier to use in practice. Moreover, VOS produces probabilistic score for OOD detection, whereas [@liu2020energy] relies on non-probabilistic energy score.

Object-level energy score.

In case of object detection, we can replace the image-level energy with object-level energy score. For ID object $(\mathbf{x}, \mathbf{b})$, the energy is defined as: $$E(\mathbf{x},\mathbf{b};\theta) = -\log \sum_{k=1}^K w_k\cdot \exp^{f_k((\mathbf{x},\mathbf{b});\theta)}, \label{eq:energy} \vspace{-0.4em}$$ where $f_k((\mathbf{x},\mathbf{b});\theta)=W^\top_\text{cls}h(\mathbf{x},\mathbf{b})$ is the logit output for class $k$ in the classification branch. The energy score for the virtual outlier can be defined in a similar way as above. In particular, we will show in Section [sec:experiment]{reference-type="ref" reference="sec:experiment"} that a learnable $\mathbf{w}$ is more flexible than a constant $\mathbf{w}$, given the inherent class imbalance in object detection datasets. Additional analysis on $w_k$ is in Appendix 13{reference-type="ref" reference="sec:app_visual_weight"}.

Overall training objective.

In the case of object detection, the overall training objective combines the standard object detection loss, along with a regularization loss in terms of uncertainty: $$\min {\theta} \mathbb{E}{(\mathbf{x}, \mathbf{b}, {y}) \sim \altmathcal{D}}~~\left[\mathcal{L}\text{cls}+\mathcal{L}\text{loc}\right]+\beta \cdot \mathcal{L}\text{uncertainty}, \label{eq:all_loss}$$ where $\beta$ is the weight of the uncertainty regularization. $\mathcal{L}\text{cls}$ and $\mathcal{L}\text{loc}$ are losses for classification and bounding box regression, respectively. This can be simplified to classification task without $\mathcal{L}\text{loc}$. We provide ablation studies in Section [sec:exp_baseline]{reference-type="ref" reference="sec:exp_baseline"} demonstrating the superiority of our loss function.

Inference-time OOD Detection {#sec:inference}

During inference, we use the output of the logistic regression uncertainty branch for OOD detection. In particular, given a test input $\mathbf{x}^$, the object detector produces a bounding box prediction $\mathbf{b}^$. The OOD uncertainty score for the predicted object $(\mathbf{x}^, \mathbf{b}^)$ is given by: $$\begin{aligned} p_\theta(g \mid \mathbf{x}^, \mathbf{b}^) = \frac{\exp^{- \phi(E(\mathbf{x}^,\mathbf{b}^))}}{1+\exp^{- \phi(E(\mathbf{x}^,\mathbf{b}^))}}. \label{eq:ood_uncertainty} \vspace{-0.5em}\end{aligned}$$ For OOD detection, one can exercise the thresholding mechanism to distinguish between ID and OOD objects: $$G(\mathbf{x}^,\mathbf{b}^)=\left{\begin{array}{ll} 1 & \text { if }p_\theta(g \mid \mathbf{x}^, \mathbf{b}^)\geq \gamma, \ 0 & \text { if }p_\theta(g \mid \mathbf{x}^, \mathbf{b}^) <\gamma. \end{array}\right. \label{eq:ood_detection}$$ The threshold $\gamma$ is typically chosen so that a high fraction of ID data (e.g., 95%) is correctly classified. Our framework VOS is summarized in Algorithm [alg:algo]{reference-type="ref" reference="alg:algo"}.

Input: ID data $\altmathcal{D}=\left{\left(\mathbf{x}{i}, \mathbf{b}i,{y}{i}\right)\right}{i=1}^{N}$, randomly initialized detector with parameter $\theta$, queue size $|Q_k|$ for Gaussian density estimation, weight for uncertainty regularization $\beta$, and $\epsilon$.
Output: Object detector with parameter $\theta^{*}$, and OOD detector $G$.\

Experimental Results

[[sec:experiment]]{#sec:experiment label="sec:experiment"} In this section, we present empirical evidence to validate the effectiveness of VOS on several real-world tasks, including both object detection (Section 4.1{reference-type="ref" reference="subsec:obj"}) and image classification (Section [subsec:img]{reference-type="ref" reference="subsec:img"}).

Evaluation on Object Detection {#subsec:obj}

Experimental details. We use PASCAL VOC [@DBLP:journals/ijcv/EveringhamGWWZ10] and Berkeley DeepDrive (BDD-100k) [@DBLP:conf/cvpr/YuCWXCLMD20] datasets as the ID training data. For both tasks, we evaluate on two OOD datasets that contain subset of images from: MS-COCO [@lin2014microsoft] and OpenImages (validation set) [@kuznetsova2020open]. We manually examine the OOD images to ensure they do not contain ID category. We have open-sourced our benchmark data that allows the community to easily evaluate future methods on object-level OOD detection.

We use the Detectron2 library [@Detectron2018] and train on two backbone architectures: ResNet-50 [@he2016identity] and RegNetX-4.0GF [@DBLP:conf/cvpr/RadosavovicKGHD20]. We employ a two-layer MLP with a ReLU nonlinearity for $\phi$ in Equation [eq:reg_loss]{reference-type="ref" reference="eq:reg_loss"}, with hidden layer dimension of 512. For each in-distribution class, we use 1,000 samples to estimate the class-conditional Gaussians. Since the threshold $\epsilon$ can be infinitesimally small, we instead choose $\epsilon$ based on the $t$-th smallest likelihood in a pool of 10,000 samples (per-class), generated from the class-conditional Gaussian distribution. A larger $t$ corresponds to a larger threshold $\epsilon$. As shown in Table 2{reference-type="ref" reference="tab:ablation_appendix2_detection"}, a smaller $t$ yields good performance. We set $t=1$ for all our experiments. Extensive details on the datasets are described in Appendix 7{reference-type="ref" reference="sec:dataset"}, along with a comprehensive sensitivity analysis of each hyperparameter (including the queue size $|Q_k|$, coefficient $\beta$, and threshold $\epsilon$) in Appendix 9{reference-type="ref" reference="sec:ablation"}.

Metrics. For evaluating the OOD detection performance, we report: (1) the false positive rate (FPR95) of OOD samples when the true positive rate of ID samples is at 95%; (2) the area under the receiver operating characteristic curve (AUROC). For evaluating the object detection performance on the ID task, we report the common metric of mAP.

[[sec:exp_baseline]]{#sec:exp_baseline label="sec:exp_baseline"}

VOS outperforms existing approaches. In Table [tab:baseline]{reference-type="ref" reference="tab:baseline"}, we compare VOS with competitive OOD detection methods in literature. For a fair comparison, all the methods only use ID data without using auxiliary outlier dataset. Our proposed method, VOS, outperforms competitive baselines, including Maximum Softmax Probability [@hendrycks2016baseline], ODIN [@liang2018enhancing], energy score [@liu2020energy], Mahalanobis distance [@lee2018simple], Generalized ODIN [@hsu2020generalized], CSI [@tack2020csi] and Gram matrices [@DBLP:conf/icml/SastryO20]. These approaches rely on a classification model trained primarily for the ID classification task, and can be naturally extended to the object detection model due to the existence of a classification head. The comparison precisely highlights the benefits of incorporating synthesized outliers for model regularization.

Closest to our work is the GAN-based approach for synthesizing outliers [@lee2018training]. Compare to GAN-synthesis, VOS improves the OOD detection performance (FPR95) by 12.76% on BDD-100k and 13.40% on Pascal VOC (COCO as OOD). Moreover, we show in Table [tab:baseline]{reference-type="ref" reference="tab:baseline"} that VOS achieves stronger OOD detection performance while preserving a high accuracy on the original in-distribution task (measured by mAP). This is in contrast with CSI, which displays degradation, with mAP decreased by 0.7% on BDD-100k. Details of reproducing baselines are in Appendix 11{reference-type="ref" reference="sec:reproduce_baseline"}.

Ablation on outlier synthesis approaches. We compare VOS with different synthesis approaches in Table [tab:synthesis]{reference-type="ref" reference="tab:synthesis"}. Specifically, we consider three types of synthesis approach: (i$^\diamond$) synthesizing outliers in the pixel space, (ii$^\natural$) using noise as outliers, and (iii$^\clubsuit$) using negative proposals from RPN as outliers. For type I, we consider GAN-based [@lee2018training] and mixup [@DBLP:conf/iclr/ZhangCDL18] methods. The outputs of the classification branch for outliers are forced to be closer to a uniform distribution. For mixup, we consider two different beta distributions $\operatorname{Beta}(0.4)$ and $\operatorname{Beta}(1)$, and interpolate ID objects in the pixel space. For Type II, we use noise perturbation to create virtual outliers. We consider adding fixed Gaussian noise to the ID features, adding trainable noise to the ID features where the noise is trained to push the outliers away from ID features, and using fixed Gaussian noise as outliers. Lastly, for type III, we directly use the negative proposals in the ROI head as the outliers for Equation [eq:reg_loss]{reference-type="ref" reference="eq:reg_loss"}, similar to [@DBLP:journals/corr/abs-2103-02603]. We consider three variants: randomly sampling $n$ negative proposals ($n$ is the number of positive proposals), sampling $n$ negative proposals with a larger probability, and using all the negative proposals. [All methods are trained under the same setup, with PASCAL-VOC as in-distribution data and ResNet-50 as the backbone. The loss function is the same as Equation [eq:all_loss]{reference-type="ref" reference="eq:all_loss"} for all variants, with the only difference being the synthesis method.]{style="color: black"}

The results are summarized in Table [tab:synthesis]{reference-type="ref" reference="tab:synthesis"}, where VOS outperforms alternative synthesis approaches both in the feature space ($\clubsuit$, $\natural$) or the pixel space ($\diamond$). Generating outliers in the pixel space ($\diamond$) is either unstable (GAN) or harmful for the object detection performance (mixup). Introducing noise ($\natural$), especially using Gaussian noise as outliers is promising. However, Gaussian noise outliers are relatively simple, and may not effectively regularize the decision boundary between ID and OOD as VOS does. Exploiting the negative proposals ($\clubsuit$) is not effective, because they are distributionally close to the ID data.

Ablation on the uncertainty loss. We perform ablation on several variants of VOS, trained with different uncertainty loss $\mathcal{L}_\text{uncertainty}$. Particularly, we consider: (1) using the squared hinge loss for regularization as in @liu2020energy, (2) using constant weight $\mathbf{w}=[1,1,...,1]^\top$ for energy score in Equation [eq:energy]{reference-type="ref" reference="eq:energy"}, and (3) classifying the virtual outliers as an additional $K+1$ class in the classification branch. The performance comparison is summarized in Table [tab:ablation_loss]{reference-type="ref" reference="tab:ablation_loss"}. Compared to the hinge loss, our proposed logistic loss reduces the FPR95 by 10.02% on BDD-100k. While the squared hinge loss in @liu2020energy requires tuning the hyperparameters, our uncertainty loss is completely hyperparameter free. In addition, we find that a learnable $\mathbf{w}$ for energy score is more desirable than a constant $\mathbf{w}$, given the inherent class imbalance in object detection datasets. Finally, classifying the virtual outliers as an additional class increases the difficulty of object classification, which does not outperform either. This ablation demonstrates the superiority of the uncertainty loss employed by VOS.

VOS is effective on alternative architecture. Lastly, we demonstrate that VOS is effective on alternative neural network architectures. In particular, using RegNet [@DBLP:conf/cvpr/RadosavovicKGHD20] as backbone yields both better ID accuracy and OOD detection performance. We also explore using intermediate layers for outlier synthesis, where we show using VOS on the penultimate layer is the most effective. This is expected since the feature representations are the most discriminative at deeper layers. We provide details in Appendix 12{reference-type="ref" reference="sec:intermediate"}.

Comparison with training on real outlier data. [We also compare with Outlier Exposure [@hendrycks2018deep] (OE). OE serves as a strong baseline since it relies on the real outlier data.]{style="color: black"} We train the object detector on PASCAL-VOC using the same architecture ResNet-50, and use the OE objective for the classification branch. The real outliers for OE training are sampled from the OpenImages dataset [@kuznetsova2020open]. We perform careful deduplication to ensure there is no overlap between the outlier training data and PASCAL-VOC. Our method achieves OOD detection performance on COCO (AUROC: 88.70%) that favorably matches OE (AUROC: 90.18%), and does not require external data.

Evaluation on Image Classification

[[subsec:img]]{#subsec:img label="subsec:img"} [[sec:cls]]{#sec:cls label="sec:cls"}

r0.45

[[tab:baseline_cls]]{#tab:baseline_cls label="tab:baseline_cls"}

Going beyond object detection, we show that VOS is also suitable and effective on common image classification benchmark. We use CIFAR-10 [@cifar] as the ID training data, with standard train/val splits. We train on WideResNet-40 [@zagoruyko2016wide] and DenseNet-101 [@huang2017densely], where we substitute the object detection loss in Equation [eq:all_loss]{reference-type="ref" reference="eq:all_loss"} with the cross-entropy loss. We evaluate on six OOD datasets: Textures [@DBLP:conf/cvpr/CimpoiMKMV14], SVHN [@netzer2011reading], Places365 [@DBLP:journals/pami/ZhouLKO018], LSUN-C [@DBLP:journals/corr/YuZSSX15], LSUN-Resize [@DBLP:journals/corr/YuZSSX15], and iSUN [@DBLP:journals/corr/XuEZFKX15]. The comparisons are shown in Table [tab:baseline_cls]{reference-type="ref" reference="tab:baseline_cls"}, with results averaged over six test datasets. VOS demonstrates competitive OOD detection results on both architectures without sacrificing the ID test classification accuracy (94.84% on pre-trained WideResNet vs. 94.68% using VOS).

Qualitative Analysis

In Figure 3{reference-type="ref" reference="fig:visual"}, we visualize the prediction on several OOD images, using object detection models trained without virtual outliers (top) and with VOS (bottom), respectively. The in-distribution data is BDD-100k. VOS performs better in identifying OOD objects (in green) than a vanilla object detector, and reduces false positives among detected objects. Moreover, the confidence score of the false-positive objects of VOS is lower than that of the vanilla model (see the truck in the 3rd column). [Additional visualizations are in Appendix [sec:app_visual]{reference-type="ref" reference="sec:app_visual"} and 14{reference-type="ref" reference="sec:app_outlier_visual"}]{style="color: black"}.

Visualization of detected objects on the OOD images (from MS-COCO) by a vanilla Faster-RCNN (top) and VOS (bottom). The in-distribution is BDD-100k dataset. Blue: Objects detected and classified as one of the ID classes. Green: OOD objects detected by VOS, which reduce false positives among detected objects. {#fig:visual width="100%"}

[[fig:visual]]{#fig:visual label="fig:visual"}

Related work

OOD detection for classification can be broadly categorized into post hoc and regularization-based approaches. In @bendale2016towards, the OpenMax score is developed for OOD detection based on the extreme value theory (EVT). Subsequent work [@hendrycks2016baseline] proposed a simple baseline using maximum softmax probability. Improved algorithms have been proposed, such as ensembling [@DBLP:conf/nips/Lakshminarayanan17], ODIN [@liang2018enhancing], energy score [@liu2020energy], Mahalanobis distance [@lee2018simple], Gram matrices based score [@DBLP:conf/icml/SastryO20], and GradNorm score [@huang2021importance]. Very recently, @sun2021react showed that a simple activation rectification strategy termed ReAct can significantly improve test-time OOD detection. Theoretical understandings on different post-hoc detection methods are provided in [@morteza2022provable]. Different from [@lee2018simple], VOS performs dynamic estimation of class-conditional Gaussian during training, which shapes the uncertainty surface over time using our proposed loss.

Another line of approaches explore model regularization using natural outlier images [@hendrycks2018deep; @mohseni2020self; @DBLP:journals/corr/abs-2106-03917] or images synthesized by GANs [@lee2018training]. However, real outlier data is often infeasible to obtain. Instead, VOS automatically synthesizes virtual outliers which allows greater flexibility and generality.@tack2020csi applied self-supervised learning for OOD detection, which we compare in Section [sec:experiment]{reference-type="ref" reference="sec:experiment"}. [@DBLP:journals/ijcv/BlumSNSC21; @DBLP:journals/corr/abs-2107-11264; @Besnier_2021_ICCV] proposed to detect outliers for semantic segmentation task. @DBLP:conf/visapp/GrcicBS21 trained a generative model and synthesize outliers in the pixel space, which cannot be applied to object detection where a scene consists of both known and unknown objects. The regularization is based on entropy maximization, which is different from VOS.

OOD detection for object detection is currently underexplored. [@DBLP:journals/corr/abs-2103-02603] used energy score [@liu2020energy] to identify the OOD data and then labeled them for incremental object detection. In contrast, VOS focuses on OOD detection and adopts a new unknown-aware training objective with a new test-time detection score. Our learning framework is generally applicable to both object detectors and classification models. Moreover, [@DBLP:journals/corr/abs-2103-02603] used the negative proposals as unknown samples for model regularization, which is suboptimal as we show in Table [tab:synthesis]{reference-type="ref" reference="tab:synthesis"}. [@DBLP:journals/corr/abs-2101-05036; @DBLP:journals/corr/abs-2107-04517] focused on uncertainty estimation for the localization regression, rather than OOD detection for classification problems. Several works [@DBLP:conf/wacv/DhamijaGVB20; @DBLP:conf/icra/MillerDMS19; @DBLP:conf/icra/MillerNDS18; @DBLP:conf/wacv/0003DSZMCCAS20; @DBLP:journals/corr/abs-2108-03614] used approximate Bayesian methods, such as MC-Dropout [@gal2016dropout] for OOD detection. They require multiple inference passes to generate the uncertainty score, which are computationally expensive on larger datasets and models.

Open-world object detection includes out-of-domain generalization [@DBLP:journals/corr/abs-2108-06753; @DBLP:journals/corr/abs-2104-08381], zero-shot object detection [@DBLP:journals/corr/abs-2104-13921; @DBLP:journals/ijcv/RahmanKP20] and incremental object detection [@DBLP:journals/corr/abs-2002-05347; @DBLP:conf/cvpr/Perez-RuaZHX20]. Most of them either developed measures to mitigate catastraphic forgetting [@DBLP:journals/corr/abs-2003-08798] or used auxiliary information [@DBLP:journals/ijcv/RahmanKP20], such as class attributes to perform object detection on unseen data, which is different from our focus of OOD detection.

Conclusion

In this paper, we propose VOS, a novel unknown-aware training framework for OOD detection. Different from methods that require real outlier data, VOS adaptively synthesizes outliers during training by sampling virtual outliers from the low-likelihood region of the class-conditional distributions. The synthesized outliers meaningfully improve the decision boundary between the ID data and OOD data, resulting in superior OOD detection performance while preserving the performance of the ID task. VOS is effective and suitable for both object detection and classification tasks. We hope our work will inspire future research on unknown-aware deep learning in real-world settings.

Reproducibility Statement {#reproducibility-statement .unnumbered}

The authors of the paper recognize the importance and value of reproducible research. We summarize our efforts below to facilitate reproducible results:

Datasets. We use publicly available datasets, which are described in detail in Section [sec:exp_baseline]{reference-type="ref" reference="sec:exp_baseline"}, Section [sec:cls]{reference-type="ref" reference="sec:cls"}, and Appendix 7{reference-type="ref" reference="sec:dataset"}.
Baselines. The description and hyperparameters of the OOD detection baselines are explained in Appendix 11{reference-type="ref" reference="sec:reproduce_baseline"}.
Model training. Our model training on object detection is based on the publicly available Detectron2 codebase: https://github.com/facebookresearch/detectron2. Hyperparamters are specified in Section [sec:exp_baseline]{reference-type="ref" reference="sec:exp_baseline"}, with a thorough ablation study provided in Appendix 9{reference-type="ref" reference="sec:ablation"}.
Methodology. Our method is fully documented in Section 3{reference-type="ref" reference="sec:method"}, with the pseudo algorithm detailed in Algorithm [alg:algo]{reference-type="ref" reference="alg:algo"}.
Open Source. The codebase and the dataset will be released for reproducible research. Code is available at https://github.com/deeplearning-wisc/vos.

Ethics statement {#ethics-statement .unnumbered}

Our project aims to improve the reliability and safety of modern machine learning models. Our study can lead to direct benefits and societal impacts, particularly for safety-critical applications such as autonomous driving. Our study does not involve any human subjects or violation of legal compliance. We do not anticipate any potentially harmful consequences to our work. Through our study and releasing our code, we hope to raise stronger research and societal awareness towards the problem of out-of-distribution detection in real-world settings.

Acknowledgement {#acknowledgement .unnumbered}

Research is supported by Wisconsin Alumni Research Foundation (WARF). We sincerely thank Ziyang (Jack) Cai for helping with inspect the OOD datasets, and members in Li's lab for valuable discussions.

Supplementary Material

Experimental details {#sec:dataset}

We summarize the OOD detection evaluation task in Table 1{reference-type="ref" reference="tab:task"}. The OOD test dataset is selected from MS-COCO and OpenImages dataset, which contains disjoint labels from the respective ID dataset. The PASCAL model is trained for a total of 18,000 iterations, and the BDD-100k model is trained for 90,000 iterations. We add the uncertainty regularizer (Equation [eq:reg_loss]{reference-type="ref" reference="eq:reg_loss"}) starting from 2/3 of the training. The weight $\beta$ is set to $0.1$. See detailed ablations on the hyperparameters in Appendix 9{reference-type="ref" reference="sec:ablation"}.

::: {#tab:task} Task 1 Task 2

ID train dataset VOC train BDD train ID val dataset VOC val BDD val OOD dataset COCO and OpenImages val COCO and OpenImages val $#$ID train images 16,551 69,853 $#$ID val images 4,952 10,000 $#$OOD images for COCO 930 1,880 $#$OOD images for OpenImages 1,761 1,761

: OOD detection evaluation tasks. :::

Software and hardware {#sec:hardware}

We run all experiments with Python 3.8.5 and PyTorch 1.7.0, using NVIDIA GeForce RTX 2080Ti GPUs.

Effect of hyperparameters {#sec:ablation}

Below we perform sensitivity analysis for each important hyperparameter[^1]. We use ResNet-50 as the backbone, trained on in-distribution dataset PASCAL-VOC.

Effect of $\epsilon$. Since the threshold $\epsilon$ can be infinitesimally small, we instead choose $\epsilon$ based on the $t$-th smallest likelihood in a pool of 10,000 samples (per-class), generated from the class-conditional Gaussian distribution. A larger $t$ corresponds to a larger threshold $\epsilon$. As shown in Table 2{reference-type="ref" reference="tab:ablation_appendix2_detection"}, a smaller $t$ yields good performance. We set $t=1$ for all our experiments.

::: {#tab:ablation_appendix2_detection} $t$ mAP$\uparrow$ FPR95 $\downarrow$ AUROC$\uparrow$ AUPR$\uparrow$

1        48.7            **54.69**           **83.41**        **92.56**
2        48.2              57.96               82.31            88.52
3        48.3              62.39               82.20            88.05
4        48.8              69.72               80.86            89.54
5        48.7              57.57               78.66            88.20
6        48.7              74.03               78.06            91.17
8        48.8              60.12               79.53            92.53

10 47.2 76.25 74.33 90.42

: Ablation study on the number of selected outliers $t$ (per class). :::

Effect of queue size $|Q_k|$. We investigate the effect of ID queue size $|Q_k|$ in Table 3{reference-type="ref" reference="tab:ablation_appendix1_detection"}, where we vary $|Q_k|={50,100,200,400,600,800,1000}$. Overall, a larger $|Q_k|$ is more beneficial since the estimation of Gaussian distribution parameters can be more precise. In our experiments, we set the queue size $|Q_k|$ to $1,000$ for PASCAL and $300$ for BDD-100k. The queue size is smaller for BDD because some classes have a limited number of object boxes.

::: {#tab:ablation_appendix1_detection} $|Q_k|$ mAP$\uparrow$ FPR95 $\downarrow$ AUROC$\uparrow$ AUPR$\uparrow$

 50          48.6              68.42               77.04            92.30
 100         48.9              59.77               79.96            89.18
 200         48.8              57.80               80.20            89.92
 400         48.9              66.85               77.68            89.83
 600         48.5              57.32               81.99            91.07
 800         48.7            **51.43**             82.26            91.80
1000         48.7              54.69             **83.41**        **92.56**

: Ablation study on the ID queue size $|Q_k|$. :::

Effect of $\beta$. As shown in Table 4{reference-type="ref" reference="tab:ablation_appendix3_detection"}, a mild value of $\beta$ generally works well. As expected, a large value (e.g., $\beta=0.5$) will over-regularize the model and harm the performance.

::: {#tab:ablation_appendix3_detection} $\beta$ mAP$\uparrow$ FPR95 $\downarrow$ AUROC$\uparrow$ AUPR$\uparrow$

   0.01      48.8              59.20               82.64            90.08
   0.05      48.9              57.21               83.27            91.00
    0.1      48.7            **54.69**           **83.41**        **92.56**
   0.15      48.5              59.32               77.47            89.06
    0.5      36.4              99.33               57.46            85.25

: Ablation study on regularization weight $\beta$. :::

Effect of starting iteration for the regularizer. Importantly, we show that uncertainty regularization should be added in the middle of the training. If it is added too early, the feature space is not sufficiently discriminative for Gaussian distribution estimation. See Table 5{reference-type="ref" reference="tab:ablation_appendix4_detection"} for the effect of starting iteration $Z$. We use $Z=12,000$ for the PASCAL-VOC model, which is trained for a total of 18,000 iterations.

::: {#tab:ablation_appendix4_detection} $Z$ mAP$\uparrow$ FPR95 $\downarrow$ AUROC$\uparrow$ AUPR$\uparrow$

2000 48.5 60.01 78.55 87.62 4000 48.4 61.47 79.85 89.41 6000 48.5 59.62 79.97 89.74 8000 48.7 56.85 80.64 90.71 10000 48.6 49.55 83.22 92.49 12000 48.7 54.69 83.41 92.56 14000 49.0 55.39 81.37 93.00 16000 48.9 59.36 82.70 92.62

: Ablation study on the starting iteration $Z$. Model is trained for a total of 18,000 iterations. :::

Additional visualization results

We provide additional visualization of the detected objects on different OOD datasets with models trained on different in-distribution datasets. The results are shown in Figures 4{reference-type="ref" reference="fig:vi1"}-7{reference-type="ref" reference="fig:vi4"}.

[[sec:app_visual]]{#sec:app_visual label="sec:app_visual"}

Additional visualization of detected objects on the OOD images (from MS-COCO) by a vanilla Faster-RCNN (top) and VOS (bottom). The in-distribution is Pascal VOC dataset. Blue: Objects detected and classified as one of the ID classes. Green: OOD objects detected by VOS, which reduce false positives among detected objects. {#fig:vi1 width="100%"}

Additional visualization of detected objects on the OOD images (from OpenImages) by a vanilla Faster-RCNN (top) and VOS (bottom). The in-distribution is Pascal VOC dataset. Blue: Objects detected and classified as one of the ID classes. Green: OOD objects detected by VOS, which reduce false positives among detected objects. {#fig:vi2 width="100%"}

Additional visualization of detected objects on the OOD images (from MS-COCO) by a vanilla Faster-RCNN (top) and VOS (bottom). The in-distribution is BDD-100k dataset. Blue: Objects detected and classified as one of the ID classes. Green: OOD objects detected by VOS, which reduce false positives among detected objects. {#fig:vi3 width="100%"}

Additional visualization of detected objects on the OOD images (from OpenImages) by a vanilla Faster-RCNN (top) and VOS (bottom). The in-distribution is BDD-100k dataset. Blue: Objects detected and classified as one of the ID classes. Green: OOD objects detected by VOS, which reduce false positives among detected objects. {#fig:vi4 width="100%"}

Baselines {#sec:reproduce_baseline}

To evaluate the baselines, we follow the original methods in MSP [@hendrycks2016baseline], ODIN [@liang2018enhancing], Generalized ODIN [@hsu2020generalized], Mahalanobis distance [@lee2018simple], CSI [@tack2020csi], energy score [@liu2020energy] and gram matrices [@DBLP:conf/icml/SastryO20] and apply them accordingly on the classification branch of the object detectors. For ODIN, the temperature is set to be $T=1000$ following the original work. For both ODIN and Mahalanobis distance [@lee2018simple], the noise magnitude is set to $0$ because the region-based object detector is not end-to-end differentiable given the existence of region cropping and ROIAlign.For GAN [@lee2018training], we follow the original paper and use a GAN to generate OOD images. The prediction of the OOD images/objects is regularized to be close to a uniform distribution, through a KL divergence loss with a weight of 0.1. We set the shape of the generated images to be 100$\times$100 and resize them to have the same shape as the real images. We optimize the generator and discriminator using Adam [@DBLP:journals/corr/KingmaB14], with a learning rate of 0.001. For CSI [@tack2020csi], we use the rotations (0$^\circ$, 90$^\circ$, 180$^\circ$, 270$^\circ$) as the self-supervision task. We set the temperature in the contrastive loss to 0.5. We use the features right before the classification branch (with the dimension to be 1024) to perform contrastive learning. The weights of the losses that are used for classifying shifted instances and instance discrimination are both set to 0.1 to prevent training collapse. For Generalized ODIN [@hsu2020generalized], we replace and train the classification head of the object detector by the most effective Deconf-C head shown in the original paper.

Virtual outlier synthesis using earlier layer {#sec:intermediate}

In this section, we investigate the effect of using VOS on an earlier layer within the network. Our main results in Table [tab:baseline]{reference-type="ref" reference="tab:baseline"} are based on the penultimate layer of the network. Here, we additionally evaluate the performance using the layer before the penultimate layer, with a feature dimension of $1,024$. The results are summarized in Table 6{reference-type="ref" reference="tab:diff_layers"}. As observed, synthesizing virtual outliers in the penultimate layer achieves better OOD detection performance than the earlier layer, since the feature representations are more discriminative at deeper layers.

::: {#tab:diff_layers} Models FPR95$\downarrow$ AUROC$\uparrow$ mAP$\uparrow$

PASCAL VOC
VOS-final 47.53 88.70 48.9 VOS-earlier 50.24 88.24 48.6 BDD-100k
VOS-final 44.27 86.87 31.3 VOS-earlier 49.66 86.08 30.6

: Performance comparison of employing VOS on different layers. COCO is the OOD data. :::

Visualization of the learnable weight coefficient $w$ in generalized energy score {#sec:app_visual_weight}

To observe whether the learnable weight coefficient $w_k$ in Equation [eq:energy]{reference-type="ref" reference="eq:energy"} captures dataset-specific statistics during uncertainty regularization, we visualize $w_k$ w.r.t each in-distribution class and the number of training objects of that class in Figure 8{reference-type="ref" reference="fig:visual_energy_weight"}. We use the BDD-100k dataset [@DBLP:conf/cvpr/YuCWXCLMD20] as the in-distribution dataset and the RegNetX-4.0GF [@DBLP:conf/cvpr/RadosavovicKGHD20] as the backbone network. As can be observed, the learned weight coefficient displays a consistent trend with the number of training objects per class, which indicates the advantage of using learnable weights rather than constant weight vector with all 1s.

Visualization of learnable weight coefficient in the generalized energy score and the number of training objects per in-distribution class. The value of the weight coefficient is averaged over three different runs. {#fig:visual_energy_weight width="100%"}

[Visualization of the virtual outliers]{style="color: black"} {#sec:app_outlier_visual}

[In this section, we visualize the synthesized virtual outliers by VOS using UMAP in Figure 9{reference-type="ref" reference="fig:synthesized_outliers"}. The in-distribution dataset is the Pascal VOC dataset with the backbone of ResNet-50. Note that we cannot visualize virtual outliers in the pixel space since they are synthesized in low-dimensional feature space. ]{style="color: black"}

[UMAP visualization of the synthesized virtual outliers. The blue points denote the object features from the in-distribution class of Person. The green points denote the synthesized virtual outliers from the low-density space w.r.t the features from that class. ]{style="color: black"} {#fig:synthesized_outliers width="90%"}

[From Figure 9{reference-type="ref" reference="fig:synthesized_outliers"}, the virtual outliers reside in the near-boundary region of the in-distribution feature cluster, which helps the model to learn a compact decision boundary between ID and OOD objects. ]{style="color: black"}

[Discussion on the detected, rejected and ignored OOD objects during inference]{style="color: black"} {#sec:app_number_objects}

[The focus of VOS is to mitigate the undesirable cases when an OOD object is detected and classified as in-distribution with high confidence. In other words, our goal is to ensure that "if the box is detected, it should be faithfully an in-distribution object rather than OOD". Although generating the bounding box for OOD data is not the focus of this paper, we do notice that VOS can improve the number of boxes detected for OOD data (+25% on BDD trained model compared to the vanilla Faster-RCNN).]{style="color: black"}

[The number of OOD objects ignored by RPN can largely depend on the confidence score threshold and the NMS threshold. Hence, we found it more meaningful to compare relatively with the vanilla Faster-RCNN under the same default thresholds. Using BDD100K as the in-distribution dataset and the ResNet as the backbone, VOS can improve the number of detected OOD boxes by 25% (compared to vanilla object detector). VOS also improves the number of rejected OOD samples by 63%.]{style="color: black"}

[^1]: Note that our sensitivity analysis uses the speckle noised PASCAL VOC validation dataset as OOD data, which is different from the actual OOD test datasets in use.