Statistical Guarantees For The Robustness Of Bayesian .

6d ago
1 Views
0 Downloads
2.51 MB
8 Pages
Transcription

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)Statistical Guarantees for the Robustness of Bayesian Neural NetworksLuca Cardelli1 , Marta Kwiatkowska1 , Luca Laurenti1 , Nicola Paoletti2 ,Andrea Patane1 and Matthew Wicker1 1University of Oxford2Royal Holloway University of London{luca.cardelli, marta.kwiatkowska, luca.laurenti, andrea.patane, c.ukAbstractWe introduce a probabilistic robustness measure forBayesian Neural Networks (BNNs), defined as theprobability that, given a test point, there exists a pointwithin a bounded set such that the BNN predictiondiffers between the two. Such a measure can be used,for instance, to quantify the probability of the existenceof adversarial examples. Building on statistical verification techniques for probabilistic models, we developa framework that allows us to estimate probabilisticrobustness for a BNN with statistical guarantees, i.e.,with a priori error and confidence bounds. We provideexperimental comparison for several approximate BNNinference techniques on image classification tasksassociated to MNIST and a two-class subset of theGTSRB dataset. Our results enable quantification ofuncertainty of BNN predictions in adversarial settings.1 IntroductionBayesian Neural Networks (BNNs), i.e. neural networkswith distributions over their weights, are gaining momentum for their ability to capture the uncertainty withinthe learning model, while retaining the main advantages intrinsic to deep neural networks [MacKay, 1992;Gal, 2016]. A wide array of attacks and formal verification techniques have been developed for deterministic (i.e. non-Bayesian)neural networks [Biggio and Roli, 2018]. However, to date, onlymethods based on pointwise uncertainty computation have beenproposed for BNNs [Feinman et al., 2017]. To the best of ourknowledge, there are no methods directed at providing guaranteeson BNNs that fully take into account their probabilistic nature.This is particularly important in safety-critical applications, whereuncertainty estimates can be propagated through the decisionpipeline to enable safe decision making [McAllister et al., 2017].In this work, we present a statistical framework to evaluatethe probabilistic robustness of a BNN. The method comes withstatistical guarantees, i.e., the estimated robustness meets a priorierror and confidence bounds. In particular, given an input pointx Rm and a (potentially uncountable) bounded set of inputpoints T Rm, we aim to compute the probability (induced bythe distribution over the BNN weights) that there exists x T such equal contribution5693that the BNN prediction on x differs from that of x . Note that thisis a probabilistic generalisation of the usual statement of (deterministic) robustness to adversarial examples [Goodfellow et al., 2014].We formulate two variants of probabilistic robustness. Thefirst variant describes the probability that the deviation of thenetwork’s output (i.e., of the class likelihoods) between x andany point in T is bounded. This variant accounts for the so-calledmodel uncertainty of the BNN, i.e., the uncertainty that derivesfrom partial knowledge about model parameters. The secondvariant quantifies the probability that the predicted class label forx is invariant for all points in T . This accounts for both modeluncertainty and data uncertainty, which is related to intrinsic uncertainty in the labels. These properties allow one to estimate, forinstance, the probability of the existence of adversarial examples.The exact computation of such robustness probabilities is,unfortunately, infeasible, as the posterior distribution of a BNN isanalytically intractable in general. Hence, we develop a statisticalapproach, based on the observation that each sample takenfrom the (possibly approximate) posterior weight distributionof the BNN induces a deterministic neural network. The lattercan thus be analysed using existing verification techniques fordeterministic networks (e.g. [Huang et al., 2017; Katz et al., 2017;Ruan et al., 2018]). Thus, we can see the robustness of a BNNas a Bernoulli random variable whose mean is the probability thatwe seek to estimate (see Section 5). In order to do so, we developa sequential scheme based on Jegourel et al. [2018], a statisticalapproach for the formal verification of stochastic systems.Namely, we iteratively sample the BNN posterior and check therobustness of the resulting deterministic network with respect tothe input subset T . After each iteration, we apply the Massartbounds [Massart, 1990] to check if the current sample set satisfiesthe a priori statistical guarantees. Thus, we reduce the number ofsamples only to those needed in order to meet the statistical guarantees required. This is essential for the computational feasibilityof the method, as each sample entails solving a computationallyexpensive verification sub-problem. Moreover, our method isgenerally applicable in that the estimation scheme is independentof the choice of the deterministic verification technique.We evaluate our method on fully connected and convolutionalneural networks, trained on the MNIST handwritten digits dataset[LeCun and Cortes, 2010] and a two-class subset of the the German Traffic Sign Recognition Benchmark (GTSRB) [Stallkampet al., 2012] respectively. We compare the robustness profilesof three different BNN inference methods (Monte Carlo dropout

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)[Gal and Ghahramani, 2016], variational inference [Blundellet al., 2015], and Hamiltonian Monte Carlo [Neal, 2012]),demonstrating that our notion of probabilistic robustness resultsin an effective model selection criterion and provides insights intothe benefits of BNN stochasticity in mitigating attacks1.In summary, the paper makes the following main contributions: We define two variants of probabilistic robustness forBNNs, which generalise safety and reachability definedfor deterministic networks. These can be used to quantifyrobustness against adversarial examples. Building on analysis techniques for deterministic neuralnetworks, we design a statistical framework for theestimation of the probabilistic robustness of a BNN, whichensures a priori statistical guarantees. We evaluate our methods on state-of-the-art approximateinference approaches for BNNs on MNIST and GTSRB,for a range of properties. We quantify the uncertainty ofBNN predictions in adversarial settings.2 Related WorkMost existing methods for the analysis and verification of neural networks are designed for deterministic models. These canbe roughly divided into heuristic search techniques and formalverification techniques. While the focus of the former is usually on finding an adversarial example [Goodfellow et al., 2014;Wicker et al., 2018; Wu et al., 2018], verification techniques striveto formally prove guarantees about the robustness of the networkwith respect to input perturbations [Huang et al., 2017; Katz et al.,2017; Ruan et al., 2018]. Alternatively, statistical techniques posita specific distribution in the input space in order to derive a quantitative measure of robustness for deterministic networks [Webbet al., 2018; Cohen et al., 2019]. However, this approach may notbe appropriate for safety-critical applications, because these typically require a worst-case analysis and adversarial examples oftenoccupy a negligibly small portion of the input space. Dvijothamet al. [2018] consider a similar problem, i.e., that of verifying (deterministic) deep learning models over probabilistic inputs. Eventhough they provide stronger probability bounds than the abovestatistical approaches, their method is not applicable to BNNs.Bayesian uncertainty estimation approaches have beeninvestigated as a way to flag possible adversarial examples ondeterministic neural networks [Feinman et al., 2017], thoughrecent results suggest that such strategies might be fooled byadversarial attacks designed to generate examples with smalluncertainty [Grosse et al., 2018]. However, as these methodsbuild adversarial examples on deterministic networks and useuncertainty only at prediction time, their results do not capture theactual probabilistic behaviour of the BNN in adversarial settings.In contrast, our approach allows for the quantitative analysisof probabilistic robustness of BNNs, yielding probabilisticguarantees for the absence of adversarial examples.A Bayesian perspective on adversarial attacks is taken by Rawatet al. [2017], where experimental evaluation of the relationshipbetween model uncertainty and adversarial examples is given.Similarly, Kendall et al. [2015] study the correlation between1Code is available enteesForBNNsuncertainty and per-class prediction accuracy in a semantic segmentation problem. These approaches are, however, pointwise, inthat the uncertainty information is estimated for one input at a time.Instead, by applying formal verification techniques on the deterministic NNs sampled from the BNN, our method supports worstcase scenario analysis on possibly uncountable regions of the inputspace. This allows us to obtain statistical guarantees on probabilistic robustness in the form of a priori error and confidence bounds.Cardelli et al. [2018] present a method for computingprobabilistic guarantees for Gaussian processes in Bayesianinference settings, which applies to fully connected BNNs inthe limit of infinite width. However, the method, while exact forGaussian processes, is only approximate for BNNs and with anerror that cannot be computed.Another relevant set of works aims to derive PAC bounds on thegeneralization error for neural networks [Neyshabur et al., 2017;Bartlett et al., 2017]. However, these bounds are not directlyapplicable to our robustness estimation problem, as our focus ison analysing how, for a given test point, a perturbation applied tothat point causes a prediction change, independently of the pointground truth.3 Bayesian Neural NetworksIn this section we provide background for learning with BNNs,and briefly review the approximate inference methods employedin the remainder of the paper. We use f w (x) [f1w (x),.,fCw (x)]to denote a BNN with C output units and an unspecified number(and kind) of hidden layers, where w is the weight vector randomvariable. Given a distribution over w and w RW , a weightvector sampled from the distribution of w, we denote with f w (x)the corresponding deterministic neural network with weightsfixed to w. Let D {(x,c) x Rm, c {1,.,C}} be thetraining set. We consider classification with a softmax likelihoodmodel, that is, assuming that the likelihood function for observingclass h, for an input x Rm and a given w RW , is givenPCby σh (f w (x)) exp(fhw (x))/ j 1 exp(fjw (x)). We defineσ(f w (x)) [σ1 (f w (x)),.,σC (f w (x))], the combined vectorof class likelihoods, and similarly we denote with σ(f w (x)) theassociated random variable induced by the distribution over w.In Bayesian settings, we assume a prior distribution over theweights, i.e. w p(w)2, so that learning for the BNN amountsto computing the posterior distribution over the weights, p(w D),via the application of the Bayes rule. Unfortunately, becauseof the non-linearity generally introduced by the neural networkarchitecture, the computation of the posterior cannot be doneanalytically [MacKay, 1992]. Hence, various approximationmethods have been investigated to perform inference with BNNsin practice. Among these methods, in this work we considerHamiltonian Monte Carlo [Neal, 2012], variational inferencethrough Bayes by backprop [Blundell et al., 2015], and MonteCarlo dropout [Gal, 2016]. We stress, however, that the methodwe present is independent of the specific inference technique used,as long as this provides a practical way of sampling weights wfrom the posterior distribution of w (or an approximation thereof).Hamiltonian Monte Carlo (HMC) proceeds by defininga Markov chain whose invariant distribution is p(w D), and25694Usually depending on hyperparameters, omitted here for simplicity.

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)relies on Hamiltionian dynamics to speed up the explorationof the space. Differently from the two other methods discussedbelow, HMC does not make any assumptions on the form of theposterior distribution, and is asymptotically correct. The resultof HMC is a set of samples wi that approximates p(w D).Variational Inference (VI) proceeds by finding a suitableapproximating distribution q(w) p(w D) in a trade-off betweenapproximation accuracy and scalability. The core idea is thatq(w) depends on some hyper-parameters that are then iterativelyoptimized by minimizing a divergence measure between q(w)and p(w D). Samples can then be efficiently extracted from q(w).Monte Carlo Dropout (MCD) is an approximate variationalinference method based on dropout [Gal and Ghahramani,2016]. The approximating distribution q(w) takes the formof the product between Bernoulli random variables and thecorresponding weights. Hence, sampling from q(w) reduces tosampling Bernoulli variables, and is thus very efficient.the stochastic process m(x ) with values in {1,.,C}, where theprobability that m(x ) h, h {1,.,C}, is given byZP (m(x ) h) zP (σh(f w (x )) z D)dz.(1)Taking the classification aspect and Multinoulli distribution intoaccount poses the following problem.Problem 2 Consider a neural network f w with training datasetD. Let x be a test point and T Rm a bounded set. We computethe probabilityp2 P (φ2(f w ) D), whereφ2(f w ) x T s.t.m(x )6 m(x).For 0 η 1, we say that f w is safe with probability at least1 η in x with respect to set T iff p2 η.An important consequence of Problem 2 is that, for regions ofthe input space where the