CyLab faculty, students to present at the 32nd USENIX Security Symposium

Aug 9, 2023

Carnegie Mellon faculty and students will present on a wide range of topics at the 32nd USENIX Security Symposium. Held in Anaheim, CA, on August 9-11, the event brings together experts from around the world, who will highlight the latest advances in the security and privacy of computer systems and networks.

Here, we’ve compiled a list of the ten papers co-authored by CyLab Security and Privacy Institute members that are being presented at the event.

Adversarial Training for Raw-Binary Malware Classifiers

Keane Lucas, Samruddhi Pai, Weiran Lin, and Lujo Bauer, Carnegie Mellon University; Michael K. Reiter, Duke University; Mahmood Sharif, Tel Aviv University

Abstract: Machine learning (ML) models have shown promise in classifying raw executable files (binaries) as malicious or benign with high accuracy. This has led to the increasing influence of ML-based classification methods in academic and real-world malware detection, a critical tool in cybersecurity. However, previous work provoked caution by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection. In this work, researchers investigate the effectiveness of using adversarial training methods to create malware classification models that are more robust to some state-of-the-art attacks. To train their most robust models, the authors significantly increase the efficiency and scale of creating adversarial examples to make adversarial training practical, which has not been done before in raw-binary malware detectors. The researchers then analyze the effects of varying the length of adversarial training, as well as analyze the effects of training with various types of attacks. They find that data augmentation does not deter state-of-the-art attacks, but that using a generic gradient-guided method, used in other discrete domains, does improve robustness. They also show that in most cases, models can be made more robust to malware-domain attacks by adversarially training them with lower-effort versions of the same attack. In the best case, one state-of-the-art attack's success rate is reduced from 90% to 5%. The researchers also find that training with some types of attacks can increase robustness to other types of attacks. Finally, they discuss insights gained from their results, and how these can be used to more effectively train robust malware detectors.

Are Consumers Willing to Pay for Security and Privacy of IoT Devices?

Pardis Emami-Naeini, Duke University; Janarth Dheenadhayalan, Yuvraj Agarwal, and Lorrie Faith Cranor, Carnegie Mellon University

Abstract: Internet of Things (IoT) device manufacturers provide little information to consumers about their security and data handling practices. Therefore, IoT consumers cannot make informed purchase choices around security and privacy. While prior research has found that consumers would likely consider security and privacy when purchasing IoT devices, past work lacks empirical evidence as to whether they would actually pay more to purchase devices with enhanced security and privacy. To fill this gap, researchers conducted a two-phase incentive-compatible online study with 180 Prolific participants. They measured the impact of five security and privacy factors (e.g., access control) on participants' purchase behaviors when presented individually or together on an IoT label. Findings revealed participants were willing to pay a significant premium for devices with better security and privacy practices. The biggest price differential they found was for de-identified rather than identifiable cloud storage. Mainly due to its usability challenges, the least valuable improvement for participants was to have multi-factor authentication as opposed to passwords. Based on the authors’ findings, they provide recommendations on creating more effective IoT security and privacy labeling programs.

Assessing Anonymity Techniques Employed in German Court Decisions: A De-Anonymization Experiment

Dominic Deuber and Michael Keuchen, Friedrich-Alexander-Universität Erlangen-Nürnberg; Nicolas Christin, Carnegie Mellon University

Abstract: Democracy requires transparency. Consequently, courts of law must publish their decisions. At the same time, the interests of the persons involved in these court decisions must be protected. For this reason, court decisions in Europe are anonymized using a variety of techniques. To understand how well these techniques protect the persons involved, researchers conducted an empirical experiment with 54 law students, whom they asked to de-anonymize 50 German court decisions. The authors found that all anonymization techniques used in these court decisions were vulnerable, most notably the use of initials. Since even supposedly secure anonymization techniques proved vulnerable, their work empirically reveals the complexity involved in the anonymization of court decisions, and thus calls for further research to increase anonymity while preserving comprehensibility. Toward that end, the authors provide recommendations for improving anonymization quality. Finally, they provide an empirical notion of “reasonable effort,” to flesh out the definition of anonymity in the legal context. In doing so, they bridge the gap between the technical and the legal understandings of anonymity.

A Two-Decade Retrospective Analysis of a University's Vulnerability to Attacks Exploiting Reused Passwords

🏅 Distinguished Paper Award Winner

Alexandra Nisenoff, University of Chicago / Carnegie Mellon University; Maximilian Golla, University of Chicago / Max Planck Institute for Security and Privacy; Miranda Wei, University of Chicago / University of Washington; Juliette Hainline, Hayley Szymanek, Annika Braun, Annika Hildebrandt, Blair Christensen, David Langenberg, and Blase Ur, University of Chicago

Abstract: Credential-guessing attacks often exploit passwords that were reused across a user's online accounts. To learn how organizations can better protect users, researchers retrospectively analyzed their universities’ vulnerability to credential-guessing attacks across twenty years. Given a list of university usernames, the authors searched for matches in both data breaches from hundreds of websites and a dozen large compilations of breaches. After cracking hashed passwords and tweaking guesses, they successfully guessed passwords for 32.0% of accounts matched to a university email address in a data breach, as well as 6.5% of accounts where the username (but not necessarily the domain) matched. Many of these accounts remained vulnerable for years after the breached data was leaked, and passwords found verbatim in breaches were nearly four times as likely to have been exploited (i.e., suspicious account activity was observed) than tweaked guesses. Over 70 different data breaches and various username-matching strategies bootstrapped correct guesses. In surveys of 40 users whose passwords the authors guessed, many users were unaware of the risks to their university account or that their credentials had been breached. This analysis of password reuse at their universities provides pragmatic advice for organizations to protect accounts.

BotScreen: Trust Everybody, but Cut the Aimbots Yourself

🏅 Distinguished Paper Award Winner

Minyeop Choi, KAIST; Gihyuk Ko, Cyber Security Research Center at KAIST and Carnegie Mellon University; Sang Kil Cha, KAIST and Cyber Security Research Center at KAIST

Abstract: Aimbots, which assist players to kill opponents in FirstPerson Shooter (FPS) games, pose a significant threat to the game industry. Although there has been significant research effort to automatically detect aimbots, existing works suffer from either high server-side overhead or low detection accuracy. In this paper, researchers present a novel aimbot detection design and implementation that they refer to as BotScreen, which is a client-side aimbot detection solution for a popular FPS game, Counter-Strike: Global Offensive (CS:GO). BotScreen is the first in detecting aimbots in a distributed fashion, thereby minimizing the server-side overhead. It also leverages a novel deep learning model to precisely detect abnormal behaviors caused by using aimbots. The authors demonstrate the effectiveness of BotScreen in terms of both accuracy and performance on CS:GO. Their tool and dataset are publicly available to support open science.

Distance-Aware Private Set Intersection

Anrin Chakraborti, Duke University; Giulia Fanti, Carnegie Mellon University; Michael K. Reiter, Duke University

Abstract: Private set intersection (PSI) allows two mutually untrusting parties to compute an intersection of their sets, without revealing information about items that are not in the intersection. This work introduces a PSI variant called distance-aware PSI (DA-PSI) for sets whose elements lie in a metric space. DA-PSI returns pairs of items that are within a specified distance threshold of each other. This paper puts forward DA-PSI constructions for two metric spaces: (i) Minkowski distance of order 1 over the set of integers (i.e., for integers a and b, their distance is |a − b|); and (ii) Hamming distance over the set of binary strings of length l. In the Minkowski DA-PSI protocol, the communication complexity scales logarithmically in the distance threshold and linearly in the set size. In the Hamming DA-PSI protocol, the communication volume scales quadratically in the distance threshold and is independent of the dimensionality of string length l. Experimental results with real applications confirm that DA-PSI provides more effective matching at lower cost than naïve solutions.

Defining "Broken": User Experiences and Remediation Tactics When Ad-Blocking or Tracking-Protection Tools Break a Website’s User Experience

Alexandra Nisenoff, University of Chicago and Carnegie Mellon University; Arthur Borem, Madison Pickering, Grant Nakanishi, Maya Thumpasery, and Blase Ur, University of Chicago

Abstract: To counteract the ads and third-party tracking ubiquitous on the web, users turn to blocking tools—ad-blocking and tracking-protection browser extensions and built-in features. Unfortunately, blocking tools can cause non-ad, non-tracking elements of a website to degrade or fail, a phenomenon termed breakage. Examples include missing images, non-functional buttons, and pages failing to load. While the literature frequently discusses breakage, prior work has not systematically mapped and disambiguated the spectrum of user experiences subsumed under “breakage,” nor sought to understand how users experience, prioritize, and attempt to fix breakage. The authors fill these gaps. First, through qualitative analysis of 18,932 extension-store reviews and GitHub issue reports for ten popular blocking tools, they developed novel taxonomies of 38 specific types of breakage and 15 associated mitigation strategies. To understand subjective experiences of breakage, the researchers then conducted a 95-participant survey. Nearly all participants had experienced various types of breakage, and they employed an array of strategies of variable effectiveness in response to specific types of breakage in specific contexts. Unfortunately, participants rarely notified anyone who could fix the root causes. The authors discuss how their taxonomies and results can improve the comprehensiveness and prioritization of ongoing attempts to automatically detect and fix breakage.

DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing

Jiawei Zhang, UIUC; Zhongzhu Chen, University of Michigan, Ann Arbor; Huan Zhang, Carnegie Mellon University; Chaowei Xiao, Arizona State University; Bo Li, UIUC

Abstract: Diffusion models have been leveraged to perform adversarial purification and thus provide both empirical and certified robustness for a standard model. On the other hand, different robustly trained smoothed models have been studied to improve the certified robustness. Thus, it raises a natural question: Can diffusion model be used to achieve improved certified robustness on those robustly trained smoothed models? In this work, researchers first theoretically show that recovered instances by diffusion models are in the bounded neighborhood of the original instance with high probability; and the "one-shot" denoising diffusion probabilistic models (DDPM) can approximate the mean of the generated distribution of a continuous-time diffusion model, which approximates the original instance under mild conditions. Inspired by their analysis, the authors propose a certifiably robust pipeline DiffSmooth, which first performs adversarial purification via diffusion models and then maps the purified instances to a common region via a simple yet effective local smoothing strategy. They conduct extensive experiments on different datasets and show that DiffSmooth achieves SOTA-certified robustness compared with eight baselines. For instance, DiffSmooth improves the SOTA-certified accuracy from 36.0% to 53.0% under ℓ2 radius 1.5 on ImageNet.

Powering for Privacy: Improving User Trust in Smart Speaker Microphones with Intentional Powering and Perceptible Assurance

Youngwook Do and Nivedita Arora, Georgia Institute of Technology; Ali Mirzazadeh and Injoo Moon, Georgia Institute of Technology and Massachusetts Institute of Technology; Eryue Xu, Georgia Institute of Technology; Zhihan Zhang,Georgia Institute of Technology and University of Washington; Gregory D. Abowd, Georgia Institute of Technology and Northeastern University; Sauvik Das, Georgia Institute of Technology and Carnegie Mellon University

Abstract: Smart speakers come with always-on microphones to facilitate voice-based interaction. To address user privacy concerns, existing devices come with a number of privacy features: e.g., mute buttons and local trigger-word detection modules. But it is difficult for users to trust that these manufacturer-provided privacy features actually work given that there is a misalignment of incentives: Google, Meta, and Amazon benefit from collecting personal data and users know it. What's needed is perceptible assurance — privacy features that users can, through physical perception, verify actually work. To that end, researchers introduce, implement, and evaluate the idea of "intentionally-powered" microphones to provide users with perceptible assurance of privacy with smart speakers. The authors employed an iterative-design process to develop Candid Mic, a battery-free, wireless microphone that can only be powered by harvesting energy from intentional user interactions. Moreover, users can visually inspect the (dis)connection between the energy harvesting module and the microphone. Through a within-subjects experiment, they found that Candid Mic provides users with perceptible assurance about whether the microphone is capturing audio or not and improves user trust in using smart speakers relative to mute button interfaces.

User Awareness and Behaviors Concerning Encrypted DNS Settings in Web Browsers

Alexandra Nisenoff, Carnegie Mellon University and University of Chicago; Ranya Sharma and Nick Feamster, University of Chicago

Abstract: Recent developments to encrypt the Domain Name System (DNS) have resulted in major browser and operating system vendors deploying encrypted DNS functionality, often enabling various configurations and settings by default. In many cases, default encrypted DNS settings have implications for performance and privacy; for example, Firefox’s default DNS setting sends all of a user’s DNS queries to Cloudflare, potentially introducing new privacy vulnerabilities. In this paper, researchers confirm that most users are unaware of these developments—with respect to the rollout of these new technologies, the changes in default settings, and the ability to customize encrypted DNS configuration to balance user preferences between privacy and performance. The authors’ findings suggest several important implications for the designers of interfaces for encrypted DNS functionality in both browsers and operating systems, to help improve user awareness concerning these settings, and to ensure that users retain the ability to make choices that allow them to balance trade offs concerning DNS privacy and performance.