Seminar:  DGA-based Botnets: Discovery, Classification, and Tracking

Date:February 25, 2013 
Talk Title:DGA-based Botnets: Discovery, Classification, and Tracking
Speaker:Robert Perdisci, Assistant Professor at the University of Georgia
Time & Location:12:00pm - 1:00pm
CIC Building, Pittsburgh


Lightweight network-based malware detection systems are often based on static domain name blacklists. Because such systems have seen a good level of success in the recent past, malware authors have begun employing domain generation algorithms (DGAs) to dynamically produce large numbers of pseudo-random command and control (C&C) domains, of which only a small subset is actually registered and successfully resolves to the current C&C servers. Furthermore, these pseudo-random C&C domains are active for only short periods of time, thus rendering detection approaches that rely on static domain blacklists ineffective. Clearly, if we knew how a DGA works we could generate all domains ahead of time and still identify and block the malware C&C traffic. However, this usually requires reverse engineering the malware executables, which is not always feasible.

This talk presents a novel system capable of detecting pseudo-random domains generated by DGA-based malware without reversing. A key observation is that most DGA-generated domains that a malware queries are not actually registered and would therefore result in "non-existent domain" responses (NXDs). Our approach uses a combination of clustering and classification algorithms. The clustering algorithm groups NXDs based on the similarity in the domain name strings as well as similarities in the groups of machines that queried them. This allows us to discover sets of NXDs generated by previously unknown DGA-based malware. The classification algorithm is used to label NXDs related to previously identified DGAs, and to enable the detection of active DGA-generated C&C domains. We have implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs hosting millions of machines. Over a period of about fifteen months, our system was able to discover several previously unknown DGA-based malware and allowed us to detect and classify DGA-generated C&C domains with high accuracy.

Speaker Bio

Roberto Perdisci is an Assistant Professor in the Computer Science department at the University of Georgia, an Adjunct Assistant Professor in the Georgia Tech School of Computer Science, and a faculty member of the UGA Institute for Artificial Intelligence. Before joining UGA he was Post-Doctoral Fellow at the College of Computing of the Georgia Institute of Technology, and Principal Scientist at Damballa, Inc., a network security company based in Atlanta, GA. 

Dr. Perdisci is a recipient of the 2012 NSF CAREER Award. His research interests are in Computer and Network Security, and in Machine Learning/Data Mining techniques for efficient analysis and modeling of large datasets. In particular, he has been focusing on modeling and detecting Botnets based on their network behavior.