FloCon 2020 has ended
Back To Schedule
Wednesday, January 8 • 8:30am - 9:00am
The Long & Winding Road to “Production-Worthy”

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Fraudulent domains are malicious domains posing as well-known services or websites. They are used by criminal and APT groups to target victims. As a result, identifying them is of particular interest to government agencies seeking to defend their networks against such attacks. This talk will detail several lessons learned from building and iterating on a production-deployed network defense analytic designed to identify these domains. Our initial analytic was a heuristic-based approach that focused on a relatively simple hypothesis. it performed well in operational testing with respect to false positives. However, this initial version had a substantial false negative problem that subsequently drove our development efforts for the next iteration. To develop our next version, we extended our heuristic approach and incorporated a machine learning model. Experimental testing led us to incorporate the machine learning model in a different way than initially planned, highlighting the classic balance between false negative coverage and false positives. The use of a machine learning model proved to be very valuable to strengthen the analytic and validate the hypothesis used for our heuristic approach. Although we were happy with the experimental results of the second version, we now had a false positive problem. Further complicating the matter, our analytic also had relatively serious computational shortcomings that did not allow it to keep up with the throughput of data. While we were able to develop a strategy for false positives, extensive profiling of our analytic code pointed to computational problems in our machine learning model that would be non-trivial to solve. We attempted several changes with our model but were ultimately forced to return to the drawing board and implement an entirely new model. This talk will outline key themes related to developing a “production-worthy” analytic: expanding the scope to solve the operational problem, balancing false negatives and false positives, incorporating software and systems engineering concerns, and measuring performance from several perspectives. We will discuss the specific tools and techniques that we used to overcome the various challenges we faced, and impart the lessons we have learned on our long and winding road to version 3.0 of our fraudulent domain analytic.

Attendees Will Learn:
Attendees will learn valuable skills for how to test their analytics from different perspectives. From an operational perspective, we will discuss how to evaluate analytics for coverage of the problem and false positives. We will detail different approaches for how to overcome challenges on either side of the spectrum. From a software perspective, we will discuss how to use code profiling tools to determine the computational performance of analytics. We provide specific examples of how the use of these tools can improve the quality of an analytic and allow a developer to move closer to “production ready.”

avatar for Emily Heath

Emily Heath

Capability Area Lead, The MITRE Corporation
Emily Heath is the Capability Area Lead for Cyber Data Analytics and Malware in the Defensive Operations Department at the MITRE Corporation. Her work focuses on the application of machine learning, analytics, and optimization approaches to problems in cybersecurity, ranging from... Read More →

Wednesday January 8, 2020 8:30am - 9:00am EST
Regency Ballroom Hyatt Regency Savannah 2 W. Bay Street Savannah GA 31401