A Survey of Visualization Systems for Malware Analysis

Authors

Markus Wagner1,2, Fabian Fischer3, Robert Luh1, Andrea Haberson1, Alexander Rind1,2, Daniel Keim3, Wolfgang Aigner1,2



1St. Poelten University of Applied Sciences, Austria
2Vienna University of Technology, Austria
3University of Konstanz, Germany


Abstract

Due to the increasing threat from malicious software (malware), monitoring of vulnerable systems is becoming increasingly important. The need to log and analyze activity encompasses networks, individual computers, as well as mobile devices. While there are various automatic approaches and techniques available to detect, identify, or capture malware, the actual analysis of the ever-increasing number of suspicious samples is a time-consuming process for malware analysts. The use of visualization and highly interactive visual analytics systems can help to support this analysis process with respect to investigation, comparison, and summarization of malware samples. Currently, there is no survey available that reviews available visualization systems supporting this important and emerging field. We provide a systematic overview and categorization of malware visualization systems and compare and categorize them from the perspective of visual analytics. Additionally, we identify and evaluate data providers and commercial tools that produce meaningful input data for the reviewed malware visualization systems. This helps to reveal data types that are currently underrepresented, enabling new research opportunities in the visualization community.


Overview

aaa















This image shows the different stages of malicious software analysis. Every aspect of the process is covered in this survey.

Interactive Exploration

Interactive categorization exploration of the covered papers.

Full Text

Postprint version of the conference paper.

Additional Material

This section will include examples of the output data of the different data droviders which were described in section 3. Additionally we will provide a list with the different used keywords for our literature research which were described in section 4.

Data Provider Report Examples

  • Anubis returns a high-level report that lists file, process, registry and network activity.
  • Cuckoo Sandbox' report file returns simple file, registry, and mutex interactions as well as limited static information.
  • CWSandbox is very similar to Anubis and Joe Sandbox: It returns a tidied up list of file system, registry, network and other OS operations the sample performed.

  • FireEye (MAS) returns a textual trace that includes general file information, Yara signature matches and malicious alerts (certain API calls, process activity, etc.) triggered by the sample.
  • JoeSandbox returns a comprehensive list of system activities and collects dropped files as well as a network trace.
  • Process Monitor (ProcMon) returns an abstracted view of the system’s API activity; its output includes the resource’s time and type of access as well as the stack of the respective thread.
  • API Monitor (APIMon) return an uninterpreted trace of API and system calls.
  • Dissassamblers and debugger will yield low-level data (e.g., CPU instructions) that is especially useful for image-based techniques and other raw-data visualization.

  • Keywords for Literature Research

    List of used keywords for the literature research


    Figures under CC-BY License

    Webpage with CC-BY content


    Funding

    This work was supported by the Austrian Science Fund (FWF) via the KAVA-Time project (P25489) and the Austrian Federal Ministry for Transport, Innovation and Technology via KIRAS project (836264).
    Additionally, it was partially supported by the DFG Priority Programme "Scalable Visual Analytics: Interactive Visual Analysis Systems of Complex Information Spaces" (SPP 1335).