Inspecting live/historical packet captures (pcaps) is a typical, daily task for a security analyst. Establishing a methodology to analyze a large amount of packet flows, recognize traffic abnormalities and distinguish normal from malicious patterns is an essential skill to develop.

Packet captures can provide great context throughout the different stages of an attack lifecycle. For example, you can extract indicators of compromise (IoCs) of the malware dropped to gain initial foothold on a targeted network and establish C2 connectivity. It can also contain entries of which pivoting techniques were used and ultimately the metadata surrounding the utilized exfiltration channel.


Scenario

malware-traffic-analysis.net offers a variety of pcap files available for download to enhance your traffic analysis skills. Today we are going to take a look at the January 2020 exercise entitled “Sol-Lightnet”. The goal of this exercise is to write a malware incident report based on the provided packet capture file. We will go through different pcap analysis stages such as writing initial packet filters, exporting file objects and analyzing malware.

The description of the exercise includes more details on the network segment from which the packet capture was taken:

  • LAN segment range: 10.20.30.0/24
  • Domain: sol-lightnet.com
  • Domain controller: 10.20.30.2 (Sol-Lightnet-DC)
  • LAN segment gateway: 10.20.30.1
  • LAN segment broadcast address: 10.20.30.255

Initial analysis

Triaging packet captures can be a time consuming task, especially if we are dealing with a large amount of packet flows and have few leads available to follow. Whenever I get my hands on a packet capture file, I like to start off with extracting basic statistics. This includes identifying the used protocols, the conversations and associated ports. By using the Tshark network protocol analyzer, we can apply filters to quickly gather these valuable pieces of information.

To start off, we create a Tshark filter that displays the total amount of packets for each network protocol. The output below shows that the capture file has over 13,000 packets, with ‘Microsoft Netlogon Remote Protocol’ being the most frequent protocol. As the capture is taken from within a Microsoft Windows domain, this should be considered as common traffic. Unsurprisingly, there is also a substantial amount of TCP traffic, inherent to the sessions of higher-level protocols. Lastly, the TLS traffic indicates the presence of encrypted (web) traffic.

pierre@gatekeeper:~/Documents/infosec/pcaps$ tshark -n -r 2020-01-30-traffic-analysis-exercise.pcap -T fields -e _ws.col.Protocol | sort | uniq -c | sort -nr | head -10
   6096 RPC_NETLOGON
   4746 TCP
   1249 TLSv1.3
    355 TLSv1.2
    188 SMB2
    166 DNS
    150 LDAP
    130 HTTP
     76 DRSUAPI
     66 NBNS

Our next triage filter consists of searching for active connection(s). In other words, we want to create a Tshark filter that looks for established, server-side TCP connections. You can sort of compare this with the ‘established’ connection state from the netstat command in Windows. The establishment of a TCP connection is based on a three-way handshake. The individual packets sent during the three-way handshake can be recorded in a pcap. Unfortunately, filtering out the handshake messages in Tshark/Wireshark is not a trivial task. We have to manually develop a packet display filter that focuses on the unique attributes associated with the third message (ACK) of the handshake.


During the final-stage of the three-way handshake, the client acknowledges the SYN/ACK sent by the server with a ACK message. This acknowledgment response has a zero-length payload. Upon reception, the server enters the established state and sets the ACK value to 1.

Tshark/Wireshark works with relative sequence numbers, which means that it will display a SEQ and ACK number relative to the first seen segment for that conversation. Therefore, every SYN in a three-way-handshake has a value of 0, except for the third message (ACK) which has a value of 1.

By converging these three unique attributes (in bold) of the ACK message, we end up with the following tshark filter: tcp.seq == 1 and tcp.ack == 1 and tcp.len == 0

After applying this filter, we end up with a list of the top 10 established TCP connections grouped by destination port. The output shows that the majority of TCP connections are established on port 80 and 443, presumably HTTP and HTTPS. We will definitely take a closer look at this traffic in later stages of our analysis.

pierre@gatekeeper:~/Documents/infosec/pcaps$ tshark -n -r 2020-01-30-traffic-analysis-exercise.pcap  -Y "tcp.len == 0 and tcp.seq == 1 and tcp.ack == 1" -T fields -e tcp.dstport | sort | uniq -c | sort -nr
     51 80
     39 443
     23 389
     21 88
      8 49675
      6 135
      3 445
      1 3268

Next, we want to break down the conversations in the capture file. A conversation is simply a unique network connection between two endpoints. The statistics below shows the aggregation of unique conversations based on source/destination IP-address. The IP-address 10.20.30.227 is particularly interesting to us, as it is listed in all of the top 10 entries.

pierre@gatekeeper: ~/Documents/infosec/pcaps$ tshark -n -r 2020-01-30-traffic-analysis-exercise.pcap -T fields -e ip.src -e ip.dst | sort | uniq -c | sort -nr | head -10
   3748 10.20.30.227    10.20.30.2
   3670 10.20.30.2      10.20.30.227
   1596 151.101.0.238   10.20.30.227
    854 216.58.193.132  10.20.30.227
    557 104.97.213.249  10.20.30.227
    299 10.20.30.227    216.58.193.132
    265 10.20.30.227    151.101.0.238
    202 172.217.9.170   10.20.30.227
    142 104.81.85.107   10.20.30.227
    124 172.217.14.163  10.20.30.227

Given the exercise description, we know that the IP-address 10.20.30.2 belongs to the domain controller. But what about 10.20.30.227? As a final step of the pcap file triage, we will gather more context around this indicator. For example, we should be able to determine the operating system and manufacturer by just analyzing fields of certain network protocols. In this case, we are particularly intersted in the DHCP, NBNS, HTTP and Kerberos traffic. The following Tshark filter can be used to extract various asset attributes:

  • Source MAC-address associated with the IP-address: 58:94:6b:77:9b:3c. The OUI corresponds to Intel Corporate.
  • Hostname in the NBNS Registration packets: DESKTOP-4C02EMG. This is the default hostname naming convention in Windows 10.
  • Username in Kerberos traffic: alejandrina.hogue. The logged on user.
pierre@gatekeeper:~/Documents/infosec/pcaps$ tshark -n -r 2020-01-30-traffic-analysis-exercise.pcap -Y "(ip.src == 10.20.30.227) and (bootp or nbns or kerberos.CNam
eString and !(kerberos.CNameString contains $))" -T fields -e ip.src -e eth.src -e tcp.dstport -e _ws.col.Protocol -e kerberos.CNameString -e _ws.col.Info -E he
ader=y -E separator=, | sort | uniq -c | sort -nr
     24 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Name query NB WPAD<00>
     12 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Refresh NB DESKTOP-4C02EMG<20>
      7 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Registration NB SOL-LIGHTNET<00>
      7 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Registration NB DESKTOP-4C02EMG<20>
      7 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Registration NB DESKTOP-4C02EMG<00>
      3 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Name query NB ZPCKJIKZGJVCK<00>
      3 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Name query NB SRJNZGVKLQXNA<00>
      3 10.20.30.227,58:94:6b:77:9b:3c,,NBNS,,Name query NB NOVVSLYN<00>
      2 10.20.30.227,58:94:6b:77:9b:3c,88,KRB5,alejandrina.hogue,AS-REQ

Examining the HTTP traffic

Recall the high number of established connections over port 80 that we flagged earlier. Granted that this is standard HTTP traffic, the header and message body are unencrypted and indicators of an infection could ‘hide’ in plain sight. For example, the request method, URI and user agent are ubiquitous fields of a HTTP request header that might contain unusual values.

Lets examine all of the HTTP requests that were transmitted by DESKTOP-4C02EMG, hereinafter referred to as “VictimPC”. To strip out the HTTP requests, we simply use the Tshark display filter: http.requests. This does not, however, get rid of the noisy Simple Service Discovery Protocol (SSDP) traffic. Windows uses the SSDP discovery service to discover UPnP devices. The HTTP traffic is sent over UDP port 1900 which we can ignore with the subfilter not udp.port eq 1900.

pierre@gatekeeper:~/Documents/infosec/pcaps$ tshark -n -r 2020-01-30-traffic-analysis-exercise.pcap -Y "((ip.src==10.20.30.227)) and (http.request and not udp.port eq 1900)" -T fields -e frame.time -e ip.src -e ip.dst -e tcp.dstport -e http.host -e http.user_agent -e http.request.method -e http.request.uri -E header=y -E separator=* > http_request.csv

We end up with a CSV-formatted timeline of HTTP requests in chronological order.


Now some of the HTTP requests are more suspicious than others. Before following up on the timeline entries, I like to visually mark them with different colors:

  • Red entries have a high confidence level of being malicious.
  • Amber entries have a neutral confidence level and can turn red or green after further analysis.
  • Green entries have low confidence level of being malicious and can be ignored.


We now have less than 20 red/amber entries to concentrate on. The ‘follow HTTP’ feature in Wireshark can help us dig into the raw request/response data of the individual HTTP streams.

Row 6: HTTP GET request to gengrasjeepram[.]com/sv.exe - the “mz” file header, followed by the DOS stub text This program cannot in DOS mode indicates that VictimPC downloaded a Windows PE (portable executable).


Row 7: HTTP GET request to api[.]ipify[.]org - malware often queries various APIs of external IP lookup websites to capture the victim’s public IP-address. In addition, the substring “Windows NT 6.1” is included in the user agent value for this request. This value is interesting to say the least, as VictimPC is almost certainly running Windows 10 instead of Windows 7. Overall, there is a clear discrepancy in the timeline between the user agent column values in green and the ones in red.


Row 8: HTTP POST request to twreptale[.]com/4/forum.php - notice that the system information of VictimPC is enclosed in the body of the HTTP request. The different objects (GUID, BUILD, INFO, IP, TYPE WIN) of the payload seem to be unique characteristics of the malware were after.


At this stage, Yara can help us to identify the malware. A Yara rule consists of string signatures that define and classify malware samples. Searching for the objects in the Yara signature repository, results in hits on the MAL_hancitor.yar Yara rule.

pierre@gatekeeper:~/opt/yara$ grep -rin –include=\*.yar 'GUID.*BUILD.*INFO.*IP.*TYPE.*WIN' 
../MALW_hancitor.yar:21:   $g = "GUID=%I64u&BUILD=%s&INFO=%s&IP=%s&TYPE=1&WIN=%d.%d" 
ascii fullword

Hancitor is trojan horse that was first observed in 2014. It is commonly distributed through phishing emails via a Word document with embedded malicious macros or DDE exploit code. A core capability of Hancitor is that it injects a dropped DLL or executable without writing it to disk, also known as a fileless infection. Hancitor mainly serves as intermediate that downloads another malicious payload, e.g. ransomware and data stealing malware.

After infection, Hancitor starts sending POST requests to a C2 server with information about the infected system. The data gets sent in a list with the objects we’ve seen earlier:

GUID - unique ID generated from result of GetAdaptersAddress API
BUILD - malware build version (also used as key for config decryption/encryption)
INFO - computer and account names
IP - infected user’s public address
TYPE - harcoded value of “1”
WIN - Windows version and the architecture of the infected machine

The response from the server looks to be encoded. As Hancitor has been around for a while, the encoding algorithm is well-known. This article mentions the following de-obfuscation routine: “Base64 decode + XOR with the “0x7A” key.” We can quickly use a tool like Cyberchef to extract the data from the payload.


Row 9: HTTP GET request to xolightfinance[.]com/4/bhola/images/1 - this is one the URLs embedded in the payload we just decoded. The response of the server seems to be encrypted. This is probably the stage where the Hancitor malware attempts to download the second stage malware.


Row 10-11: HTTP POST request to twreptale[.]com/mlu/forum.php and twreptale[.]com/d2/about.php → the content of both request and response message body look odd. A quick Google search on the URI’s reveals that these HTTP requests are likely C2 callbacks of the credential stealing malware, Pony. We will confirm this later on, when we take a closer look at the malware sample itself.


Row 12-14 and 32: more HTTP POST requests to twreptale[.]com - these are Hancitor C2 callbacks, the response of the server can be decoded with the same algorithm we’ve used earlier. The value JHSQARRABw== translates to: “^.ê{n:}”. This is most likely an instruction sent by the C2 server.


Row 27,28,33 until 39: these amber entries could not be correlated with the infection traffic and can be considered benign.


Basic Malware Analysis

With just a few clicks, Wireshark can extract the HTTP objects from the capture file. In this case we are only interested in the payloads which were downloaded remotely. We can consequently analyze these malware samples to collect host-based IoCs/artifacts and confirm our findings highlighted in the web analysis phase.

  • Hancitor - sv.exe - 995cbbb422634d497d65e12454cd5832cf1b4422189d9ec06efa88ed56891cda
  • Pony (most likely) - 1 - f8e421120a437352ec4f4f73f729b86795917ca1b36b28ff201b2e295ca6f627


Before we detonate the malware in our sandbox, we will parse some of the static properties (e.g. version info, compilation date) from the PE file. The extracted data from ‘sv.exe’ reveals that it is attempting to impersonate a legitimate software utility: ‘Opus Viewer’ (third row in the table below). Malware authors commonly built their executable with the same information as legitimate installers, in an attempt to make the binary look more genuine. In addition, there is diverse set of imports declared by the malware sample. Imports are located in the Import Address Table (IAT). The IAT contains a list of function pointers to API’s located in external DLL’s. The function pointers declared by “sv.exe” suggests that it performs various operations on the system such as process creation, file/registry manipulation and network communication. No surprises here.

   
File Size
80KB (81920 bytes)
LegalCopyright
Copyright 1999-2012 GP Software
Description
PE32 executable (GUI) Intel 80386, for MS Windows
LegalTradeMarks
Directory Opus, PCOpus are trademarks of GPSoftware
Comments
Directory Opus Image Viewer
OriginalFileName
d8viewer.exe
CompanyName
GP Software
ProductName
Directory Opus
FileDescription
D8Viewer
Compile Time
Wed Oct 30 05:38:49 2019 UTC
FileVersion
4, 0, 0, 8
Compiler/Packer
Borland Delphi 30
RegCreateKeyExA (advapi32.dll)
GetFileAttributesA (kernel32.dll)
VirtualProtect (kernel32.dll)
LoadLibraryA (kernel32.dll)
GetStartupInfoA (kernel32.dll)
CreateProcessA (kernel32.dll)
ShellExecuteA (shell32.dll)
socket (wsock32.dll)
connect (wsock32.dll)
closesocket (wsock32.dll)

In order to expose the dynamic behavior of the malware in our sandbox, we will utilize a tool named “Noriben”. Noriben is a Python script that works in conjunction with Procmon (from Sysinternals) and automatically collects, analyses and reports on runtime indicators of malware. Upon execution of the malware, it aggregates all of its network, process, registry and file events and stores it in a text/CSV timeline file.

After running “sv.exe” in our sandbox, only a small number of events were recorded by Noriben. The output shows that our sandbox did, initially establish a connection to the IP 54.234.147.226, which corresponds to the ‘IP-lookup domain’ ipify.org (row 7 of http timeline). There is no more activity recorded after this.

-=] Sandbox Analysis Report generated by Noriben v1.8.3
-=] Developed by Brian Baskin: brian @@ thebaskins.com  @bbaskin
-=] The latest release can be found at https://github.com/Rurik/Noriben

-=] Execution time: 34.47 seconds
-=] Processing time: 1.59 seconds
-=] Analysis time: 0.90 seconds

Processes Created:
==================

File Activity:
==================

Registry Activity:
==================
[RegSetValue] sv.exe:7556 > HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\Cache\Content\CachePrefix  =  
[RegSetValue] sv.exe:7556 > HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\Cache\Cookies\CachePrefix  =  Cookie:
[RegSetValue] sv.exe:7556 > HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\Cache\History\CachePrefix  =  Visited:

Network Traffic:
==================
[UDP] svchost.exe:1684 > 192.168.1.254:53
[UDP] 192.168.1.254:53 > svchost.exe:1684
[TCP] sv.exe:7556 > 54.243.147.226:80
[TCP] 54.243.147.226:80 > sv.exe:7556

Unique Hosts:
==================
192.168.1.114
192.168.1.254
54.243.147.226

Where are the C2 callbacks and how about the download of the second-stage malware? The answer is simple. The local DNS server was unable to resolve the malicious domains associated to the malware as they’ve been taken offline. Example below:


At this point, we could use a tool like FakeNet-NG to redirect and capture the initial Hancitor C2 traffic. However, the malware would still not be able to fully execute, e.g. the second-stage malware cannot be downloaded. As the domains have been taken offline, our only option is to look for historical uploads of the malware on online sandboxes.


Online Sandbox Analysis

When dealing with sophisticated malware, there is nothing that beats an experienced reverse engineer manually interacting with the malware in question. This can however, be a time consuming process where some form of automation is required. This is where the usage of online sandbox services like Any.run and Hybrid Analysis come into play. Besides being an amazing resource for generating threat hunting hypothesis, online sandboxes can support in repetitive and frequent malware analysis tasks.

We continue by gathering some quick intel on both files we exported earlier, by using a tool called Munin developed by Florian Roth. It basically checks the presence of file hashes and associated attributes on popular online malware analysis platforms such as Virustotal, URLhaus and Any.run. The real strength of Munin lies in automating lookups for a high number of files. For demonstration purposes, we only run it against the two files, but you get the idea.

As a side note: Munin will only query the different APIs based on the file’s hash value. Hence, you should not be worried about uploading targeted malware when using this tool.


Right off the bat we can see that there are quite some findings for both malware samples:

  • Both files were first uploaded on 29 January, 2020. At the time of writing, this is just over a month ago.
  • VirusTotal has flagged both malware samples. The first file, ‘sv.exe’, is even associated to an accurate malware family name (Win32/Hancitor.A!MTB and Win32/TrojanDownloader.Hancitor.0) by Kaspersky and ESET-NOD32.
  • The file hashes correspond to different filenames, e.g. “d8viewer.exe” and “2020-1-30-traffic-analysis-exercise.jar”.
  • Many of the malicious domains are taken offline, confirmed by URLHaus.
  • Any.run, Hybrid Analysis and Cape have historical sandbox analysis reports available.

In comparison with our offline sandbox Navibor, the analysis results of Any.run are far more extensive. By the time this malware sample was uploaded (2 February 2020), the C2 domains were still accessible. Therefore the malware did not pause or terminate unexpectedly. It looks like the Hancitor trojan was spread via a phishing campaign. The malicious Hancitor payload, in this case named “gift.exe”, was embedded as an Excel macro and ultimately dropped on the system in the users’ Documents folder. Hancitor then spawned an instance of the system process, svchost and injected it with malicious code. Another payload is downloaded and used as a data theft tool. It is recognized by Any.run as Pony and queries serveral files/directories of browsers and FTP applications for credentials. It is safe to say that we have empowered and confirmed our earlier web analysis findings and fully completed the malware incident report.

Host artifacts  
Network activities sv.exe connected to api[.]ipify[.]org
sv.exe connected to twreptale[.]com
sv.exe connected to xolightfinance[.]com
File activities sv.exe accessed files within "C:\ProgramData\FileZilla\"
sv.exe accessed files within "%APPDATA%\SmartFTP\"
sv.exe accessed files within "C:\ProgramData\VanDyke\Config\Sessions\"
sv.exe accessed files within "C:\Users\%USERNAME%\AppData\Local\CuteFTP\"
–redacted–
Registry activities sv.exe set value HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\Cache\Cookies\CachePrefix = Cookie:
sv.exe set value HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\Cache\History\CachePrefix = Visited:
Process activities

Updated: