Melting the ice; Looking into IcedID loaders

By Jonathan Khananshvili, SOC Team Leader and Senior Analyst at Bugsec.

The IcedID banking (aka Bokbot) Trojan was first discovered back in 2017 and has been around since then.
To this day, IcedID is spread chiefly via malspam emails typically containing Office file attachments while targeting large enterprises and financial organizations.

In March, April and May 2021, we observed a vast campaign initialized by IcedID (or more acquire “Lunar Spider”).

The campaign included highly targeted phishing emails from compromised accounts with malicious XLM4.0 macros that dropped PE for the actor’s reconnaissance. In most cases, at the last stages of the malware, it will cause deployment of REvil, Maze and Egregor ransomwares.

In today’s post, we’ll share the flow and the evolution of the initial attack while sharing our recommendations for hunting in your enterprise’s network.

Digital documents often contain hidden metadata about a given file,
While threat actors are often aware and constantly hide or either randomize that, identifying a pattern of hidden metadata within a campaign can be extremely valuable for threat researchers to map all campaign operations and have a good understanding of the threat actor, the scale, and maybe even his specific purposes or targets.

By gathering a massive number of samples that we found connected to that ICedid campaign, we could identify some clear metadata indicators about the actor and the campaign itself, such as the author of the document, his creation date, and the heading pairs, which display a count of the sheet’s type in the document’s original language.

We’ve found those parameters connected to the EtterSilent maldoc builder that made huge waves in the underground forums in the last months, including serving many other actors such Gozi and Qbot.

Author: Rabota
Create date: 2015:06:05 18:19:34Z
Heading Pairs(s): Листы, 1, Макросы Excel 4.0, \d{1,6}[ID1] [ID2]

The delivery method starts with a phishing email and an attached office document, alternatively a URL to download a one. We could observe many unique spear-phishing emails specific to the target, mainly by using compromised accounts to mimic a conversation with a familiar person to build credibility with the target. It appears that Lunar Spider put their effort into improving their SE (Social engineering) skills.

Although a unique number of emails were recorded, it appears that our spider was lazy with the naming convention of the malicious documents. We’ve conducted a list of the four main naming convention we’ve studied during our research.

Bear with us. In the other hunting section, we will share regex and hunting recommendations to check whether you have been targeted or compromised.

First, let’s take an example of an attachment of early March.
Reviewing the excel document shows a typical social engineering image that requests to enable edit and content to allow execution.

We can quickly identify that the document is a CDF document by reviewing his magic bytes.

While no VBA exists in the document, and no odd content is present from a first impression, we should take an in-depth look here.

We can review the CF’s data streams and storages inside the document and establish the macro source.

A book stream of a CDF document begins with a BOF record and is followed by workbook global records up to the first EOF. The workbook global section contains a BOUNDSHEET record for each sheet in the workbook.

BOUNDSHEET record will shed some light on the sheet’s type and hopefully give us an idea of what we’re dealing with.

By reviewing the sheet type property offset, we can see that 2 of the three sheets have Excel 4 macros. Excel 4.0 is a 30-year-old feature of Microsoft Excel gaining popularity among malware in the last year.

The interesting part here is the author of the document decided not to use the “hsstate” option, which can establish “hidden” and “very hidden” stages on the given formulas. The author chose to do so, which is probably related to the number of detections that were implemented lately, specifically on that option. XLM 4.0 is not popular. Additionally, documents that use “hidden” and “very hidden” stages can be extremely odd.

Reviewing the macros:

After we developed an understanding that two of the sheets containing XLM 4.0 formulas, we could extract and reveal them from both sheets.

The flow of the macro is obfuscated. All the formulas and strings are scrambled in between the cells and sheets.

The deobfuscate version is the initial dropper to download the 1^st stage of the malware:

It’ll declare a dat file with an Excel’s NOW() function that can return a numeric serial number of execution’s date and hour. We’ve found this one a great pivot point for hunting compromised machines, we’ll discuss about it further in the hunting section.

Afterwards, it’ll try to download this file from 3 different C2 locations using URLDownloadToFileA winapi:

188.127.254.114
185.82.219.160
45.140.146.36

As part of the downloading, it will save the file under a different name. In this example, it is “SOT.GOT{0-2}”

The last part will be the execution of the above. It will use rundll32 to run ‘SOT.GOT’ with a given function, “DllRegisterServer.”

As you may guess, the “dat” file isn’t a regular data file. It is a PE dll. [ID3]

During our transverse study, we could leverage the evaluation of another delivery method used by the actor. The technique runs XLM 4 macros from scrambles cells to download and execute the same discovery function precisely as before, however, this time with an OOXML document format compared to the above OLE.

Since of the middle of march, we could see the OOXML version as the main distribution delivery method that is associate with EtterSilent builder.

OOXML is typically stored as a compressed ZIP archive, which gives us the option to navigate through it.

In the workbook.xml we can see the auto_open on the hidden sheet1, cell AK2.

The content of the sheets formulas is in the macrosheet directory. While the XML files contain many unrelated characters, we want to filter only the XLM 4.0 formulas.
We can do that by using bs or any regular expression processor we like and simply filter <f> </f> tags.

Ref: https://docs.microsoft.com/en-us/office/open-xml/working-with-formulas

from bs4 import BeautifulSoup
import sys

with open (sys.argv[1],'r') as f:
     pars = BeautifulSoup(f.read(), "html.parser")
     for i in pars.find_all('f'):
           print(i.text)

Figure 2: Exactly the same properties as above OLE2 Figure

Unpacking the dll is trivial, while no complicated anti-debugging techniques were used. The packing technique that was used is a self-injection (aka PE-overwrite); it’ll allocate a new memory region, change the permission of the stub, and essentially overwrite it back to the PE’s section.

First, since it is a dll, it requires us to set it into the debugger with rundll32.exe and the target dll and function to be executed DllRegisterServer

Setting a bp on VirtualAlloc and follow execution to his return address, the RAX register will hold the address of the newborn allocated buffer.

After the 3^rd VirtualAlloc’s hitting, it will simply write the unpacked PE to the empty memory region created earlier.

Most of the dll’s capabilities including a generic recon on the target machine to the actor’s C2 server.

The malware calls several WinAPI functions to collect information about the target machine. “RtlGetVersion” and GetNativeSystemInfo” are to gather the Windows version, build, and CPU architecture.

It’ll eventually be declared under the “_gat” parameter within the cookie.

Adapters information using GetAdaptersInfo function from IPHLPAPI module

This image has an empty alt attribute; its file name is image-23.png

Machine name and username

Those parameters eventually will decide whether the 2^nd stage of the malware should be download. In the next episode, we will take in deep dive into ICEdid’s C2 servers and see how we can manipulate them.

Hunting Time!

As described at the beginning of the article, we could build a hunting recommendation for our clients by analyzing the campaign behavior to check whether they have been targeted or compromised.

Email gateway:

Using our naming convention list we’ve conducted; you can hunt for delivered or quarantined attachments and URLs that associate with IcedID and Qbot attachments.

Pseudocode Queries (SPL Like)

Index=<Email_gateway>    earliest=<Time_Range>
| regex FileName = "^(Document|DEBT|Comission|documents|Complaint|Overdue-Debt|inv|Overdue|Outstanding-Debt|Cancellation |CompensationClaim|Compensation|invoice|Copy|Compaint-Copy|Permission||)(_|-||)\d{8,12}+(_|-)\d{8,12}.(xls|xlsm)$"

Index=<Email_gateway>    earliest=<Time_Range>
| regex FileName = "^(statistics|fedex|Claim|Indebtedness|Contract|services|catalogue|-\d{8,12}.(xls|xlsm|zip)$"

Detecting attachments with target first and last name: 
Index=<Email_gateway>    earliest=<Time_Range>
| regex FileName = "^[a-zA-Z]+_[a-zA-Z]+(-|\.)(\d{8,12}|\d{2}).(xls|xlsm|zip|doc)$"

NOTE: If you have a proper lookup table with employee details, you can optimize it with that.

All above are relevant to mail attachments, however as described in the article, another delivery method is with a direct download to the attachment. Hence, a URL rather than an attachment.

In that case you can use the below queries in order to look only on the target download file

Index=<Email_gateway>    earliest=<Time_Range>
| mvexpand url
| eval File_name = mvindex(split(<URL_Field>,"/"),-1)
| regex File_name = "^(Document|DEBT|Comission|documents|Complaint|Overdue-Debt|inv|Overdue|Outstanding-Debt|Cancellation |CompensationClaim|Compensation|invoice|Copy|Compaint-Copy|Compaint-Letter|Permission||)(_|-||)\d{8,12}+(_|-)\d{8,12}.(xls|xlsm)$"

Index=<Email_gateway>    earliest=<Time_Range>
| mvexpand url
| eval File_name = mvindex(split(<URL_Field>,"/"),-1)
| regex File_name = "^[a-zA-Z]+_[a-zA-Z]+(-|\.)(\d{8,12}|\d{2}).(xls|xlsm|zip)$"

Index=<Email_gateway>    earliest=<Time_Range>
| mvexpand url
| eval File_name = mvindex(split(<URL_Field>,"/"),-1)
| regex File_name = "^(statistics|fedex|Claim|Indebtedness|Contract|services|catalogue|-\d{8,12}.(xls|xlsm|zip)$"

Proxy logs (dat and jpg files)

During our research, we found a huge number of domains and IPs associated with the download of the first stage dll. It appears that IcedID developed a high scale operation that can handle those amounts of C2 servers.

As a result, we have tried to find a pattern that can help us identify most of IcedID’s dll requests regardless of the actual IP or domain that was explicitly involved.

What we can do is to understand the “file generator algorithm” that the actor is using. which the “now()” function of excel4 with a .dat extension.

The now() function uses 1900 date system to declare the date. The date is converted into a serial number representing the number of elapsed days, starting with 1 for January 1, 1900. the second component is the hour, minutes and seconds, which calculate with a division of 86400 (60*60*24)

An example:

We can simply illustrate in python:

import time
import datetime as dt

def excel_date(date):
    temp = dt.datetime(1899, 12, 30)    # not 31st Dec but 30th
    delta = date - temp
    return float(delta.days) + (float(delta.seconds) / 86400

print(excel_date(dt.datetime.now()))

Notes:

The first 2 integers have to be 4, given the campaign is in 2021
13 is the highest number that can be declared in the time’s calculation

Pseudocode query (SPL)

Index=<Proxy_data> earliest=<Time_Range>
| mvexpand url
| eval File_name = mvindex(split(<URL_Field>,"/"),-1)
| regex File_name = "^[4][4]\d{3}(,|\.)\d{3,13}.(dat|jpg)"

Regex (alone):
[4][4]\d{3}\,\d{3,13}(\.|,)(dat|jpg)

EDR:
Rundll32.exe execution with the given functions: DllRegisterServer, pluginint

Pseudocode query (SPL):

Index= <EDR_processes>   earliest=<Time_Range> rundll32.exe AND (DllRegisterServer OR pluginint)

NOTE: false positives can be identified by verifying the dll who executed the functions.

Sum up:

The IcedID group continues to spread on a colossal scale using a high-performance maldoc builder sold in the darknet. The primary delivery method uses compromised mail accounts to locate an existing conversation to increase the chances of fooling the victim. In this article, we’ve looked into IcedID’s delivery method and his loader techniques,
while breaking patterns into pieces to provide good hunting searches and ideas in your enterprise.

We’re highly recommended to implement the YARA rules and follow our hunting queries and ideas.

Suspect of being compromised? Contact us at +972-52-2804516

YARA rules:

OLE:

rule BekaBot1 {
   meta:
      description = "BekaBot1"
      author = "Yonatan K"
      date = "2021-04-26"
   strings:
         $required_1 = { 85 00 ?? 00 [5] 01 }
         $required_2 = "rabota" fullword nocase wide ascii
         $required_3 =  "rundll32" fullword nocase wide ascii  


   condition:
      uint32(0) == 0xE011CFD0 and all of ($required*) and filesize < 400KB

OOXML:

rule BekaBot_2 {
   meta:
      description = "BokBot2"
      author = "Yonatan K"
      date = "2021-04-26"
   strings:
        $mxlsx  = /^\x50\x4B\x03\x04/
        $s1 = /xl\/macrosheets\/[a-zA-Z0-9_-]+\.xmlPK/
        $s2 = "mP2[%5p2" fullword nocase wide ascii
        $s3 = "MFJyfR" fullword nocase wide ascii
        $s4 = "3{uOP(z1sY" fullword nocase wide ascii
        $s5 = "Oc=VdMVe]" fullword nocase wide ascii
        $s6 = "FrgWtR%Y"fullword nocase wide ascii
        $s7 = "T't[I/" fullword nocase wide ascii
        $s8 = "/}234K%" fullword nocase wide ascii
        $s9 = "Jnr@Kh}" fullword nocase wide ascii
        $s10 = "HF:'oF" fullword nocase wide ascii
        $s11 = "EOm(n$"  fullword nocase wide ascii
        $s12 = "*TSqrwJ"  fullword nocase wide ascii
        $s13 = "FVx%gh"  fullword nocase wide ascii
        $s14 = "jVpOya"  fullword nocase wide ascii
        $s15 = "3M.1wK"  fullword nocase wide ascii


   condition:
      $mxlsx at 0  and 3 of ($s*) and filesize < 500KB

References:

https://www.openoffice.org/sc/compdocfileformat.pdf
https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/198656/EDidriksen.pdf?sequence=1
https://www.loc.gov/preservation/digital/formats/digformatspecs/Excel97-2007BinaryFileFormat(xls)Specification.pdf

https://www.loc.gov/preservation/digital/formats/digformatspecs/Excel97-2007BinaryFileFormat(xls)Specification.pdf

https://www.intel471.com/blog/ettersilent-maldoc-builder-macro-trickbot-qbot/

Written by Jonathan Khananshvili, SOC Team Leader and Senior Analyst at Bugsec.

IRON

Melting the ice; Looking into IcedID loaders

Hunting Time!

Leave a Reply Cancel reply