Neel Mehta Billy Leonard Shane Huntiey Google Security Team Version 1 0 Published September 5 2014 TLP Green Neel Mehta Billy Leonard Shane Huntley Google Security Team Version 1 0 Published September 5 2014 TLP Green Table of Contents introduction 2 Sofacy Analysis 3 Sofacy Persistence Mechanism 15 Sofacy Functionality 6 Network Sofacy X Agent Analysis 19 X-Age nt identifiers 20 X-Agent InternalsPersistence Network Air Gapped Operations 32 X Agent AppendixA 37 Appendix 39 1 Wm Introduction Many sophisticated state-sponsored attackers use multi-stage malware toolkits First stage implants are widely distributed easily discovered and serve as a simple beachhead in contrast complex second-stage implants are typically used sparingly on only the most interesting systems after determining there is limited risk of detection by security products As such a first-stage tool exists primarily to limit the exposure of second-stage tools extending their usable shelflife This analysis describes one family of malware a first-stage tool Sofacy and an associated second-stage tool X-Agent Sofacy is an antivirus industry name while X-Agent was named by the malware authors Together these tools are used by a SOphisticated state-sponsored group targeting primarily former Soviet republics NATO members and other Western European countries This information has been determined from VirusTotal submissions Antivirus detection for both Sofacy and X-Agent is subpar with plenty of room for improvement Antivirus detection for Sofacy based on VirusTotal data was roughly 36 6% Detection for X-Agent was lower at only 34 2% Our goal in releasing this analysis is to improve antivirus detection for both Consequently recipients of this paper are free to share it with interested parties in the security community This analysis is- Acknowledgements This analysis was compiled with the tireless help and extensive expertise of the Google Security Team especially Heather Adkins Daniel White Joachim Metz Andrew Lyons Liam Murphy Elizabeth Schweinsberg Matty Pellegrino Kristinn Gu j nsson Cory Altheide Armon Bakhshi and Mike Wiacek VirusTotal Submissions By Country Analysis of VirusTotal submissions for Sofacy and X-Agent yields insights into the attackers operations As a first stage tool Sofacy is used relatively indiscriminately against potential targets X-Agent is reserved for high priority targets This is borne out by the data VirusTotal submissions show that Sofacy was three times more common than X-Agent in the wild with over 600 distinct samples in the data set Proportional differences in the geographical distribution of submissions of the first-stage tool Sofacy and the second stage tool X-Agent provide some interesting insights For example the Republic of Georgia represents only 3 5% of Sofacy submissions but makes up 28 9% of all X-Agent submissions more than any other country This suggests that at one point Georgia was a high priority target for the attackers This ratio is likely a lagging indicator of attacker interest attackers must first be caught and compromise can go undetected for years The same comparison shows attacker interest in Ukraine Germany Poland Denmark and also Russia X-Agent submissions from the United States and Canada were proportionally smaller than Sofacy submissions Sofacy submissions to VirusTotal Israel 2 7% Ukraine 7 1% USA 22 6% Georgia 3 5% Belgium Austria 2'7 396 Germany 12 4% X Agent submissions to VirusTotal Denmark 31 1 996 Romania 1 9% Russia 193% Georgla 28 9% Germany Poland 15 4% 3 856 Canada Japan 3 3% 1 9% Submission Share Ratio of X Agent Sofacy in VirusTotaI by Country Georgia Romania Russia Denmark Poland Germany Ukraine Vietnam Canada 5 Korea USA Ratio 9x Sofacy Analysis Sofacy is an above average first-stage implant Early variants are more technically complex than recent ones For example older samples feature the ability to move seamlessly between processes and harvest credentials Newer variants are more mature and focused in their design They provide functionality to detect personal security products survey infected machines and install a second stage tool all without exposing techniques such as lateral process movement Dropped By Boring-Looking Exploits Sofacy is often delivered by Microsoft Word exploits as RTF DOC or DOCX files CVE-2012-0158 CVE-2010-3333 It is occasionally delivered by Adobe Acrobat PDF reader exploits These exploits are often first used by Chinese attackers but have been repurposed by the actors responsible for Sofacy To achieve this the Sofacy executable is swapped in for the original exploits payload leaving other parts intact including shellcode Sofacy Internals Position-Independent Code Development The authors of Sofacy use clever compiler tricks to produce a binary with no dependence on imports relocations or initial code position This allows it to be copied into another process and executed without any additional dependencies or setup requirements Assembly language development is slower and more expensive than development in higher level languages This ease of development comes at the cost of new dependencies on OS-specific loaders which complicate cross-process injection The authors of Sofacy have found an elegant middle ground which at first glance might appear to be hand written assembly but is consistent with the register allocation and plpelining of Microsoft Visual The entry point is passed two arguments a pointer to an address in kerne132 dll and a base address to identify where the code is in memory Function calls and global variables are then accessed as an offset from this base address Here are two examples Function Call Global Variable Access lea call lea mov eax eax eax esi 193721h esi 194 85h eax Each function is passed the code base address as its first argument and this is used consistently like a calling convention Thusly the authors have produced a position-independent binary with no external imports by utilizing preprocessor macros or a similar mechanism to do pointer math for each function call or global variable access In order to use system functions the start function first walks back in memory to find the start of the kernel32 module Then the code manually walks the export table hashing the function names to resolve the required imports A full list of hashes for the imports is shown below imported_function_hashes dd seg060 @0188189 seg000 0@1881C1 seg000 001881C5 seg060 661881C9 seg60@ 00188105 seg600 0018B1D9 seg@0@ @@18BlDD seg0002091881E5 seg080 601881E9 seg000 00i8BlFi seg000 001881F5 segeeo ooIsaIFo seg008200183201 seg600 80188205 seg0@@ @@188209 segoee @oisazoo seg000 00188215 seg999 901852i9 seg@0@ @@18B21D segO@@ 0@188221 seg000 00188225 seg080206188229 seg000 @018822D 593669 @6188231 seg690 00188235 seg800 00183239 seg808 98188230 30C48297h 64ll920Bh 235459 8h 6306C065h CloseHandle CopyFileW CreateDirectoryw CreateEventw CreateFileA CreateFileMappingw CreateFileW CreateMailslotw CreateMutexW CreatePipe CreateProcessw CreateRemoteThread CreateThread DeleteFilew ExitProcess ExitThread FindClose FindFirstFileA FindFirstFilew FindNextFileA FindNextFileW FreeLibrary GetCommandLinew GetCurrentProcess GetCurrentProcessId GetEnvironmentVariablew GetExitCodeProcess GetExitCodeThread GetFileInformationByHandle GetFileSize GetFileTime GetLocalTime GetModuleFileNamew seg000 seg000 seg000 seg000 seg000 seg000 seg000 seg000 5eg000 seg000 segaeaz seg000 segeoe segeoe segoeez seg000 segoeo segeoo 593000 segoeo segeea segoeez segoeo segeeo segeeo segeoe segGGO segeea seg000 segeae seg000 segeea segeea segaee segeea seg0002 segoee segae segeeo seg000 seg000 00183241 00183245 00183249 00183243 00183251 00183255 00183259 00183253 00183261 00183265 00183269 00183263 00183271 00183275 00183279 00183273 00183281 00183285 00183289 00183283 00183291 00183295 00183299 00183293 001832A1 001832A5 001882A9 001832AD 00183231 00183235 00183239 00183233 001832C1 001832C5 001832C9 001832CD 00183231 00183235 00183239 00183233 001832E1 001882E5 001832E9 001832ED 001832F1 001832F5 001832F9 001832FD 00183301 00183305 00183309 00183303 00183311 00183315 00183319 00183313 038E52932h 8937610Dh 51268313h 8A324136h 0346984E h 6E824142h 0C3734633h 0C3 34651h 0C3534933h 033434733h 033434751h 03407C411h 0353992A4h 579313E9h 56F73980h 0E1159330h 033234930h 78353983h 033016F89h 032089259h 4363076Ch GetPrivateProfileStringA GetPrivateProFileStringw GetProcAddress GetStartupInfow GetTickCount GetTimeZoneInFormation GetVersionExw GetVolumeInFormationW GlobalAlloc GlobalFree HeapCreate HeapDestroy IsBadReadPtr LoadLibraryW MapViewOFFile MultiByteToWideChar OpenMutexW OpenProcess PeekNamedPipe Process32FirstW ReadFile ReadProcessMemory ReleaseMutex ResumeThread SetCurrentDirectoryW SetEndOFFile SetFileAttributESW SetFilePointer SetFileTime SetThreadPriority SetThreadPriorityBoost Sleep TerminateProcess TerminateThread UnmapViewOfFile VirtualAlloc VirtualAllocEx VirtualFree VirtualFreeEx WaitForSingleObject WideCharToMultiByte WriteFile writePrivateProfileStringA WriteProcessMemory LoaderFuncHonath Sofacy persists on infected machines as an and compressed payload appended to a small loader executable file The loader the payload by permuting a 32-bit key and X0 Ring each byte with the lowest eight bits The last byte of the payload is left untouched The 32 bit key is initialized with a literal value in the loader s main function text 00401025 mov This value is then modified often using MMX instructions text text text text text text text 0040107F movd mm0 ebp var_28 00401083 mm0 2 00401087 movd mm0 0040113D mov eax ebp var_28 00401140 eax 4 00401143 inc eax 00401144 mov eax Finally the key is passed to the function text text text text 0040117A push ebp var_28 00491170 push 7D68h 00401182 push ebp var_8_buffer 00401185 call Here is the equivalent code in C void unsigned char payload size_t len unsigned int key unsigned char out for size_t i 0 i len unsigned char key A 1 8 8 0xff outEi payloadEi x key 0xea61 key 0x24142871 last byte is not obfuscated out i payloadEi LZSS Decompression The payload contains a decompression stub which implements a simple Lempel-Ziv variant commonly used in malware Malware will likely recognize the decompression code with one small change in addition to the first layer each compressed input byte is with a permutation of a hardcoded 32-bit key unsigned int key unsigned char next_byte input_byte key @xff Equivalent of x86 ror instruction key rotate_right key Dynamic Dependency Resolution The loader invokes the entry point of the position independent code blob it must then first resolve dynamic dependencies before doing anything else Sofacy identifies dynamic dependencies by iterating through a list of files in %windir% system32 hashing file names and comparing those hashes to a list of hashes for needed DLLs Ultimately it depends on at least the following system DLLs kernel32 dll user32 dll w52_32 dll shlwapi dll advapi32 dll iphlpapi dll pstorec dll inetmib1 dll snmpapi dll wininet dll setupapi dll shell32 dll ole32 dll 10 String Contemporary variants of Sofacy strings using an algorithm that resembles RC5 The loader contains three distinct blobs of data 1 Dynamic dependencies configuration and C2 servers 2 A list of antivirus and personal security products to detect 3 The actual implant binary Each variation of RC5 permutes an 8 byte block of data with an 8 byte key using a single round A 4-byte window of the key is used to each byte of input data For example taking the following key bytesThe first byte of input will be using the first 4 bytes of the keyThe second byte will be using bytes 2 through And eventually the window wraps at the 6th byte of input using the last 3 and first byte of the keyThe four key bytes along with an 8-bit representation of the input position is combined to generate an 8-bit value that is with the input byte Each algorithm is a variation on this theme unsigned char input size_t input_1ength unsigned char key for size_t i 0 i input_1ength unsigned int x y unsigned char a b c d unsigned char input_index_char i exff size_t block_index i 7 keyEi key i l 8 key i 2 key i 3 permute values - get an 8-bit value to xor with input byte inputEi x There are at least 6 variations of the permutation algorithm For most Sofacy samples one of these six variations can be used to the three blobs of data Variation input_index_char 4 b i put_index_char II 00 Variation 2 input_index_char 903 - ll a input_index_char block_index b ll Variation 3 input_index_char 9 9 2 a input_index_char 7 A b 12 @xff Variation 4 a 2 input_index_char b10ck_index d input_index_char @xff i c b y Variation 5 a input_index_char 4 Ofo b input_index_char @fo d y Oxff Variation 6 a input_index_char b10ck_index Bxff b d input_index_char C y @xff Parameter Store Recent Sofacy droppers configuration data and store it in a registry key This key hangs off HKLM if the dropper has permissions to write there otherwise HKCU and is located at The configuration data is stored in a proprietary key value format It starts with a 6-byte key followed by 20 bytes of UINT8 The remainder of the data is configuration values Configuration values are identified by their index into the length table The parameter store allows run-time updates to the configuration and serves to separate it from the implant binary For example the C2 servers cannot be found in the implant binary and may only be recovered statically from a dropper or by the data from the parameter store Keystroke Logging Sofacy's keystroke logger attaches its input processing methods to those of the active foreground window It polls the foreground window detecting changes as the user switches applications it also captures process context such as executable paths and arguments Captured keystrokes are normalized to Unicode taking into account the active keyboard layout Inter-Instance Communication Via Mailslots Sofacy communicates with itself over a mailslot3 such as Mailslot LSAMailSlot As an example the keystroke logger uses this mailslot to communicate with the main Sofacy process As it receives keystrokes it sends them back over the mailslot as serialized HTML Another instance of the implant running in a different process will read the keystroke log data from the mailslot it and re-transmit it over the C2 network connection 14 Persistence Mechanisms Persistence Via LNK Shortcuts Sofacy may persist via changes to an existing LNK file4 in a shell startup folder This LNK file is invoked each time the user logs in Sofacy adds a Shell Item to the end of the LNK file The shell startup folder locations are determined by reading the following registry keys Folders Startup Folders DesI top Folders Commor1 Startup Folders Common Desktop Sofacy scans the startup folders for an appropriate pre-existing LNK file The LNK file's original timestamps are captured and a small change to the file is made After modification the Windows API SetFileTime function is called to restore the file's creation last access and last write times An example LNK file is included in Appendix A Persistence Via Windows Shell Sofacy is also known to persist via Quick Launch6 folders Shell Icon Overlay Handlers and Shell Service Objects Older versions of Sofacy may drop itself into one of the following Quick Launch folders Data Microsoft Internet Explorer Ouick Launch Data Microsoft Internet Explorer Quick Launch Shell Icon Overlay Handlers7 are COM objects that implement the IShellIconOverlayIdentifier interface to show icon overlays where one icon is displayed on top of another Icon Overlay handlers are loaded in the context of explorer exe when each user logs in This is used by legitimate applications such as TortoiseSVN Sofacy registers itself as a Shell Icon Overlay Handler by setting the appropriate registry key to the UID of its registered COM object 4 microsoft 5 15 Observed names for the Icon Overlay Value are AdvancedStorageShell The icon overlay handler key points to a registered COM object a Sofacy DLL Sofacy can also persist as a Shell Service Object another class of COM objects that load on user login They are registered in the following key Observed names for Sofacy Shell Service Objects are netids The shell service object CLSID used is Sofacy Functionality Disabling Error Reporting To avoid detection Sofacy systematically disables crash reporting logging and post-mortem debugging each time it starts It is delivered via memory corruption exploits which are inherently unpredictable Also Sofacy performs complicated inter-process inspection and code injection Finally the code may have bugs Any of these factors may lead to crashes which if logged are likely to be noticed Sofacy disables crash and PC health reporting by changing the following registry DWORD values to O It suppresses system hard error message display by setting the following registry DWORD value to 2 It also disables Dr Watson or other post-mortem debuggers by deleting the following registry key NT CurrentVersion AeDebug Interest in the Physical Location of the Machine Sofacy tries to read a value PhysicalLocation_Name from the system administrative template file Windows inf system adm Administrators especially in large organizations will populate this field with the physical location of the system in the field Sofacy gathers this information as part of its machine survey It is sent back to the malware operator adding context that may inform operator interest Email Credential Harvesting Sofacy recovers cached email credentials from several sources Specifically it can recover saved credentials from Outlook The Bat Eudora and Becky Local Output Queue Sofacy temporarily queues data it gathers on disk This data is LZSS-compressed and The location of the queue file is configurable and specified in a registry key The registry key is subject to frequent change as is the location of the queue file In one sample the queue file location was stored in this key cense Network Communications Impersonating Legitimate Processes For Network Communication When communicating with the C2 server Sofacy will scan a list of running processes looking for a running web browser or email client When one is found it will clone the process arguments exactly then create a new instance of the process The main thread of the cloned process is started in a suspended state and the implant is injected into the new process address space The implant is started instead of the original Sofacy will pick a C2 that matches the cloned process HTTP SMTP or P0 P3 By doing this Sofacy mimics legitimate user processes making it difficult to discern that network traffic originated from malware not user actions Asymmetric of Session Keys Sofacy uses the Windows API to create session keys for C2 communications It creates an ephemeral RC4 session key and seals it with a hardcoded 1024-bit RSA public key This sealed session key is included with the data transmitted to the C2 server As such only a recipient with the matching private key can decode traffic Proxy Awareness Sofacy will detect proxies configured for Winlnet and Firefox It will then use the correct proxies when connecting outbound to C2 servers Sofacy Indicators Known mailslots for IPC Mailslot LSAMailSlot Representative Sample Hashes Example Signatures The following ClamAV and Yara signatures can be used to detect Sofacy 9451072da 8bc75f5e5dc20c00 rule 18 strings$sucmset configuby numw8m9 0 16condition 1 of them rule stringscondition 1 of them Known 02 Servers C2 domains securitypractic com checkmalware org adawareblock com Checkmalware info scanmalware info updatepc org updatesoftware24 com testservice24 net symanttec org microsofi org microsof update com 1P Addresses 123 100 229 59 200 74 244 118 74 52 1l5 178 88 198 55 146 67 18 l 2 18 203 117 68 X-Agent Analysis X-Agent is a second-stage toolkit complementing Sofacy Portions of the X-Agent code base can be found in malware dating back to at least 2004 Somewhere down the Line -Agent became the internal name for this tool The features of X Agent demonstrate its sophistication For example it can operate in an air-gapped environment via an ad-hoc pseudo-network of USB flash drives X-Agent is multi-platform capable With minor changes to platform-specific code X-Agent will run on Linux instead of Windows It can also be repackaged in different forms for example as a DLL by the addition of a single module This analysis applies to X-Agent on two known platforms Linux and Windows X-Agent Identifiers Windows PE File Resource Locale IDs Windows Portable Executable resources are localized and include the locale ID9 of Windows running on build systems As such it may reveal the origin of malware The locale iD field can be faked but is often overlooked in malware build environments PE resources are organized into a 4-level deep tree with the third level specifying the locale ID of the resource This is different from a code page such as Windows-1251 and is more specific The Windows resource compiler Rchl dll uses the default locale iD Of 113 X-Agent PE samples observed in VT's dataset 68 had PE resources Three unique locale iDs were found in these samples 0409 en US English US 0419 ru RU Russian 0000 NULL invalid Of the 68 samples that contained PE resources the most common locale ID was ru RU Russian 20 Locale IDs Number of Samples ru-RU and NULL 1 1 Program Database File Paths Microsoft's Visual compiler may include a fully-qualified path to a program database PDB file to help a debugger can locate symbols This build-time artifact can provide information about the systems used to build the malware The following PDB paths have been observed in X-Agent samples C Documents and Visual Studio 2005 Projects NET Mail 1 1 Mail l l obj Release rund1132 pdb C WORK SOFT Joiner joiner O l Release joiner pdb C WORK SOFT Joiner joiner O 2 Release joiner pdb d Shared DATA spec_ver X-Agent Internals X-Agent Framework The X-Agent framework is a set of components communicating over well-defined methods Each component is a module and they communicate over channels Individual instances of X-Agent are termed agents Each agent is assigned a unique ID agent ID calculated from a hash of the MAC addresses of all network interfaces on the machine 2 l The X-Agent framework uses the term controller to refer to the software running on the C2 server Each X-Agent agent communicates with its controller over a C2 channeL Kernel The core module in the X-Agent framework is the agent kernel a small user-mode microkemel This microkernei can register other modules and communication channels as well as handle thread management and It has a generic interface to storage and configuration data Implant Initialization and Lifetime On startup X Agent s main function registers relevant modules and an external channel It then starts a channel controller thread which handles message distribution and channel selection Finally X-Agent starts a worker thread for each module X Agent continues to run until ali these workers terminate or untii operator commands instruct it to exit or uninstall Parameter Storage X-Agent like Sofacy can maintain a parameter store that contains C2 servers and other configurable parameters This would be initialized by the dropper separating the configuration from the implant configuration on disk It also allows for runtime configuration changes For unknown reasons most X-Agent builds do not use the parameter store in practice Windows The Windows Registry provides the underlying datastore for the parameter store on Windows and can be found at individual parameters are keyed off their registry value name a hexadecimal number string Linux On Linux the parameter storage is held in a SQLite database located in Each row in the database contains an id column which serves as the key Each parameter is then stored as a binary or dword value Channels X-Agent uses channels to structure communication and connections Channels are used for and C2 Multiple channels are multiplexed over a single 22 network connection External channels are used to communicate with the controller abstracting the network C2 protocol from higher level channels The following channel types have been found in X-Agent samples HTTP Channel Ox2101 0x2102 Mail Channel 0x2302 Local Channel 0x2301 Channel Controller The X Agent channel controller is responsible for passing module messages between external channels and local modules The channel controller is unaware of any specific C2 protocols These are abstracted and entirely the responsibility of the external channel The channel controller also passes controller-generated inbound module messages to local modules It queues these messages in memory as a vector and passes them to the target module The channel controller s final responsibility is to control which channels are used for communication through a channel changing mechanism exposed via a module command An operator sitting at a remote console can switch from one external channel switching C2 protocols on the fly For example X Agent might switch from communicating over HTTP to email protocols External Channels External channels are used to multiplex messages from modules to the controller X-Agent agents must register at least one external channel with the kernel They imitate legitimate network activity such as web browsing or sending and receiving email Local Channels X-Agent contains a local channel implementation that uses a hidden file for module message IIO This local channel is used in conjunction with the Net Flash module in air-gapped environments see below 23 The X Agent kernel will selectively intercept messages to load and unload modules before they passed to the channel controller This is conceptually similar to a local channel Modules Each X-Agent component is a module including the kernel The modules register with the kernel and are identified by a unique 16-bit The following modules have been observed in X Agent binaries Kernel UXODOZ Remote Key Logger 0x1002 Process Retranslator 0x1302 DLL Ox 602 Net Flash 0x120 Module classes are derived from a common base class and accessed over the same basic abstract interface X-Agent modules may override five methods in the module base class In a compiled X-Agent binary they appear in the following order in a module vtabie i A take message method This method passes inbound module messages to the modules which take ownership of them 2 A give message method by which the module gives up ownership of outbound module messages to send them to the controller 3 Aget module ID method that returns the 16-bit module ID 4 A set module ID method that sets the module ID 5 A worker run method which is the main function for the module It invoked in a dedicated thread started by the kernel 24 Module Messages Module messages are X-Agent's internal message representation A module message contains the agent ID a module ID a command number a priority and an opaque data field and size The module ID on outbound messages specifies the module that created the message On inbound messages the module lD specifies which module should receive the message These messages are called questions and come from the controller The destination module will receive these questions and may choose to answer them 'with a response Responses are also constructed as module messages Some modules will generate messages autonomously For example the keystroke logger module will generate module messages containing logged keystrokes Module Message Serialization X-Agent serializes module messages starting with a simple header followed by an opaque field struct modu1e_message UINT16LE modu1e_id UINT8 command_number UINT8 X-Agent serializes each module message by wrapping it in a raw packet see Appendix B That raw packet is then sent over the network to the C2 controller The size of the C2 message specifies the raw packet size and subsequently the module message size The protocol design does not include sequence numbers and behaves like an unreliable transport mechanism Statefulness is tracked completely in response module messages For example when X-Agent receives a command to read a file it responds with a log message that says it read a specific file followed by the file s contents HTML Log Messages X-Agent log messages are written as HTML and color coded perhaps to make it easier for human operators to read Error messages tend to be colored red font Size 4 color red process is exist font font size 4 color red File don t 25 Persistence On Windows X-Agent will persist via a Registry Run key using rundll32 exe to invoke its publicly exported init method The Run key may be named after its DLL filename on disk such as With a registry key value of rund1132 exe Alternatively X-Agent may persist as a Windows service or as a Shell Icon Overlay Handler like Sofacy Linux deployments of X-Agent may persist via a desktop file located in config autostart When installed as root the X Agent binary may be installed as and persist via run level scripts such as re local Network Communications Packet Queues X-Agent uses packet queues to buffer C2 traffic when passed between the kernel and channel controller The inbound message queue is a vector in memory accessed Outbound messages are buffered in two local queue files on disk one each for high and normal priority messages Each queue holds module messages prefixed by a UINT32LE Observed names for the queue files are edg6E85F986 edg6EF885E2 edg6E85F98675 tmp zdg6E85F98675 tmp edg6EF885E2 tmp zdg6EF885E2 tmp These queue files are most often located in the tmp directory on Linux and the path returned by GetTempPathl on Windows Some DLL builds of X-Agent will put these queue files in their working directory instead although this is less common 26 After reading queue files the channel controller deletes them They are not securely wiped from disk and may be recoverable External Channels and X-Agent HTTP traffic is clear text SMTP and POP3 channels use TLS and are more challenging to detect on the network HTTP External Channel X-Agent s HTTP external channel is commonly-used to talk to the controller POST requests are used to send messages while GET requests retrieve inbound messages An example HTTP external channel session has been provided as a text file and is available via VirusTotal with a SHA-256 hash of All HTTP messages include a magic token value in the POST messages also include a request body containing an encoded module message HTTP URI Generation The full for HTTP requests is randomly generated according to a template implicitly agreed upon by both agent and controller The base URls for GET and POST requests is generated by selecting a random string from a list Since this base is ignored by the controller it is not unusual for it to change between X-Agent versions in one X-Agent sample the following list of base URIs was observed watch search find results open searchl close 27 Parameters for the URI are chosen from a list and appended to the base URI The following parameter name choices have been observed text from aim age oem btnG Oprnd ai utmm channel One of these parameters is agreed upon by the agent and the controller to encode the agent ID and is henceforth referred to as the HTTP agent ID token This is used by the controller to track sessions in the representative sample the chosen parameter was ai All other parameters appear to contain meaningless randomly generated base64 ike data Older X-Agent samples used a static URI for HTTP channel requests This ends with a hardcoded session tracking parameter value name ai The HTTP agent ID taken was simply appended to this base URI HTTP Adent ID Token Format and Encoding The controller will extract the HTTP agent ID token from the correct URI parameter It is then decoded to identify which agent is communicating The HTTP agent ID token is base64 encoded data using the web-safe alphabet see Appendix B The encoded string is padded with a 5-byte random prefix so that it looks like valid base64 data When encoded as binary data the HTTP agent ID token starts with a 4-byte XOR key foiiowed by a 7 or 20-byte magic token value and the agent ID xor keyi4 magicwtokenET or 203 agentuid The XOR key is repeated and extended out to a length of 11 or 24 bytes then with the magic token and agent ID fields The 7 byte magic token for HTTP data when XOR decoded shouid beOider versions of X-Agent use a 20-byte ASCII magic token value The following steps may be used to decode an HTFP agent 1D token 1 Discard the 5 bytes of prefix data 2 Base64 decode using the web-safe alphabet see Appendix B 3 De-obfusca te XO Ring with the repeated XOR key The following example demonstrates the decoding operation Client agent request GET m j byi 1 Accept Accept Language gzip deflate User Agent Mozillaf5 0 Gecko 20100101 Firefox 20 0 Host windows updater com Server controller response 200 OK Date Thu 12 Jun 2014 22 18 27 GMT Server Apache Content Length 3 Connection Close Content Type text plain charseteUTFw8 400 In this example the HTTP agent ID token is in the aim URI parameter ai oedQJ3vMSQ6j9N7oleYALu8C To decode discard the 5 bytes of prefix data leaving This data must be base64 decoded using the web-safe alphabet see Appendix B The result isThe first 4 bytes of this data are the XOR key To continue decoding XOR with the repeated key giving a result ofThe first 7 bytes are the expected HTTP agent ID tokenThe remaining 4 bytes are the agent ID as a 32 bit littie endian integer 43 f0 1C 10 The agent iD in this case was 0x101cf04 3 in some situations the high 8-bits of the agent ID may be zero causing only 3 bytes of the 32 bit agent ID to be base64 encoded The decoded output for HTTP agent ID token tokens will look truncated missing the East byte This is likely unintended HTTP Message Format and Encodinu HTTP channel messages are encoded in a format common to both inbound and outbound messages inbound messages are responses to GET requests and outbound messages are contained in POST request bodies The encoding of HTTP channel messages is similar to that of HTTP agent ID tokens To decode a 5-bytejunk prefix should be discarded and the remaining data base64 decoded with the web safe alphabet see Appendix B The result will be binary data starting with an ii byte header containing the following fields xor_key 4i magicwtoken 7 The following steps will decode a HTTP channel message Discard the 5 byte prefix from the body 2 Decode the remainder with the web-safe base64 alphabet 3 Retrieve the 4-byte XOR key the first 4 bytes of decoded data 4 the next ll bytes of the message with the XOR key This includes the HTTP magic token and the agent ID 5 Vaiidate the 7 byte magic token in the header has the expected value 30 Discard the magic token bytes The result of this decoding is a raw packet message encoded in the previously-described format An example POST request for X-Agent s HTTP channel is available via VirusTotal The final output is a serialized module message from module 0x1 002 command 0x64 with an opaque message body whose contents have a hash of Mail External Channel The SMTP and POP3 channels together make up a common mail channel The SMTP channel is used to send messages and the POP3 channel is used to receive them These channels are an alternative to the HTTP channel which can both send and retrieve messages The oldest versions of X-Agent exclusively used mail protocols for C2 communication lnconqruous Mail Subiect Fields - Hardcoded Vaiues or Encodina X-Agent sends SMTP messages to the controller with specific magic values in the Subject line The presence of these values is enforced by the C2 controller and by X-Agent when fetching messages via P0 P3 The most common Subject line observed contains plradi nomen which refers to a Georgian government-issued citizen identification number similar to a US Social Security Number Other versions of X-Agent expect the Subject line to contain an encoded token for session management much like the HTTP agent lD token This data is encoded using an encoding method called the P2Scheme The encodes binary data using the standard base64 alphabet see Appendix B The binary data starts with a random 5-byte XOR key foliowed by a 7-byte magic subject token and 4 bytes for the agent ID as a xormkeyi5ll magicmsubject token 7ii agenthidj 31 The 5-byte XOR key is repeated extended out to 11 bytes covering the magic subject token and the agent It The magic subject token when XO R-decoded should have the following valueother words followed by the ASCII string china The choice of magic token values using Georgian phrases and the word china' seems incongruous Mail Message Format and Encoding The mail channel sends and receives messages as multipart MIME email The first message part contains a 7-bit UTF-8 representation of gamarjoba which is Georgian for hello The second message part is a base64 encoded attachment with the filename detaluri dat Alternatively the file may be named where %s is a string representation of the current time Detalan' means detailed in Georgian The file may also be called winmail dat The attachment contents are a single raw packet message see Appendix B Air-Gapped Operations Some versions of X-Agent are designed to operate in an environment without an Internet connection such as an air-gapped network In this situation X-Agent relies on human intervention to carry commands and data in and out via writable external media such as USB flash drives X-Agent will register a local channel for external communication and use a module called Net Flash The Net Flash module receives notifications from the OS when a new file-system on writable external media is mounted The Net Flash module then checks for incoming module messages in the following locations System Volume High priority incoming messages System Volume Information sys Normal priority incoming messages logs data System Volume Information sys Outbound messages logs com 32 If these folders do not exist they are created as hidden system directories Inbound message files are deleted after they re read The X-Agent microkernel contains a message shim for the Net Flash module When Net Flash is active this shim intercepts all outbound messages rerouting them before they reach an external channel Linux versions of X-Agent also contain this shim but a Linux version of the Net Flash module has not been observed This architecture indicates that the X-Agent kernel was designed or specifically adapted to work in air-gapped environments Autorun Infection Perhaps to support infection in air-gapped networks X-Agent has the ability to spread via autorun invocation on USB flash drives Some samples have been observed with residual strings from an autorun inf file autorun open shell open Explore Volume Information USBGuard exe install shell open Default l X-Agent Indicators Known mutexes Known mailslots for Packet queue file names edg6E85F98675 tmp edg6EF885E2 tmp zdg6E85F98675 tmp zdg6EF885E2 tmp 33 Representative Sample Hashes Signatures The following Yara signatures can be used to detect X-Agent rule ecksomAlgo rithm stringscondition 1 of them rule XAG strings $s_uniq1 wide $s_uniq2 ascii $s_unic 3 ascii $s_uniq4 wide $swuniq5 wide $swuniq6 engE85F98675Irnp wide $s_uniq7 wide a 4font size 4 coior red comm isn't success ffont br wide 6 font siZE 4 color red com 6 is success lfont ascii font size 4 color red com 7 is success lfont ascii font size 4 color red com isn't success ifont ascii EXC - Cannot create Post Channel asci EXC - Cannot create Get Channei ascii Cannot create ascii Cannot create ExtChannetToProcessTitread ascii Cannot create ProcTo Ext Pipe ascii Cannot create ExtToProc PipeE ascii Cannot create Process ascii Calloc 3 error ascii wide autoru Volume nformation USBGuard exe ascii size 4 colorzred comm wide font size 4 color red comm wide font size 4 don't create ifont br wide 35 width 800 height 500 ascii 2 font size 4 colorzred fiie is blocked another process lfont br wide Calloc 1 error Packet lost ascii Error Broken Pipe ascii condition 1 of $swuniqt or 8 of them rule strings $s_uniq1 font sizez i coior red align center WRlTE FILE IS NOT ascii 34 $shuniq2 font size 4 color red alignzcentepWRiTE FILE IS SUCCESS lfont br ascii $s uniq3 Terminal don t started ascii $s_uniq6 ascii ascii 2 rm -f f configiautostartl ascii mkdir ascii 11AgentKemei ascii 12EAgentModule ascii $z'W3ResavedApWasdi BFSModule ascii i of or 6 of them SMTP and POP3 Servers and Accounts When the mail channel is active the following SMTP and POP3 servers and accounts have been observed being used for C2 X-Agent binaries contain hard-coded credentials for free webmail providers or presumably compromised accounts SMTP and POPB Servers smtp mail ru pop mail ru smtp yandex ru smtp bk ru smtp gmail com smtp mia gov ge mail mia gov ge SMTP and POP3 accounts arkadmo@mail ru roe xichard@yandex ru john dory@mail ru Colin mcrae1968@gmail com devil 666 666 13@gmail oom interppol gmail com obert fastand@gmail com jose karreras@bk ru 35 kar1 fridrikh@yandex ru sarah nyassa@gmail com i1ya kasatonov@list ru zurab razmadze11@gmail Com albertborough@yahoo com ahmedOmed8outlook com shjanashvili0mia gov ge u kakhidze@mia gov ge r gvarjaladze@mia gov ge maia otxmezuri8mia gov ge 1 maghradze@mia gov ge CZ Servers and Domains The following observed C2 domains and IP addresses are most used by the HTFP external channel Domain names hotfix update com adobeincorp com Check-fix c0m secnetcontrol c0m checkwinframe com testsnetcontrol c0m azureon line com windows updater com IP addresses 62 205 175 96 63 247 82 242 63 247 82 243 64 92 172 221 64 92 172 222 67 18 172 18 70 85 221 10 74 52 115 118 80 94 84 21 80 94 84 22 81 177 20 109 81 112 20 110 82 103 128 81 82 103 128 82 82 103 132 81 82 103 132 82 83 102 136 86 88 198 55 146 94 23 254 109 201 218 236 26 203 117 68 58 216 244 65 34 36 Appendix A Sofacy LNK Persistence File The following LNK file shows how Sofacy creates persistence using this method This can also be found in VirusTotal with a hash of Windows Shortcut in Contains a Contains a Contains a Contains a Contains a Centains an formation link target identifier description string relative path string working directory string command line arguments string icon location string Contains an icon location block Link information Creation time Modification time Access time File size File attribute flags Should be archived Drive type Drive serial number Volume label Local path Description Relative path WOrking directory Command line arguments Icon location Link target identifier Jan 06 2011 21 30 40 983625000 UTC Aug 14 2007 02 43 56 000000000 UTC Jan 07 2011 06 47 58 593750000 UTC 622030 bytes 0x00000020 Fixed Dxec6d8b11 C Program Files Internet Explorer iexplore exe Users hpplication C Program Files Internet Explorer C Program Files Internet Explorer iexplore exe iProgramFiles Internet Explorer iexplore exe Shel item list Number of items 8 Shell item 1 Class type Root folderi Shell folder identifier She Shell item ll folder name My Computer 1 Class type 0x2f Volume Volume name Shell item 3 Class type 0331 File entry Directory Name Documents and Settings Modification time Not set 0 Fil Extension attribute flags 0x00000010 Is directory lock 1 Signature Oxbeef0004 File entry extension Long name Documents and Settings Creation time Not set 01 Acc ess time Not set 0 Shell item 4 Class type Name Modification time File attribute flags Is directory Extension block 1 Signature Long name Creation time Access time Shell item 5 Class type Name Modification time File attribute flags Is directory Extension block 1_ Signature Long name Creation time Access time Shell item 6 Class type Name Modification time File attribute flags is directory Extension block 1 Signature Long name Creation time Access time Shell item 7 Class type ame Modification time File attribute flags Is directory Extension block Signature Long name Creation time Access time Shell item 8 Class Name Modification time File attribute flags type 1 0x31 File entry All Users Not set 0200000010 Directory 0xbeef0004 File entry extension All Users Not set Not set 0 x31 File entry Application Data Not set 0x00000010 Directory 0xbeef0004 File entry extension Application Data Not set Not set 0x31 File entry Microsoft Not set 0300000010 Directory 0xbeef0004 Microsoft Not set 0 Not set 0 File entry extension 0x31 File entry MediaPlayer Not set 0x00000010 Directory 0xbeef0004 File entry extension MediaPlayer Not set Not set 0x32 File entry File service exe Not set 0x00000020 Should be archived Extension block 1 Signature Long name Creation time Access time Distributed link tracking data Machine identifier Droid volume identifier Droid file identifier Birth droid volume identifier Birth droid file identifier 0xbeef0004 File entry extension service exe Not set Not set no 38 Appendix X-Agent CZ Raw Packet Decoding Base64 Alphabets X-Agent uses two base64 alphabets during message encoding The first is a standard base64 alphabet used for mail messages SMTP and P0 P3 HTTP messages are encoded with a different web-safe base64 alphabet Raw Packet Message Format Raw packets are a generic container and packet format used to transmit module messages over external channels such as HTTP SMTP or POP3 Raw packets are transmitted one-by-one each in its own external channel message For example the SMTP mail channel sends each raw packet message as a mail attachment file The size of the raw packet message is the size of the decoded attachment Raw packets include the following fields agent_id crc 2 The raw packet message format was meant to be abstracted from the external channel but there is one implementation inconsistency The HTTP external channel XORs the agent ID field with an XOR key intended to obfuscate the previous header The mail channels do not do this and it is likely an unintentional oversight Flaw Packet Message CRC Checking 39 A CRC is calculated over the data and session key fields and then sent as two UINT16LE fields in the packet The first is a polynomial seed for the CRC-16 algorithm followed by the calculated good CRC value Here is an implementation of the CRC check functionality in unsigned short crci6 const unsigned char input size_t len unsigned short poly_seed unsigned short result for size_t i 0 i len unsigned char input i For int 0 8 if result 0xff i result 1 result poly_seed else result 1 1 return result bool input unsigned char header 4 unsigned short seed expected_crc actual_crc if 4 return false memcpy header seed headerEG header 1 expected_crc headerEZ header 3 actual_crc unsigned 4 4 seed return actual_crc expected_crc Raw Packet Message Raw packet messages are using a key built by concatenating a static private key with a public key that changes each packet Afew simple steps can be used to a raw packet message 1 Retrieve the agent D first 4 bytes of the message as a little-endian Discard these message bytes from the stream 2 Retrieve the 6 polynomial seed value and the expected CRC-16 value as the next two 6LEs immediately following the agent ID Discard the CRC bytes 4 in total from the stream 40 3 Calculate the actual CRC of the remaining packet bytes seeding the CRC with correct polynomial seed This should match the expected value 4 Create the full RC4 key for the message which starts with a SO-byte static private RC4 keyThen append the last 4 bytes of the message the public key to create the full RC4 key Finally discard the last 4 bytes of the stream the public key 5 the remainder of the message stream using the full RC4 key 6 Check that the last i 1 bytes of the message are the magic token bytesDiscard these bytes The result is a clear text serialized module message 4 National Security Archive Suite 701 Gelman Library The George Washington University 2130 H Street NW Washington D C 20037 Phone 202 994‐7000 Fax 202 994‐7005 nsarchiv@gwu edu
OCR of the Document
View the Document >>