Malware Analysis as a function of intelligence and counterintelligence operations.

Simply stated, intelligence operations are focused on gathering information about an organization’s adversaries.  Counterintelligence operations are focused on limiting, controlling, or identifying the information that an organization’s adversaries gather about the organization.

Often when discussing malware, analysts will speak mechanically about the observations and capabilities of a sample.  (How is it packed?  How does it maintain persistence? What are the artifacts that let me find it on other machines or detect it on the network?…)  But a lot of what we need to do involves gathering and controlling information.  That is more like intelligence and counterintelligence operations and can be as hard as reverse engineering the sample itself.

Where did we get the sample from?  Who knows that we have the sample?  Who are we willing to share the sample with?  Have we seen similar samples that can be grouped or linked to another investigation? What is the sample trying to do (cross over from the mechanical approach)?  Does a sample demonstrate intelligence of the target or was it a dummy weapon that just happened to compromise a host?

When we are conducting our investigation and need to acquire a sample, the sample may be obtained from a host that we control.  These samples may not be droppers or installers, but we can gather a lot of information from the persistent samples left on a host.  Other times we will need to reach out in order to obtain a sample or the original dropper.  We may be examining links in an e-mail, document, or web history; we may find URLs for payloads in a sample we are analyzing; or we may just be doing research on a family of malware.  Some sources are safer than others, like when we acquire from a vetted malware dump or malware tracking site.  Others may be a compromised site that has the malware and some may be sites controlled by an adversary.

When we reach out for samples, do we fully consider the potential intelligence we are exposing? Some of the questions we need to ask ourselves include:

  • What IP address does our traffic appear to come from?
  • Who has visibility of the last leg of our route?
  • What does a service provider know about their customer?
  • How normal will the generated traffic appear?

Things like TOR, anonymizers, and virtual private servers (VPS) in the cloud can be used to obscure the source of traffic.  Entrance and exit nodes from TOR, an anonymizer service, or a VPS provider, have some degree of visibility of our traffic.  Even your local ISP and whatever DNS service that we use have some degree of visibility of our traffic.  Risking exposure of this intelligence is something that we want to minimize, but it is something we need to weigh against the goals of the investigation.  If the investigative goal is focused on short term response objectives due to exigent circumstances; risking exposure of the intelligence while consuming services may be warranted.  If the investigative goal is to not tip off an adversary as to what an organization is investigating, then a more methodical and discreet approach would be called for. If an investigation is in response to a well-known family of malware or is part of a highly visible malware campaign, exposure may not be as risky.  If it appears to be an unknown or custom malware sample or is part of an obviously targeted attack, limiting information exposure may be the most important investigative goal.

When selecting service providers (free or paid), we need to ask questions that include:

  • What is the provider’s reputation?
  • How strong is their security?
  • Do they have more important customers?
  • How do they make money or why are they offering the service?
  • What do they do with information they gather?
  • Are they bigger than you?
  • What information do they reveal about their consumers?
  • What information do they reveal about the services rendered?

Reputation can go a long way but the other questions still need to be examined to judge whether the reputation is sufficient to warrant use of a service.  If a provider has been around for a long time, provides quality service, demonstrates strong security of their consumer information, has clear and open policies, and has a clear purpose for offering the service, then consuming the service is likely low risk.  If a service provider exposes submissions for public examination or other customer examination, use of even the most reputable and useful provider needs to be weighed against the guarantee that the information submitted will be exposed. This includes Anti-Virus (AV) product providers.  Useful information could be revealed to an adversary if the AV product provider automatically includes all submissions in their signature updates to their other customers.

Some of the other issues we must address are how signals may be received.  What if the malware we are investigating is targeted at a limited number of potential victims or a single potential victim?  If we begin to generate traffic with sites under the control or observation of our adversary, then we may be tipping our hand that we are on to them.  If we discover a malware sample and reach out for help, either to other organizations or to service providers, we run the risk that the information that we have the sample will be exposed.  For example: if we submit all of our discovered malware to a cloud based AV scanning service for triage, our adversary could just set up an automated query for the malware hash value and determine when the malware has been discovered.  This would allow our adversary valuable time to reconfigure the malware or take an offensive act, early in our investigation, that we could otherwise have avoided had we not allowed the information to be exposed in a cloud service.  This is where we have to use judgment in determining whether the information that we gain outweighs the potential increase in risk due to exposing the information.

One of the biggest times that we can release information is when we start cleaning malware off of our systems.  When our adversaries see their number of compromised hosts dropping or appear to lose communication with all of them at once, they may attempt to take counter action.  This is why it is important to control when this signal is sent (it cannot be avoided when it is time to clean).  If the incident response investigation is complete, the malware has been analyzed, all compromised hosts have been identified, all of the stake holders are in place, and a remediation action plan is ready to be deployed, the adversary will be at a significant disadvantage compared to when they get this signal earlier and can counter during the early phases of the response process.

Last, we have simple information control.  Last year the story of the RSA breach piqued my interest as an intelligence and counterintelligence story.  It appears that early in the response phase, the malicious attachment that resulted in the adversary gaining a foothold was submitted to a cloud service for scanning.  The loss of control of this information later led to some embarrassment because RSA was no longer able to control and limit access to that attachment.  I am not questioning the judgment call because I do not have all the facts that were available to the decision makers at the time.  But the story did bring to light the need to consider these longer term information control issues.  It is also possible that the adversary became aware of the detection based on the publicly available information that was exposed at the time the sample was submitted to the cloud based service.

Some other ways we may inadvertently give away intelligence and some things to consider as counters to the risk:

  • A unique or rare browser footprint may be associated with a certain research group or individual.  This would permit an adversary to know when that research effort is examining them. Panopticlick (https://panopticlick.eff.org ) is a good way to check whether your total browser footprint is common or rare.  Take steps to make your browser more common.  Use a live CD of a vanilla Linux distribution to do your research browsing.
  • If an adversary has released malware to a limited number of targets or a single target, a user agent string not identical to the malware may alert them they are being examined and by whom.  Full malware analysis in a test network, with something like INETSim, some of the fake tools in the REMnux distro, or FakeNet (Full disclosure I have not had time to test FakeNet yet) can provide this information so exactly matching requests can be generated if needed.
  • Over time, using the same IP address to do research may result in more advanced adversaries identifying traffic sources.  Using TOR, an anonymizer, or a consumer ISP with DHCP can help counter this.
  • Service providers may be information gathering on us.  If we submit 20 or more polymorphic variants from the same malware family, it would be reasonable for the service provider to assume we are responding to an overrun of our network by that malware family.  Could that information alone have commercial use? I have not formulated a defense against this other than increasing the weight given to not submitting.
  • What does our DNS traffic reveal? Using and rotating DNS providers can limit this information, but unless you have the money to support your own DNS root, you just have to realize the information you expose.
  • Even if we don’t touch the adversary’s hosts, does our Internet research give away information about what we are responding do? Did we remember to turn off prefetching in Firefox before we started to conduct research in Google?  Easy solution, turn off prefetching and any other function that tries to cache potential next stops while browsing.

It is important for a malware analyst, and other analysts responding to an incident involving malware, to identify when some action they take could expose sensitive information.  When dealing with scatter shot malware that is part of a publicly discussed spam campaign or otherwise high profile/public incident, the potential cost due to the risk of releasing information is generally low.  But when faced with an unknown sample, or something with limited exposure, the cost due to exposure rises.  These decisions are something I have beaten myself up over, vowed to learn from, and then committed to making better decisions the next time.

5 thoughts on “Malware Analysis as a function of intelligence and counterintelligence operations.

  1. Great article, thanks for sharing.

    I did the test on Panopticlick, results is staggering: my browser fingerprint appears to be unique among the 2,112,506 tested so far :-(

  2. Pingback: Coopetition and sharing threat intelligence | Overhack

  3. Pingback: To Blog or not to Blog: Knowledge and Experience « Malware Musings

  4. Pingback: Malware Analysis Resources | tool analysis malware online

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>