Antimalware Strategy and cleaning up the act in the midst of plague

by Ash — on  ,  , 

cover-image

Funny joke, that is partially a joke



Intro

So lets assume business environment in your company is quite liberal and user-friendly. Now that you took over security operations you realize just how bad the whole situation is with malware infections. You have AntiVirus (AV) solution installed, but something is still missing, infections still happen and users complain about slow performance and lost files.

Reasoning

The questions you are going to ask your team is what to do with the infected machine. Sure you can play a bit and try to pull off investigative process on every infection, but if you are in the middle of viral outbreak your immediate actions should be concentrated on bringing it under control. Any organization suffers from multiple cases of successful malware infections. AV effectiveness depends on strategy, and your strategy should fit your business requirements.

Prerequisites

  1. Identify tools at your disposal
  2. List out those tools that you may use to control malware infections: AV solution used, Active Directory GPO to enforce policies, Excel to analyze data, command tools, etc.

  3. Decide the roles
  4. I would obviously suggest Security team to investigate and confirm the infection, IT team to perform the cleaning activities and Security team to verify if all is well afterwards. Classic segregation of duties. You need to see who will participate in all process.

  5. Decide the reporting channels
  6. Here you might want to rely on IT ticketing system or utilize your own Incident Response System. If you decide to go for the latter you need to consider how IT will calculate their KPIs.

  7. Review implemented controls
  8. Make sure that you identify what has already been done and why it is not working. Take a sit with participants of the process and review policies that are implemented in AV solution, review security configuration of the endpoints, identify trends in infections, etc

  9. Keep track of those requests
  10. Whether you decide to use IT service desk ticketing system, simple email or special security incident response control system, just know that you need to keep track of requests with suspected infections.

Process

Process layout will depend on your organization chart. My assumption is that your security operations team is separated from IT department and has the established agreement that IT support team deals with endpoints and user’s side, while your team report, advice and verify.

  1. Every day I would generate a 24 hour infections report. This report would filter out cleaning events on removable media and keep events related to infections only on local drives.
  2. Based on this report, PCs with suspected infections will be reported to IT service desk for checking and cleaning. Each PC would be reported separately.
  3. IT service desk would update request to include user and get access to the PC.
  4. It is important that user cooperates and is involved in whole process from the beginning. Should user fail to cooperate consider escalation channels.
  5. IT should follow established check and clean procedure. If machine is infected it should be scanned and cleaned. This reddit post is a good example of procedure to clean up the infected machine. Any comments or results of check and clean procedure should be updated in the raised request.
  6. If event is false positive or infection was cleaned at the time of AV alert, support team would simply acknowledge it in the raised request.
  7. Any artifacts should be collected and preserved for Security operations team to verify. I normally ask for scan log, list of files on local drives, list of processes, list of services, installed programs, and information about user, his access privileges.
  8. Security team should verify if clean up is done and machine is malware-free. Perhaps more information should be provided, more checks to be done, etc. All this should also be reflected in the request.
  9. For further improvement of the process make sure to keep appropriate metrics. For some ideas see section Metrics below.
Produced by OmniGraffle 7.0.3 2017-02-18 17:29:01 +0000 Canvas 1 Layer 1 Initialize Remediate V erify Impr ove Layer 2 Security Operations IT Support Operations User Operations Create Request to check malware infection Generate periodic malware infection report Security Engineer IT Support Malwar e Cleaning Pr ocedur e Reported PC Infection Confi rmed Update Request Clean Infection No Y es Inform User Provide Access Security Engineer Provide Support User User Cooperate More activity required Y es Security Engineer Update Metrics No Metrics Report Request Request


Reporting

Reports are important part of your process. Especially when your tool-set is limited to standard protection and detection controls, such as Proxy, AV and IPS. Reports I consider in my daily operations include:

  • Report showing details of threats per host for the last 24 hours. This report is used to aggregate events, conduct investigation and report host to IT for cleaning. Reported hosts are counted later on as part of metrics
  • Trend report showing number of infections per week. This report shows effectiveness of your actions.
  • Report showing agent communications for the last 30 days. This one is to identify those hosts that may have communication problems with central console.
  • Another useful report that shows canceled scheduled tasks. It helps to tune up those scans.

Metrics

To measure your performance you will need metrics, since nothing brings more clarity into whole process as the ability to measure its effectiveness. The metrics I keep in my operations are loosely based on CIS information security metrics. For malware control process I keep track of the following:

Period Metric
Daily Weekly Monthly
Number of reports of suspected infection
Number of confirmed infections
Mean response time to take action

To do it more efficiently I have a separate excel file with some macros that automate the best part of calculations.

Number of reported suspected infections helps me measure effectiveness of changes that are done to control malware infections. For instance turning on the scheduled scans may increase number of requests with suspected infections.

Number of confirmed infections obviously shows effectiveness of the overall process. The less confirmed infection the better.

Finally mean time to respond helps you measure average time spent on investigation and cleaning process. Ideally your mean time should be measure in hours, practically I measure it days.

What to look for

  1. If your policy is configured to delete infected file, users might and probably will complain one day about missing files that are legitimate. Probably quarantine is the best option in the beginning.
  2. If your investigation process is split between IT and Security teams, make sure to force both teams to communicate every action to user prior to the actual action.
  3. Is IT service desk concerned about their KPI? If may be a blessing in disguise if used correctly, or one of the pitfalls. KPI for IT support means the need to respond quicker. What you need to make sure is that they do not miss necessary steps while scanning or cleaning the infected PC, otherwise IT support team will try to shorten their response at the cost of quality clean up.
  4. Is your scan covering enough “ground” in reasonable time period? If scan takes too long & coupled with performance hit, users will try to cancel it.

Conclusion

Yes, I get it, AV struggles to keep up. According to various resources anti-virus vendors estimate that up to a million new variants of malware appear every day. It is logical to assume that time it takes for new malware to appear and hit your environment may probably be shorter than your AV update period. Nowadays, malware breaks in by the zerg rush tactics. Nothing can stop infection from happening, not even that fancy heuristic algorithm your AV solution vendor was so upbeat about. But AV solutions are not yet dead. Coupled with the proper process of cleaning up the house in the controlled manner they become useful control and protection mechanism. Every environment is different, so will be your approach. As long as it is logical and measured, I am pretty sure results will come.

Mentions

Feature photo is by the Spanish photograper Jorge Pérez Higuera.