Domain 4: Incident Management Flashcards
(112 cards)
Security Incidents
Many IT professionals use the terms security event and security incident casually and interchangeably, but this is not correct. Members of a cybersecurity incident response team should use these terms carefully and according to their precise definitions within the organization. The National Institute for Standards and Technology (NIST) offers the following standard definitions for use throughout the U.S. government, and many private organizations choose to adopt them as well:
Event vs security incident vs adverse event
An event is any observable occurrence in a system or network. A security event includes any observable occurrence that relates to a security function. For example, a user accessing a file stored on a server, an administrator changing permissions on a shared folder, and an attacker conducting a port scan are all examples of security events.
An adverse event is any event that has negative consequences. Examples of adverse events include a malware infection on a system, a server crash, and a user accessing a file that they are not authorized to view.
A security incident is a violation or imminent threat of violation of computer security policies, acceptable use policies, or standard security practices. Examples of security incidents include the accidental loss of sensitive information, an intrusion into a computer system by an attacker, the use of a keylogger on an executive’s system to steal passwords, and the launch of a denial-of-service attack against a website.
Computer security incident response teams (CSIRTs)
Computer security incident response teams (CSIRTs) are responsible for responding to computer security incidents that occur within an organization by following standardized response procedures and incorporating their subject matter expertise and professional judgment.
Phases of Incident Response
CSIRT Preparation
CSIRTs do not spring up out of thin air. As much as managers may wish it were so, they cannot simply will a CSIRT into existence by creating a policy document and assigning staff members to the CSIRT. Instead, the CSIRT requires careful preparation to ensure that the CSIRT has the proper policy foundation, has operating procedures that will be effective in the organization’s computing environment, receives appropriate training, and is prepared to respond to an incident.
The preparation phase also includes building strong cybersecurity defenses to reduce the likelihood and impact of future incidents. This process of building a defense-in-depth approach to cybersecurity often includes personnel who might not be part of the CSIRT.
During the preparation process, incident response teams should also define their standard notification and escalation procedures. Remember that anyone in the organization may be the first to identify a potential security incident. Procedures should clearly define how first responders report a potential incident to the CSIRT, the process for notifying the team members of an activation, and the criteria for escalating incident reports to management, as warranted.
The CSIRT detection and analysis phase includes?
The detection and analysis phase of incident response is one of the trickiest to commit to a routine process. Although cybersecurity analysts have many tools at their disposal that may assist in identifying that a security incident is taking place, many incidents are only detected because of the trained eye of an experienced analyst.
NIST 800-61 describes four major categories of security event indicators:
Alerts that originate from intrusion detection and prevention systems, security information and event management systems, antivirus software, file integrity–checking software, and/or third-party monitoring services
Logs generated by operating systems, services, applications, network devices, and network flows
Publicly available information about new vulnerabilities and exploits detected “in the wild” or in a controlled laboratory environment
People from inside the organization or external sources who report suspicious activity that may indicate a security incident is in progress
When any of these information sources indicate that a security incident may be occurring, cybersecurity analysts should shift into the initial validation mode, where they attempt to determine whether an incident is taking place that merits further activation of the incident response process. This analysis is often more art than science and is very difficult work.
NIST recommends the following actions to improve the timeliness and effectiveness of incident analysis:
Profile networks and systems to measure the characteristics of expected activity. This will improve the organization’s ability to identify abnormal activity during the detection and analysis process.
Understand normal behavior of users, systems, networks, and applications. This behavior will vary between organizations, at different times of the day, week, and year and with changes in the business cycle. A solid understanding of normal behavior is critical to recognizing deviations from those patterns.
Create a logging policy that specifies the information that must be logged by systems, applications, and network devices. The policy should also specify where those log records should be stored (preferably in a centralized log management system) and the retention period for logs.
Perform event correlation to combine information from multiple sources. This function is typically performed by a security information and event management (SIEM) system.
Synchronize clocks across servers, workstations, and network devices. This is done to facilitate the correlation of log entries from different systems. Organizations may easily achieve this objective by operating a Network Time Protocol (NTP) server.
Maintain an organization-wide knowledge base that contains critical information about systems and applications. This knowledge base should include information about system profiles, usage patterns, and other information that may be useful to responders who are not familiar with the inner workings of a system.
Capture network traffic as soon as an incident is suspected. If the organization does not routinely capture network traffic, responders should immediately begin packet captures during the detection and analysis phase. This information may provide critical details about an attacker’s intentions and activity.
Filter information to reduce clutter. Incident investigations generate massive amounts of information, and it is basically impossible to interpret it all without both inclusion and exclusion filters. Incident response teams may wish to create some predefined filters during the preparation phase to assist with future analysis efforts.
Seek assistance from external resources. Responders should know the parameters for involving outside sources in their response efforts. This may be as simple as conducting a Google search for a strange error message, or it may involve full-fledged coordination with other response teams.
Containment is the first activity that takes place during the detection and analysis phase, and it should begin as quickly as possible after analysts determine that an incident is underway.
What activities are conducted during containment?
Containment is the first activity that takes place during this phase, and it should begin as quickly as possible after analysts determine that an incident is underway. Containment activities are designed to isolate the incident and prevent it from spreading further. If that phrase sounds somewhat vague, that’s because containment means very different things in the context of different types of security incidents. For example, if the organization is experiencing active exfiltration of data from a credit card processing system, incident responders might contain the damage by disconnecting that system from the network, preventing the attackers from continuing to exfiltrate information. On the other hand, if the organization is experiencing a denial-of-service attack against its website, disconnecting the network connection would simply help the attacker achieve its objective. In that case, containment might include placing filters on an upstream Internet connection that blocks all inbound traffic from networks involved in the attack or blocking web requests that bear a certain signature.
Who may need to get involve with decisions in the containment phase?
Containment activities typically aren’t perfect and often cause some collateral damage that disrupts normal business activity.
Consider the two examples described in the previous paragraph. Disconnecting a credit card processing system from the network may bring transactions to a halt, potentially causing significant business losses.
Similarly, blocking large swaths of inbound web traffic may render the site inaccessible to some legitimate users. Incident responders undertaking containment strategies must understand the potential side effects of their actions while weighing them against the greater benefit to the organization. Decisions such as these are one of the reasons that senior management may want to have input into the organization’s incident response strategies and tactics.
CONTAINMENT STRATEGY CRITERIA
In the Computer Security Incident Handling Guide, NIST recommends using the following criteria to develop an appropriate containment strategy and weigh it against business interests:
Note:
Selecting appropriate containment strategies is one of the most difficult tasks facing incident responders. Containment approaches that are too drastic may have an unacceptable impact on business operations. On the other hand, responders who select weak containment approaches may find that the incident escalates to cause even more damage.
Potential damage to and theft of resources
Need for evidence preservation
Service availability (for example, network connectivity and services provided to external parties)
Time and resources needed to implement the strategy
Effectiveness of the strategy (for example, partial containment and full containment)
Duration of the solution (for example, emergency workaround to be removed in four hours, temporary workaround to be removed in two weeks, or permanent solution)
Unfortunately, there’s no formula or decision tree that guarantees responders will make the “right” decision while responding to an incident. Incident responders should understand these criteria, the intent of management, and their technical and business operating environment. Armed with this information, responders will be well-positioned to follow their best judgment and select an appropriate containment strategy.
What segmentation strategy for CSIRT is useful?
Network Segmentation
Cybersecurity analysts often use network segmentation as a proactive strategy to prevent the spread of future security incidents. For example, the network shown in Figure 8.2 is designed to segment different types of users from each other and from critical systems. An attacker who can gain access to the guest network would not be able to interact with systems belonging to employees or in the data center without traversing the network firewall.
In addition to being used as a proactive control, network segmentation may play a crucial role in incident response. During the early stages of an incident, responders may realize that a portion of systems are compromised but wish to continue to observe the activity on those systems while they determine other appropriate responses. However, they certainly want to protect other systems on the network from those potentially compromised systems.
Why choose isolation over segmentation as a containment strategy?
Isolation vs segmentation
Although segmentation does limit the access that attackers have to the remainder of the network, it sometimes doesn’t go far enough to meet containment objectives. Cybersecurity analysts may instead decide that it is necessary to use stronger isolation practices to cut off an attack. Two primary isolation techniques may be used during a cybersecurity incident response effort: isolating affected systems and isolating the attacker.
Segmentation and isolation strategies carry with them significant risks to the organization. First, the attacker retains access to the compromised system, creating the potential for further expansion of the security incident. Second, the compromised system may be used to attack other systems on the Internet. In the best case scenario, an attack launched from the organization’s network against a third party may lead to some difficult conversations with cybersecurity colleagues at other firms. In the worst case scenario, the courts may hold the organization liable for knowingly allowing the use of their network in an attack. Cybersecurity analysts considering a segmentation or isolation approach to containment should consult with both management and legal counsel.
In the segmentation approach, the network is connected to the firewall and may have some limited access to other networked systems.
In the isolation approach, the quarantine network connects directly to the Internet and has no access to other systems. In reality, this approach may be implemented by simply altering firewall rules rather than bypassing the firewall entirely. The objective is to allow the attacker to continue accessing the isolated systems but restrict their ability to access other systems and cause further damage.
What variation on the isoltation approach but requires a sandbox system to monitor activity?
WHat are the benefits?
ISOLATING THE ATTACKER
Isolating the attacker is an interesting variation on the isolation strategy and depends on the use of sandbox systems that are set up purely to monitor attacker activity and that do not contain any information or resources of value to the attacker. Placing attackers in a sandboxed environment allows continued observation in a fairly safe, contained environment. Some organizations use honeypot systems for this purpose.
What is the strongest containment technique available?
Removal
Removal of compromised systems from the network is the strongest containment technique in the cybersecurity analyst’s incident response toolkit. As shown in Figure 8.5, removal differs from segmentation and isolation in that the affected systems are completely disconnected from other networks, although they may still be allowed to communicate with other compromised systems within the quarantine VLAN. In some cases, each suspect system may be physically disconnected from the network so that they are prevented from communicating even with each other. The exact details of removal will depend on the circumstances of the incident and the professional judgment of incident responders.
How could an attacker detect isolation?
Removing a system from the network is a common containment step designed to prevent further damage from taking place, but NIST points out in their Computer Security Incident Handling Guide that it isn’t foolproof. The guide presents a hypothetical example of an attacker using a simple ping as a sort of “dead man’s switch” for a compromised system, designed to identify when the adversary detects the response and removes the system from the network.
In this scenario, the attacker simply sets up a periodic ping request to a known external host, such as the Google public DNS server located at 8.8.8.8. This server is almost always accessible from any network and the attacker can verify this connectivity after initially compromising a system.
The attacker can then write a simple script that monitors the results of those ping requests and, after detecting several consecutive failures, assumes that the attack was detected and the system was removed from the network. The script can then wipe out evidence of the attack or encrypt important information stored on the server.
The moral of the story is that although removal is a strong weapon in the containment toolkit, it isn’t foolproof!
If incident handlers suspect that evidence gathered during an investigation may be used in court, they should take special care to preserve and document evidence during their investigation. NIST recommends that investigators maintain a detailed evidence log that includes the following:
Note:
The primary objective during the containment phase of incident response is to limit the damage to the organization and its resources. That objective may take precedence over other goals, but responders may still be interested in gathering evidence during the containment process. This evidence may be crucial in the continuing analysis of the incident for internal purposes, or it may be used during legal proceedings against the attacker.
Evidence Gathering and Handling
Identifying information (for example, the location, serial number, model number, hostname, MAC addresses, and IP addresses of a computer)
Name, title, and phone number of each individual who collected or handled the evidence during the investigation
Time and date (including time zone) of each occurrence of evidence handling
Locations where the evidence was stored
Failure to maintain accurate logs will bring the evidence chain of custody into question and may cause the evidence to be inadmissible in court.
What are the issues in trying to identifying attackers of an attack?
Identifying Attackers
Identifying the perpetrators of a cybersecurity incident is a complex task that often leads investigators down a winding path of redirected hosts that crosses international borders. Although you might find IP address records stored in your logs, it is incredibly unlikely that they correspond to the actual IP address of the attacker. Any attacker other than the most rank amateurs will relay communications through a series of compromised systems, making it very difficult to trace their actual origin.
Before heading down this path of investigating an attack’s origin, it’s very important to ask yourself why you are pursuing it. Is there really business value in uncovering who attacked you, or would your time be better spent on containment, eradication, and recovery activities? The NIST Computer Security Incident Handling Guide addresses this issue head-on, giving the opinion that “[i]dentifying an attacking host can be a time-consuming and futile process that can prevent a team from achieving its primary goal—minimizing the business impact.”
Law enforcement officials may approach this situation with objectives that differ from those of the attacked organization’s cybersecurity analysts. After all, one of the core responsibilities of law enforcement organizations is to identify criminals, arrest them, and bring them to trial. That responsibility may conflict with the core cybersecurity objectives of containment, eradication, and recovery. Cybersecurity and business leaders should take this conflict into consideration when deciding whether to involve law enforcement agencies in an incident investigation and the degree of cooperation they will provide to an investigation that is already underway.
Once the cybersecurity team successfully contains an incident, it is time to move on to the eradication phase of the response. What happens during Incident Eradication and Recovery?
The primary purpose of eradication is to remove any of the artifacts of the incident that may remain on the organization’s network. This could include the removal of any malicious code from the network, the sanitization of compromised media, and the securing of compromised user accounts.
The recovery phase of incident response focuses on restoring normal capabilities and services. It includes reconstituting resources and correcting security control deficiencies that may have led to the attack. This could include rebuilding and patching systems, reconfiguring firewalls, updating malware signatures, and similar activities. The goal of recovery is not just to rebuild the organization’s network but to do so in a manner that reduces the likelihood of a successful future attack.
During the eradication and recovery effort, cybersecurity analysts should develop a clear understanding of the incident’s root cause. This is critical to implementing a secure recovery that corrects control deficiencies that led to the original attack. After all, if you don’t understand how an attacker breached your security controls in the first place, it will be hard to correct those controls so the attack doesn’t reoccur. Understanding the root cause of an attack is a completely different activity than identifying the attacker. Root cause assessment is a critical component of incident recovery while, as mentioned earlier, identifying the attacker can be a costly distraction.
What type of Reconstruction and Reimaging during incident needs to occur?
During an incident, attackers may compromise one or more systems through the use of malware, web application attacks, or other exploits. Once an attacker gains control of a system, security professionals should consider it completely compromised and untrustworthy. It is not safe to simply correct the security issue and move on because the attacker may still have an undetected foothold on the compromised system. Instead, the system should be rebuilt, either from scratch or by using an image or backup of the system from a known secure state.
Rebuilding and/or restoring systems should always be done with the incident root cause analysis in mind. If the system was compromised because it contained a security vulnerability, as opposed to through the use of a compromised user account, backups and images of that system likely have that same vulnerability. Even rebuilding the system from scratch may reintroduce the earlier vulnerability, rendering the system susceptible to the same attack. During the recovery phase, administrators should ensure that rebuilt or restored systems are remediated to address known security issues.
What patching strategy should analysts use?
Patching Systems and Applications
During the incident recovery effort, cybersecurity analysts will patch operating systems and applications involved in the attack. This is also a good time to review the security patch status of all systems in the enterprise, addressing other security issues that may lurk behind the scenes.
Cybersecurity analysts should first focus their efforts on systems that were directly involved in the compromise and then work their way outward, addressing systems that were indirectly related to the compromise before touching systems that were not involved at all. Figure 8.6 shows the phased approach that cybersecurity analysts should take to patching systems and applications during the recovery phase.
Sanitization and Secure Disposal
Sanitization and Secure Disposal
During the recovery effort, cybersecurity analysts may need to dispose of or repurpose media from systems that were compromised during the incident. In those cases, special care should be taken to ensure that sensitive information that was stored on that media is not compromised. Responders don’t want the recovery effort from one incident to lead to a second incident!
Generally speaking, there are three options available for the secure disposition of media containing sensitive information: clear, purge, and destroy. NIST defines these three activities in NIST SP 800-88: Guidelines for Media Sanitization:
Clear applies logical techniques to sanitize data in all user-addressable storage locations for protection against simple non-invasive data recovery techniques; this is typically applied through the standard Read and Write commands to the storage device, such as by rewriting with a new value or using a menu option to reset the device to the factory state (where rewriting is not supported).
Purge applies physical or logical techniques that render target data recovery infeasible using state-of-the-art laboratory techniques. Examples of purging activities include overwriting, block erase, and cryptographic erase activities when performed through the use of dedicated, standardized device commands. Degaussing is another form of purging that uses extremely strong magnetic fields to disrupt the data stored on a device.
Destroy renders target data recovery infeasible using state-of-the-art laboratory techniques and results in the subsequent inability to use the media for storage of data. Destruction techniques include disintegration, pulverization, melting, and incinerating.
What steps are required to validate the Recovery Effort of an incident?
Validating the Recovery Effort
Before concluding the recovery effort, incident responders should take time to verify that the recovery measures put in place were successful. The exact nature of this verification will depend on the technical circumstances of the incident and the organization’s infrastructure. Four activities that should always be included in these validation efforts follow:
Validate that only authorized user accounts exist on every system and application in the organization. In many cases, organizations already undertake periodic account reviews that verify the authorization for every account. This process should be used during the recovery validation effort.
Verify the proper restoration of permissions assigned to each account. During the account review, responders should also verify that accounts do not have extraneous permissions that violate the principle of least privilege. This is true for normal user accounts, administrator accounts, and service accounts.
Verify that all systems are logging properly. Every system and application should be configured to log security-related information to a level that is consistent with the organization’s logging policy. Those log records should be sent to a centralized log repository that preserves them for archival use. The validation phase should include verification that these logs are properly configured and received by the repository.
Conduct vulnerability scans on all systems. Vulnerability scans play an important role in verifying that systems are safeguarded against future attacks. Analysts should run thorough scans against systems and initiate remediation workflows where necessary.
What are the two things that are part of the Post-Incident Activity?
CSIRT enters the post-incident activity phase of incident response. During this phase, team members conduct a lessons-learned review and ensure that they meet internal and external evidence retention requirements.
How does the lessons learned review work?
The lessons-learned review should be facilitated by an independent facilitator who was not involved in the incident response and who is perceived by everyone involved as an objective outsider. This allows the facilitator to productively guide the discussion without participants feeling that they are advancing a hidden agenda. NIST recommends that lessons-learned processes answer the following questions:
Exactly what happened and at what times?
What was the root cause of the incident?
How well did staff and management perform in responding to the incident?
Were the documented procedures followed? Were they adequate?
What information was needed sooner?
Were any steps or actions taken that might have inhibited the recovery?
What would the staff and management do differently the next time a similar incident occurs?
How could information sharing with other organizations have been improved?
What corrective actions can prevent similar incidents in the future?
What precursors or indicators should be watched for in the future to detect similar incidents?
What additional tools or resources are needed to detect, analyze, and mitigate future incidents?