4. Types of Security Procedures
4.1 System Security Audits
Most businesses undergo some sort of annual financial auditing as a
regular part of their business life. Security audits are an
important part of running any computing environment. Part of the
security audit should be a review of any policies that concern system
security, as well as the mechanisms that are put in place to enforce
4.1.1 Organize Scheduled Drills
Although not something that would be done each day or week,
scheduled drills may be conducted to determine if the procedures
defined are adequate for the threat to be countered. If your
major threat is one of natural disaster, then a drill would be
conducted to verify your backup and recovery mechanisms. On the
other hand, if your greatest threat is from external intruders
attempting to penetrate your system, a drill might be conducted to
actually try a penetration to observe the effect of the policies.
Drills are a valuable way to test that your policies and
procedures are effective. On the other hand, drills can be time-
consuming and disruptive to normal operations. It is important to
weigh the benefits of the drills against the possible time loss
which may be associated with them.
4.1.2 Test Procedures
If the choice is made to not to use scheduled drills to examine
your entire security procedure at one time, it is important to
test individual procedures frequently. Examine your backup
procedure to make sure you can recover data from the tapes. Check
log files to be sure that information which is supposed to be
logged to them is being logged to them, etc..
When a security audit is mandated, great care should be used in
devising tests of the security policy. It is important to clearly
identify what is being tested, how the test will be conducted, and
results expected from the test. This should all be documented and
included in or as an adjunct to the security policy document
It is important to test all aspects of the security policy, both
procedural and automated, with a particular emphasis on the
automated mechanisms used to enforce the policy. Tests should be
defined to ensure a comprehensive examination of policy features,
that is, if a test is defined to examine the user logon process,
it should be explicitly stated that both valid and invalid user
names and passwords will be used to demonstrate proper operation
of the logon program.
Keep in mind that there is a limit to the reasonableness of tests.
The purpose of testing is to ensure confidence that the security
policy is being correctly enforced, and not to "prove" the
absoluteness of the system or policy. The goal should be to
obtain some assurance that the reasonable and credible controls
imposed by your security policy are adequate.
4.2 Account Management Procedures
Procedures to manage accounts are important in preventing
unauthorized access to your system. It is necessary to decide
several things: Who may have an account on the system? How long may
someone have an account without renewing his or her request? How do
old accounts get removed from the system? The answers to all these
questions should be explicitly set out in the policy.
In addition to deciding who may use a system, it may be important to
determine what each user may use the system for (is personal use
allowed, for example). If you are connected to an outside network,
your site or the network management may have rules about what the
network may be used for. Therefore, it is important for any security
policy to define an adequate account management procedure for both
administrators and users. Typically, the system administrator would
be responsible for creating and deleting user accounts and generally
maintaining overall control of system use. To some degree, account
management is also the responsibility of each system user in the
sense that the user should observe any system messages and events
that may be indicative of a policy violation. For example, a message
at logon that indicates the date and time of the last logon should be
reported by the user if it indicates an unreasonable time of last
4.3 Password Management Procedures
A policy on password management may be important if your site wishes
to enforce secure passwords. These procedures may range from asking
or forcing users to change their passwords occasionally to actively
attempting to break users' passwords and then informing the user of
how easy it was to do. Another part of password management policy
covers who may distribute passwords - can users give their passwords
to other users?
Section 2.3 discusses some of the policy issues that need to be
decided for proper password management. Regardless of the policies,
password management procedures need to be carefully setup to avoid
disclosing passwords. The choice of initial passwords for accounts
is critical. In some cases, users may never login to activate an
account; thus, the choice of the initial password should not be
easily guessed. Default passwords should never be assigned to
accounts: always create new passwords for each user. If there are
any printed lists of passwords, these should be kept off-line in
secure locations; better yet, don't list passwords.
4.3.1 Password Selection
Perhaps the most vulnerable part of any computer system is the
account password. Any computer system, no matter how secure it is
from network or dial-up attack, Trojan horse programs, and so on,
can be fully exploited by an intruder if he or she can gain access
via a poorly chosen password. It is important to define a good
set of rules for password selection, and distribute these rules to
all users. If possible, the software which sets user passwords
should be modified to enforce as many of the rules as possible.
A sample set of guidelines for password selection is shown below:
- DON'T use your login name in any form (as-is,
reversed, capitalized, doubled, etc.).
- DON'T use your first, middle, or last name in any form.
- DON'T use your spouse's or child's name.
- DON'T use other information easily obtained about you.
This includes license plate numbers, telephone numbers,
social security numbers, the make of your automobile,
the name of the street you live on, etc..
- DON'T use a password of all digits, or all the same
- DON'T use a word contained in English or foreign
language dictionaries, spelling lists, or other
lists of words.
- DON'T use a password shorter than six characters.
- DO use a password with mixed-case alphabetics.
- DO use a password with non-alphabetic characters (digits
- DO use a password that is easy to remember, so you don't
have to write it down.
- DO use a password that you can type quickly, without
having to look at the keyboard.
Methods of selecting a password which adheres to these guidelines
- Choose a line or two from a song or poem, and use the
first letter of each word.
- Alternate between one consonant and one or two vowels, up
to seven or eight characters. This provides nonsense
words which are usually pronounceable, and thus easily
- Choose two short words and concatenate them together with
a punctuation character between them.
Users should also be told to change their password periodically,
usually every three to six months. This makes sure that an
intruder who has guessed a password will eventually lose access,
as well as invalidating any list of passwords he/she may have
obtained. Many systems enable the system administrator to force
users to change their passwords after an expiration period; this
software should be enabled if your system supports it [5, CURRY].
Some systems provide software which forces users to change their
passwords on a regular basis. Many of these systems also include
password generators which provide the user with a set of passwords
to choose from. The user is not permitted to make up his or her
own password. There are arguments both for and against systems
such as these. On the one hand, by using generated passwords,
users are prevented from selecting insecure passwords. On the
other hand, unless the generator is good at making up easy to
remember passwords, users will begin writing them down in order to
4.3.2 Procedures for Changing Passwords
How password changes are handled is important to keeping passwords
secure. Ideally, users should be able to change their own
passwords on-line. (Note that password changing programs are a
favorite target of intruders. See section 4.4 on configuration
management for further information.)
However, there are exception cases which must be handled
carefully. Users may forget passwords and not be able to get onto
the system. The standard procedure is to assign the user a new
password. Care should be taken to make sure that the real person
is requesting the change and gets the new password. One common
trick used by intruders is to call or message to a system
administrator and request a new password. Some external form of
verification should be used before the password is assigned. At
some sites, users are required to show up in person with ID.
There may also be times when many passwords need to be changed.
If a system is compromised by an intruder, the intruder may be
able to steal a password file and take it off the system. Under
these circumstances, one course of action is to change all
passwords on the system. Your site should have procedures for how
this can be done quickly and efficiently. What course you choose
may depend on the urgency of the problem. In the case of a known
attack with damage, you may choose to forcibly disable all
accounts and assign users new passwords before they come back onto
the system. In some places, users are sent a message telling them
that they should change their passwords, perhaps within a certain
time period. If the password isn't changed before the time period
expires, the account is locked.
Users should be aware of what the standard procedure is for
passwords when a security event has occurred. One well-known
spoof reported by the Computer Emergency Response Team (CERT)
involved messages sent to users, supposedly from local system
administrators, requesting them to immediately change their
password to a new value provided in the message . These
messages were not from the administrators, but from intruders
trying to steal accounts. Users should be warned to immediately
report any suspicious requests such as this to site
4.4 Configuration Management Procedures
Configuration management is generally applied to the software
development process. However, it is certainly applicable in a
operational sense as well. Consider that the since many of the
system level programs are intended to enforce the security policy, it
is important that these be "known" as correct. That is, one should
not allow system level programs (such as the operating system, etc.)
to be changed arbitrarily. At very least, the procedures should
state who is authorized to make changes to systems, under what
circumstances, and how the changes should be documented.
In some environments, configuration management is also desirable as
applied to physical configuration of equipment. Maintaining valid
and authorized hardware configuration should be given due
consideration in your security policy.
4.4.1 Non-Standard Configurations
Occasionally, it may be beneficial to have a slightly non-standard
configuration in order to thwart the "standard" attacks used by
some intruders. The non-standard parts of the configuration might
include different password encryption algorithms, different
configuration file locations, and rewritten or functionally
limited system commands.
Non-standard configurations, however, also have their drawbacks.
By changing the "standard" system, these modifications make
software maintenance more difficult by requiring extra
documentation to be written, software modification after operating
system upgrades, and, usually, someone with special knowledge of
Because of the drawbacks of non-standard configurations, they are
often only used in environments with a "firewall" machine (see
section 3.9.1). The firewall machine is modified in non-standard
ways since it is susceptible to attack, while internal systems
behind the firewall are left in their standard configurations.
5. Incident Handling
This section of the document will supply some guidance to be applied
when a computer security event is in progress on a machine, network,
site, or multi-site environment. The operative philosophy in the
event of a breach of computer security, whether it be an external
intruder attack or a disgruntled employee, is to plan for adverse
events in advance. There is no substitute for creating contingency
plans for the types of events described above.
Traditional computer security, while quite important in the overall
site security plan, usually falls heavily on protecting systems from
attack, and perhaps monitoring systems to detect attacks. Little
attention is usually paid for how to actually handle the attack when
it occurs. The result is that when an attack is in progress, many
decisions are made in haste and can be damaging to tracking down the
source of the incident, collecting evidence to be used in prosecution
efforts, preparing for the recovery of the system, and protecting the
valuable data contained on the system.
5.1.1 Have a Plan to Follow in Case of an Incident
Part of handling an incident is being prepared to respond before
the incident occurs. This includes establishing a suitable level
of protections, so that if the incident becomes severe, the damage
which can occur is limited. Protection includes preparing
incident handling guidelines or a contingency response plan for
your organization or site. Having written plans eliminates much
of the ambiguity which occurs during an incident, and will lead to
a more appropriate and thorough set of responses. Second, part of
protection is preparing a method of notification, so you will know
who to call and the relevant phone numbers. It is important, for
example, to conduct "dry runs," in which your computer security
personnel, system administrators, and managers simulate handling
Learning to respond efficiently to an incident is important for
numerous reasons. The most important benefit is directly to human
beings--preventing loss of human life. Some computing systems are
life critical systems, systems on which human life depends (e.g.,
by controlling some aspect of life-support in a hospital or
assisting air traffic controllers).
An important but often overlooked benefit is an economic one.
Having both technical and managerial personnel respond to an
incident requires considerable resources, resources which could be
utilized more profitably if an incident did not require their
services. If these personnel are trained to handle an incident
efficiently, less of their time is required to deal with that
A third benefit is protecting classified, sensitive, or
proprietary information. One of the major dangers of a computer
security incident is that information may be irrecoverable.
Efficient incident handling minimizes this danger. When
classified information is involved, other government regulations
may apply and must be integrated into any plan for incident
A fourth benefit is related to public relations. News about
computer security incidents tends to be damaging to an
organization's stature among current or potential clients.
Efficient incident handling minimizes the potential for negative
A final benefit of efficient incident handling is related to legal
issues. It is possible that in the near future organizations may
be sued because one of their nodes was used to launch a network
attack. In a similar vein, people who develop patches or
workarounds may be sued if the patches or workarounds are
ineffective, resulting in damage to systems, or if the patches or
workarounds themselves damage systems. Knowing about operating
system vulnerabilities and patterns of attacks and then taking
appropriate measures is critical to circumventing possible legal
5.1.2 Order of Discussion in this Session Suggests an Order for
This chapter is arranged such that a list may be generated from
the Table of Contents to provide a starting point for creating a
policy for handling ongoing incidents. The main points to be
included in a policy for handling incidents are:
o Overview (what are the goals and objectives in handling the
o Evaluation (how serious is the incident).
o Notification (who should be notified about the incident).
o Response (what should the response to the incident be).
o Legal/Investigative (what are the legal and prosecutorial
implications of the incident).
o Documentation Logs (what records should be kept from before,
during, and after the incident).
Each of these points is important in an overall plan for handling
incidents. The remainder of this chapter will detail the issues
involved in each of these topics, and provide some guidance as to
what should be included in a site policy for handling incidents.
5.1.3 Possible Goals and Incentives for Efficient Incident
As in any set of pre-planned procedures, attention must be placed
on a set of goals to be obtained in handling an incident. These
goals will be placed in order of importance depending on the site,
but one such set of goals might be:
Assure integrity of (life) critical systems.
Maintain and restore data.
Maintain and restore service.
Figure out how it happened.
Avoid escalation and further incidents.
Avoid negative publicity.
Find out who did it.
Punish the attackers.
It is important to prioritize actions to be taken during an
incident well in advance of the time an incident occurs.
Sometimes an incident may be so complex that it is impossible to
do everything at once to respond to it; priorities are essential.
Although priorities will vary from institution-to-institution, the
following suggested priorities serve as a starting point for
defining an organization's response:
o Priority one -- protect human life and people's
safety; human life always has precedence over all
o Priority two -- protect classified and/or sensitive
data (as regulated by your site or by government
o Priority three -- protect other data, including
proprietary, scientific, managerial and other data,
because loss of data is costly in terms of resources.
o Priority four -- prevent damage to systems (e.g., loss
or alteration of system files, damage to disk drives,
etc.); damage to systems can result in costly down
time and recovery.
o Priority five -- minimize disruption of computing
resources; it is better in many cases to shut a system
down or disconnect from a network than to risk damage
to data or systems.
An important implication for defining priorities is that once
human life and national security considerations have been
addressed, it is generally more important to save data than system
software and hardware. Although it is undesirable to have any
damage or loss during an incident, systems can be replaced; the
loss or compromise of data (especially classified data), however,
is usually not an acceptable outcome under any circumstances.
Part of handling an incident is being prepared to respond before
the incident occurs. This includes establishing a suitable level
of protections so that if the incident becomes severe, the damage
which can occur is limited. Protection includes preparing
incident handling guidelines or a contingency response plan for
your organization or site. Written plans eliminate much of the
ambiguity which occurs during an incident, and will lead to a more
appropriate and thorough set of responses. Second, part of
protection is preparing a method of notification so you will know
who to call and how to contact them. For example, every member of
the Department of Energy's CIAC Team carries a card with every
other team member's work and home phone numbers, as well as pager
numbers. Third, your organization or site should establish backup
procedures for every machine and system. Having backups
eliminates much of the threat of even a severe incident, since
backups preclude serious data loss. Fourth, you should set up
secure systems. This involves eliminating vulnerabilities,
establishing an effective password policy, and other procedures,
all of which will be explained later in this document. Finally,
conducting training activities is part of protection. It is
important, for example, to conduct "dry runs," in which your
computer security personnel, system administrators, and managers
simulate handling an incident.
5.1.4 Local Policies and Regulations Providing Guidance
Any plan for responding to security incidents should be guided by
local policies and regulations. Government and private sites that
deal with classified material have specific rules that they must
The policies your site makes about how it responds to incidents
(as discussed in sections 2.4 and 2.5) will shape your response.
For example, it may make little sense to create mechanisms to
monitor and trace intruders if your site does not plan to take
action against the intruders if they are caught. Other
organizations may have policies that affect your plans. Telephone
companies often release information about telephone traces only to
law enforcement agencies.
Section 5.5 also notes that if any legal action is planned, there
are specific guidelines that must be followed to make sure that
any information collected can be used as evidence.
5.2.1 Is It Real?
This stage involves determining the exact problem. Of course
many, if not most, signs often associated with virus infections,
system intrusions, etc., are simply anomalies such as hardware
failures. To assist in identifying whether there really is an
incident, it is usually helpful to obtain and use any detection
software which may be available. For example, widely available
software packages can greatly assist someone who thinks there may
be a virus in a Macintosh computer. Audit information is also
extremely useful, especially in determining whether there is a
network attack. It is extremely important to obtain a system
snapshot as soon as one suspects that something is wrong. Many
incidents cause a dynamic chain of events to occur, and an initial
system snapshot may do more good in identifying the problem and
any source of attack than most other actions which can be taken at
this stage. Finally, it is important to start a log book.
Recording system events, telephone conversations, time stamps,
etc., can lead to a more rapid and systematic identification of
the problem, and is the basis for subsequent stages of incident
There are certain indications or "symptoms" of an incident which
deserve special attention:
o System crashes.
o New user accounts (e.g., the account RUMPLESTILTSKIN
has unexplainedly been created), or high activity on
an account that has had virtually no activity for
o New files (usually with novel or strange file names,
such as data.xx or k).
o Accounting discrepancies (e.g., in a UNIX system you
might notice that the accounting file called
/usr/admin/lastlog has shrunk, something that should
make you very suspicious that there may be an
o Changes in file lengths or dates (e.g., a user should
be suspicious if he/she observes that the .EXE files in
an MS DOS computer have unexplainedly grown
by over 1800 bytes).
o Attempts to write to system (e.g., a system manager
notices that a privileged user in a VMS system is
attempting to alter RIGHTSLIST.DAT).
o Data modification or deletion (e.g., files start to
o Denial of service (e.g., a system manager and all
other users become locked out of a UNIX system, which
has been changed to single user mode).
o Unexplained, poor system performance (e.g., system
response time becomes unusually slow).
o Anomalies (e.g., "GOTCHA" is displayed on a display
terminal or there are frequent unexplained "beeps").
o Suspicious probes (e.g., there are numerous
unsuccessful login attempts from another node).
o Suspicious browsing (e.g., someone becomes a root user
on a UNIX system and accesses file after file in one
user's account, then another's).
None of these indications is absolute "proof" that an incident is
occurring, nor are all of these indications normally observed when
an incident occurs. If you observe any of these indications,
however, it is important to suspect that an incident might be
occurring, and act accordingly. There is no formula for
determining with 100 percent accuracy that an incident is
occurring (possible exception: when a virus detection package
indicates that your machine has the nVIR virus and you confirm
this by examining contents of the nVIR resource in your Macintosh
computer, you can be very certain that your machine is infected).
It is best at this point to collaborate with other technical and
computer security personnel to make a decision as a group about
whether an incident is occurring.
Along with the identification of the incident is the evaluation of
the scope and impact of the problem. It is important to correctly
identify the boundaries of the incident in order to effectively
deal with it. In addition, the impact of an incident will
determine its priority in allocating resources to deal with the
event. Without an indication of the scope and impact of the
event, it is difficult to determine a correct response.
In order to identify the scope and impact, a set of criteria
should be defined which is appropriate to the site and to the type
of connections available. Some of the issues are:
o Is this a multi-site incident?
o Are many computers at your site effected by this
o Is sensitive information involved?
o What is the entry point of the incident (network,
phone line, local terminal, etc.)?
o Is the press involved?
o What is the potential damage of the incident?
o What is the estimated time to close out the incident?
o What resources could be required
to handle the incident?
5.3 Possible Types of Notification
When you have confirmed that an incident is occurring, the
appropriate personnel must be notified. Who and how this
notification is achieved is very important in keeping the event under
control both from a technical and emotional standpoint.
First of all, any notification to either local or off-site
personnel must be explicit. This requires that any statement (be
it an electronic mail message, phone call, or fax) provides
information about the incident that is clear, concise, and fully
qualified. When you are notifying others that will help you to
handle an event, a "smoke screen" will only divide the effort and
create confusion. If a division of labor is suggested, it is
helpful to provide information to each section about what is being
accomplished in other efforts. This will not only reduce
duplication of effort, but allow people working on parts of the
problem to know where to obtain other information that would help
them resolve a part of the incident.
Another important consideration when communicating about the
incident is to be factual. Attempting to hide aspects of the
incident by providing false or incomplete information may not only
prevent a successful resolution to the incident, but may even
worsen the situation. This is especially true when the press is
involved. When an incident severe enough to gain press attention
is ongoing, it is likely that any false information you provide
will not be substantiated by other sources. This will reflect
badly on the site and may create enough ill-will between the site
and the press to damage the site's public relations.
5.3.3 Choice of Language
The choice of language used when notifying people about the
incident can have a profound effect on the way that information is
received. When you use emotional or inflammatory terms, you raise
the expectations of damage and negative outcomes of the incident.
It is important to remain calm both in written and spoken
Another issue associated with the choice of language is the
notification to non-technical or off-site personnel. It is
important to accurately describe the incident without undue alarm
or confusing messages. While it is more difficult to describe the
incident to a non-technical audience, it is often more important.
A non-technical description may be required for upper-level
management, the press, or law enforcement liaisons. The
importance of these notifications cannot be underestimated and may
make the difference between handling the incident properly and
escalating to some higher level of damage.
5.3.4 Notification of Individuals
o Point of Contact (POC) people (Technical, Administrative,
Response Teams, Investigative, Legal, Vendors, Service
providers), and which POCs are visible to whom.
o Wider community (users).
o Other sites that might be affected.
Finally, there is the question of who should be notified during
and after the incident. There are several classes of individuals
that need to be considered for notification. These are the
technical personnel, administration, appropriate response teams
(such as CERT or CIAC), law enforcement, vendors, and other
service providers. These issues are important for the central
point of contact, since that is the person responsible for the
actual notification of others (see section 5.3.6 for further
information). A list of people in each of these categories is an
important time saver for the POC during an incident. It is much
more difficult to find an appropriate person during an incident
when many urgent events are ongoing.
In addition to the people responsible for handling part of the
incident, there may be other sites affected by the incident (or
perhaps simply at risk from the incident). A wider community of
users may also benefit from knowledge of the incident. Often, a
report of the incident once it is closed out is appropriate for
publication to the wider user community.
5.3.5 Public Relations - Press Releases
One of the most important issues to consider is when, who, and how
much to release to the general public through the press. There
are many issues to consider when deciding this particular issue.
First and foremost, if a public relations office exists for the
site, it is important to use this office as liaison to the press.
The public relations office is trained in the type and wording of
information released, and will help to assure that the image of
the site is protected during and after the incident (if possible).
A public relations office has the advantage that you can
communicate candidly with them, and provide a buffer between the
constant press attention and the need of the POC to maintain
control over the incident.
If a public relations office is not available, the information
released to the press must be carefully considered. If the
information is sensitive, it may be advantageous to provide only
minimal or overview information to the press. It is quite
possible that any information provided to the press will be
quickly reviewed by the perpetrator of the incident. As a
contrast to this consideration, it was discussed above that
misleading the press can often backfire and cause more damage than
releasing sensitive information.
While it is difficult to determine in advance what level of detail
to provide to the press, some guidelines to keep in mind are:
o Keep the technical level of detail low. Detailed
information about the incident may provide enough
information for copy-cat events or even damage the
site's ability to prosecute once the event is over.
o Keep the speculation out of press statements.
Speculation of who is causing the incident or the
motives are very likely to be in error and may cause
an inflamed view of the incident.
o Work with law enforcement professionals to assure that
evidence is protected. If prosecution is involved,
assure that the evidence collected is not divulged to
o Try not to be forced into a press interview before you are
prepared. The popular press is famous for the "2am"
interview, where the hope is to catch the interviewee off
guard and obtain information otherwise not available.
o Do not allow the press attention to detract from the
handling of the event. Always remember that the successful
closure of an incident is of primary importance.
5.3.6 Who Needs to Get Involved?
There now exists a number of incident response teams (IRTs) such
as the CERT and the CIAC. (See sections 188.8.131.52.1 and 184.108.40.206.4.)
Teams exists for many major government agencies and large
corporations. If such a team is available for your site, the
notification of this team should be of primary importance during
the early stages of an incident. These teams are responsible for
coordinating computer security incidents over a range of sites and
larger entities. Even if the incident is believed to be contained
to a single site, it is possible that the information available
through a response team could help in closing out the incident.
In setting up a site policy for incident handling, it may be
desirable to create an incident handling team (IHT), much like
those teams that already exist, that will be responsible for
handling computer security incidents for the site (or
organization). If such a team is created, it is essential that
communication lines be opened between this team and other IHTs.
Once an incident is under way, it is difficult to open a trusted
dialogue between other IHTs if none has existed before.
A major topic still untouched here is how to actually respond to an
event. The response to an event will fall into the general
categories of containment, eradication, recovery, and follow-up.
The purpose of containment is to limit the extent of an attack.
For example, it is important to limit the spread of a worm attack
on a network as quickly as possible. An essential part of
containment is decision making (i.e., determining whether to shut
a system down, to disconnect from a network, to monitor system or
network activity, to set traps, to disable functions such as
remote file transfer on a UNIX system, etc.). Sometimes this
decision is trivial; shut the system down if the system is
classified or sensitive, or if proprietary information is at risk!
In other cases, it is worthwhile to risk having some damage to the
system if keeping the system up might enable you to identify an
The third stage, containment, should involve carrying out
predetermined procedures. Your organization or site should, for
example, define acceptable risks in dealing with an incident, and
should prescribe specific actions and strategies accordingly.
Finally, notification of cognizant authorities should occur during
Once an incident has been detected, it is important to first think
about containing the incident. Once the incident has been
contained, it is now time to eradicate the cause. Software may be
available to help you in this effort. For example, eradication
software is available to eliminate most viruses which infect small
systems. If any bogus files have been created, it is time to
delete them at this point. In the case of virus infections, it is
important to clean and reformat any disks containing infected
files. Finally, ensure that all backups are clean. Many systems
infected with viruses become periodically reinfected simply
because people do not systematically eradicate the virus from
Once the cause of an incident has been eradicated, the recovery
phase defines the next stage of action. The goal of recovery is
to return the system to normal. In the case of a network-based
attack, it is important to install patches for any operating
system vulnerability which was exploited.
One of the most important stages of responding to incidents is
also the most often omitted---the follow-up stage. This stage is
important because it helps those involved in handling the incident
develop a set of "lessons learned" (see section 6.3) to improve
future performance in such situations. This stage also provides
information which justifies an organization's computer security
effort to management, and yields information which may be
essential in legal proceedings.
The most important element of the follow-up stage is performing a
postmortem analysis. Exactly what happened, and at what times?
How well did the staff involved with the incident perform? What
kind of information did the staff need quickly, and how could they
have gotten that information as soon as possible? What would the
staff do differently next time? A follow-up report is valuable
because it provides a reference to be used in case of other
similar incidents. Creating a formal chronology of events
(including time stamps) is also important for legal reasons.
Similarly, it is also important to as quickly obtain a monetary
estimate of the amount of damage the incident caused in terms of
any loss of software and files, hardware damage, and manpower
costs to restore altered files, reconfigure affected systems, and
so forth. This estimate may become the basis for subsequent
prosecution activity by the FBI, the U.S. Attorney General's
5.4.1 What Will You Do?
o Restore control.
o Relation to policy.
o Which level of service is needed?
o Monitor activity.
o Constrain or shut down system.
5.4.2 Consider Designating a "Single Point of Contact"
When an incident is under way, a major issue is deciding who is in
charge of coordinating the activity of the multitude of players.
A major mistake that can be made is to have a number of "points of
contact" (POC) that are not pulling their efforts together. This
will only add to the confusion of the event, and will probably
lead to additional confusion and wasted or ineffective effort.
The single point of contact may or may not be the person "in
charge" of the incident. There are two distinct rolls to fill
when deciding who shall be the point of contact and the person in
charge of the incident. The person in charge will make decisions
as to the interpretation of policy applied to the event. The
responsibility for the handling of the event falls onto this
person. In contrast, the point of contact must coordinate the
effort of all the parties involved with handling the event.
The point of contact must be a person with the technical expertise
to successfully coordinate the effort of the system managers and
users involved in monitoring and reacting to the attack. Often
the management structure of a site is such that the administrator
of a set of resources is not a technically competent person with
regard to handling the details of the operations of the computers,
but is ultimately responsible for the use of these resources.
Another important function of the POC is to maintain contact with
law enforcement and other external agencies (such as the CIA, DoD,
U.S. Army, or others) to assure that multi-agency involvement
Finally, if legal action in the form of prosecution is involved,
the POC may be able to speak for the site in court. The
alternative is to have multiple witnesses that will be hard to
coordinate in a legal sense, and will weaken any case against the
attackers. A single POC may also be the single person in charge
of evidence collected, which will keep the number of people
accounting for evidence to a minimum. As a rule of thumb, the
more people that touch a potential piece of evidence, the greater
the possibility that it will be inadmissible in court. The
section below (Legal/Investigative) will provide more details for
consideration on this topic.
5.5.1 Establishing Contacts with Investigative Agencies
It is important to establish contacts with personnel from
investigative agencies such as the FBI and Secret Service as soon
as possible, for several reasons. Local law enforcement and local
security offices or campus police organizations should also be
informed when appropriate. A primary reason is that once a major
attack is in progress, there is little time to call various
personnel in these agencies to determine exactly who the correct
point of contact is. Another reason is that it is important to
cooperate with these agencies in a manner that will foster a good
working relationship, and that will be in accordance with the
working procedures of these agencies. Knowing the working
procedures in advance and the expectations of your point of
contact is a big step in this direction. For example, it is
important to gather evidence that will be admissible in a court of
law. If you don't know in advance how to gather admissible
evidence, your efforts to collect evidence during an incident are
likely to be of no value to the investigative agency with which
you deal. A final reason for establishing contacts as soon as
possible is that it is impossible to know the particular agency
that will assume jurisdiction in any given incident. Making
contacts and finding the proper channels early will make
responding to an incident go considerably more smoothly.
If your organization or site has a legal counsel, you need to
notify this office soon after you learn that an incident is in
progress. At a minimum, your legal counsel needs to be involved
to protect the legal and financial interests of your site or
organization. There are many legal and practical issues, a few of
1. Whether your site or organization is willing to risk
negative publicity or exposure to cooperate with legal
2. Downstream liability--if you leave a compromised system
as is so it can be monitored and another computer is damaged
because the attack originated from your system, your site or
organization may be liable for damages incurred.
3. Distribution of information--if your site or organization
distributes information about an attack in which another
site or organization may be involved or the vulnerability
in a product that may affect ability to market that
product, your site or organization may again be liable
for any damages (including damage of reputation).
4. Liabilities due to monitoring--your site or organization
may be sued if users at your site or elsewhere discover
that your site is monitoring account activity without
Unfortunately, there are no clear precedents yet on the
liabilities or responsibilities of organizations involved in a
security incident or who might be involved in supporting an
investigative effort. Investigators will often encourage
organizations to help trace and monitor intruders -- indeed, most
investigators cannot pursue computer intrusions without extensive
support from the organizations involved. However, investigators
cannot provide protection from liability claims, and these kinds
of efforts may drag out for months and may take lots of effort.
On the other side, an organization's legal council may advise
extreme caution and suggest that tracing activities be halted and
an intruder shut out of the system. This in itself may not
provide protection from liability, and may prevent investigators
from identifying anyone.
The balance between supporting investigative activity and limiting
liability is tricky; you'll need to consider the advice of your
council and the damage the intruder is causing (if any) in making
your decision about what to do during any particular incident.
Your legal counsel should also be involved in any decision to
contact investigative agencies when an incident occurs at your
site. The decision to coordinate efforts with investigative
agencies is most properly that of your site or organization.
Involving your legal counsel will also foster the multi-level
coordination between your site and the particular investigative
agency involved which in turn results in an efficient division of
labor. Another result is that you are likely to obtain guidance
that will help you avoid future legal mistakes.
Finally, your legal counsel should evaluate your site's written
procedures for responding to incidents. It is essential to obtain
a "clean bill of health" from a legal perspective before you
actually carry out these procedures.
5.5.2 Formal and Informal Legal Procedures
One of the most important considerations in dealing with
investigative agencies is verifying that the person who calls
asking for information is a legitimate representative from the
agency in question. Unfortunately, many well intentioned people
have unknowingly leaked sensitive information about incidents,
allowed unauthorized people into their systems, etc., because a
caller has masqueraded as an FBI or Secret Service agent. A
similar consideration is using a secure means of communication.
Because many network attackers can easily reroute electronic mail,
avoid using electronic mail to communicate with other agencies (as
well as others dealing with the incident at hand). Non-secured
phone lines (e.g., the phones normally used in the business world)
are also frequent targets for tapping by network intruders, so be
There is no established set of rules for responding to an incident
when the U.S. Federal Government becomes involved. Except by
court order, no agency can force you to monitor, to disconnect
from the network, to avoid telephone contact with the suspected
attackers, etc.. As discussed in section 5.5.1, you should
consult the matter with your legal counsel, especially before
taking an action that your organization has never taken. The
particular agency involved may ask you to leave an attacked
machine on and to monitor activity on this machine, for example.
Your complying with this request will ensure continued cooperation
of the agency--usually the best route towards finding the source
of the network attacks and, ultimately, terminating these attacks.
Additionally, you may need some information or a favor from the
agency involved in the incident. You are likely to get what you
need only if you have been cooperative. Of particular importance
is avoiding unnecessary or unauthorized disclosure of information
about the incident, including any information furnished by the
agency involved. The trust between your site and the agency
hinges upon your ability to avoid compromising the case the agency
will build; keeping "tight lipped" is imperative.
Sometimes your needs and the needs of an investigative agency will
differ. Your site may want to get back to normal business by
closing an attack route, but the investigative agency may want you
to keep this route open. Similarly, your site may want to close a
compromised system down to avoid the possibility of negative
publicity, but again the investigative agency may want you to
continue monitoring. When there is such a conflict, there may be
a complex set of tradeoffs (e.g., interests of your site's
management, amount of resources you can devote to the problem,
jurisdictional boundaries, etc.). An important guiding principle
is related to what might be called "Internet citizenship" [22,
IAB89, 23] and its responsibilities. Your site can shut a system
down, and this will relieve you of the stress, resource demands,
and danger of negative exposure. The attacker, however, is likely
to simply move on to another system, temporarily leaving others
blind to the attacker's intention and actions until another path
of attack can be detected. Providing that there is no damage to
your systems and others, the most responsible course of action is
to cooperate with the participating agency by leaving your
compromised system on. This will allow monitoring (and,
ultimately, the possibility of terminating the source of the
threat to systems just like yours). On the other hand, if there
is damage to computers illegally accessed through your system, the
choice is more complicated: shutting down the intruder may prevent
further damage to systems, but might make it impossible to track
down the intruder. If there has been damage, the decision about
whether it is important to leave systems up to catch the intruder
should involve all the organizations effected. Further
complicating the issue of network responsibility is the
consideration that if you do not cooperate with the agency
involved, you will be less likely to receive help from that agency
in the future.
5.6 Documentation Logs
When you respond to an incident, document all details related to the
incident. This will provide valuable information to yourself and
others as you try to unravel the course of events. Documenting all
details will ultimately save you time. If you don't document every
relevant phone call, for example, you are likely to forget a good
portion of information you obtain, requiring you to contact the
source of information once again. This wastes yours and others'
time, something you can ill afford. At the same time, recording
details will provide evidence for prosecution efforts, providing the
case moves in this direction. Documenting an incident also will help
you perform a final assessment of damage (something your management
as well as law enforcement officers will want to know), and will
provide the basis for a follow-up analysis in which you can engage in
a valuable "lessons learned" exercise.
During the initial stages of an incident, it is often infeasible to
determine whether prosecution is viable, so you should document as if
you are gathering evidence for a court case. At a minimum, you
o All system events (audit records).
o All actions you take (time tagged).
o All phone conversations (including the person with whom
you talked, the date and time, and the content of the
The most straightforward way to maintain documentation is keeping a
log book. This allows you to go to a centralized, chronological
source of information when you need it, instead of requiring you to
page through individual sheets of paper. Much of this information is
potential evidence in a court of law. Thus, when you initially
suspect that an incident will result in prosecution or when an
investigative agency becomes involved, you need to regularly (e.g.,
every day) turn in photocopied, signed copies of your logbook (as
well as media you use to record system events) to a document
custodian who can store these copied pages in a secure place (e.g., a
safe). When you submit information for storage, you should in return
receive a signed, dated receipt from the document custodian. Failure
to observe these procedures can result in invalidation of any
evidence you obtain in a court of law.
6. Establishing Post-Incident Procedures
In the wake of an incident, several actions should take place. These
actions can be summarized as follows:
1. An inventory should be taken of the systems' assets,
i.e., a careful examination should determine how the
system was affected by the incident,
2. The lessons learned as a result of the incident
should be included in revised security plan to
prevent the incident from re-occurring,
3. A new risk analysis should be developed in light of the
4. An investigation and prosecution of the individuals
who caused the incident should commence, if it is
All four steps should provide feedback to the site security policy
committee, leading to prompt re-evaluation and amendment of the
6.2 Removing Vulnerabilities
Removing all vulnerabilities once an incident has occurred is
difficult. The key to removing vulnerabilities is knowledge and
understanding of the breach. In some cases, it is prudent to remove
all access or functionality as soon as possible, and then restore
normal operation in limited stages. Bear in mind that removing all
access while an incident is in progress will obviously notify all
users, including the alleged problem users, that the administrators
are aware of a problem; this may have a deleterious effect on an
investigation. However, allowing an incident to continue may also
open the likelihood of greater damage, loss, aggravation, or
liability (civil or criminal).
If it is determined that the breach occurred due to a flaw in the
systems' hardware or software, the vendor (or supplier) and the CERT
should be notified as soon as possible. Including relevant telephone
numbers (also electronic mail addresses and fax numbers) in the site
security policy is strongly recommended. To aid prompt
acknowledgment and understanding of the problem, the flaw should be
described in as much detail as possible, including details about how
to exploit the flaw.
As soon as the breach has occurred, the entire system and all its
components should be considered suspect. System software is the most
probable target. Preparation is key to recovering from a possibly
tainted system. This includes checksumming all tapes from the vendor
using a checksum algorithm which (hopefully) is resistant to
tampering . (See sections 220.127.116.11, 18.104.22.168.) Assuming original
vendor distribution tapes are available, an analysis of all system
files should commence, and any irregularities should be noted and
referred to all parties involved in handling the incident. It can be
very difficult, in some cases, to decide which backup tapes to
recover from; consider that the incident may have continued for
months or years before discovery, and that the suspect may be an
employee of the site, or otherwise have intimate knowledge or access
to the systems. In all cases, the pre-incident preparation will
determine what recovery is possible. At worst-case, restoration from
the original manufactures' media and a re-installation of the systems
will be the most prudent solution.
Review the lessons learned from the incident and always update the
policy and procedures to reflect changes necessitated by the
6.2.1 Assessing Damage
Before cleanup can begin, the actual system damage must be
discerned. This can be quite time consuming, but should lead into
some of the insight as to the nature of the incident, and aid
investigation and prosecution. It is best to compare previous
backups or original tapes when possible; advance preparation is
the key. If the system supports centralized logging (most do), go
back over the logs and look for abnormalities. If process
accounting and connect time accounting is enabled, look for
patterns of system usage. To a lesser extent, disk usage may shed
light on the incident. Accounting can provide much helpful
information in an analysis of an incident and subsequent
Once the damage has been assessed, it is necessary to develop a
plan for system cleanup. In general, bringing up services in the
order of demand to allow a minimum of user inconvenience is the
best practice. Understand that the proper recovery procedures for
the system are extremely important and should be specific to the
It may be necessary to go back to the original distributed tapes
and recustomize the system. To facilitate this worst case
scenario, a record of the original systems setup and each
customization change should be kept current with each change to
6.2.3 Follow up
Once you believe that a system has been restored to a "safe"
state, it is still possible that holes and even traps could be
lurking in the system. In the follow-up stage, the system should
be monitored for items that may have been missed during the
cleanup stage. It would be prudent to utilize some of the tools
mentioned in section 22.214.171.124 (e.g., COPS) as a start. Remember,
these tools don't replace continual system monitoring and good
systems administration procedures.
6.2.4 Keep a Security Log
As discussed in section 5.6, a security log can be most valuable
during this phase of removing vulnerabilities. There are two
considerations here; the first is to keep logs of the procedures
that have been used to make the system secure again. This should
include command procedures (e.g., shell scripts) that can be run
on a periodic basis to recheck the security. Second, keep logs of
important system events. These can be referenced when trying to
determine the extent of the damage of a given incident.
6.3 Capturing Lessons Learned
6.3.1 Understand the Lesson
After an incident, it is prudent to write a report describing the
incident, method of discovery, correction procedure, monitoring
procedure, and a summary of lesson learned. This will aid in the
clear understanding of the problem. Remember, it is difficult to
learn from an incident if you don't understand the source.
126.96.36.199 Other Security Devices, Methods
Security is a dynamic, not static process. Sites are dependent
on the nature of security available at each site, and the array
of devices and methods that will help promote security.
Keeping up with the security area of the computer industry and
their methods will assure a security manager of taking
advantage of the latest technology.
188.8.131.52 Repository of Books, Lists, Information Sources
Keep an on site collection of books, lists, information
sources, etc., as guides and references for securing the
system. Keep this collection up to date. Remember, as systems
change, so do security methods and problems.
184.108.40.206 Form a Subgroup
Form a subgroup of system administration personnel that will be
the core security staff. This will allow discussions of
security problems and multiple views of the site's security
issues. This subgroup can also act to develop the site
security policy and make suggested changes as necessary to
ensure site security.
6.4 Upgrading Policies and Procedures
6.4.1 Establish Mechanisms for Updating Policies, Procedures,
If an incident is based on poor policy, and unless the policy is
changed, then one is doomed to repeat the past. Once a site has
recovered from and incident, site policy and procedures should be
reviewed to encompass changes to prevent similar incidents. Even
without an incident, it would be prudent to review policies and
procedures on a regular basis. Reviews are imperative due to
today's changing computing environments.
6.4.2 Problem Reporting Procedures
A problem reporting procedure should be implemented to describe,
in detail, the incident and the solutions to the incident. Each
incident should be reviewed by the site security subgroup to allow
understanding of the incident with possible suggestions to the
site policy and procedures.