A SHORT NOTE ON RELIABILITY OF SECURITY SYSTEMS
Jozwiak Ireneusz J., Laskowski Wojciech
Wroclaw University of Technology, Wroclaw, Poland
Keywords
computer security, reliability, computer incidents Abstract
Telecommunication systems become a key component of critical infrastructure. One of the main elements of such systems is computer system. The organizations which can be involved in crisis management (e.g. government agencies, etc. ) need to know results of security drawbacks in their systems. Moreover, they should have a tool for analysing the results of decision made in security context. And often the following question is raised: why do security systems fail? To answer it in this paper the aspects of reliability are discussed. From this point of view the security systems are analysed. We hope that thanks to such approach we will be able to reach some characteristics of security incidents occurrence. Moreover, we hope to use our results to build security attributes metrics. In addition, we present thesis that predictions of occurrence of incidents is impossible, so we should focus on registration of incidents type. On such a foundation we can formulate conclusions about drawbacks in configurations or administration of information systems. In our research we have observed that in case of some class of information systems, the availability incidents are the most dangerous. And we conclude that only using technologies with good reliability characteristics can lead to solving this problem.
1. Introduction
The problem of reliability of security systems were discussed by Anderson in several publications e.g. [1] or [2]. Another example is paper [11] where system reliability is viewed from game theoretical perspective and this work can be easily applied to security domain. One of the most popularised practical models of security systems is so called 'defence-in-depth' model [3]. Taking into consideration such a model it can direct our attention into basic models of system reliability: serial or parallel systems [4], [7]. Many components of security systems can be characterized by one of the above-mentioned structures. For example, access control subsystem, firewall, IDS and antivirus software can be considered as a mixed structure (Figure 1) with three serial elements and one element parallel to this structure.
Using reliability techniques influence security systems. A good example is a problem of placing IDS in redundant networks [10]. Another example is operating systems. Very large number of modules, software
Figure 1. The scheme of a typical security system - an example
applications or services induce many security problems. There are many areas when security vulnerabilities are present, e.g. authorization subsystem, remote services etc. There is a set of security holes, which can be viewed as a serial or parallel structure (in basic reliability models sense). In this paper we present some empirical data from our research connected with analysing incidents connected with security of information systems.
2. Security incidents
In some period of the time (approximately 2 years) we have focused on observation of tree kinds of security systems. These systems (the models are presented in fig. 2) can be characterized as follows:
1. System A - a stand-alone system, not connected to any network, an access to this system is limited to a small number of users.
2. System B - specialized networked system, separated from public networks (several workstations)
3. System C - system networked, connected to public operators network (dozen workstations)
o
PC
b) SystomB
a) SystemA
Figure 2. The models of observed information systems
The physical structures of these systems are less important for our research. Moreover, the size, role and localization of these systems are intended not to be mentioned at this time. Taking into consideration of three attributes of information: confidentiality, integrity and availability these systems were observed in order to notice specific incidents: virus incidents in System A and System C and availability incidents in case of System B. The availability incidents we understand as the breaks in proper working the system, e.g. lack of communications or servicing the elements of network infrastructure. The preliminary results are presented in Table 1.
Table 1. The number of observed incidents
Type of system Number of virus Number of Period of
incidents availability incidents observations
System A 2 — 1 year
System B — 137 2 years
System C 41 — 1,5 year
In case of System A we noticed two different kinds of macro viruses [12]. The virus incidents in System C were connected with worms (mainly from Sasser 'family'), trojans or loggers [12]. The most interesting observations are connected with System B. Over 130 incidents were noticed. So some kind of reliability analysing methodology was used in order to describe the characteristics of events in this system. We are interested in mean time between incidents and frequencies of incidents.
3. Analysis of incidents
The preliminary results are presented in Table 2 and in the Figures 3,4,5.
These pictures present number of incidents and its length and time periods between incidents. In case of System B we are focused on general number of incidents and time between incidents (Figure)
Table 2. A comparison of mean values and standard deviations of data about incidents
Type of system Mean time between incidents [day] Standard dev.of time between incidents [day] Mean time of incident [min] Standard dev.of time of incident [min]
System B 5,25 7,30 129,48 184,77
System C 11,90 16,54 — —
Tim e periods between the virus incidents - System B
fin - . n n 9 fir, il [1 n n„ nJ nil
N * A. ^ fy ,-p ^ ^ & 4 £
The successive incidents
Figure 3. Graphical representation of observed length of time periods between incidents. System C.
Figure 4. Histogram of observed time between the virus incidents. System C.
[day]
a> a> CL -U
o I
1
-JLJJ JléMiiAiMir-l Jjkt 1 mill I ill 14 Jill
49 57 65 73 81 89 97 105 succesive incidents
Figure 5. Graphical representation of observed length of time periods between incidents. System B.
0,6 0,5
± 0,4 <->
£ o,3 | 0,2 0,1 0
Histogram -time between incidents
n n — ^
0,05 0,5 5 10 15 30
time between incident? [day?]
Figure 6. Histogram of observed time between the availability incidents. System B.
Figure 7. Graphical representation of observed length of time periods of availability incidents. System B.
Figure 8. Histogram of observed time periods of availability incidents. System B.
The basic statistical analyses were done in order to notice the frequencies of incidents and derive empirical distributions.
In case of SystemB and virus incidents, the occurring the events has characteristic presented in figure
4.
In case of System C and availability incidents, the characteristics presented in fig. 6 and 7 were derived.
4. Reliability of security systems
When security of information systems is considered it is needed to analyse three attributes: confidentiality, availability and integrity. According to reliability theory, one of the key measures is probability of failures or time between failures. When it comes to security systems there is a lack of such metrics. In general security can be seen as a subjective category. So it is very difficult to find adequate metrics or measures of security attributes. But it seems that reliability context and analogies should be helpful. Another problem is if such metrics can be helpful in decision taking during ensuring security
process. It seems that measuring security is impossible or at least possible in very limited scope. In authors' opinion every techniques which can be utilized to limit uncertainty during decision taking (in computer security domain) is worth considering.
Figure 9. Metrics for security attributes analysing
The observation done by authors can be helpful in analysing first of all aspects of availability. Looking for the distribution of probability of occurring incidents we can observe shape the distribution presented in Figure 10 and Figure 11.
Figure 10. Probability distribution of observed time between the availability incidents. System B.
Figure 11. Probability distribution of observed time periods of availability incidents. System B.
The main conclusion from this preliminary analysis is that the most probable time between incidents is from range 0,01 to 1 day. It means that in the case of this system, the attention of operators should be focused first of all on control transmission links and devices. When it comes to virus incidents our observation proves that supervising the system should be done every day (the shape of characteristic in
Figure 4 shows that occurrence of incidents more often than one incident per 5 days period is possible with high probability).
The availability incidents' distributions were presented in fig. 10 and Figure 11. Expected time period of availability incidents is approximately 2 hours.
The reliability of security systems is connected with proper implementation of software and hardware components of security systems. The flexible and easy in realization reconfigurable hardware elements can be used. This problem was discussed and presented e.g. in [8] or [6]. Using reconfigurable hardware can significantly increase reliability e.g. cryptographic systems. What is more the speed of transmitting data are very important parameters. For example, the results of implementation of cryptographic device CRYPTON [12] is presented in Table 3.
5. Conclusion
Our observations proof the thesis that collecting data for analysing security is a very complex practical problem.
Table 3. The chosen parameters of reconfigurable device CRYPTON [12]
Device Reconfigurable device CRYPTON
Clock period [ns] 52
Frequency [MHz] 19,2
Encryption (decryption) speed [Mb/s] 203,2
Time to encrypt (decrypt) one data block [ns] 630
Number of encryption (decryption) per second 1 587 301
What is more, the analysing of these data needs new more accurate methods. This is a general problem of IDS systems. Many methods of artificial intelligence are used in this domain, e.g. machine learning, data mining or neural networks. Exploring the data for discovering dependencies connected with incidents is a real and still open problem. We face some kind of paradox: we either a huge number of data and have problems with its exploring or we suffer from lack of accurate data. This problem can be noticed when a need for a fast assessing of security incidents takes place. In such a situation very often fast decision is needed: is this an incident or not? We still do researches connected with developing a new method for security assessment. Our method is based on preliminary preparation of data for scaling early intrusion detection systems using simulation. And in this method we need some characteristic connected with frequencies of incidents presented in the paper. In many elements our analysis is very similar to reliability analysis. We are focused on answering the questions: why do the security systems fail? And this is the key direction of constructing our method: finding the cause - effect dependencies in incident analysis in order to induce the rules for IDS systems. The first element of these observations is to notice how often the incidents take place.
As far as reliability of security system is concerned it is worth underline the wide spectrum of threads, which should be considered. One of these subjects is implementing hardware devices using high speed and characterized by good reliability characteristic technology.
The occurrence of computer incidents is rather unpredictable. It is very hard to reach characteristics like probability distributions. Institutions do not publish data about incidents. We can only collect own data or gather data from other sources, like CERT (Computer Emergency Response Team). Other solution is preparing data using simulation.
To conclude we can say that only implementing heterogeneous environments with combination of software and hardware, commercial and open source components can lead to ensuring a good level of reliability. And consequently in such a way we can increase level of security of information systems.
References
[1] Anderson, R. (1993). Why Cryptosystems Fail. 1st Conference on Computer and Communication Security. VA, USA.
[2] Anderson, R. (2001). Security engineering. A Guide to Building Dependable Distributed Systems. John Wiley & Sons Inc.
[3] Hazlewood, V. (2007). Defense-in-depth. An Information Assurance Strategy for the Enterprise, San Diego 2006, (http://security.sdsc.edu/DefenseInDepthWhitePaper.pdf, February 2007)
[4] Jozwiak, I.J. (1992). The reliability and functional model of computer network with branched structure. Microelectronics and Reliabilit. Vol. 32, nr 3, 345-349.
[5] Jozwiak, I.J. (1996). The failure time random variable modeling. Microelectronics and Reliability. vol. 36, 10, 1525-1529.
[6] Jozwiak, I. & Laskowski, W. (2003). Reconfigurable hardware and safety and reliability of computer systems. Risk Decision and Policy Journal. Philadelphia.
[7] Kolowrocki, K. (2004). Reliability of Large Systems. Amsterdam-Boston-Heidelberg-London-New York-Oxford-Paris-San Diego-San Francisco-Singapore-Sydney-Tokyo, Elsevier.
[8] Laskowski, W. (2001). Uklady programowalne jako narz^dzia wspomagaj^ce kryptograficzn^ ochron^ danych. Przeglqd Telekomunikacyjny 3, 178-183.
[9] Liderman, K. (2003). A guide for security administrators. Warszawa (in Polish).
[10] SANS Institute, Intrusion detection FAQ. (2007). (on line: http://www.sans.org/resources/idfaq).
[11] Varian, H. (2002). System reliability and free riding. Workshops on Economics and Information Security. Berkeley, (on line: http://citeseer.ist.psu.edu/527418.html).
Virus Encyclopedia, CA. (2007). (http://www3.ca.com/securityadvisor/virusinfo/browse.aspx).
STABILITY AND SAFETY OF SHIPS: HOLISTIC AND RISK
APPROACH
Kobylinski Lech
Foundation for Safety of Navigation, Gdansk, Poland
Keywords
maritime safety, risk analysis, ships' stability, ice accretion on ship Abstract
Present stability regulations developed over the years by IMO reached definite conclusion with the adoption of the Revised Draft of the Intact Stability Code. The criteria included there are design criteria of the prescriptive nature, based mainly on statistics of stability casualties. Currently IMO is considering development of criteria based on ship performance. Concept of such criteria is, however, at present not agreed. The criteria are working comparatively well with regard to the majority of conventional ships, however advent of very large and sophisticated ships of non-conventional features caused that those criteria may be inadequate. The author advances the idea consisting of application of safety assessment and risk analysis using holistic and system approach to stability. Safety against capsizing (or LOSA accident) is a complex system where design, operational, environmental and human factors have to be taken into account. Although this seems to be a very complex task, in the opinion of the author it may be manageable and could be applied for safety assessment of highly sophisticated and costly ships.
1. Introduction
One of the most important aspects of safety is safety against capsizing. In modern times capsizing is an accident that is not happening often, but if it happens, the consequences are usually catastrophic and ship is lost, quite often with all hands on board. When the number of lost lives is large, the public opinion reacts to such accidents acutely, almost hysterically, as for example in the case of ESTONIA disaster, and the consequences of the accident to the maritime world may be rather serious. That is why safety against capsizing is an important issue.
In order to avoid possibility of capsizing, criteria for ship stability were developed. Some simple criteria were proposed quite long time ago, in the middle of nineteenth century, but the most recent criteria were developed and recommended by the International Maritime Organisation (United Nations Agency) in late sixties and early seventies of the last century. Those criteria are used until this day in some countries; recently they were included in the Code of Intact Stability for All Types of Ships developed by IMO and they will become compulsory under the provisions of the SOLAS Convention in 2009.
The existing criteria are design oriented and their essence consists of specification of critical values of some stability parameters. In spite of the fact, that some ships satisfying those criteria capsized, the general opinion is that the great majority of ships are reasonably safe.
The existing criteria may be, however, not applicable to some types of modern ships incorporating novel design features. There is no previous experience in relation to safety and stability of those ships and to satisfy existing criteria may not assure required level of safety. Because of this, Marine Safety Committee of IMO recently included in its work programme the item requiring development of performance-oriented criteria for ships of novel ship type.
Performance oriented criteria according to this definition, but also according to the understanding of the majority of members of the IMO SLF Sub-committee, are criteria that take into account scenarios of capsizing of the ship in a seaway. However, forces of the sea are not the main hazard posed to the ship. Analysis of causes of stability accidents reveals that in more than 80% of casualties human factor is the principal cause, in the remaining accidents factors such as cargo shift, icing or other heeling moments are often initiating events. Therefore, the author proposed that instead of developing additional prescriptive
criteria provision may be used, already included in the SOLAS Convention (Chapter II-1, Part B-1, regulation 25-1.3) allowing the Administration to apply, under certain conditions, alternative methods if it is satisfied that it least the same degree of safety as represented by the existing requirements is achieved.
If the formulation of this provision (rather often used in IMO instruments) is understood as such, that the objectives are specified, it opens the way to application of the holistic and risk-based approach. Chantelave [3] discussed this problem. Obviously, as the application of risk analysis is not an easy task, the provision should be supplemented by guidance to the Administration.
Full risk analysis for the particular ship or group of ships requires large resources that were not available to the author. Therefore risk analysis was executed on a limited scale, and in particular group of experts consisted of few persons. The purpose of the exercise was to investigate the possibilities of application holistic and risk approach to stability problems and create some basis for possible content of guidance such as mentioned above. In the paper only some parts of the analysis are referred; the other parts of the exercise will be published in other places.
2. Holistic and system approach
As mentioned above, existing criteria are design criteria intended to be applied during the design stage of a ship. However, even the preliminary analysis of stability casualties shows, that design features of the ship are not the most important nor most often cause of casualty. Casualty - it will be in the following called LOSA -(loss of stability accident) [16], is usually the result of a sequence of events that involve environmental conditions, ship loading condition, ship handling aspects and human factor in general. Therefore in order to make safety assessment holistic approach is needed to the ship stability system.
Ship stability system is rather complicated. However, in most cases it could be considered as consisting of four basic elements: ship, environment, cargo and operation (See Figure 1). The Venn diagram in this figure stresses strong interactions between the four elements. The use of the system approach to stability criteria was proposed by the author quite long time ago and it was partly applied in development of the Intact Stability Code [12], but in general until this day stability requirement remain basically design oriented. Analysis of LOSA casualties reveals that the causes of casualty may be attributed to:
- functional aspects resulting from reliability characteristics of the technical system, therefore stability characteristics of the ship
- operational aspects resulting from action of the personnel handling the system, therefore crew members but also ship management, cargo handling, marine administration and owners company organisation
- external causes resulting from factors independent from designers, builders and operators of the technical system therefore ship environment and climatology [4], [5].
Figure 1. Four-fold Venn diagram for ship stability system
Human factor plays important part in all four elements of the system. Human and organisational errors, HOE, according to some authors, are responsible for approximately 80% of all marine casualties [17], other sources definitely stated that this percentage is 80% [23]. In order to achieve sufficient level of safety with respect of stability, all elements creating stability system have to be taken into account. Taking into
account the fact, that less than 20% of all casualties are caused by faulty or bad design of the ship, the existing safety requirements that refer mainly to design features of the ship can not insure sufficient level of safety, in particular with regard to ships having novel design features. The only way out of this would be to use risk-based approach.
3. Prescriptive versus risk-based approach
In many fields of technology when planning highly sensitive and costly enterprises risk analysis is performed nowadays. The Marine Safety Committee of IMO recommended using this approach in IMO rule making process [11]. In spite of this recommendation, and in spite of the fact that risk analysis is performed, for example, as a rule in offshore industry experts on stability are hesitant to use this approach, still preferring development of prescriptive criteria.
Conventional prescriptive approach to the problem of safety that is used for a very long time is in the form of a recipe defining maximum or minimum values of some parameters. This approach is now substituted by safety assessment and analysis of risk. In place of rigid formulae, the disadvantage of which is insufficient flexibility to innovative of the system and that may be changed only using small steps, new risk based requirements are oriented on attainment of the target that is safety of the system.
Traditional regulations related to stability are of prescriptive nature and usually are based on deterministic calculations. They are formulated in the way where a ship dimension or other characteristic (e.g. metacentric height) must be greater (or smaller) than certain prescribed quantity. Prescriptive regulations could be developed on the basis of statistics, model tests and full-scale trials. In some cases probabilistic calculations might be also used as a basis of prescriptive regulations
The basic dichotomy in the conception of safety requirements consists of prescriptive approach versus risk-based approach. The main shortcoming of prescriptive regulations is that they are bounding designers and they do not allow introduction of novel design solutions. They are based on experience gained with existing objects and they are not suitable to novel types. Usually they were amended after serious casualties had been happened. The risk involved and the level of safety with the application of prescriptive regulations is not known [15].
At the opposite to the prescriptive regulations there is risk-based approach. In the risk-based approach the regulations specify objectives to be reached that is safe performance of an object. Risk-based approach could be described as a goal-oriented performance based approach utilizing, usually, probabilistic calculations. However, it is possible to imagine. The advantages of risk-based approach are obvious. They give free hand to the designers to develop new solutions, they actually allow taking optimal decisions from the point of view of economy and the risk to the public and to the environment is assessed and accepted.
All existing stability regulations are of the prescriptive nature. At present, however, the need to apply risk-based approach is recognized and actually recommended. However, up to now there are very few attempts to apply, at least partially, this approach to stability problems.
Risk-based approach according to IMO recommendation is formalized and includes the following
steps:
1. Identification of hazards
2. Risk assessment
3. Risk control options
4. Cost-benefit assessment, and
5. Recommendations for decision making
4. Hazard identification
The first step of a risk analysis is to carry out hazard identification and ranking procedure (HAZID). Hazards could be identified using several different methods.
IMO resolution included general guidance on the methodology of hazard identification. With respect to stability, hazard identification could be achieved using standard methods involving evaluation of available data in the context of functions and systems relevant to the type of ship and mode of its operation. Stability is considered assuming that the ship is intact and accident evaluated is called LOSA (loss of stability accident)
that is covering capsizing, that means taking position upside down, but also a situation where amplitudes of rolling motion or heel exceed a limit that makes operation or handling the ship impossible for various reasons -loss of power, loss of manoeuvrability, necessity to abandon the ship. In the last situation the ship may be salvaged [16].
According to general recommendation the method of hazard identification comprises mixture of creative and analytical techniques. Creative element is necessary in order to ascertain that the process is proactive and is not limited to hazards that happened in past. For this purpose a group of experts should be created consisting of specialists in design, operation, management and human factor.
Hazards identification was based on
1. Analysis of historical data on LOSA accidents.
2. Statistical analyses of cause of accidents available in various sources, inter allia in [1], [8], [9], [10]
3. Detailed description of LOSA accidents. For this purpose accidents of 20 described in detail casualties were analysed,
4. Analysis of the few accidents using TRIPOD methodology [22]
5. Evaluation by experts using DELPFIC method
6. Analysis by the group of experts
The group of experts was requested to evaluate the results of all the above analyses and to propose a list and ranking of hazards. Because of available resources to conduct engineering analysis was preferred in opposite to expert analysis as defined in [7].
The expert group recognized that the number of hazards defined as a potential situation to threaten the ship stability when considering all elements of the stability system is large and because of that decided to consider on the first level the following hazards
1. critical stability
2. forces of the sea
3. cargo shift
4. icing
5. human factor- management
6. external heeling moments
7. cargo and ballast operations
8. fire and explosion
Figure 2 shows fault tree for the first level. It shows all eight groups of hazards connected by "OR" gate; this however, does not preclude that two or more hazards may be present at the same time. The system is rather complex, because in further down levels of the fault trees there are strong interconnections between different factors. This is shown in the example of the fault tree (Figure 4).
In the above list, insufficient stability is defined as stability characteristics that do not meet IMO current requirements. Cargo shifting was singled out because in more than 300 LOSA casualties cargo shift occurred in about 40% cases. Fire is important because fire fighting water can reduce stability and cause capsizing (example: NORMANDIE in New York harbour in 1942). Forces of the sea include action of waves and wind. This may be the most difficult hazard to evaluate because of the complex hydrodynamic structural model of behaviour of the ship in a seaway. External heeling moments comprise different heeling moments apart of heeling moments caused by forces of the sea and shifting of cargo. In this category are heeling moment caused by water on deck, by centrifugal force when turning, fishing gear pull, tow rope forces etc.
Ranking for the frequency of hazards adopted in the application of Delphic method consisted of five groups (1 to 5) as proposed in [6]: (frequent, probable, occasional, remote and unlikely) that is different from the IMO recommendation [11]. Different ranking indexes are related to probabilities, but this was not revealed to participants of the exercise, because it seems that assessment of probability is very subjective and does not lead to reliable results. This is shown in Table 1.
Ranking, as proposed by the group of experts, that took into consideration all the above-mentioned results, differs in rather wide limits. That is understandable, because hazards probability is obviously different for different types of ships and for different modes of operation. For example, icing need not to be considered as hazard for ships operating in Mediterranean, and requires high ranking for ship s operating at high latitudes. The same applies to shifting of cargo, because in some ships there is no cargo that can shift. Therefore no probabilities were attached to hazards at the first level. However an example of averaged ranking estimated by the group of nine experts is shown in Table 2.