The overall methodology discussed in this paper is schematically represented in Fig. 1.
For a given security problem and a given power system, security cases are first generated via a random sampling approach, in a sufficiently broad and diverse range so as to screen all situations deemed relevant. Second, each case is pre-analyzed in terms of security by simulating numerically various possibly harmful contingencies. At this step massive parallelism may be exploited to speed up this off-line simulation phase.
Figure 1: Machine learning framework for security assessment
The obtained data bases are typically composed of several thousands of cases, for which security information was gathered with respect to several tens of disturbances. The relevant information is then extracted by statistical learning techniques. These must be able to : (i) identify the relevant attributes among those used to describe the system states; (ii) build a model which explains the relationship between the relevant attributes and the security status and/or which can predict the security of new situations, different from those in the data base.
More generally, the generic problem of supervised learning from examples can be formulated as follows :
Given a learning set of examples of associated input/output pairs, derive a general modelfor the underlying input/output relationship, which may be used to explain the observed pairs and/or predict output values for any new unseen input.
In the context of power system security, an example corresponds to a given operating situation, described in terms of its electrical state and topology. The input attributes would be (hopefully) relevant parameters describing it and the output could be information concerning its security, in the form of a discrete classification (e.g. secure / marginal / insecure) or of a numerical value derived from security margins or indices.
The solution of this overall learning problem is decomposed into several subtasks.
Representation consists of (i) choosing appropriate input attributes to represent the power system state, (ii) defining the output security information, and (iii) choosing a class of models suitable to represent input/output relations.
The representation problem is left to the engineer. In the context of power system security, a compromise has to be found between the use of very elementary standard operating parameters and more or less sophisticated compound features, known to show strong correlation with security.
Feature selection aims at reducing the dimensionality of the input space by dismissing attributes which do not carry useful information to predict the considered security information. This allows the more or less local nature of many security problems to be exploited.
Model selection (or learning per se) will typically try to identify in the predefined class of models the one which best fits the learning states. This generally requires choice of model structure and parameters, using an appropriate search technique.
The distinction between feature selection and model selection is somewhat arbitrary, and some methods actually solve these two problems simultaneously rather than successively.
Interpretation and validation are very important in order to understand the physical meaning of the synthetic model and to determine its range of validity. It consists of testing the model on a set of unseen test examples and comparing its information with prior expertise about the security problem.
From the interpretation and validation point of view, as we will see, some of the methods provide rather black-box information, difficult to interpret, while some others provide explicit and transparent models, easier to compare with prior knowledge.
Model use consists of applying the model to predict security of new situations on the basis of the values assumed by the input parameters, and if necessary to ``invert'' the model in order to provide information on how to modify input parameters so as to achieve a security enhancement goal.
Concerning the model use for fast decision making, although speed variations of several orders of magnitude may exist between various techniques, those methods discussed in this paper in the context of power system security are sufficiently fast. It is also worth noting that they all may exploit parallelism quite easily, if deemed necessary.