Characteristics of random signals. Random signals and their characteristics

Since all information signals and interference are random and can only be predicted with a certain degree of probability, the probability theory is used to describe such signals. In this case, statistical characteristics are used, which are obtained by conducting numerous experiments under the same conditions.

All random phenomena studied by probability theory can be divided into three groups:
- random events;
— random variables;
- random processes.

random event is any fact that, as a result of experience, may or may not occur.
A random event is the appearance of interference at the input of the receiver or the receipt of a message with an error.
Random events are denoted by Latin letters A, B, C.

The numerical characteristics of a random event are:
1. Frequency of occurrence of a random event:

where m is the number of experiments in which this event occurred;
N is the total number of experiments performed.

As follows from expression (40), the frequency of occurrence of a random event cannot exceed 1, since the number of experiments in which this event occurred cannot exceed the total number of experiments.
2. Probability of occurrence of a random event:

That is, the probability of occurrence of a random event is the frequency of its occurrence with an unlimited increase in the number of experiments performed. The probability of occurrence of an event cannot exceed 1. A random event with a probability equal to one is reliable, i.e. it will definitely happen, therefore, events that have already occurred have such a probability.
Random value is a quantity that changes randomly from experience to experience.
The random variable is the noise amplitude at the receiver input or the number of errors in the received message. Random variables are denoted by Latin letters X, Y, Z, and their values are x, y, z.
Random variables are discrete and continuous.
Discrete is a random variable that can take on a finite set of values (for example, the amount of equipment, the number of telegrams, etc., since they can only take the integer 1, 2, 3, ...).
A continuous variable is a random variable that can take on any values from a certain range (for example, the noise amplitude at the receiver input can take any values, just like an information analog signal can take any values).

Numerical, statistical characteristics describing random variables are:
1.Probability distribution function.

F(x)=P(X ? x) (42)

This function shows the probability that a random variable X will not exceed a specific value of x. If the random variable X is discrete, then F(x) is also a discrete function, if X is a continuous value, then F(x) ? continuous function.
2. Probability density.

P(x)=dF(x)/dx (43)

This characteristic shows the probability that the value of a random variable will fall into a small interval dx in the vicinity of the point x', i.e., in the shaded area (figure).

3. Expected value.

where xi are the values of a random variable;
Р(хi) is the probability of occurrence of these values;
n is the number of possible values of the random variable.

where p(x) is the probability density of a continuous random variable.

In its meaning, the mathematical expectation shows the average and most probable value of a random variable, i.e., this value is most often taken by a random variable. Expression (44) is used if the random variable is discrete, and expression (45) if it is continuous. The notation M[X] is special for the mathematical expectation of the random variable given in square brackets, but the notation mх or m is sometimes used.

4. Dispersion.

Dispersion quantitatively characterizes the degree of dispersion of the results of individual experiments relative to the average value. The notation for the variance of a random variable D[X] is generally accepted, but the notation ??x can also be used. Expression (46) is used to calculate the variance of a discrete random variable, and (47) is used to calculate the variance of a continuous random variable. If you take the square root of the variance, you get a value called the standard deviation (?x).

All characteristics of a random variable can be shown using Figure 22.

Figure 22 - Characteristics of a random variable

random process is such a function of time t, the value of which, for any fixed value of time, is a random variable. For example, Figure 23 shows a diagram of some random process observed as a result of three experiments. If we determine the value of the functions at a fixed time t1, then the obtained values will turn out to be random variables.

Figure 23 - Ensemble of implementations of a random process

Thus, the observation of any random variable (X) in time is a random process X(t). For example, information signals (telephone, telegraph, data transmission, television) and noise (narrowband and broadband) are considered as random processes.
A single observation of a random process is called implementation xk(t). The set of all possible realizations of one random process is called an ensemble of realizations. For example, Figure 23 shows an ensemble of implementations of a random process, consisting of three implementations.

To characterize random processes, the same characteristics are used as for random variables: probability distribution function, probability distribution density, mathematical expectation and variance. These characteristics are calculated in the same way as for random variables. Random processes are of various types. However, in telecommunications, most random signals and noises are stationary ergodic random processes.

Stationary is a random process whose characteristics F(x), P(x), M[X] and D[X] do not depend on time.
Ergodic is a process in which the time averaging of one of the implementations leads to the same results as the static averaging over all implementations. Physically, this means that all implementations of the ergodic process are similar to each other, so measurements and calculations of the characteristics of such a process can be carried out using one (any) of the implementations.
In addition to the four characteristics above, random processes are also described by the correlation function and the power spectral density.

The correlation function characterizes the degree of relationship between the values of a random process at different times t and t+?. Where? time shifting.

where tн is the observation time of the realization xk(t).

Power Spectral Density- shows the distribution of the power of a random process by frequency.

where? P is the power of the random process per frequency band? f.

So the observation random event in time is a random process, its occurrence is a random event, and its value is a random variable.

For example, the observation of a telegraph signal at the output of a communication line for some time is a random process, the appearance of its discrete element “1” or “0” at the reception is a random event, and the amplitude of this element is a random variable.

The mathematical apparatus for analyzing stationary random signals is based on the ergodicity hypothesis. According to the ergodicity hypothesis, the statistical characteristics of a large number of arbitrarily chosen realizations of a stationary random signal coincide with the statistical characteristics of one realization of a sufficiently large length. This means that averaging over a set of realizations of a stationary random signal can be replaced by averaging over time of one rather long realization. This greatly facilitates the experimental determination of the statistical characteristics of stationary signals and simplifies the calculation of systems under random influences.

Let's determine the main statistical characteristics of a stationary random signal, given as one implementation in the interval (Fig. 11.1.1, a).

Numerical characteristics. Numerical characteristics of a random signal are the mean value (mathematical expectation) and variance.

The average value of the signal over a finite time interval is equal to

If the averaging interval - the implementation length T tends to infinity, then the time average value according to the ergodicity hypothesis will be equal to the mathematical expectation of the signal:

Rice. 11.1.1. Implementations of stationary random signals

In what follows, for brevity, we will omit the sign of the limit in front of integrals over time. In this case, either instead of the sign = we will use the sign , or by the calculated statistical characteristics we mean their estimates.

In practical calculations, when the final implementation is given as N discrete values separated from each other by equal time intervals (see Fig. 8.1), the average value is calculated using the approximate formula

A stationary random signal can be considered as the sum of a constant component equal to the mean value , and a variable component corresponding to the deviations of the random signal from the mean:

The variable component is called a centered random signal.

Obviously, the average value of the centered signal is always zero.

Since the spectrum of the signal x(t) coincides with the spectrum of the corresponding centered signal , then in many (but not all!) problems of calculating automatic systems, instead of the signal x(t), we can consider the signal .

The dispersion D x of a stationary random signal is equal to the average value of the squared deviations of the signal from the mathematical expectation , i.e.

Dispersion D x is a measure of the spread of instantaneous signal values around the mathematical expectation. The greater the ripple of the variable component of the signal relative to its constant component, the greater the dispersion of the signal. The variance has the dimension of x squared.

Dispersion can be considered in the same way as the average value of the power of the variable component of the signal.

Often, the standard deviation is used as a measure of the spread of a random signal.

For the calculation of automatic systems, the following property is important:

the variance of the sum or difference of independent random signals is equal to the sum (!) of the variances of these signals, i.e.

Mathematical expectation and variance are important numerical parameters of a random signal, but they do not fully characterize it: they cannot be used to judge the rate of change of the signal over time. So, for example, for random signals x 1 (t) and x 2 (t) (Fig. 11.1.1, b, c), the mathematical expectations and variances are the same, but despite this, the signals clearly differ from each other: signal x 1 (t) changes more slowly than the signal x 2 (t).

The intensity of change of a random signal in time can be characterized by one of two functions - correlation or spectral density function.

correlation function. The correlation function of a random signal x(t) is the mathematical expectation of the products of the instantaneous values of the centered signal, separated by the time interval, i.e.

where m is a variable shift between the instantaneous values of the signal (see Fig. 11.1.1, a). The shift varies from zero to some value. Each fixed value corresponds to a certain numerical value of the function.

The correlation function (also called autocorrelation) characterizes the degree of correlation (tightness of connection) between previous and subsequent signal values.

As the shift increases, the connection between the values and weakens, and the ordinates of the correlation function (Fig. 11.1.2, a) decrease.

This basic property of the correlation function can be explained as follows. For small shifts, the integral sign (11.1.12) includes products of factors that, as a rule, have the same signs, and therefore most of the products will be positive, and the value of the integral will be large. As the shift increases, more and more factors with opposite signs will fall under the integral sign, and the values of the integral will decrease. For very large shifts

Rice. 11.1.2. Correlation function (a) and spectral density (b) of a random signal

the factors and are practically independent, and the number of positive products is equal to the number of negative products, and the value of the integral tends to zero. It also follows from the above reasoning that the correlation function decreases the faster, the faster the random signal changes in time.

It follows from the definition of the correlation function that it is an even function of the argument , i.e.

therefore, only positive values are usually considered.

The initial value of the correlation function of the centered signal is equal to the signal dispersion, i.e.

Equality (8.14) is obtained from expression (11.1.12) by substituting .

The correlation function of a specific signal is determined by the experimentally obtained implementation of this signal. If the signal realization is obtained in the form of a continuous diagram record of length T, then the correlation function is determined using a special computing device - a correlator (Fig. 11.1.3, a), which implements formula (11.1.12). The correlator consists of a BZ delay block, a BU multiplication block, and an I integrator. To determine several ordinates, the delay block is tuned in turn to various shifts

If the implementation is a set of discrete signal values obtained at regular intervals (see Fig. 11.1.1, a), then the integral (11.1.12) is approximately replaced by the sum

which is calculated using a computer.

Fig 11.1.3 Algorithmic schemes for calculating the ordinates of the correlation function (a) and spectral density (b)

To obtain sufficiently reliable information about the properties of a random signal, the implementation length T and the discrete interval must be selected from the conditions:

where T n t h and T in h - periods, respectively, of the lowest and highest frequency components of the signal.

Spectral density. Let us now define the spectral characteristic of a stationary random signal . Since the function is not periodic, it cannot be expanded into a Fourier series (2.23). On the other hand, the function is non-integrable due to its unlimited duration, and therefore cannot be represented by the Fourier integral (2.28). However, if we consider a random signal on a finite interval T, then the function becomes integrable, and there is a direct Fourier transform for it:

The Fourier image of a non-periodic signal x(t) characterizes the distribution of the relative signal amplitudes along the frequency axis and is called the amplitude spectral density, and the function characterizes the signal energy distribution among its harmonics (see 2.2). Obviously, if the function is divided by the duration T of a random signal, then it will determine the distribution of the power of the final signal among its harmonics. If now let T tend to infinity, then the function will tend to the limit

which is called the power spectral density of the random signal. In what follows, the function will be called abbreviated as the spectral density.

Along with the mathematical definition (11.1.18) of the spectral density, a simpler - physical interpretation can be given: the spectral density of a random signal x (t) characterizes the distribution of the squares of the relative amplitudes of the signal harmonics along the axis.

According to definition (11.1.18) spectral density is an even function of frequency. When the function usually tends to zero (Fig. 11.1.2, b), and the faster the signal changes in time, the wider the graph.

Individual peaks in the spectral density graph indicate the presence of periodic components of a random signal.

Let us find the relation between spectral density and signal dispersion. We write Parseval's equality (2.36) for the final realization and divide its left and right parts by T. Then we obtain

When the left side of equality (8.19) tends to the dispersion of the signal D x [see. (11.1.10)], and the integrand on the right side - to the spectral density , i.e. instead of (8.19) we get one of the main formulas of statistical dynamics:

Since the left side of equality (11.1.20) is the total dispersion of the signal, then each elementary component under the integral sign can be considered as the dispersion or square of the amplitude of the harmonic with frequency .

Formula (11.1.20) is of great practical importance, since it allows one to calculate its dispersion from the known spectral density of a signal, which in many problems of calculating automatic systems serves as an important quantitative characteristic of quality.

The spectral density can be found from the experimental implementation of the signal using a spectral analyzer (Fig. 11.1.3, b), consisting of a bandpass filter PF with a narrow bandwidth, a quadrator Kv and an integrator I. To determine several ordinates, the bandpass filter is alternately tuned to different transmission frequencies .

The relationship between the functional characteristics of a random signal. N. Viner and A. Ya. Khinchin showed for the first time that the functional characteristics of a stationary random signal are related to each other by the Fourier transform: the spectral density is an image of the correlation function, i.e.

and the correlation function, respectively, is the original of this image, i.e.

If we expand the factors using the Euler formula (11.1.21) and take into account that , and are even functions, and is an odd function, then expressions (11.1.21) and (11.1.22) can be transformed to the following form, more convenient for practical calculations:

Substituting the value into the expression (11.1.24), we obtain the formula (11.1.20) for calculating the variance.

The relations connecting the correlation function and the spectral density have all the properties inherent in the Fourier transform. In particular: the wider the graph of the function, the narrower the graph of the function, and vice versa, the faster the function decreases, the slower the function decreases (Fig. 11.1.4). Curves 1 in both figures correspond to a slowly changing random signal (see Fig. 11.1.1, b), the spectrum of which is dominated by low-frequency harmonics. Curves 2 correspond to a rapidly changing signal x 2 (t) (see Fig. 11.1.1, b), the spectrum of which is dominated by high-frequency harmonics.

If a random signal changes in time very sharply, and there is no correlation between its previous and subsequent values, then the function has the form of a delta function (see Fig. 11.1.4, a, line 3). The spectral density graph in this case is a horizontal straight line in the frequency range from 0 to (see Fig. 11.1.4, b, straight line 3). This indicates that the amplitudes of the harmonics are the same over the entire frequency range. Such a signal is called ideal white noise (by analogy with white light, in which, as is known, the intensity of all components is the same).

Fig 11.1.4 Relationship between correlation function (a) and spectral density (b)

Note that the concept of "white noise" is a mathematical abstraction. Physical signals in the form of white noise are not feasible, since according to formula (11.1.20), an infinitely wide spectrum corresponds to an infinitely large dispersion, and, consequently, an infinitely large power, which is impossible. However, real signals with a finite spectrum can often be roughly considered as white noise. This simplification is justified in cases where the signal spectrum is much wider than the bandwidth of the system affected by the signal.

For all random signals operating in real physical systems, there is a correlation between previous and subsequent values. This means that the correlation functions of real signals differ from the delta function and have a finite, non-zero decay duration. Accordingly, the spectral densities of real signals always have a finite width.

Communication characteristics of two random signals. To describe the probabilistic relationship that appears between two random signals, use the cross-correlation function and the mutual spectral density.

The mutual correlation function of stationary random signals x 1 (t) and x 2 (t) is determined by the expression

The function characterizes the degree of connection (correlation) between the instantaneous values of the signals x 1 (t) and x 2 (t), separated from each other by a value. If the signals are not statistically related (not correlated) with each other, then for all values of the function .

For the cross-correlation function, the following relation follows from definition (8.25):

The correlation function of the sum (difference) of two correlated signals is determined by the expression

The mutual spectral density of random signals x 1 (t) and x 2 (t) is defined as the Fourier image of the cross-correlation function:

It follows from definition (11.1.28) and property (11.1.26) that

Spectral density of the sum (difference) of random signals x 1 (t) and x 2 (t)

If the signals x 1 (t) and x 2 (t) are not correlated with each other, then expressions (11.1.27) and (11.1.29) are simplified:

Relations (11.1.31), as well as (11.1.11), mean that the statistical characteristics and D x of the set of several random signals uncorrelated with each other are always equal to the sum of the corresponding characteristics of these signals (regardless of the sign with which the signals are summed into this set).

Typical random impacts. Real random impacts affecting industrial control objects are very diverse in their properties. But resorting to a certain idealization in the mathematical description of influences, one can single out a limited number of typical or typical random influences. Correlation functions and spectral densities of typical actions are fairly simple functions of the arguments and . The parameters of these functions, as a rule, can be easily determined from the experimental realizations of the signals.

The simplest typical impact is white noise with limited bandwidth. The spectral density of this effect (Fig. 11.1.5, a) is described by the function

Where is the intensity of white noise. Signal dispersion according to (11.1.20)

The correlation function according to (11.1.24) in this case has the form

Taking into account (11.1.33), the function (11.1.34) can be written in the following form:

The graph of the function (11.1.35) is shown in fig. 11.1.5, b.

Rice. 11.1.5. Spectral densities and correlation functions of typical random signals

Most often in practical calculations there are signals with an exponential correlation function (Fig. 11.1.5, d)

Applying the transformation (11.1.23) to the correlation function (11.1.36), we find the spectral density (Fig. 11.1.5, c)

The larger the parameter a x, the faster the correlation function decreases and the wider the spectral density graph. The ordinates of the function decrease with increasing ax. At , the considered signal approaches ideal white noise.

In approximate calculations, the parameter а x can be determined directly from the realization of the signal - the average number of intersections of the time axis by the centered signal: .

Often a random signal contains a latent periodic component. Such a signal has an exponential-cosine correlation function (Fig. 11.1.5, e)

The parameter of this function corresponds to the average value of the “period” of the latent component, and the parameter a x characterizes the relative intensity of the remaining random components that are superimposed on the periodic component. If the exponent is , then the relative level of these components is small, and the mixed signal is close to harmonic. If the indicator is , then the level of random components is commensurate with the "amplitude" of the periodic component. At , the correlation function (8.38) practically coincides (with an accuracy of 5%) with the exponent (11.1.36).

Using fuzzy logic methods to determine the classification characteristics of random processes

1 2 A.M. Prokhorenkov, N.M. rocked

1 Polytechnic Faculty, Department of Automation and Computer Engineering

Faculty of Economics, Department of Information Systems

Annotation. The paper considers the issues of the need to classify random processes that take place in process control systems, analyzes the informative features and existing approaches to the classification of processes. An approach is proposed in which the classification features are the class of the process (stationary, non-stationary), the type of process (additive, multiplicative, additive-multiplicative) and the type of the deterministic component. An algorithm for classifying random processes by one implementation is proposed, based on the use of nonparametric criteria, the Hurst exponent, the Bayesian classification procedure, and fuzzy logic.

abstract. In the paper the necessity of random processes" classification in industrial control systems have been considered. Informative signs and existent methods for the classification have been analyzed. The new approach has been suggested. According to it the process type (stationary or non-stationary), process kind (additive, multiplicative or additive-multiplicative) and deterministic constituent "s kind are classification signs. A realization-based algorithm for the random processes" classification has been proposed. It implies application of non-parametric criteria, Hurst items, Bayesian classifying procedure and fuzzy logic.

1. Introduction

At present, one of the main directions for improving automatic control systems (ACS) is to increase the accuracy of control and stabilization of technological parameters within fairly narrow limits.

An important role in solving the problem of increasing control accuracy is assigned to the measuring subsystem, which is part of the ACS. The random nature of disturbing influences and controlled variables implies the use of a procedure for statistical processing of measurement results, which causes the presence of such error components as statistical error and error caused by the inadequacy of the processing algorithm to a real random process. The reason for the latter type of error is the error in the classification of the observed process. For example, by classifying a non-stationary process as stationary, one can increase the methodological error in estimating the mathematical expectation by increasing the smoothing interval. In turn, the complication of the measurement algorithm in order to reduce the methodological error, as a rule, leads to an increase in the instrumental error. Establishing a priori the class of the process largely determines the algorithm for processing measurement results and hardware.

In ACS, the need to classify random processes is also due to the requirements of a reasonable transition from the analysis of an ensemble of implementations to the analysis of a single implementation. In addition, knowledge of the process class is necessary to describe its dynamics, predict its future values, and select control algorithms.

2. Analysis of informative features and approaches to the classification of random processes

A common approach in classifying objects of any nature, including random processes, is to identify informative features. The analysis performed showed that the informative features used in the classification of processes differ in variety and are determined by the classification goal set by the authors.

All observable processes X(t), which characterize physical phenomena, in the most general form can be classified as deterministic and random.

A deterministic process is defined by a single implementation described by a given time function. Due to the inevitable influence of various external and internal factors in relation to the control system, a deterministic process is an abstraction. In this regard, in the practice of studying processes, a quasi-deterministic process is considered,

realizations of which are described by time functions of a given type ab...,an), where ab...,th are time-independent random parameters.

In contrast to a deterministic process, a random process is represented as a random function X(t, t), where t is time, t0, 0 is the space of elementary events. The function X(/, m) at any time can take on different values with a known or unknown distribution law.

Assigning a process to the class of random can be due either to its physical nature or to the conditions of its study, leading to insufficient a priori data. If the classification is based on the causes of the occurrence of randomness, then non-singular and singular processes can be distinguished. The first group includes processes for which it is impossible to trace the nature of cause-and-effect relationships, since they are the result of a superposition of a large number of elementary processes. For non-singular processes, it is fundamentally impossible to predict instantaneous values. For processes of the second group, in the presence of a certain amount of data, the prediction of their instantaneous values becomes reliable. Singular processes can be either random or deterministic. In control systems for technological objects, all processes should be considered as random, and for processing the results of observations in real time, the reason for the randomness of the process does not play a role.

In the theory of random processes, the most general classification is the classification "by time" and "by state" (Wentzel, Ovcharov, 2000; Kovalenko et al., 1983; Levin, 1989). According to these features, four classes can be distinguished: 1) processes with discrete states and discrete time; 2) processes with discrete states and continuous time; 3) processes with continuous states and discrete time; 4) processes with continuous states and continuous time.

The processes occurring in automatic control systems are random processes with continuous states and continuous time. The use of digital measuring technology leads to the need to consider processes at discrete times and assign them to the first or third class.

An exhaustive characteristic of a random process is a multidimensional distribution law:

^n(xb X2, /2; ... ; x^ 4) = P(X(^)< XI,Х^)< хъ...,Х(4)< хп}.

In practice, as a rule, one-dimensional or two-dimensional distribution laws of a random process are considered, since they contain a sufficient amount of information about the properties of a random process, and the increase in the amount of information when using higher-order probabilistic characteristics turns out to be insignificant. In addition, the determination of multidimensional probabilistic characteristics is associated with great difficulties in the hardware implementation of algorithms for their calculation.

Taking into account the change in probabilistic characteristics over time, random processes are divided into stationary (SSP) and non-stationary processes (NSP). The probabilistic characteristics of the SSP are the same in all sections. The stationarity condition in the narrow sense is the invariance of the n-dimensional probability density with respect to the time shift m. The stationarity conditions in the broad sense are limited by the requirements that the mathematical expectation M[X(0] and variance B[X(()]) are independent of time and the correlation function depends only on time shift t, that is:

M[X(0\=cosh1, t[X(0\=cosh1, Xx(b, t2)=Rx(m), m=^2 - 1.

In practice, in most cases, the correlation function is a fairly complete characteristic of the BSC, therefore, they are usually limited to identifying the stationarity of the process in a broad sense.

The structure of a random process can be established by the correlation function or by the known distribution density.

Depending on the type of distribution laws, normal, uniform, Rayleigh, Poisson and other random processes can be distinguished. Deviations from the classical form of the distribution indicates the non-stationarity of the process. Based on one realization of limited length, it is difficult to judge with sufficient accuracy the law of distribution of a random process, and in most applied cases of analysis, the researcher does not have information about the form of the distribution function. Then the type of process is either postulated, or the distribution function is not taken into account in the analysis.

More complete information about the dynamic properties of the process can be obtained from the correlation function. A typical correlation function of the BSC is a symmetrical decreasing function. The presence of fluctuations in the correlation function indicates the periodicity of the random process. If the correlation function is aperiodically damped, then

random process is considered broadband. A multiband random process is characterized by a triangular correlation function. Stationary - in a broad sense - processes have correlation functions that, with an unlimited increase in m, tend to a constant value or are periodic functions of m.

Stationary processes whose correlation functions include an exponent with a negative argument are ergodic. The tendency of the correlation function to some constant value other than zero is usually a sign of the non-ergodic process.

Determination of the statistical characteristics of random processes is fundamentally possible in two ways: determination by one realization and by an ensemble of realizations. If the probabilistic characteristics of the process obtained by averaging over time are equal to those found by averaging over the ensemble, then the random process is ergodic. Processes that do not have the ergodicity property can only be processed over an ensemble of realizations.

A priori knowledge of the ergodicity of the process greatly simplifies the algorithmic support of information-measuring and information-control complexes. Under the conditions of real technological processes and control systems, it is impossible to check the global ergodicity of processes, and it is accepted as a hypothesis.

Non-stationary processes are characterized by a change in time of their statistical characteristics, so this can be taken into account when performing the classification. From the point of view of this approach, processes are usually singled out that have a time-varying mean value; time-varying mean square, time-varying mean and mean-square, time-varying frequency structure (Bendat, Peirsol, 1989). Such a classification reflects the change in time estimates of probabilistic characteristics.

The above analysis showed that there cannot be a unified classification of processes due to the independence of classification features and the diversity of classification purposes. There are several approaches to the classification of processes. A significant part of the authors seek to systematize information about random processes in order to show all their diversity (Ventzel, Ovcharov, 2000; Kovalenko et al., 1983; Levin, 1989; Shakhtarin, 2002). The most general approach to the classification of both stationary and non-stationary processes is associated with their continuous or discrete representation (Ventzel and Ovcharov, 2000; Kovalenko et al., 1983; Levin, 1989).

In applied cases, the specifics of tasks are taken into account, the solution of which must be preceded by the classification of the observed processes. So, for example, in (Tsvetkov, 1973; 1984; 1986) a classification of processes in metrology was carried out according to the signs of stationarity and ergodicity in order to identify the causes and analyze their influence on the methodological error in measuring the statistical characteristics of random processes. In radio engineering, classification according to the spectral properties of signals is widely used (Levin, 1989). To justify the transition from the analysis of an ensemble of realizations to the analysis of individual realizations, in (Bendat, Pearsol, 1989) it is proposed to perform a classification by types of nonstationarity and, at the same time, the time behavior of estimates of statistical characteristics is considered.

Thus, the currently existing approaches to the classification of random processes do not allow developing an algorithm for their analysis in order to identify the nature of the non-stationarity of the process, the type of deterministic components and their characteristics necessary to solve the problems of operational control and management of technological processes, according to one implementation. In this regard, solutions aimed at generalizing and improving existing approaches to the classification of random processes are relevant.

3. Classification of random processes according to one implementation

Random processes occurring in control systems can be represented as the result of the combined action of a deterministic useful signal and stationary interference. In the general case, the effect of interference on the useful signal can be expressed by the operator Depending on the type of operator V, the following signal models are distinguished (Kharkevich, 1965):

additive model X(0 = + e(0; (1)

multiplicative model X(/) = φ2(/) e(/); (2)

additive-multiplicative model

where φ1(0, φ20) are deterministic functions of time, e(1) is a stationary random process with zero mathematical expectation ne = 0 and constant variance D.

An example of an additive process is the output signal of a measuring instrument when the useful signal is added to the instrument's internal noise. A change in the stiffness of the pressure gauge sensor membrane, a change in the gain of the amplifier, a change in the reference voltage in a digital voltmeter, and others are the causes of the multiplicative error of measuring systems, which is described by a multiplicative model. In many cases, the non-stationary process of errors can be described in the form of an additive-multiplicative model.

In engineering practice, processes that are stationary in a broad sense are usually considered, while the behavior of the mathematical expectation, variance, and correlation function is estimated in time. Therefore, when classifying non-stationary processes, one should proceed from the analysis of the same characteristics.

Taking into account the assumptions made, the mathematical expectation tX, the BX variance and the correlation function RX of random processes represented by models (1-3) have the following form:

additive

multiplicative

additive-multiplicative

mX(0 = f:(0; Du(0 = D;

Rx(tl, /2) = Rs(th /2);

mX(() = 0; Du(0 = ^(OD; Rx(tl, /2) = ^(M^^^WHA, /2); mX(P) = φ1(/); Rx(tl, /2) = Ф2(bsh/2shb, /2).

It follows from the above relations that the mathematical expectation for the additive and additive-multiplicative models depends on the deterministic component φ1(/). The dispersion and correlation function of the additive model are fully characterized by the properties of the stationary noise. And for the multiplicative and additive-multiplicative models, these probabilistic characteristics are also determined by the deterministic component φ2(/).

Expressions (4) and (6) show that for processes represented by additive and additive-multiplicative models, the mathematical expectation can be estimated from one implementation using one or another operation equivalent to low-pass filtering.

If the noise variance e(Г) is constant, then it is also possible to determine the mean square of the multiplicative and additive-multiplicative processes (and thus obtain an estimate of the variance) from one implementation (Bendat, Peirsol, 1989).

Thus, for the processes represented by models (1-3), there is no need to check the ergodic properties of a non-stationary random process.

The accuracy of estimating the statistical characteristics depends on the type and parameters of the deterministic processes φ1(t) and φ2(t) (Proktorkinkoy, 2002), so the classification of processes according to the type of nonstationarity should be supplemented by the classification according to the type of deterministic processes.

Classification should be considered as a necessary preliminary stage in the study of random processes in order to identify their properties before conducting the main statistical processing, therefore, in a sense, the classification should reflect the algorithm for analyzing the observed process. Taking into account the above, a classification of random processes was developed in the presence of one realization of the process under study (Fig. 1). The class of the process, the type of non-stationarity were chosen as classification features: non-stationarity by mathematical expectation (MO), non-stationarity by dispersion, non-stationarity by correlation function (CF), as well as the laws of change of mathematical expectation and dispersion. In the proposed classification, the most common transients in engineering practice are considered as deterministic components: linear, exponential, periodic, periodic damping.

Implementation of a random process

Stationary in MO

Non-stationary in MO

SP by dispersion

NSP for KF

ERP by dispersion

SP for KF NSP for KF

Linear

ERP by dispersion

SP for KF NSP for KF

SP by dispersion

NSP for KF

Exponential

Periodic

Periodically damped

Rice. 1. Classification of random processes represented by one implementation

4. Statement of the problem of classification of random processes

In the general case, classification is understood as the division of the considered set of objects or phenomena into homogeneous, in a certain sense, groups, or the assignment of each of a given set of objects to one of the previously known classes. In the second case, we have a classification problem in the presence of training samples ("classification with training"). In the classical form, the solution to this problem is to perform a display of the form:

those. assignment of the object, given by the vector of informative features R = (rb r2, ..., rn), to one of the predetermined classes (d?b a2, ..., ab).

The processes represented by models of the form (1-3) belong to the class of non-stationary random processes. To identify non-stationary properties, it is proposed to use non-parametric criteria (Kendall, Stewart, 1976), the Hurst exponent (Feder, 1991) and correlograms, based on the results of which a vector of informative features R will be formed.

A significant majority of nonparametric criteria respond to changes in the estimate of the mathematical expectation. Thus, nonparametric criteria without preliminary processing of the observed series make it possible to single out two classes of processes "stationary in terms of mathematical expectation" and "nonstationary in terms of mathematical expectation".

By the value of the Hurst exponent, one can judge both the stationarity of the process in terms of mathematical expectation, and the form of the deterministic component. This allows us to consider a priori three classes of processes: stationary with respect to mathematical expectation; non-stationary in terms of mathematical expectation, changing according to a monotonic law; non-stationary in mathematical expectation, changing according to a periodic law.

As noted in Section 2, the correlation function carries information about the dynamic properties of the process under study. The output of the correlogram beyond the 95% confidence interval allows, to a certain extent, to judge how the process under study differs from white noise.

The impossibility of applying the classification procedure for the simultaneous selection of classes of processes that are non-stationary in terms of mathematical expectation and dispersion leads to the need to apply the classification procedure twice.

The second problem is that informative features are given on different scales. The result of applying each non-parametric criterion separately is measured on a dichotomous scale, and the sign can take two values: "a random process does not contain a deterministic component" - "the process contains a deterministic component", or "0" and "1". And the Hurst exponent is measured on a quantitative scale and takes values in the range from zero to one.

Randomness tests have different efficiencies for different types of deterministic components of non-stationary random processes, therefore, in conditions of limited a priori information about the properties of the process under study, the decision on the class of the process should be made based on the results of applying a set of criteria. In this regard, it is proposed to obtain a certain generalized classification feature. The classification based on nonparametric criteria is proposed to be based on the Bayesian procedure for binary features (Afifi, Eisen, 1982). The estimates obtained in this way are further considered as a generalized result of applying nonparametric criteria, and the posterior probability is considered as a classification feature. In this case, the measurement scale becomes the same as for the Hurst exponent.

The third problem is related to the dependence of the values of the selected classification features on the length of the implementation and the parameters of the process under study, which are unknown at the stage of process classification. Therefore, one should look for an answer to the question: "To what extent does the process under study belong to this or that class?". Due to this formulation of the question, it is proposed to use fuzzy logic methods to classify processes.

5. Bayesian classification procedure

It is required to classify the process X(/) based on the presence or absence of n events. The number of events (features) is equal to the number of considered nonparametric criteria. Let us define for each y-th event (y = 1, 2, ..., n) a random variable:

In our case, Ty = 1, if in the process X(/) under study, according to the criterion y, there is a tendency to change the mathematical expectation, Ty = 0 - otherwise.

R = (rb r2, ..., rn) ^ye (di, d2, ..., dm),

1 if there is a y event, 0 if there is no y event.

The probability of an object belonging to a class, provided that the value of the attribute Ty is equal to one, is denoted by , y=1,2, ... n. Since non-parametric criteria allow us to divide the set of processes under study into stationary and non-stationary processes, then in this case m = 2.

The distribution law Tu for a class has the form:

/ (Tu) = RT (1 - Ru) 1-TU.

The results Tu of applying nonparametric criteria are independent, so the joint distribution law / (r) for the class can be written as:

/g (G) \u003d P /g (Tu).

Assume that the prior probabilities are the same *1 = q2 = 0.5 and the misclassification costs are equal. The cost of erroneous classification in this case is associated with the losses that may occur when classifying a stationary process as a non-stationary process or when classifying a non-stationary process as a stationary process. The conditional probability Pr(e, | r) that the process under study belongs to the class for a given vector of observations (a posteriori probability) is determined by the formula (Afifi, Eisen, 1982):

ъ P RT (1 - Ru)

Pr(e/ | r) = ■

P Rku (! - Rku) 1-

The process X(0) belongs to the class for which the value Рг(ё, | г) is maximal. The values py are estimated from the training sample from the processes belonging to all the considered models (1-3) and containing various types of deterministic components. Let 51 and 52 be the number of non-stationary and stationary in MO processes, respectively, 5 = 51 + 52. Denote by y the number of processes of class / for which non-stationarity in MO is revealed by the y criterion.

For each newly incoming process X(/), characterized by the vector of feature values (m1, ..., mn), the posterior probability estimate has the form:

Pr(e/ | r) = ■

6. Proposed fuzzy classification procedure

Each classification feature Ku is given by a linguistic variable characterized by a triple of elements<Ку, Ту, Пу>, where Ku - variable name; Tu is a term-set, each element of which is represented as a fuzzy set on the universal set Pu.

The universal set of values of the Hurst exponent - ПН = . H values around 0.4< Н < 0,6 определяют собой область белого шума в нечетком смысле. Значения Н в окрестности 0,3±0,1 говорят о наличии в рассматриваемом временном ряду периодической компоненты. Значения Н, близкие к единице, характеризуют наличие монотонной компоненты в исследуемом процессе.

Let's define a term-set as the names of possible components of non-stationary random processes: "periodic", "stationary", "monotone". We define membership functions as the difference of two Gaussian functions defined by the relation:

¿u(x, cg1, c1, cg2, c2) = e a" - e °2.

This membership function reflects the fact that each type of process is characterized by a certain range of values of the Hurst exponent - the core of the fuzzy set is non-empty. Studies have shown that the probability of error attributing a process containing a periodic component to noise

higher than the probability of the error of attributing to the noise of a monotonous noisy process. The asymmetric double Gaussian function makes it possible to reflect this point. The membership functions of the linguistic variable "Hurst exponent" before setting up the fuzzy model are shown in fig. 2a.

The universal set of values for the estimate of the a posteriori probability (7) PRg = . Evaluation values close to one indicate the presence of a deterministic component in the series under study, and those close to zero indicate the randomness of the series. The term-set of the variable "non-parametric criteria" is defined as ("stationary", "non-stationary"). The terms can be formalized using the double Gaussian membership function (Fig. 2b).

Let's call the third linguistic variable "correlogram". The universal set of values of this variable Pk = - the weight coefficient of the rule with the number /p.

As a solution, choose the class with the maximum degree of membership:

Mdi(**), Md2(**), ..., Mäm(**)),

where the symbol * denotes the vector of values of the classification features of the process under study.

Tuning is finding the parameters of the membership functions of the input variables and the weight coefficients of the rules that minimize the deviation between the desired and actual behavior of the fuzzy classifier on the training set.

Proximity criteria can be defined in various ways. In this work, we used the criterion proposed in (Shtovba, 2002). The training sample is formed from L data pairs connecting the inputs X = (xb x2, ..., xn) with the output y of the dependence under study: (Xq, yq), q = 1, 2, ..., L. Let us introduce the following notation: P - vector of parameters of membership functions of input terms; W - vector of weight coefficients of knowledge base rules; F(Xq, P, W) - the result of inference on a fuzzy base with parameters (P, W) with the value of the inputs Xq; ßd(yq) - the degree of belonging of the value of the output variable y in the q-th pair of the training sample to the solution d,; цdi(Xq, P, W) - the degree of membership of the output of the fuzzy model with parameters (P, W) to the solution d, determined by formula (8) with the values of the inputs from the q-th pair of training sample. As a result, the optimization problem takes the following form:

1 L m t \ T Z Sq Z ((yq) - Mdi (Xq, P, W))

Rice. 3. Membership function of the linguistic variable "Hurst exponent" after tuning

= [ 1 if yq = F (Xq, P, W)

where q)