# ~.cal File

This section describes an optional output file (`~.cal`) that is produced only when indirect calibration is used. It displays a great deal of information related to the problem of shifting the hypocentroid of a relocation (either uncalibrated or calibrated with the direct method) to best match a set of calibration locations that were specified by the cal_ command.

If there is only a single calibration event the calculation is straight-forward. The uncertainty of the calibrated hypocentroid will be the matrix sum of the covariance matrix for the calibration location and the cluster vector of the calibration event. With multiple calibration events the process is much more complicated. To fully understand it, careful inspection of the source code in the module `cal_shift.f90` is required. That code has a lengthy comment section describing the approach, reproduced here:

When calibration locations are available for one of more cluster events, calculate the shift needed to bring the cluster into best alignment (the “optimal” shift vector). The original method (mode = 0) was just to take the average of all the individual shift vectors. this is still supported in the code but to use it you would need to edit the value of “mode” in the call to this subroutine and recompiled. The preferred way (mode= 1, hard-wired in the code) is to take into account the uncertainties of the calibration data and the cluster vectors, through their covariance matrices, and also to consider the possibility of bias that is not included in the formal covariance matrices. Covariance matrices for calibration locations are input by the user, using the command “cal_”.

The calibration event covariance matrices (rcv) are assumed to be already scaled to a 90% confidence level. This could be done, for example, by applying a utility program that converts from the confidence ellipse calculated in some other code (e.g., a single event location code) to an equivalent covariance matrix. In the case of InSAR analysis or geologic information (fault trace) it may be seat-of-the-pants. The covariance matrices for cluster vectors (ccv) are first converted to 90% confidence ellipses with the normal Bayesian approach, then converted back to raw covariances (subroutine ell2cv), after which they can be added to the calibration event covariances (because they are completely independent estimates), yielding the combined covariances (scv). These are converted (no scaling) directly to 90% confidence ellipses for the combination. The area of these combined confidence ellipses is used for inverse weighting of the shift vectors estimated for each calibration event, to get the estimate of the optimal shift vector for the entire cluster.

The uncertainty of the optimal shift vector (gcv) is based on the weighted (same weights as for the individual shift vectors) combination of the combined covariancess (scv), expressed as a confidence ellipse (90%). This is only part of the uncertainty, however. There is additional uncertainty, which can be thought of as associated with the “unmodelled” or “biased” part of the indirect calibration problem; it arises if any of the calibration event locations are biased. Such bias could arise with local network solutions if the velocity model is poorly chosen, or if readings from too great a distance have been used, or if some phases are mis-identified or outliers are not properly weighted. It can also arise if a serious error (e.g., an outlier reading, with poor azimuthal coverage) has caused bias in the estimate of a cluster vector in the hypocentroidal decomposition analysis. Such bias is seen by comparing the individual estimates of shift vector to the optimal shift vector. If there are departures that are statistically unlikely, given the alleged confidence of the calibration locations and cluster vectors, then there is a need to account for the unmodeled error. Graphically, it is easily seen by plotting residual shift vectors (subtract the optimal shift vector) and comparing to the confidence ellipses of each shift vector (scv).

Instead of taking the RMS of the residual shift vectors to estimate the level of inconsistency (this is too sensitive to outliers and overestimates the uncertainty), I do a test based on the coverage statistic for the residual calibration shift vectors, for which the cumulative binomial probability distribution gives the probability of the observed number of “uncovered” vectors. See ‘rdbt_test2’ for more explanation. If the null hypothesis is rejected, the test is repeated after adding a small amount to each covariance matrix, equivalent to adding a circular uncertainty. When the null hypothesis cannot be rejected, this extra covariance is converted to a “radius of doubt”, which is added to the modelled uncertainty (gcv) to yield an “augmented GT covariance matrix” (agcv). The area of the equivalent ellipse is converted to a circular area, and the radius of that circle is a useful measure of the calibration level of the cluster. The calibration levels of individual events (accv) are found by adding their cluster vector covariances to agcv.

I brought back the older algorithm (rdbt_test1) for radius of doubt (v8.1) that is based on a test of the hypothesis that all residual shift vectors have zero length. I have kept the test based on coverage statistics for now, but it has the problem that we seldom have more than a couple calibration events and the 90% coverage requirement translates into 100% coverage requirement if there are fewer than 10 calibration events. If there are 10 or more calibration events then the larger of the two estimates is used. Otherwise the test based on zero-length residual vectors is used.

These concepts are illustrated with the `~.cal` file from a cluster based on nuclear tests at the Chinese test site Lop Nor. The Chinese government has never released official information on these tests but many research projects have established likely associations between specific tests and test sites (boreholes and adits) observable in satellite imagery. The hypocentroid was estimated from only P arrivals (command phyp) in the distance ranage 30-90°, which normally provides a fairly good estimate of the absolute location, but certainly not a calibrated one. The first part of the file is:

On 15 calibration events: Event 5 Cluster vector CV for event 5 kcrit2: 5.463 90% confidence ellipse: -62.033 2.661 3.060 Area: 25.575 km**2 Calibration location CV for event 5 90% confidence ellipse: 0.000 1.000 1.000 Area: 3.142 km**2 Weights for event 5 w12 = 0.348 w3 = 1.000 w4 = 21.033 Covariance matrices for calibration event 5 ccv: 1.622 0.173 0.173 1.388 0.000 0.012 sccv: 8.860 0.945 0.945 7.581 0.000 0.038 rcv: 1.000 0.000 0.000 1.000 1.000 0.010 scv: 9.860 0.945 0.945 8.581 1.000 0.048

Event 5 is the first of 15 events in the cluster that are being used for indirect calibration. The confidence ellipses for the event’s cluster vector and the associated calibration location are converted to covariance matrices and combined (`scv`), an estimate of the uncertainty of the calibration shift for each calibration event. The first four covariance elements listed refer to the epicenter, the fifth one is focal depth and the sixth is origin time.

The weight factor `w12` will be used for weighting the multiple estimates of epicentral shift when they combined (inverse weighting, so larger area = less weight). Weight `w3` refers to focal depth. Weight `w4` refers to origin time. The same display follows for each of the other 14 calibration events.

Using the weights for each calibration event a weighted mean is calculated for all the calibration shift vectors.

Weighted mean (REF-SOL) calibration shift: Latitude : -0.045 deg Longitude: -0.080 deg Depth : 0.000 km OT : 0.252 sec

This output is also printed to the terminal window. The depth shift is zero because all events had their depth set to the calibration depth. Next, we need to determine the uncertainty of the estimate of calibration shift. To do that we consider the consistency of the shift vectors from individual events. We start with a weighted average of all the shift vector covariance matrices.

Weighted average CV for all calibration events gcv: 1.900 0.170 0.170 1.660 1.000 0.028

Subtracting the weighted mean shift of each parameter from the individual shift vectors produces residual vectors, listed in the next bit of output. If our estimate of uncertainty is reasonable, most of those vectors should lie inside the corresponding ellipse. The coverage parameter `covp` is less than 1.0 in that case.

Residual calibration shift vectors and COVP, based on SGCV iev dtiev ddiev rvl dx dy covp 5 0.676 0.000 3.360 -2.992 -1.529 0.999 6 0.093 0.000 2.271 -1.782 1.408 0.713 8 0.073 0.000 1.675 -1.422 0.885 0.911 9 -0.035 0.000 0.434 0.202 -0.384 0.053 10 0.321 0.000 3.468 -1.089 3.293 1.858 11 0.250 0.000 1.728 -0.056 -1.727 0.894 15 -0.058 0.000 0.740 -0.408 0.618 0.171 16 -0.234 0.000 0.646 0.466 0.448 0.125 18 -0.511 0.000 1.303 1.028 -0.801 0.607 19 0.444 0.000 2.016 -2.004 0.215 1.167 22 0.071 0.000 0.940 0.869 0.358 0.290 23 -0.023 0.000 0.129 0.119 0.050 0.005 25 -0.125 0.000 1.033 1.026 0.120 0.365 26 -0.145 0.000 1.599 1.304 -0.925 0.903 28 -0.104 0.000 1.454 -1.318 0.614 0.758 Coverage: 13 / 15 = 0.867

The statistical test on coverage is done a little later. First is a test of the null hypothesis that all these residual vectors have zero length. A variation of this test was introduced in the original paper on hypocentroidal decomposition (Jordan and Sverdrup (1981), where the null hypothesis was that all cluster vectors have zero length, i.e., all events in the cluster could have occurred at the same location.

Radius of doubt test based on null hypothesis that all residual cluster vectors have zero length Critical value = 37.1 at 90% Individual event contributions to kocs2: 5 0.98734 6 0.59863 8 1.01530 9 0.37448 10 2.20194 11 2.68454 15 0.10395 16 0.25038 18 2.48718 19 1.35064 22 0.18115 23 0.24012 25 1.67360 26 3.31006 28 0.84308 rdbt_test = 0.00; observed value = 18.30; null hypothesis cannot be rejected rdbt1 = 0.00

If the observed value of the statistic `kocs2` exceeds the critical value, the null hypothesis is rejected, in which case we would add a small increment of circular uncertainty (“radius of doubt”) to the individual covariance matrices and try again. In this case the null hypothesis cannot be rejected, so we move on to the other test of consistency of the calibration shift vectors.

Radius of doubt based on coverage statistics Threshold precentage of probability for radius of doubt test: 0.10 rdbt nr nrc k coverage P(X k) rdbt_test = 0.000 15 13 2 0.867 0.451; null hypothesis cannot be rejected rdbt2 = 0.00

As mentioned in the excerpt from the code documentation above, this test only makes sense if there are at least 10 calibration events. For this cluster there are 15 calibration events, so we can consider this as an alternative to the test of the null hypothesis that all residual vectors could actually be zero length. Both tests result in a zero radius of doubt, but if one test produced a larger radius of doubt than the other we would take the larger one.

In the following output the value for the epicenter is the radius of doubt calculated above (zero in this case). The values for depth and origin time are based on the robust estimate of spread of the residuals. The RMS values are also shown for reference.

Inconsistency (between multiple calibration data) terms for augmented GT covariance: Epicenter: 0.000 km (rms = 1.786 km) Depth : 0.000 km (rms = 0.000 km) OT : 0.250 sec (rms = 0.284 sec)

The previous estimate of the covariance matrix for the calibration shift (`gcv`) is updated by adding the results of these tests on inconsistency. We call this the “augmented GT covariance matrix” (`agcv`).

Augmented GT covariance matrix agcv: 1.900 0.170 0.170 1.660 1.000 0.090

Now `agcv` is converted to a confidence ellipse, as our best estimate of the uncertainty of the calibration shift:

Uncertainties from the augmented GT covariance matrix: Confidence ellipse: -62.687 1.254 1.410 Area of ellipse: 5.554 km**2 Equivalent circular radius (CE level) = 1.330 km Depth shift uncertainty: 1.000 OT shift uncertainty: 0.301

The confidence ellipse is defined by three values, the azimuth of the semi-minor axis and the lengths of the semi-minor and -major axes. The final block of output provides a listing of the total uncertainty (including cluster vector) of calibrated hypocenter for each event in the cluster. For calibration events the coverage test (`covp`) now applies to the calibrated epicenter. This test can be done in one of three ways, distinguished by the choice of which uncertainty estimate to apply:

- Traditional: calibration events get the uncertainties of the calibration data.
- Systematic: all events done same way, add calibration uncertainty to cluster vector uncertainty.
- Optimal: use traditional or systematic method, whichever has shorter semi-major axis.

The user can control which mode is used with the command ctyp. Only the first 10 events are shown here:

Cumulative uncertainty of calibration-shifted cluster events iev alpha xl yl area eqr ddep dot covp caltype 1 83.769 3.527 4.618 51.161 4.035 1.000 0.381 0.000 2 -54.243 2.504 2.989 23.512 2.736 1.000 0.355 0.000 3 -44.588 2.907 5.222 47.692 3.896 1.000 0.358 0.000 4 -71.458 2.061 2.492 16.131 2.266 1.000 0.340 0.000 5 -62.134 2.941 3.369 31.130 3.148 1.000 0.358 1.097 systematic 6 -64.958 2.472 3.064 23.794 2.752 1.000 0.343 0.828 systematic 7 -42.097 1.805 2.155 12.217 1.972 1.000 0.341 0.000 8 -66.582 1.434 1.944 8.762 1.670 1.000 0.330 1.351 systematic 9 -49.256 1.556 1.873 9.157 1.707 1.000 0.330 0.075 systematic 10 -11.258 2.335 2.745 20.129 2.531 1.000 0.356 2.198 systematic