Empirical Reading Errors
This section focuses on the estimation and usage of empirical reading errors in mloc. There are other aspects of reading errors that are covered in the main Reading Errors section.
The concept of “reading error” or “picking error” in earthquake location is normally understood as an estimate of the uncertainty of the reading of the arrival time (“pick”) of a specific seismic phase on the seismogram of a specific earthquake. Seismic analysts rarely provide their own estimate of that uncertainty beyond a qualitative characterization as “emergent” or “impulsive”, and earthquake location codes that employ a quantitative estimate of reading error, e.g., for inverse weighting, normally use an ad hoc value based on phase type.
Empirical Reading Error is a related concept, based on multiple event relocation, i.e., simultaneous location analysis of a clustered group of earthquakes. Many seismic stations observe the same seismic phase for multiple events in the cluster. The resulting multiple observations of the same station-phase provide an opportunity to carry out a statistical analysis which leads to an estimate of the uncertainty of those readings that is based on the readings themselves, thus empirical. It would be more correct to refer to this as an empirical reading uncertainty, but we follow the traditional terminology.
It is important to appreciate that this concept of empirical reading error includes contributions to scatter of readings beyond reading error per se. For example it will absorb differences in travel time through a heterogeneous Earth even from events that are not exactly co-located, as well as scatter arising from the different philosophies of arrival time picking used by different analysts, changes in station equipment, irregularities in timing systems, differences in the precision to which picks are reported, etc.
Each arrival time reading of a given station-phase is assigned the same empirical reading error. Although this obviously falls short of the ideal of having a reliable estimate of the uncertainty of each reading, it is a significant improvement over the traditional methods for handling uncertainties in arrival time data. Because the arrival time readings are weighted inversely to their reading errors (whatever the source) in the location algorithm, the specification of reading errors has a major impact on the estimated hypocenters and their uncertainties.
How Empirical Reading Errors are Determined
Empirical Reading Errors are estimated from the distribution of residuals for a given station-phase (for example, the Pn phase at station TUC) for a specific cluster. The number of samples can range from two to several hundred. It is not uncommon to have multiple independent readings of the same phase at the same station for the same event. The analysis is done on the set of residuals obtained by removing a theoretical arrival time for each reading, based on a standard Earth model and the current hypocenters of the events in the cluster.
The estimate of spread of the residuals must be done with a robust estimator, i.e., one that is not sensitive to outliers, which are very common in arrival time data sets. We employ the estimator Sn (no relation to the seismic phase) proposed by Croux and Rousseeuw, (1992). This measure of scale or spread has three desirable properties, 1) it requires no assumptions about the nature of the underlying distribution, 2) it requires no estimate of the central tendency (e.g., the mean or median) of the distribution, and 3) it reduces to the standard deviation if applied to a Gaussian distribution. It is also very easy to compute (see subroutine croux in mloclib_statistics.f90)
Cleaning
An important aspect of the relocation process consists of multiple cycles in which the current estimates of empirical reading error are used to identify outlier readings, which are then flagged so that they will not be used in subsequent relocations. In the following relocation, estimates of empirical reading errors will tend to be smaller because of the filtering of outliers and improvement in the locations of the clustered events. Therefore the process of identifying outliers is iterative and it must be repeated until convergence. In this context, convergence means that the distribution of residuals for a given station-phase is consistent with the current estimate of spread. As outlier readings are flagged, the distribution is expected to evolve toward a normal distribution with standard deviation equal to the empirical reading error. We generally continue this cleaning process until all readings used in the relocation are within 3σ of the mean for that station-phase, where σ is the current estimate of empirical reading error for the relevant station-phase.
More detail on the cleaning process is provided elsewhere.
Output File
Every run of mloc produces a ~.rderr file containing the empirical reading errors calculated from the current run. It also carries the empirical reading errors that were used in the current run and some other statistics. To use these empirical reading errors in a future run, use the command rfil to reference the desired ~.rderr file.