MLOC Station Code Problems

Station Code Problems

The correct assignment of geographic coordinates to the station codes in arrival time datasets is an essential aspect of every relocation algorithm. mloc is extremely flexible in regard to station data but that very flexibility leads to some rather complex scenarios. Failure to resolve station coordinate problems occasionally leads to data being used incorrectly, but much more often leads to data dropping out of the relocation, with consequences ranging from nil to severe. mloc produces a considerable amount of information on the subject and provides tools for helping to solve problems of this type but to use that information wisely it is necessary to have some understanding about how the station coordinate issue is handled in mloc. Ultimately it is a responsibility of the user to review the output carefully, especially in the early stages of a new cluster analysis, to ensure the correct usage of the available data. The purpose of this section is to give you the necessary background and a sense of strategies that have proven useful.

Before reading this section it is highly recommended that you read the section on Station Data and the one on Supplemental Station Files. A few main points to remember:

The master station file distributed with mloc includes only station codes registered at the IR (although some of the coordinates have been corrected slightly from what you’ll find at the IR website). If arrival time data came from the ISC and the station code matches an entry in the master station file, it is almost certainly correct.
New stations are being registered at the IR constantly, so if you are using ISC data for recent events you may find that the master station file needs to be updated with the new codes. Just do it.
Do not add non-registered station codes to your master station file. Put them in supplemental station files.
mloc reads supplemental station files before the master station file and puts all of it in a list. When matching a station code to coordinates, the first successful match is selected. If any code in a supplemental station file conflicts with one in the master station file, the supplemental file wins. For this reason it is wise to keep supplemental station files as short as possible. Unneeded entries may step on entries from the master station file (case 1 below).

During the Run

The first hint that you have station code problems will occur during the terminal session in which you run mloc. After the part where the event files are read there will be a summary of the number of station codes that failed the date range (operational epoch) or which were not found at all. If nothing is listed and you are using only the master station file and all your data come from the ISC Bulletin, then you are in the clear and should have no worries about station code problems.

If, however, you are using one or more supplemental station files or if any of your data come from stations that may not have been registered with the International Registry of Seismograph Stations (IR) at the ISC, the absence of warnings at this stage cannot be taken at face value. There could still be problems:

The coordinates of an unregistered code defined in a supplemental station file might have been applied to an arrival in your dataset that should be using the coordinates from the IR.
Coordinates from a station code registered at the IR (i.e., the master station file) may have been assigned to an unregistered, conflicting station code in your dataset (perhaps from a local network).
The same code could be defined in more than one of your supplemental station files.

Case 1 is rare. Case 2 is common when using data acquired from somewhere other than the ISC. Case 3 would be revealed in the station output file (below) as a case of conflicting station codes, but the result of all these scenarios is that the incorrect coordinates are assigned to a certain station code. Calculations of epicentral distance are therefore very likely to be in error, so theoretical travel time is in error, probably by a lot. The reading is likely to fall out of the relocation as a gross outlier. These cases will be hiding among “legitimate” cases of gross outliers in the Bad Data section of the ~.phase_data file, but there are clues to help pick out the station code problems. The main one is to check the authorship of the pick (an excellent reason for including authorship in your phase records) and consider if it makes sense to find a pick from that source at that epicentral distance. Another clue is to compare the epicentral distance to the one listed (hopefully) in the event file.

The Station Output File

After a run of mloc, the first place to check for station code problems is the ~.stn file. There are three main types of problems to look for: cases of failed date range, station conflicts and missing stations.

Failed Date Range

It is not unusual to have a few “failed date range” warnings, even in the plain vanilla case of using only ISC data with only the master station file, because many of the entries for operational epoch are in error. To fix those problems, one can clear the operational epoch fields in the master station file (they are optional), or extend the operational epoch as necessary by pasting in the new epoch-limiting value (listed in the entry in the ~.stn file).

Duplicates and Conflicts

If any supplemental station files have been used, there may be cases where the same station code appears in both a supplemental file and the master station file. These instances are laid out with comparison of coordinates and categorized as “pure duplicate”, “minor differences in coordinates” or “station conflict”. Duplicates can be left, but I prefer to remove them from the supplemental station file. Minor differences won’t make any difference for stations that are only used to estimate the cluster vectors, but if the station is one that will be used for direct calibration the issue should be investigated. Looking at the competing coordinates in Google Earth is sometimes helpful.

If a conflict is found for a station that is in the arrival time dataset from a source other than the ISC, then it’s working as it should; you want the entry in the supplemental station file to override the one in the master station file. If the conflict is between two supplemental station files you’ll need to ensure the correct supplemental station file is loaded first, or else edit the file with the irrelevant entry.

Missing Stations

The next section of the station output file lists all the cases of missing stations. It is good practice (but admittedly tedious) to take the time to resolve these problems, especially in cases where there are many instances of the missing station; those instances provide information that would help resolve cluster vectors. However, even a case with a single instance could provide important information in a direct calibration if it’s in the appropriate distance range. The best strategy is to check first at the IR, and if the code is found, add it to the master station file. If not, you’ll need to check with the source of the data you are using and make an entry for it in a supplemental station file.

The remainder of the station output file lists all stations used in the relocation, their coordinates and the source (i.e., one of the supplemental station files or the master station file) of the coordinates. This is sometimes useful in tracking down a station problem.

The Worst Case Scenario

The discussion so far has been based on the assumption that all station codes in the dataset are unique, i.e., a given station code always refers to a single geographic location. Fortunately this assumption is usually true, but when data from several different sources are being combined in a relocation it may happen that arrival time data exist from two distinct stations (i.e., different locations) with the same code. There are several ways to deal with this awkward situation.

Ignorance is Bliss

By far the easiest way to deal with a station conflict is to ignore it. That means that the coordinates for the station in question that were introduced in the supplemental station file will be applied everywhere, even to arrivals that should be taking their coordinates from the master station file. Those arrivals will fall out of the relocation as outliers. The use of a supplemental station file typically means that those stations are at local distance range and therefore of considerable importance to a calibration analysis. In most cases the readings that are lost will be at teleseismic range and it’s likely there will only be one or a few instances in the dataset. If there is only one reading it will not contribute to the cluster vectors in any case, and if it is not in the distance range be used for the hypocentroid in direct calibration its loss is irrelevant. When there are several instances the loss of those data can still probably be accepted with virtually no consequence to the quality of the relocation, but it must be considered.

Change station codes

If a station conflict involves only a small number of readings the conflict can be managed by altering the station code in the supplemental station file (and the relevant event files) in some manner. The provision in the MNF format for carrying the station code in two fields, one of which is treated as archival, makes this approach more palatable than losing the original station code in the dataset completely.

The Right Way

The correct way to handle a station code conflict when you need to use data from both locations is to expand the identification of the station beyond the traditional station code. This is why the so-called ADSLC system was proposed. The MNF v1.3 format used for event files supports the full ADSLC description, but mloc uses only the agency and deployment fields to resolve station code conflicts. Only the master station file format (which can also be used for a supplemental station file) and the generic and NEIC supplemental station file formats carry agency and deployment codes.

To use agency and deployment as well as station code to resolve station code conflicts in mloc, the command radf is used. This allows you to turn on the use of agency and deployment codes for specific stations, i.e., the ones for which there is a conflict. For every other station code the agency and deployment fields are treated as blank in mloc, regardless of what is in those fields in the event files or station files. It is necessary to ensure that the agency and deployment fields in phase records with the station code of interest are filled in with the values corresponding to the appropriate entry in the master station file and supplemental station files. In the NEIC format the agency is implicit (“FDSN”) and the deployment code is the 2-letter FDSN network code. Most entries in the master station file use “ISC” as the agency and “IR” as the deployment, but there are exceptions so it may be necessary to inspect the station entry in the master station file before setting the fields in your event files.

This functionality of the radf command exists only since mloc v10.5.0. The command exists in earlier versions but it used a different strategy that proved to be impractical.