MLOC Data Format MNF v1.3

MLOC Native Format (MNF) v1.3

The current version is v1.3.3, released October 11, 2017 with mloc v10.3.4. mloc issues a warning if a file with an earlier format is read but depending on the details the file may still be processed correctly.

MNF v1.3 is the only format supported by mloc for input of arrival time data for individual events, i.e., the data for each event is stored in a separate file. Event files have the filename suffix .mnf. The strongly recommended naming convention is to base the body of the filename on some estimate of the origin time of the event, down to the nearest second:

YYYYMMDD.HHMM.SS.mnf

It is not important that the origin time represented in the filename be especially accurate, only that it be distinguishable from other events in the cluster.

MNF Bulletins

MNF v1.3 also supports the formatting of seismic bulletins, consisting of the concatenation of two or more individual event files in MNF v1.3 format. There is no limit on the size of a bulletin other than practical concerns related to storage, viewing and editing. I have worked comfortably with a bulletin of ~44,000 Iranian earthquakes that is 260 MB in size.

The only difference between an event file and a bulletin is that the bulletin’s first record is a bulletin record and the end-of-file record appears only once, as the last record of the bulletin. In a bulletin the format record is required only once, before any events are read, but it is legal to include one in each event bloc.

Defined Record Types

MNF v1.3 has 11 defined record types, of which 6 are considered “data-carrying”:

Record Types of MNF v1.3
Flag	Record Type	Minimum Length	Full Length
B	Bulletin	1	121
F	Format	15	15
E	Event	1	121
I	ID	1	51
H	Hypocenter	74	121
D	Depth	9	121
M	Magnitude	8	121
P	Phase	55	121
#	Comment	1	121
S	Stop event	1	4
EOF	End of file	3	3

Unlike all other record types, which are distinguished by the flag in column 1, the end-of-file record (EOF) uses columns 1-3; it has no other arguments. It causes processing of a data file to end, so it would normally only be found once, at the end of the file, whether it holds a single event or multiple events.

The “natural” line length for MNF files is 121 characters, because most of the information-carrying record types have fields defined to this length.

Data for a single event is carried in a block of records that must start with an event record “E” and end with a stop record “S”. Within the block, data is carried in a combination of hypocenter “H”, depth “D”, magnitude “M”, and phase reading “P” records. At least one hypocenter record “H” is required, but multiple estimates of hypocenter are permitted. ID, magnitude and depth records are optional and there is no limit to how many can be supplied of each. Phase reading records are also optional, in the case of an earthquake catalog. The MNF format can represent a catalog but it is not designed to do so in an efficient manner, since it requires a minimum of three lines (an event record, at least one hypocenter record, and a stop record) for each event.

Some of the fields in an MNF-formatted dataset are not required for mloc processing, but they carry information which is often important to retain for interpretation of results and for maintaining compatibility of data products with the NEIC and ISC and other agencies that have standardized on formats such as the IASPEI Seismic Format (ISF) format for data exchange. Conversely, MNF does not carry some (actually, many) fields which would be considered essential for a general-purpose seismic bulletin format such as ISF, because it is optimized for use in relocation studies using mloc.

Several record types carry a “usage” flag (always in column 3) that determines the way (or if) the information in that record will be used. Some records have an “ID” field to carry an ID number that may have been assigned elsewhere, typically in a relational database (e.g., EvID, OrID, ArrID). Event IDs have their own record type because it can be useful in some aspects of mloc’s processing. Comment records (flag “#“) are supported; they can be inserted anywhere in an MNF-formatted file. Obviously, any text that occurs after the first EOF record will not be processed and can be considered as a comment, regardless of the formatting, but the use of the EOF record in this manner is not recommended.

Except for the requirement to start each event block with an event record and end the block with a stop record, there are few requirements on the order of records within an event block. In most cases the recommended order would be event record, ID record(s), hypocenter record(s), depth record(s), magnitude record(s), phase reading records, followed by a stop record.

There must be at least one hypocenter record, but multiple hypocenter estimates can be carried. The usage flag should be used to determine which of several hypocenter records will be honored for the starting location in mloc (or is otherwise the “preferred” location); otherwise precedence will be determined by the software reading the file, probably either the first hypocenter record encountered or the last. The same principle applies to depth and magnitude records: the usage flag should be used to specify a preferred value, if there is one, rather than relying on the sequence of records.

Format Versions

Each data file, whether for a single event or multiple events, should contain a format version record (“F”) before the first event record (“E”). The format version record provides a version number for the MNF format in which the file is written. A program that reads an MNF file should check the version number to be sure it will correctly interpret the data records. A bulletin or catalog made up of event files written in different MNF format versions could be processed by including the necessary format version records (“F”) when the format changes for the next event, but this is not recommended.

Starting with MNF v1.3.3 and mloc v10.3.4 (October 11, 2017 release), format versions are expected to be complete, i.e., “1.3.3” rather than “1.3” which was adequate for previous versions of mloc.

Depths

The hypocenter record normally (but not always) carries an estimate of focal depth, but there may be additional estimates of depth that should be carried. The depth record is intended to carry information on depth that is considered credible, but was not associated with a particular hypocenter determination. The most common source of such depth estimates is waveform analysis of some kind. Multiple depth records are permitted and one of them can be designated as preferred by the usage flag ‘=’ in column 3. Depth records can carry an optional depth code flag which mloc uses to keep track of the nature of depth constraint. It is highly recommended to use the standard flags. For example, the focal depth in the preferred hypocenter record will be taken as the preferred depth by default in mloc, but it would be over-ridden if a following depth record meets these requirements:

The depth record has been designated as preferred by the usage flag “=” in column 3
The depth record carries a depth code that is considered “constrained”
The hypocenter record’s depth estimate does not carry a depth code that is considered to be “constrained”

Like the hypocenter record, a depth record can carry an asymmetric estimate of uncertainty. The depth uncertainties are optional. Focal depth and both uncertainties can be given to a tenth of a kilometer, but the decimal point should be present even if the value is only given to the nearest kilometer, to ensure correct reading by Fortran formatted read statements. The authorship field can be used for comments about the nature of the depth estimate. There is no concept of author ID for depth.

Magnitudes

Magnitude estimates are carried in a magnitude record, one magnitude estimate per record. Multiple magnitude estimates can be carried. A usage flag can be used to select a preferred estimate from among multiple records, but is not required. If no magnitude record is marked as preferred, mloc uses the first one encountered as a measure of magnitude. In cases where it is desired to carry a magnitude associated with a specific reading, a magnitude record could be interspersed with phase reading records; in any case the author field could be used to indicate the desired association. Magnitudes can be carried to two decimal places. In any case the decimal point should always be given, to ensure correct reading by Fortran formatted read statements.

Station Codes

The concept of station codes is evolving away from the traditional strategy of attempting to carry all necessary information about a seismic station (plus the installed instrumentation and who operates it) in a single code of 4 or 5 characters. The MNF format implements the New IASPEI Station Coding Standard:

Agency.Deployment.Station.Location.Channel

or “ADSLC” formulation, using the fixed format display standard described in the defining documentation. In this format, what used to be known as the “station code” is carried in 3-5 characters.

mloc’s support for the ADSLC protocol is complete in the sense that the phase record has a field for each element, but in fact mloc uses only the station field routinely; the deployment field can also be used optionally to resolve station-naming conflicts, but the implementation of this feature is complex, not entirely bug-free and seldom worth the trouble. One reason for this disappointing state of affairs is that the ISC has not yet implemented the ADSLC protocol throughout their database and the ISF data format returned from searches of the ISC Bulletin only carries the station field. Even if that problem were solved, however, the extensive aliasing incorporated in the ADSLC standard makes the simple question “are these two stations the same?” challenging for a program like mloc that attempts to integrate data from many sources. Strategies for dealing with station naming issues are discussed here.

Station codes are carried twice in phase records. One instance is required and that is the field read by mloc. The station code is normally repeated in the second field (if it appears at all; it is optional), but it can be different in cases where a network established a station with a convenient code and only later decided to register the station with the IR, at which point they discovered that their favorite code was already taken. The station would then be registered in the IR with a different code but it is not uncommon for the network to continue using the original code in their internal processing and if one acquires data directly from the network there will be a station conflict. The solution is to edit the station code that is read by mloc to be the one that is registered.

Phase Names

Seismic phase name is carried twice in the MNF format. The first instance (columns 24:31 of a “P” record, see below) is read by mloc, and then it is subject to the various procedures within mloc that may change the syntax of the phase name or change the phase ID completely. The second instance (columns 66:73) carries the phase name as reported by the original source. It is sometimes useful to change the input phase name from the original phase name, to assist mloc’s phase identification algorithm in determining the correct (or at least the desired) phase identification. It is also useful to retain the original phase name unchanged for reference. The MNF format also carries a position for a special flag (“!”) which informs mloc that the input phase name should not be changed during relocation.

ID Numbers

For informational and forensic purposes, MNF includes fields for identification numbers for various kinds of data that are provided by seismological centers. The only use that mloc makes of such information is to use event IDs (EvIDs), if provided, to correctly match events with the relocation results of a previous run, as carried in an hdf-formatted file.

Conventionally, ID numbers assigned to events (“EvID”), hypocenters (“OrID”), phase readings (“ArrID”) and other kinds of data in relational databases are integer numbers. Standards on how many digits are carried vary from system to system, but 10-digit integers are likely to be necessary before long, especially for ArrIDs. The MNF format does not enforce a data type on those IDs; they are character fields as far as mloc is concerned. Even so, ArrIDs and OrIDs should be right-justifed to aid in correct reading of integers by a Fortran code. Although 10-character ArrIDs are specified in MNF it has proven necessary to provide larger fields for EvIDs and OrIDs.

The larger field for EvIDs is driven by NEIC, where current practice places no practical limit on the length of an event ID, which is no longer an integer, but a combination of letters (e.g. network codes) and digits. The 40-character field provided in the MNF format should be adequate to handle these, but mloc only reads the first 10 characters from this field, because this is assumed to be adequate to distinguish between events in a cluster. An EvID of less than 10 characters can be placed anywhere in the 10-character field (columns 12:21) that mloc reads; it will be right-justified inside mloc.

Prior to v1.3.3 of the MNF format, only a single EvID could be entered for each event; it was carried at the end of the event record. In v1.3.3, EvIDs are read from one or more ID records. The format also includes a character field to identify the source of the EvID. The current version of mloc can still read an older MNF file in “1.3” format and extract an EvID carried in an event record. The first-encountered ID record will be the preferred value in mloc by default, or the usage code (“=”) can be used to specify among multiple records.

The OrID field in a hypocenter record (columns 104:121) is 18 characters in length because it is used by mloc to record the cluster name/series code (left-justified, starting in column 104). For magnitude and phase records, the associated 10-character ID field is in columns 112:121. Although 10 digit integer IDs can be carried in the ID field, it should be noted that 32-bit computer systems may have problems processing integers of more than 9 digits.

Defined Record Types

The concept of “optional” fields in the descriptions of record types is specifically in the context of use by mloc. Fields that are optional are indicated in the tables below. In writing MNF-formatted data files it is advisable to pad lines to the full length of that record type, which is based on the defined fields, not the required fields, and it is not unwise to pad all lines to 121 characters, the full length of the longest defined record types, regardless of record type.

Bulletin Record

Column	Description
1:1	Record format flag “B”
5:121	Bulletin description, optional (a117)

Format Record

Column	Description
1:1	Record format flag “F”
10:15	Format version (a6)

Note: It is useful for readability to include extra text, such that columns 1:9 read “F MNF v”, but all that is required is the “F” in column 1 and the version number in columns 10:15.

Event Record

Column	Description
1:1	Record format flag “E”
3:3	Usage flag, optional (a1)
5:121	Annotation, optional (a117)

The only non-blank usage flag for an event record is “-“, indicating that there is no phase data.

ID Record

Column	Description
1:1	Record format flag “I”
3:3	Usage flag, optional (a1)
5:10	ID source, optional (a6)
12:51	Event ID, optional (a40)

The only recognized usage code is “=”, which makes this entry preferred, regardless of its position relative to other ID records. Otherwise the first ID record encountered will be used. An ID record in which the Event ID field is left blank is legal. If there are multiple ID records and it is desired to have mloc ignore them, an empty ID record with usage code set to make it the preferred ID could be used.

Hypocenter Record

Column	Description
1:1	Record format flag “H”
3:3	Usage flag, optional
5:8	Year (i4)
10:11	Month (i2)
13:14	Day (i2)
16:17	Hour (i2)
19:20	Minute (i2)
22:26	Seconds (f5.2)
28:32	OT uncertainty, optional (f5.2)
35:42	Latitude (f8.4)
44:52	Longitude (f9.4)
54:56	Smin azimuth, optional (i3)
58:62	Error ellipse Smin, optional (f5.2)
64:68	Error ellipse Smaj, optional (f5.2)
70:74	Focal Depth, optional (f5.1)
76:76	Depth code, optional (a1)
78:82	Plus depth uncertainty, optional (f5.1)
84:88	Minus depth uncertainty, optional (f5.1)
90:93	GTCNU, optional (a4)
95:102	Author, optional (a8)
104:121	Origin or cluster ID, optional (a18)

Note: mloc recognizes a hypocenter record in which the usage flag is “=” as the preferred hypocenter for setting the starting location. If no hypocenter record carries the usage flag, the first hypocenter record encountered will be taken as preferred by mloc. Other software may behave differently.

Both depth uncertainties are provided as positive numbers. “Plus” depth uncertainty is on the deeper side; “Minus” uncertainty is shallower, and therefore should not be greater than the focal depth in absolute value. If only one depth uncertainty is encountered, it should be interpreted as a symmetric uncertainty. It is important to put the decimal place into the depth field, even if precision is only to the nearest kilometer (or more), to ensure correct reading by a Fortran formatted read statement.

No information on the statistical level of the uncertainties of origin time, depth, or epicenter is provided in the MNF format because there is so little standardization at present. Such values are commonly interpreted as ± 1 σ for origin time and depth, but confidence ellipses are usually calculated at 90% or 95% confidence levels.

The “GTCNU” field carries a four-character code relating to calibration status (I prefer this term to “ground truth” level). It could be the GTX formulation (e.g., Bondar et al., (2004)), but I have developed the GTCNU nomenclature to provide much more detailed information on the subject of what hypocentral parameters are considered to be calibrated (i.e., thought to be bias-free). The GTCNU nomenclature is documented fully elsewhere.

The placement of a traditional (10-digit integer) “OrID” value within the 18-character field is optional, but it is probably best to right-justify it (i.e., in columns 112:121) to facilitate reading as an integer. mloc writes the cluster name and series number as a character string that is left-justified in the field, beginning at column 104.

Depth Record

Column	Description
1:1	Record format flag “D”
3:3	Usage flag, optional
5:9	Depth (f5.1)
11:11	Depth code, optional (a1)
13:17	Plus depth uncertainty, optional (f5.1)
19:23	Minus depth uncertainty, optional (f5.1)
25:121	Authorship and comments, optional (a97)

Note: Both depth uncertainties are provided as positive numbers. “Plus” depth uncertainty is on the deeper side; “Minus” uncertainty is shallower, and therefore should not be greater than the focal depth in absolute value. If only one depth uncertainty is encountered, it should be interpreted as a symmetric uncertainty. It is important to put the decimal place into the depth field, even if precision is only to the nearest kilometer (or more), to ensure correct reading by a Fortran formatted read statement. The depth code in column 11 is an optional character flag that informs mloc about the nature of the depth constraint.

Magnitude Record

Column	Description
1:1	Record format flag “M”
3:3	Usage flag, optional (a1)
5:8	Magnitude (f4.2)
10:14	Magnitude scale, optional (a5)
16:110	Author and comments, optional (a95)
112:121	Magnitude ID, optional (a10)

Note: mloc recognizes a magnitude record in which the usage flag is “=” as the preferred magnitude for this event. mloc uses only the first two characters of magnitude type. Magnitude ID would normally be the OrID from a hypocenter record; it should be right-justified in the field.

Phase Reading Record

Column	Description
1:1	Line format flag “P”
3:3	Usage flag, optional (a1)
5:9	Station code (a5)
12:17	Epicentral distance, optional (f6.2)
19:21	Azimuth, event to station, optional (i3)
23:23	Prevent phase re-identification flag, optional (“!”)
24:31	Input phase name, optional (a8)
33:36	Arrival time year (i4)
38:39	Arrival time month (i2)
41:42	Arrival time day (i2)
44:45	Arrival time hour (i2)
47:48	Arrival time minute (i2)
50:55	Arrival time seconds (f6.3)
57:58	Reading Precision, optional (i2)
60:64	TT residual, optional (f5.1)
66:73	Original phase name, optional (a8)
75:79	Agency, optional (a5)
81:88	Deployment or network, optional (a8)
90:94	Station, optional (a5)
96:97	Location, optional (a2)
99:101	Channel, optional (a3)
103:110	Author, optional (a8)
112:121	Arrival ID, optional (a10)

Note: It is helpful, but not required, to place a “dot” between the ADSLC fields. The usage flag carries the variable “fcode” (also known as phase reading flags) in mloc. The defined phase reading flags are listed here.

Comment Record

Column	Description
1:1	Line format flag “#”
2:121	Comment, optional (a120)

Stop Record

Column	Description
1:1	Line format flag “S”

Note: Only the “S” in column 1 is required, but for better readability it is useful to write “STOP” in columns 1:4.

End of File Record

Column	Description
1:3	Line format flag “EOF”