- Society of Exploration Geophysicists
We introduce two new traveltime picking schemes developed specifically for crosshole ground-penetrating radar (GPR) applications. The main objective is to automate, at least partially, the traveltime picking procedure and to provide first-arrival times that are closer in quality to those of manual picking approaches. The first scheme is an adaptation of a method based on cross-correlation of radar traces collated in gathers according to their associated transmitter-receiver angle. A detector is added to isolate the first cycle of the radar wave and to suppress secon-dary arrivals that might be mistaken for first arrivals. To improve the accuracy of the arrival times obtained from the crosscorrelation lags, a time-rescaling scheme is implemented to resize the radar wavelets to a common time-window length. The second method is based on the Akaike information criterion(AIC) and continuous wavelet transform (CWT). It is not tied to the restrictive criterion of waveform similarity that underlies crosscorrelation approaches, which is not guaranteed for traces sorted in common ray-angle gathers. It has the advantage of being automated fully. Performances of the new algorithms are tested with synthetic and real data. In all tests, the approach that adds first-cycle isolation to the original crosscorrelation scheme improves the results. In contrast, the time-rescaling approach brings limited benefits, except when strong dispersion is present in the data. In addition, the performance of crosscorrelation picking schemes degrades for data sets with disparate waveforms despite the high signal-to-noise ratio of the data. In general, the AIC-CWT approach is more versatile and performs well on all data sets. Only with data showing low signal-to-noise ratios is the AIC-CWT superseded by the modified crosscorrelation picker.
Radar traveltime tomography has become increasingly popular for hydrogeologic, environmental, and geotechnical applications (Tronicke et al., 2004; Gloaguen et al., 2007; Johnson et al., 2007). Such studies are possible because in the radar regime, the wave velocity is a function mainly of the dielectric permittivity distribution in the ground (Davis and Annan, 1989) and because the permittivity of free water is well contrasted relative to that in most geologic materials (Olhoeft, 1981). Experimental and theoretical models transform the dielectric permittivity into volumetric water content or porosity, depending on saturation conditions (Topp et al., 1980; Endres and Knight, 1992; Sihvola, 2000). Hence, mapping the radar-wave velocity field within the ground can lead to a permittivity distribution map of the imaged region and consequently characterize the region's porosity or water-content distribution.
Three general approaches exist to reconstruct the velocity field from borehole radar measurements. The most common approach regroups ray-based methods, which assume that radar-wave propagation can be described by the laws of geometric optics (Olsson et al., 1992; Zhou and Fullagar, 2001; Gloaguen et al., 2005). In this approach, the velocity field is constructed by the inversion of the first-arrival traveltime data. Another approach, also based on the inversion of first-arrival traveltime data, relies on Fresnel volume sensitivities that account for the finite frequency of the crosshole radar signal to improve the resolution (Day-Lewis et al., 2005; Buursink et al., 2008). The last approach, based on inverting the full radar waveform (Ernst et al., 2007; Kuroda et al., 2007), has yet to gain widespread use, mostly because of huge computing requirements. Obviously, the velocity model obtained from ray- and Fresnel-volume-based tomographic reconstruction depends on the quality of the first-arrival times, which is largely dependent on the fidelity of observation of the radar phase and the consistency of timing measurements. Therefore, the collection of high-quality first-arrival times is a prerequisite for determining accurate dielectric-permittivity, porosity, or water-content results.
To obtain insight into the repercussion of timing errors on velocity estimates, consider a wave traveling at constant velocity V over a distance l from a transmitter to a receiver. The traveltime of the wave over this distance is . The variations of velocity with respect to traveltime and distance are (1) Errors in traveltime and positioning lead to an error in velocity: (2)
Assuming exact positioning of the transmitter and receiver and defining as the relative error of the velocity estimate, we can see the latter is equal to the negative of the relative error of the traveltime. Hence, overestimating the traveltime by a given proportion will lead to underestimating velocity in the same proportion. It is interesting that, given a constant absolute timing error, the relative error decreases with increasing travel distance. It is therefore likely that the relative timing errors vary among crosshole measurement data sets. This results from the particular geometry of data acquisition that involves typically wide angular coverage which leads to significant variation in travelled distances. Inconsistent relative timing errors undoubtedly introduce artifacts in the reconstruction of the velocity fields.
Nowadays, manual picking by a well-trained human operator is still the best method to obtain reliable first-arrival times. However, some errors can result from the highly repetitive nature of the work or from inadequate working procedures. It is also a very time-consuming task. Quick, accurate first-arrival picking is of great importance when processing large data sets. Although automatic picking schemes exist, they were developed mainly for earthquake seismology and exploration seismic studies. These methods must be adapted and improved for radar applications because of the different nature of the radar wavelet and the poor signal-to-noise ratio (S/N) often encountered for large transmitter-receiver offsets.
In the framework of ground-penetrating radar (GPR) traveltime tomography, adaptations of the crosscorrelation picker are proposed by Messinger (2004) and Irving et al. (2007), whereas Tronicke (2007) devises a scheme based on a statistical criterion. Messinger (2004) proposes the combination of crosscorrelation and short-term averaging/long-term averaging (STA/LTA) techniques to design an automated picking procedure, whereas Irving et al. (2007), in a scheme also based on crosscorrelation, apply the technique to data organized in common-ray-angle gathers.
Although generally performing well, techniques based on crosscorrelation can suffer because similarity of radar waveforms is not always met. On the other hand, methods based on statistical criteria are applied on each trace individually and do not require similarity to be effective. Tronicke (2007) finds that one such criterion, the Akaike information criterion (AIC), performs well except for low S/N data and uses ad hoc time windowing based on an apparent velocity to guide the algorithm. In all cases, postprocessing quality control (QC) is required to validate the output of the picking and correct outliers manually.
In this contribution, we propose two alternative approaches to improve the accuracy of picking and decrease the number of postprocessing corrections for outliers inherent to automated methods. One method is derived from the approach of Irving et al. (2007) and is based on crosscorrelation of traces collated by ray-angle interval. To mitigate the effect of differences in waveform shape caused by late arrivals (such as waves refracted at the air-ground or saturated-unsaturated interfaces, or reflected waves) and differences in the first-break phase, a window is applied to isolate the first-break wavelet to crosscorrelate and pick only the first arrival.
The second method operates on each trace independently to bypass the restrictive criterion of waveform similarity. It is based on an improvement of the AIC picker used for P- and/or S-wave picking in seismology (Maeda, 1985) by introducing the continuous wavelet transform (CWT). The CWT acts as a band-pass filter well localized in time. The analysis of the phase part of the transform allows us to better distinguish the change of the radar wavelet phase. Our pickers are added to bh_tomo, a MATLAB borehole georadar 2D tomography package described in Giroux et al. (2007).
Modified crosscorrelation picker
The first method is based on the work of Irving et al. (2007). Irving's algorithm is based on the crosscorrelation of traces with a reference waveform for which the arrival time is known. The originality of the approach resides in the reference waveform being obtained iteratively from the data set through a specific sorting of traces.
First, the traces are aligned using crosscorrelation, with the reference trace initially being the one with the highest S/N. Then, the aligned traces are stacked to produce a new reference waveform, which is used to realign the traces. This process is repeated until all traces are aligned properly. Finally, the arrival time is picked manually on the reference trace, with the arrival times of the other traces determined from the crosscorrelation lags. The method is said to be semiautomatic because the user must pick the reference trace and intervene during the alignment of the traces to reject those that cannot be aligned properly.
To mitigate the effect of changes in the GPR phases on crosscorrelation, Irving et al. (2007) determine a number of different reference waveforms that represent ranges in transmitter-receiver angle in which the arriving pulses have a similar shape. This is achieved by gathering the traces by ray-angle intervals. As they illustrate, this step is essential to obtain high-quality results. The main assumption behind this step is that waveforms recorded at similar angles have similar shapes. Irving et al. (2007) point out clearly that significant problems might arise in situations in which this assumption is violated, such as for data collected above the water table or near the surface, where sharp velocity contrasts, refracted arrivals, and changes in antenna coupling impact greatly the waveforms and frequency content.
To overcome the latter limitation, we propose preprocessing the data in a manner that isolates the first cycle of the GPR waveform (referred to as first-cycle isolation). By doing so, waveform similarity is increased and interference with scattered arrivals during crosscorrelation can be partially, if not completely avoided. This step is applied typically to traces with moderate to high S/N because the scattered arrivals usually are not visible on traces with low S/N.
The preprocessing step proceeds as follows: First, each trace is normalized to its maximum absolute value. Then, the traces are crosscorrelated with a wavelet of the form (Arcone, 1991) (3) where f is chosen to be the dominant frequency of the signal. To automate the procedure, the beginning of the first cycle is picked at the first time lag exceeding a given crosscorrelation amplitude threshold, typically 0.25. A time window slightly wider than the wavelet of equation 3 and beginning slightly before the above time-lag datum is applied to nullify the undesirable signals (see Figure 1). In this manner, we ensure the sought onset is not altered. The time window is extended with cosine tapers at the leading and trailing edges to smooth out transitions. Note that the wavelet of equation 3 could be substituted by any other wavelet that would be more adapted to the data at hand. However, because we need to use only a time window to suppress later arrivals and do not need to determine the onset time itself, equation 3 works well enough. Once this step is completed, the procedure of Irving et al. (2007) can then be conducted using the windowed signal.
Another problem that arises when using crosscorrelation for onset picking is that when the dominant frequencies of the wavelets vary, the widths of the wavelets vary and, therefore, they differ from the reference wavelet (Figure 2a). An immediate consequence is that the crosscorrelation lags will not fall exactly at the onset, leading to an error in the picked time. To help minimize this problem, the ability to stretch the first cycles to a common length is added to the procedure. The time scale of each trace is rescaled by a ratio of , where is the dominant frequency of the trace and is the average of the dominant frequency for all traces. Because the crosscorrelation picking procedure requires that the time scales be consistent, all stretched traces must be interpolated to a common time scale. An example of the results obtained with this procedure is shown in Figure 2, illustrating the improved similarity between the rescaled traces and the reference obtained after stacking. After completing the picking procedure, one must scale back the obtained times to their original time frames.
Obviously, the quality of the time-rescaling correction depends on an accurate estimate of the dominant frequency of the first cycle of the radar wavelet. This estimation is an important task because at this step its exact location in the radargram is unknown. As a first estimate, a low-pass Chebyshev filter is typically applied before computing the frequency spectra, and the autoregressive estimator of Burg (Marple, 1987) is used on the whole trace. Autoregressive (AR), as well as moving average (MA), and autoregressive moving-average (ARMA) models are prediction models that allow relatively simple parametric descriptions of the second-order statistics of random processes, which can be used to efficiently compute their power spectral density. Spectral density estimates obtained using AR models present sharp peaks and offer good spectral resolution (Marple, 1987). In Burg's method, the power spectral density is obtained from the coefficients of an AR model that minimizes the sample-prediction-error variance in the least-squares sense. The frequency sample interval in discrete Fourier spectra is the sampling rate divided by the number of samples N used in the discrete Fourier transform. If the frequency spacing is too large, ambiguities can arise in determining the frequency content if the narrowband signal components have center frequencies that fall between the N frequency evaluation points (Marple, 1987). Hence, a potential source of error is the low resolution in estimating the dominant frequency of the radar wavelet resulting from a relatively large time discretization step and a limited number of samples. To mitigate this, we pad the ends of the traces with zeros to increase the number of points used in the Fourier transform and to decrease the frequency spacing.
The procedure to find the dominant frequency is iterative to avoid outliers that are frequent with data having a low S/N. The algorithm is initiated with the order n of the autoregressive prediction model set to two. If the frequency at the maximum energy falls outside a prescribed range (usually equal to , where is the nominal frequency of the radar antennas), the order n is incremented and the spectrum is recalculated. If a dominant frequency cannot be found for , the time rescaling correction is not performed. Note that the Chebyshev low-pass filter is applied for spectral estimation only, i.e., the picking is performed with the unfiltered traces to avoid possible bias caused by phase distortion.
In summary, the preprocessing procedure for one trace includes the following steps:
1) Low-pass filter the trace and store the original trace.
2) Apply zero padding to the trace.
3) Set the order n of the AR model to two.
4) Compute the power spectrum using Burg's algorithm.
5) Set to the frequency when the power is maximum.
6) If is outside and , where , compute the power spectrum using Burg's algorithm with the updated value of n and set to the frequency when the power is maximum.
7) The retrieve the original, unfiltered trace. If is inside , perform first-cycle isolation if desired. If time rescaling is desired, perform time-rescaling correction for the trace, tag trace as rescaled, and store for future reverse correction of picked time. Otherwise, leave the trace intact.
After applying this sequence to the whole data set, the traces are gathered by their respective associated ray angle and the picking procedure can be initiated.
In radar crosshole measurements, there can exist several possible paths followed by the EM energy between the transmitter (Tx) and receiver (Rx), e.g., direct, reflected, refracted, tube waves. In other words, different wave types can be present in a radar trace; therefore, similarity between traces recorded at different locations is never guaranteed. As such, it is appealing to have an automated picking method that does not rely on waveform similarity. For this reason, the AIC picker is chosen as a starting basis because of its simplicity, efficiency, and possible automatic implementation.
For any method, and for the AIC in particular, the quality of the picked times is dependent on the S/N of the recorded traces. Figure 3a shows an example of a typical radar trace and the parameters used in defining the S/N retained in this paper. First, it is necessary to define the strength of the radar signal as (4) where r is the discretized trace, std is the standard deviation, is the sample interval from to , is time index of the maximum amplitude of the radar trace, is the sampling interval, and is the dominant frequency of the trace. The S/N is defined as (5) where is the standard deviation of noise (usually calculated with the last 100 samples of the radar trace).
The distortion of the shape of the radar waveform (amplitude and phase) is caused by various phenomena and depends essentially on intrinsic ground absorption and scattering, radiation patterns and antenna coupling, wave interference (transmitted, refracted, and reflected), and Rx-Tx distance. In general, the longer the wavepath, the higher the distortion. The S/N is usually high for a small Tx-Rx angle (shorter wavepaths) and can be very low for a high Tx-Rx angle (longer wavepaths). Figure 3b shows an example of a radar trace for a high Tx-Rx angle. For this trace, the S/N is low and the radar signal is attenuated, with the frequency content different than the trace with a small Rx-Tx angle of Figure 3a.
Before picking, it is advisable to remove some noise to improve the quality of onset timing. Our experience shows that the following preprocessing steps usually work well: trend removal, data selection based on S/N level, and background noise removal by stationary wavelet transform denoising. Trend removal is carried out by linear interpolation. However, in rare cases the trend can be nonlinear and approaches such as polynomial trending or high-pass filtering should be used. After this step, all traces that have a low S/N — typically less than five — are discarded. The final preprocessing step is the translation-invariant denoising using the stationary wavelet transform (SWT) (Mallat, 1999).
The theory and numerical implementation of the discrete wavelet transform (DWT) and the SWT are well documented in the literature (Daubechies, 1992; Mallat, 1999). The DWT consists of decomposing the signal in a series of approximations and details using a filter bank and decimation. More precisely, the signal is convolved with two filters constructed from a scaling function (a low-pass filter) and from a wavelet function (a high-pass filter). The two filtered signals are decimated by suppression of one sample out of two. The high-frequency filtered signal (detail coefficients) is stored, and the process is repeated with the low-frequency filtered signal (approximation coefficients). At each level, the scale increases in a dyadic manner resulting from the decimation. In this way, the original signal is decomposed into several scales according to its frequency components.
In contrast to the DWT, the SWT is time invariant, a desirable property for signal reconstruction and denoising (Coifman and Donoho, 1995). The SWT algorithm differs from the DWT algorithm in that for each level, the decimation step is suppressed and the filter is upsampled by inserting zeros between filter coefficients. This way, the length of approximation and detail coefficients are the same as the length of the original signal. For low S/N data, the SWT denoising technique outperforms classical Fourier-based low-pass filtering when the noise variance is high near the lower cutoff frequency of the filter (Dolabdjian et al., 2002). In fact, low-pass filters typically introduce a time shift and a distortion of nonstationary signals such as radar recordings. Conversely, the SWT denoising technique is more selective: The wavelet coefficients of the radar signal are concentrated on only a few large coefficients compared to the noise, which has its energy spread over a large number of smaller coefficients. The denoising technique relies on a thresholding rule that is based on a noise model that removes the low-valued coefficients and on an inverse wavelet transform that reconstructs the radar signal with little loss of detail.
The denoising process can be summarized in three points: compute the SWT of the radar trace, apply a threshold strategy on the detail coefficients of the wavelet transform, and reconstruct the radar trace by synthesizing the denoised coefficients. The threshold strategy involves determining a base noise level in accordance with the data noise model. Coefficients below this threshold are considered noise (and should, therefore, be set to zero), with coefficients above this threshold defining the signal that should be kept and perhaps modified.
Several types of thresholding exist, with hard and soft thresholding the most common (Mallat, 1999). The hard thresholding approach we use sets the coefficients below the threshold to zero and does not modify the others. For thresholding selection rules, the universal thresholding scheme (Donoho and Johnstone, 1994) is used and nonwhite Gaussian noise assumed. Therefore, the universal threshold is calculated level by level. After several tests, a Daubechies wavelet with two vanishing moments (Mallat, 1999, p. 241) is chosen as a mother wavelet. It has a short support and permits a good reconstruction of discontinuities. The stationary wavelet transform is done using three levels of detailed decomposition.
Automatic denoising with hard thresholding does not remove residual spikes. To overcome this problem, the denoised traces are processed with a median filter of very short length, typically five samples, to preserve the radar waveform as much as possible. The result of the preprocessing sequence on a real radar trace is presented in Figure 4. The residual noise is highly smoothed and the shape of the radar waveform is well preserved.
The AIC is based on an assumption that the radar trace can be separated into locally stationary segments, each modeled as autoregressive processes. The intervals before and after the onset time are assumed to be two stationary time series. Maeda (1985) defines the AIC function as (6) where r is the discretized radar trace of N samples and var is the variance. The global minimum of the AIC function defines the onset point of the first arrival. In some cases, multiple wave arrivals can occur in a radar trace, and the AIC picker will choose the wavelet that has the strongest phase. In other words, the AIC function contains several local minima, each corresponding to the onset of a particular wavelet; but the global minimum corresponds to the wavelet that has the strongest phase (Zhang et al., 2002). Thus, we must guide the work of the AIC picker by choosing an appropriate time window that contains the onset of the first arrival. Our experience shows that for crosshole radar data, the minimum of the AIC function computed over a trace truncated in time immediately after its maximum amplitude corresponds to the first-arrival onset in most cases. We use this minimum in our algorithm.
One noticeable shortcoming of the AIC picker is that the onset time it provides is always somewhat late (or early) compared to the true time, depending on the level and frequency content of the noise. This is illustrated in Figure 5, where the relative error on the time picked with the AIC estimator increases from 0.13% for the noise-free trace to 3.29% for a trace having an S/N of 10.5. Although its performance is improved by the preprocessing steps, better picking precision should be achieved. To obtain a higher level of accuracy, we propose a correction to the AIC method based on the complex continuous-wavelet transform (CWT).
The wavelet transform of a signal allows its analysis in terms of local time-frequency compositions. This is achieved by decomposing the signal over dilated (or scaled) and translated wavelets. The CWT is different from the SWT; in SWT, the wavelet is sampled in a dyadic manner for fast multilevel evaluation. In CWT, the evaluation is performed at any desired scale. The 1D continuous wavelet transform of a radar trace can be seen as the convolution of with a scaled version of a mother wavelet that acts as a dilated band-pass filter (Mallat, 1999). Evaluated at time b and scale a, the CWT is given by (7) where is the complex conjugate of the translated and scaled wavelet.
For this application, the complex Morlet wavelet was chosen as the mother wavelet because it offers a good trade-off between time localization and frequency resolution. It is defined as (Teolis, 1998) (8) where is the bandwidth parameter and is the wavelet center frequency. The parameter is set equal to the dominant frequency of the radar trace to evaluate the CWT at the scale of the radar signal. The phase of the CWT reaches π at minima in the signal and crosses zero at the maxima. This can be seen in Figure 5a–d, which shows synthetic traces with an increasing level of noise and the corresponding phase responses of the CWT. In particular, the CWT phase response is a consistent estimator of the phase of the radar trace. Hence, the phase reaching π nearest to the AIC picker can be used as the onset of the first arrival, provided the first oscillation is positive. As shown in Figure 5, this correction improves the accuracy of the picked times; within this example, the largest relative error is 1% for the AIC-CWT method compared to 3.29% for AIC alone.
The approach can be summarized as follows:
1) After preprocessing the data, the traces are truncated in time at their maximum amplitude, the AIC is calculated on all truncated traces; the onset picking is determined by finding the AIC minima.
2) For each trace, the dominant frequency of the radar wavelet is calculated and then used to scale the complex wavelet transform. If cannot be found with the procedure, the frequency at the maximum of the amplitude spectrum is taken.
3) The position of the first break of the radar wavelet is extracted from the phase part of the complex wavelet transform. We then ensure the CWT correction falls near the AIC pick, i.e., if , the the first-arrival picking is . Otherwise, the first-arrival picking is .
4) After all traces are processed, common-receiver-gather plots with associated first-arrival picks and uncertainties are displayed for visual QC.
PERFORMANCE OF THE PICKERS
To evaluate the performance of the proposed algorithms, we measure the accuracy of the picks and the amount of manual work required at the postprocessing QC step. To achieve this, the true traveltimes of the traces must be known. Synthetic data are generated by finite-difference time-domain (FDTD) modeling and the true traveltimes extracted from the modeled traces.
Two models are considered: the first mimics a heterogeneous saturated sand aquifer and the second includes a vadose zone and the air-ground interface (see Figure 6a and b). The physical properties (permittivity, conductivity) of both models are determined from a program developed by Giroux and Chouteau (2008). An FDTD scheme in cylindrical coordinates is then used to compute the synthetic crosshole radargrams. The source function is a Ricker source wavelet. The grid cell is , and the time step is to avoid numerical dispersion and ensure stability. Figure 6c and d shows one transmitter gather for each of the models. Traces obtained from model 1 exhibit a large similarity, whereas traces obtained from model 2 show a greater degree of dissimilarity, with superimposed arrivals related to refracted and reflected waves at the air-ground interface.
Real radar traces are contaminated by various sources of noise, the level of which obviously impacts the accuracy and reliability of any automated picking procedure. This aspect is studied independently by adding three levels of noise to the synthetic traces. Uniformly distributed pseudorandom number sets are scaled to represent noise of 0.1%, 0.2%, and 0.5% of the maximum absolute value of all traces for both data sets and are added to the traces to yield a total of eight data sets, six with noise and two without. An example of the increasing level of noise on a synthetic trace is shown in Figure 5.
The first measure of performance is the time deviation between the picking schemes relative to the true onset times. The true traveltimes are obtained from the noise-free traces by selecting the first time index in the FDTD traces at which the absolute amplitude is above zero. The FDTD time step is lower than the usual sampling frequency of field equipment. Hence, to obtain more representative results for the picking tests, the synthetic traces are downsampled to a time step (typical for field survey acquisition) from the FDTD time step. The original crosscorrelation scheme of Irving et al. (2007), labeled XC, is compared to the modified crosscorrelation with first-cycle isolation XC1, the modified crosscorrelation with first-cycle isolation and time rescaling XC2, and the AIC-CWT picker (AIC).
Figure 7 shows the time deviation relative to the true onset for two levels of noise for model 1 (the heterogeneous model). Figure 7a and b shows the time deviation versus S/N with a positive time deviation, meaning a late pick. Figure 7c and d shows the cumulative curves for absolute values of relative time deviation and S/N. The cumulative time deviation curves are asymptotic to the percentage of picked traces, i.e., the ratio of (semi)automatically picked traces over the number of traces in the data set.
From these figures, it appears that the modified crosscorrelation pickers work well for traces having both high and low S/N, with almost all traces picked with less than a 2% deviation. The modifications brought to the original crosscorrelation scheme also improve the results, although the time-rescaling step does not appear to bring significant benefit when compared to first-cycle isolation alone. This is explained by the fact that for these data sets about 95% of the traces have approximately the same dominant frequency. There is almost no dispersion, and the time-rescaling correction is not justified in this instance. Although performing well with cleaner radargrams, the accuracy of the AIC-CWT picker degrades for noisier data. For example, only 44% of the traces have less than 2% deviation on the noisier data set (Figure 7d). When comparing Figure 7a and b, one notices the AIC-CWT picker has the tendency to pick late on noisy data. This decrease in performance at low S/N is observed by Tronicke (2007).
Although the radar traces of model 1 have a high degree of similarity, the waveforms obtained with model 2 (the vadose zone model) are more disparate. Not surprisingly, the performance of the crosscorrelation pickers degrades as the similarity between the traces diminishes. This is visible in Figure 8, which shows the time deviation from the true onset for two levels of noise for model 2. There are many late outliers for all three crosscorrelation-based pickers, even at high S/N values. For most, these outliers result from unpicked air-ground refracted arrivals, which have lower amplitudes than the direct arrivals. Although easily spotted in postprocessing, these outliers require manual correction unless, of course, only direct or perhaps reflected arrivals are sought (Tronicke et al., 2001). Nevertheless, the crosscorrelation pickers have absolute deviations of less than 2% for more than 80% of the traces. Here again, there is almost no dispersion in the data, and the time-rescaling step brings no benefit. The AIC-CWT picker has the advantage of not generating outliers but shows a severe degradation of accuracy, caused by the very low overall S/N (e.g., only 13% of the traces have a S/N above 20 for 0.5% added noise level).
Tables 1 and 2 summarize the results by listing the mean and standard deviation of the timing error for both models at all noise levels. For the data set obtained with model 1, the modified crosscorrelation pickers have a clear advantage, except for the ideal, noise-free scenario when the AIC-CWT and crosscorrelation pickers perform equally well. For crosscorrelation pickers alone, the increased similarity brought by the isolation of the first cycle manifests itself by an overall improvement over the original algorithm. The results for model 2 are different. For noisy data, the mean of the deviation times is comparable for all methods. However, the AIC-CWT has an advantage, again for the noise-free scenario but also in terms of deviation around the mean, which is at worse four times lower than for the crosscorrelation pickers.
It is important to remember that in processing the arrival times with the crosscorrelation methods, the user can reject the traces that cannot be aligned properly because of a strong dissimilarity with other traces in the common-ray-angle gather. This is done to improve the shape of the stacked reference trace, which impacts the precision of the picks. These rejected traces can be processed manually afterward. On the other hand, the AIC-CWT picker processes all of the traces but sometimes generates outliers for low S/N traces. These outliers must be discarded after visual inspection or picked manually. With the AIC-CWT picker, very noisy traces are rejected on the basis of their S/N by fixing a rejection S/N threshold, and many picking runs can be carried out to choose the best threshold.
For all methods, the accuracy of the picks can be unsatisfying and can also be adjusted manually in postprocessing. A certain degree of manual intervention is needed for all automatic picking methods, which should be as low as possible. Hence, the second measure of performance relates to the number of automatically processed traces and to the manual work needed to achieve the final results.
Tables 3 and 4 list the percentages of automatically or semiautomatically processed traces, manually adjusted picks, and overall processed traces. The results are comparable for all methods, except for the AIC-CWT picker on the noisier data sets in which more manual postprocessing is needed. It is worth noting, however, that the crosscorrelation methods are semiautomatic and require some level of processing by the user — hence, having an advantage over the AIC-CWT picker in manual postprocessing alone.
To confirm the conclusions obtained from the synthetic data, the algorithms are tested further with two sets of data collected in the field. The first was obtained in a limestone quarry where a large campaign was carried out to delineate clay-filled karsts. This data set exhibits a very low overall S/N, with only 25% of the traces having a S/N larger than 20. The shape of the radar waveforms in this example have little distortion, and only some traces show the superposition of waves and interference.
The second data set was acquired on a concrete structure (Giroux et al., 2006). In contrast to the limestone quarry data set, it is characterized by heavy distortion of the shape of the radar wavelet. The disparity in radar waveforms is caused mainly by the intrinsic dispersive proprieties of the concrete at the site, in which electrical resistivity measurements performed on 66 coring samples yielded an average of . The disparity is also caused by the presence of reinforcement bars and other reflectors. The second data set is also fairly noisy, with low-frequency and inductive wow events of small to moderate amplitudes, in addition to the usual high-frequency noise. With this data set, a high S/N (as defined in equation 5) can be misleading because the low-frequency noise tends to be interpreted as a signal. A RAMAC system from Malå was used, with antennas having nominal frequencies of 250 and for the first and second data collections, respectively.
Figure 9 shows the time deviations of the automated picking methods from manually picked onsets for the two data sets, which are considered the correct arrival times. Initially, 68.1% of the total number of traces could be correctly picked manually for the limestone quarry data set. This percentage is only 44.3% for the concrete structure data set. The accuracy of our performance criterion is reduced because the automated methods might pick onsets rejected by the operator during manual picking (it happened for the AIC-CWT picker, as reflected with percentages above 100% in Table 5). Hence, some of the eventually correct picks might not be included in the results shown in Figure 9 because of the absence of reference data. Although not rigorously verifiable, we expect this effect to be marginal.
For the limestone quarry data set, all automated pickers perform equally well, with the modified crosscorrelation methods picking more than 81.5% of the traces with less than a 2% deviation; the AIC-CWT lags slightly behind, with 68.2% of the traces having the same precision. Here, the percentages are calculated with respect to the number of manually picked traces. Therefore, in terms of all available data (including traces not picked manually), the percentages are lower, i.e., on the order of 55% and 46% for the modified crosscorrelation and AIC-CWT pickers, respectively.
In agreement with the results obtained with the synthetic data, the AIC-CWT method is affected more by low S/N than the other methods, as shown in Figure 9a. Also of interest is the fact that the time-rescaling correction improves the results significantly below 4% deviation, i.e., XC2 picked 10% more traces with less than 2% deviation than XC1. This is explained by the strong dispersion observed in the data, as shown in Figure 10.
For the concrete structure data set, the crosscorrelation methods do not appear appropriate. For these schemes, the time deviations are below 2% for at best 24.5% of the manually picked traces. The poor performance of the crosscorrelation pickers is explained by the fact that sorting into common-ray-angle gathers fails to guarantee waveform similarity. Although an improvement over the original crosscorrelation scheme, the first cycle detector used in the XC1 amd XC2 approaches fails to isolate the first wavelet oscillation in a sufficient number of cases, even with high S/N data, as seen in Figure 9b. This is caused by the presence of a highly distorted and relatively low-amplitude first arrival on a significant number of traces. Figure 11 illustrates the traces corresponding to ray angles between 7.5° and 12.5°. On the concrete structure data set, the AIC-CWT picker performs better. It picked about 61% of first arrivals within 2% of manual picks, as seen in Figure 9d.
With regard to the amount of manual postprocessing work performed, Table 5 illustrates the performance statistics in terms of the percentage calculated with respect to the number of manually picked traces. For the limestone data set, the amount of postprocessing work is relatively similar for all methods and the modified crosscorrelation pickers required slightly less manual work than the AIC-CWT method for a slightly lower final amount of picked traces.
The situation is different for the concrete data set. For the crosscorrelation pickers, a large number of traces had to be rejected manually in the semiautomated phase of the processing. Because of the lack of similarity within the common angle gathers, it was difficult to obtain a satisfying reference waveform from the stacking procedure. This is reflected in the percentages of automatically picked traces for the XC, XC1, and XC2 pickers, which are 67%, 82%, and 76%, respectively. A considerable amount of manual work is also needed in postprocessing to pick the rejected traces and correct the outliers.
The situation is different for the AIC-CWT approach, in which the percentage of automatically picked traces is higher than the amount of manually picked traces (120%). In postprocessing, only 26% of the traces were adjusted manually, which is 50% less than the crosscorrelation pickers. For a large proportion of these traces, they were discarded, as reflected by the final number of picked traces reaching a value of 113%. The concrete structure data set is an example in which the AIC-CWT method presents a clear advantage over the crosscorrelation methods.
Two new algorithms have been developed to automate first-break picking of crosshole radar data, each having its own strengths and deficiencies. In the first algorithm, a technique based on crosscorrelation of traces sorted into common ray-angle gathers was modified to incorporate a detector designed to isolate the first cycle of the radar signal and a time-rescaling operator to stretch the isolated wavelets to a common length in time. First-cycle isolation improves the results in all tests performed. On the other hand, the time-rescaling correction proves useful only when strong dispersion is observed in the data. The modified crosscorrelation picker is very accurate, even for data with low S/Ns, as long as the shapes of the radar waveform are not too disparate. In such situations, the second algorithm, based on AIC with a correction derived from the CWT, is more robust.
The only drawback of the AIC-CWT picker is the performance degradation at moderately low S/N. Both methods require careful postprocessing QC to correct for the outliers resulting from the automated procedures. Based on a relatively short experience with the new algorithms, we estimate that, postprocessing included, the gain in processing time with respect to manual picking is about three to four times better for the crosscorrelation pickers and perhaps one order of magnitude better for the AIC-CWT picker.
One shortcoming of the modified crosscorrelation picker is the inefficiency of the first-cycle detector in highly distorted data sets. Possible improvements could be the development of pattern recognition criteria to define the degree of similarity between the traces. Sorting the radar traces in groups following these criteria rather than by ray angle is susceptible to produce better results. We are investigating the advantages of a first-cycle detector based on the AIC-CWT picker. As a final note, both proposed methods require an estimate of the dominant frequency of the radar signal and some form of band-pass filtering. This leads us to suggest that a relatively high time sampling rate should be used when performing data acquisition to achieve a high resolution of the frequency spectra.
This material is based on work supported, in part, by the Natural Sciences and Engineering Research Council of Canada (operating grant RGPIN848 from M.C.), and by the Fonds québécois de la recherche sur la nature et les technologies under contract ER-115473. The authors thank Anaïs Chabagno for testing the algorithms and the associate editor and three anonymous reviewers for feedback that greatly improved the manuscript.
- Received September 18, 2008.
- Revision received December 22, 2008.