Various aspects of the wavelet approach to nonparametric regression are considered, with the overall aim of extending the scope of wavelet techniques to irregularly spaced data, to regularly spaced datasets of arbitrary size, to heteroscedastic and correlated data, and to data that contain outliers. The core of the methodology is an algorithm for finding all of the variances and within-level covariances in the wavelet table of a sequence with given covariance structure. If the original covariance matrix is band-limited, then the algorithm is linear in the length of the sequence. The variance calculation algorithm allows data on any set of independent variable values to be treated, by first interpolating to a fine regular grid of suitable length, and than constructing a wavelet expansion of the gridded data. Various thresholding methods are discussed and investigated. Exact risk formulas for the mean square error of the methodology for given design are derived. Good performance is obtained by noise-proportional thresholding, with thresholds somewhat smaller than the classical universal threshold. Outliers in the data can be removed or downweighted, and aspects of such robust techniques are developed and demonstrated in an example. Another natural application is to correlated data, where the covariance of the wavelet coefficients is not due to an initial grid transform but rather is an intrinsic feature. The use of the method in these circumstances is demonstrated by an application to data synthesized in the study of ion channel gating. Our basic approach has many other potential applications, some of which are discussed briefly.

The 'Signal plus Noise' model for nonparametric regression can be extended to the case of observations taken at the vertices of a graph. This model includes many familiar regression problems. This article discusses the use of the edges of a graph to measure roughness in penalized regression. Distance between estimate and observation is measured at every vertex in the L(2) norm, and roughness is penalized on every edge in the L(1) norm. Thus the ideas of total variation penalization can be extended to a graph. The resulting minimization problem presents special computational challenges, so we describe a new and fast algorithm and demonstrate its use with examples. The examples include image analysis, a simulation applicable to discrete spatial variation, and classification. In our examples, penalized regression improves upon kernel smoothing in terms of identifying local extreme values on planar graphs. In all examples we use fully automatic procedures for setting the smoothing parameters. Supplemental materials are available online.

Suppose that we observe independent random pairs (X(1), Y(1)), (X(2), Y(2)), ... , (X(n), Y(n)). Our goal is to estimate regression functions such as the conditional mean or beta-quantile of Y given X, where 0 < beta < 1. In order to achieve this we minimize criteria such as, for instance, Sigma(n)(i=1)rho(f(X(i)) - Y(i)) + lambda . TV(f) among all candidate turn lions f. Here rho is some convex function depending on the particular regression function we have in mind, TV(f) stands for the total variation of f. and lambda > 0 is some tuning parameter. This framework is extended further to no binary or Poisson regression, and to include localized total variation penalties. The latter are needed to construct estimators adapting to inhomogeneous smoothness of f. For the general framework we develop noniterative algorithms for the solution of the minimization problems which are closely related to the taut string algorithm (cf. Davies and Kovae, 2001), Further we establish a connection between the present setting and monotone regression, extending previous work by Mammen and ven de Geer (1997). The algorithmic considerations and numerical examples are complemented by two consistency results.

The problem of bivariate density estimation is studied with the aim of finding the density function with the smallest number of local extreme values which is adequate with the given data. Adequacy is defined via Kuiper metrics. The concept of the taut-string algorithm which provides adequate approximations with a small number of local extrema is generalised for analysing two- and higher dimensional data, using Delaunay triangulation and diffusion filtering. Results are based on equivalence relations in one dimension between the taut-string algorithm and the method of solving the discrete total variation flow equation. The generalisation and some modifications are developed and the performance for density estimation is shown. (C) 2007 Elsevier B.V. All rights reserved.