The options discussed in this chapter concern testing for appropriate functional specification, ordinary least squares (OLS) models, generalized least squares (GLS) models, minimum absolute deviation (L1) models, MINIMAX models, and weighted least squares (WLS) models. The appropriate syntax for running these options is contained in the regression, reg and robust commands of Stokes (2000b). It is assumed that the user has entered his data in B34S and is ready to estimate the model. After a detailed discussion of the statistics available, a number of examples are shown.

Keywords: Generalized least squares (GLS), Minimum absolute deviation (L1), MINIMAX, Weighted least squares (WLS), BLUS residual analysis, BAYES option, Constrained least squares (CLS), Best regression, Stepwise regression, Heteroskedasticity tests, Davidson-MacKinnon leverage analysis, Least Trimmed Squares (LTS) analysis

The econometric analysis discussed in this chapter concern models with restrictions on the range of the dependent variable. The B34S commands discussed in this chapter include the following:

probit: | Runs a single-equation probit model for a 0-1 dependent variable |

mprobit: | Runs a single-equation probit model for a dependent variable with up to ten ordered states |

loglin: | Runs up to a four-equation logit model for 0-1 dependent variables |

mloglin: | Runs up to a 69-equation logit model with each dependent variable having up to 20 states |

tobit: | Runs a single-equation tobit model in which the dependent variable is bounded from above or below |

In addition the poisson model is discussed and estimated with the matrix command. Good general references for this chapter are Maddala (1983) and Daganzo (1979). Promising non-parametric methods such as the Random Forest Model are dicussed in Chapter 17.

Keywords: Logit, Probit, Tobit, Multinomial Logit, Multinomial Probit, Poisson Models

In section 4.1, after first discussing the basic simultaneous equations model, the constrained reduced form, the unconstrained reduced form and the final form are introduced. In section 4.2 the theory behind QR approach to simultaneous equations modeling as developed by Jennings (1980) is discussed in some detail. The simeq command performs estimation of systems of equations by the methods of OLS, limited information maximum likelihood (LIML), two-stage least squares (2SLS), three-stage least squares (3SLS), iterative three-stage least squares (I3SLS), seemingly unrelated regression (SUR) and full information maximum likelihood (FIML), using code developed by Les Jennings (1973, 1980). The Jennings code is unique in that it implements the QR approach to estimate systems of equations, which results in both substantial savings in time and increased accuracy. The estimation methods are well known and covered in detail in such books as Johnston (1963, 1972, 1984), Kmenta (1971, 1986), and Pindyck and Rubinfeld (1976, 1981, 1990) and will only be sketched here. What will be discussed are the contributions of Jennings and others. The discussion of these techniques follows closely material in Jennings (1980) and Strang (1976).

Section 4.3 illustrates estimation of variants of the Kmenta model while section 4.4 illustrates an exactly identified model. Section 4.5 shows how using the matrix command OLS, LIMF, 3SLS and FIML can be estimated. The code here is for illustration purposes, benchmarking but not production. Section 4.6 shows matrix command subroutines LS2 and GAMEST that respectively do single equation 2SLS and GMM models.

Keywords: Estimation of Structural Models, OLS, LIML, LS2, LS3, ILS3, General method of moments (GMM), Two-stage least squares, Three-stage least squares, B34S matrix implementation

The literature on the analysis of panel data is comprehensively covered in Wooldridge (2002) and Baltagi (2005). Greene (2003) and Wooldridge (2006, Chapters 13-14) provide a more accessible treatment. The goals of this chapter include discussion of the B34S ecomp command that implements the Freiden (1973) error- components program, which was heavily modified by Henry, McDonald and Stokes (1976) and a number of examples programmed with the B34S matrix command that illustrates one and two way fixed effects models. The ecomp command implements two ways to proceed to estimate one way random effects models. The Nerlove approach (1971b) can be used when there is variance in the panels. The Henry, McDonald and Stokes (1976) approach to random effects estimation can be used where this is no variance within the panel such as if a gender dummy or ethnicity dummy is used.

The error-components model provides a way in which a data set containing pooled-time-series and cross-section data can be modeled without having to assume the error structure is the same across time and across regions. The ecomp command will estimate a model that is the same as a pooled-cross-section time-series model with a cross-section-specific dummy variable. The ecomp procedure can also be used on a purely time-series model, if correction with seasonal dummies is desired, but the specific seasonal dummy coefficients are not needed. The ecomp procedure also allows seasonal harmonic analysis. An example showing these more novel uses of the ecomp procedure will be presented later.

The literature on random effects error components analysis is vast and there are a large number of ways to estimate random effects models, many of which are discussed in Baltagi (2005). For random effects models a number of important techniques are provided.

Keywords: Error-Component Model, Fixed-effect model, Balanced model, Unbalanced model

The B34S transprob command estimates and decomposes a Markov probability model. The basic code for estimation of the Markov model was adapted from Lee, Judge and Zellner (1970). The decomposition of the Markov model implements the approach suggested by Theil (1972, Chap. 5). A good discussion of the theory of Markov chains is contained in Isaacson and Madsen (1976). Recent economic applications are contained in Stokey and Lucas (1989, Chaps. 8, 11, and 12). After an initial overview of the model, the Theil (1972) decompositions are discussed. The Kosobud and Stokes (1979) model of OPEC behavior is used to illustrate an economics application of Markov model building.

Keywords: Markov model, Decomposing the Transition Probability Matrix, Estimating the Transition Probability Matrix

The B34S commands bjiden and bjest control identification and estimation of Box-Jenkins (1976) models. The basic code was originally developed by David Pack (1977) and was extensively modified by the author. This code also runs under the matrix command where the autobj command can be used to automatically identify an ARIMA model. The basic reference for the theory that underlies these commands is the classic work of Box and Jenkins (1976), which has been updated as Box, Reinsel and Jenkins (1994). Nelson (1973) provides a simplified treatment of univariate models, while a more modern treatment is found in Enders (1995) and Hamilton (1994). Since these excellent references are available, only a brief discussion of the identification and estimation steps will be given here. Zellner and Palm (1974) outline how the Box-Jenkins univariate and transfer function models are special cases of the more general vector autoregressive moving-average models. The simultaneous equations model is shown to be another special case. The btiden and btest commands to estimate these more complex models are discussed in Chapter 8, while the simultaneous equations command simeq is discussed in Chapter 4. A frequency interpretation of the VAR model is discussed in Chapter 12.

Keywords: Box-Jenkins ARIMA, Autocorrelation, Transfer function

The B34S commands btiden and btest control the identification and estimation of VAR and VARMA models. The basic code for this section was obtained from the Department of Statistics, University of Wisconsin, and is documented in Tiao et al (1979) and Tiao and Box (1981). After first discussing how VAR and VARMA models relate to structural models and the concept of Granger (1969) causality, VAR and VARMA model identification is reviewed. These models are used to test for Granger (1969) causality. The Hinich (1982) nonlinearity test is shown to be an added diagnostic tool to determine whether the model is correctly specified. The VAR and VARMA models are illustrated, using the gas furnace data, which is also used in chapter 12 to illustrate the frequency approach to the VAR model estimation proposed by Geweke (1982a, 1982b, 1982c, 1984).

Keywords: VAR, VARMA, Hinich test, Stock-Watson test, Moving Stock-Watson test

In recent years there has been increasing interest in the development of tests to determine the correct functional specification of linear and nonlinear models. Potential specification problems include the stability of the coefficients over time, the effect of omitted variables on the estimated coefficients. This chapter discusses in some detail both the recursive residual technique (RR), outlined in the classic article by Brown, Durbin, and Evans (1975), and the less well known but powerful Flexible Least Squares (FLS) approach developed by Kalaba- Tesfatsion (1989). Both procedures involve tests for equation misspecification that do not make assumptions about the nature of the possible problem. The recursive residual techniques have been implemented in the rr procedure and as an option of the matrix command OLSQ. The FLS approach is implemented as the matrix command FLS.

The recursive residuals can be shown to be of the LUS class (linear unbiased scaler), of which the BLUS residual (see sec. 2.9) is a special case. Many of the BLUS tests can be applied to the recursive residual, with the advantage that the calculation of the RR is computationally simpler than the corresponding calculation of BLUS residuals. Recursive residual analysis involves calculating updated coefficient vectors as additional observations are added to the regression. Using such methods, tests can be made for parameter stability. Analogous tests are not possible with the BLUS technique. The FLS approach develops coefficient estimates that yield vector-minimal sums of squared residuals (measurement error) plus weighted dynamic errors [measured as the weighted discrepancy ]conditional on the observation. As the weights increase the solution approaches the OLS model. Statistical tests on whether the means of individual coefficients are used to suggest whether the model is stable for that variable. Unless the model is perfectly specified, in general the measurement error of an FLS model is less than the corresponding OLS model.

Keywords: Ordinary least squares (OLS), Recusrsive residuals (RR), Flexible least squares (FLS), Linear unbiased scalar (LUS), Cusum test

The qr command in B34S allows calculation of a high accuracy solution to the OLS regression problem and is discussed in section 10.1. Using this command the principal component regression can be calculated to give further insight into possible rank problems and is discussed in section 10.3. The matrix command provides substantially more capability and can be used to provide additional insight. After an initial survey of the theory, a number of examples are presented. Further examples on this important topic are presented in section 16.7 which involves the matrix command and illustrates accuracy gains due to data precision and alternative methods of calculation. Section 10.3 discusses the ridge, lasso, least trimmed squares and elastic net models which can be used to shrink the variables on the right hand side and/or remove outliers. Section 10.4 is devoted to boosting while extended examples for many of the procedures discussed are given in section 10.5.

Keywords: Ordinary least squares (OLS), QR (High Accuracy) estimation, Principal-component regression, Ridge regression, Lasso, Elastic Net models, General llinear models(GLM), Shrinkage models, LTS routines for resistant estimation, Boosting

The B34S matrix command contains a general programming language that contains commands that allow estimation of a wide variety of nonlinear models. This facility contains a superset of the capability that was contained in the older, and now replaced, nonlin command than estimated a nonlinear least squares model that was specified by the user in a Fortran subroutine. Stokes (1991, 1997 chapter 11) contains a discussion of this older approach which, although it required a dynamic link to a Fortran program, was extremely fast. Using the matrix command, the user specifies the model in a 4th generation programming language where a number of specific commands can be used to solve the model depending on the problem.

Keywords: Nonlinear modeling, Nonlinear estimation, ARCH, GARCH

The varfreq command allows decomposition of the VAR model into the frequency domain to provide additional insight into the dynamic structure implicit in the model, using techniques developed by Geweke (1982a, 1982b, 1982c, 1984). In economics the concept of "short run" and "long run" usually mean short lag and long lag. For example, in the area of monetary policy, the income effect, which measures a positive correlation between changes in the monetary base and interest rates, is thought to take up to 16 months. The shorter term liquidity effect, or negative correlation between changes in the monetary variable and the interest rate, is thought to take only a few months (Stokes and Neuburger 1979). Clearly, in these cases the concept of long and short run can be appropriately thought of in terms of time lag. However, in other situations we might define "short run" to mean high frequency and "long run" to mean low frequency. As an example, consider the behavior of prices during the long deflation in the United States after the Civil War until nearly the turn of the century. It might be reasonable to suppose that a rational person might find the long run (low frequency) component of the price movements "expected price movements" while the short run (high frequency) movements "unexpected price movements." Both measures are present in a VAR model such as equation (8.1-9). The varfreq command allows these effects to be measured.

The kfilter command provides the capability to estimate a state space model with use of an identification strategy suggested by Aoki (1987). A state space model can be written in VARMA form and vice versa. The Aoki approach uses the covariance matrices between the past data and future realizations of time series to build the Hankel matrix of covariance matrices. Aoki (1987) argues that the numerically determined rank of this Hankel matrix is the dimension of the state space model. The kfilter command provides an automatic means by which a state space model can be estimated. It will provide a baseline forecast by which ARIMA and VARMA model forecasts can be tested.

The topic of unit roots in time series models impacts all time series model building procedures. The B34S polysolv command is used to test models for unit roots. The dangers of unit roots are illustrated with sample data and various tests are performed using the pgmcall procedure. Finally, a very brief introduction of ARCH, GARCH and ARCH-M model building using is shown. A more extensive discussion is in Stokes (200x, Chapter 7)

Keywords: Frequency decomposition, Kalman filtering, Unit roots, ARCH, GARCH

The optcontrol command implements the Chow (1975, 1981) optimal control program in B34S. Because this program has already been well documented above, the decision was made to support both the old command syntax and the B34S command language. This chapter briefly describes what is being calculated and illustrates some of the output from running the Klein-Goldberger model.

Keywords: Optimal control analysis

The B34S regression command, which is discussed in Chapter 2, contains a number of tests for equation specification. These include serial correlation, heteroskedasticity, lack of normality of the error and nonlinearity using BLUS analysis. Often when a model failed a specification test, the user was forced to try various model alternatives to try to correct the problem. The bispec sentence, which was documented in Chapter 8 and which is callable under the rr, bjiden, bjest, btiden, btest and matrix commands, implements more comprehensive tests for Gaussianity and nonlinearity developed by Hinich (1982). An important feature of these tests was that they could be applied to raw series or series filtered with linear models, since any nonlinearity in the data would pass though the linear filter. If the exact form of the nonlinearity was known, then the nonlinear capability in the matrix command, which was documented in Chapter 11, could be used. The problem was that the exact form of the nonlinearity is usually not known.

The rr procedure, which was documented in Chapter 9, can be used to perform recursive residual tests on time series and cross sectional models. In many cases it is possible to detect a potential nonlinearity problem, only to be faced with a long trial-and-error process in attempting to correct the mispecification. A major assumption of any linear procedure is that the coefficients are stable no matter what the level of the explanatory variable and, in the case of a time series model, for all time periods. If the problem is a level-specific variation in the coefficients, and the data is appropriately sorted, the Hinich (1982) test will often indicate nonlinearity. Using recursive residual analysis, such a problem may be seen as a lack of stability of the coefficients that is a function of the level of a specific variable. The researcher is then faced with experimenting with interaction and power terms in the model to attempt to correct the stability problem in the plot of the coefficients with the goal that in the final model, the coefficients that are invariant to the level of any specific variable. Achieving this in practice is often a long and difficult process, further complicated by higher order interactions by level. Fortunately, as shown below, the GAM can provide some insight into the which, if any, of the right hand side variables are nonlinear.

The GAM, ACE, MARSpline and approaches, which are documented in this chapter, allow the user to specify models in which the coefficients differ by level. An advantage of these procedures is that the estimated nonlinearity is determined by the algorithm. In the MARSpline approach, in addition to simple knots, complex nonlinear interaction terms can be specified. The GAM and related ACE and LOESS approaches can be thought of as diagnostic tools whereby nonlinearity can be visually detected. These procedures will be discussed next and examples will be provided to illustrate their use. A number of non-parametric alternatives such as Projection Pursuit, Random Forest, Exploratory Projection Pursuit, Regularized Discriminate Analysis, Recursive Covering and Cluster Analysis are discussed in some detail in Chapter 17.

The MARSpline procedure provides a systematic nonlinear estimation strategy that is particularly powerful in situations where there are large numbers of right-hand variables and low-order interaction effects. The equation switching model, in which the slope of the model suddenly changes for a given value of the X variable, is a special case of the MARS model. The MARSpline procedure can detect and fit models in situations where there are distinct breaks in the model, such as are found if there is a change in the underlying probability density function of the coefficients and where there are complex variable interactions. After an overview of MARS, GAM, ACE and modeling, some examples will be presented to illustrate the use of these procedures. These include the gas furnace data, that was discussed in Chapters 7 and 8. Next five sample datasets of 15,000 observations will be built and discussed. Logit options in the MARSpline command are used to reanalyze the McManus data discussed in Chapter 3. Datasets suggested by Friedman and Brieman will be studied. Finally, the Sinai-Stokes (1972) and Benzing (1989) production function data are analyzed with the MARSpline approach. The procedure provides a complementary approach to the MARSpline technique that is particularly good in situations where there are few variables but the nonlinear surface is smooth and of high order, i. e., not involving a sharp local structure. The approach has been found to be useful in situations in which the MARSpline approach is not feasible. The GAM, LOESS_MV and ACE class of models provide an alternative way to search for and detect nonlinearity. These are documented in a number of places including the seminal reference Hastie-Tibshirani (1990).

Keywords: General additive models (GAM), Multivariate adaptive regression splines (MARSpline), Alternative conditional expectation models (ACE), LOESS

For single series the B34S spectral command calculates the Fourier cosine and sine coefficients, the periodogram and, if weights are supplied, the spectrum. The spectral command provides substantially more capability than the spectral available under the B34S bjiden and bjest sentences. For multiple series the spectral paragraph calculates the real part of the cross periodogram, the imaginary part of the cross periodogram, the cospectral density estimate, the quadrature-spectrum estimate, the amplitude, the coherency squared and the phase spectrum. After a brief discussion of the theory, the spectrum is calculated for generated AR data with positive and negative coefficients. While spectral analysis usually proceeds by a Fourier decomposition of the series into cosine and sine coefficients, it is possible to use OLS methods to make these calculations. The advantage of the OLS approach is that it highlights the relationship between time and frequency domain approaches. The matrix command also contains substantial spectral programming capability that includes spectral, cspectral and fft commands together with complex variable manipulation capability.

Wavelet analysis provides means by which a series can be filtered in a manner that has certain advantages over the windowed Fourier transformation (WFT). Torrence and Compo (1998) stress the WFT has the disadvantage of being inefficient since it "imposes a scale or 'response interval' T into the analysis, a problem not found with the wavelet transformation. They advocate using wavelets as a means by which a data series can be filtered from noise. Hastie-Tibshirani-Friedman (2001, 149) note the ability of wavelets to represent a series and provide a means by which a sparse representation can be found. "Wavelets bases are very popular in signal processing and compression, since they are able to represent both smooth and/or locally bumpy functions in an efficient way – a phenomenon dubbed time and frequency localization. In contrast, the traditional Fourier basis allows only frequency localization." In section 15.4 the Torrence-Compo wavelet implementation is discussed and examples shown.

Keywords: Spectral analysis, Wavelets, Windowed Fourier transformation (WFT), Spectrum, Periodogram

The B34S matrix command is a full featured 4th generation programming language that allows users to customize calculations that are not possible with the built-in B34S procedures. This chapter provides an introduction to the matrix command language with both special emphasis on linear algebra applications and the general design of the language. This chapter should be thought of as an introduction and overview of the power of programming capability with an emphasis on applications. At many junctures, the reader is pointed to other chapters for further discussion of the theory underlying the application The role of this chapter is to provide a reference to matrix command basics. An additional and no less important goal is to discuss a number of tests of linpack and lapack eigenvalue, Cholesky and LU routines for speed and accuracy.

Section 16.1 provides a brief introduction to the matrix command language. All commands in the language are listed by type, but because of space limitations are not illustrated in any detail. Since the matrix command has a running example for all commands, the user is encouraged to experiment with commands of special interest by first reading the help file and next running one or more the of the supplied examples. To illustrate the power of the system, a program to perform a fast Fourier transform example with real*8 and complex*16 data is shown. A user subroutine filter illustrates how the language can be extended. The help file for the schur factorization, which is widely used in rational expectations models, is provided as an example to both show capability and illustrate what a representative command help file contains. Section 16.2 provides an overview of some of the nonlinear capability built into the language and motivates why knowledge of this language is important. The solutions to a variety of problems are illustrated but not discussed in any detail. Section 16.3 discusses many of the rules of the matrix language while section 16.4 illustrates matrix algebra applications. In sections 16.5 and 16.6 refinements to eigenvalue analysis and inversion speed issues are illustrated. Section 16.7 shows the gains obtained by real*16 and VPA math calculations.

Keywords: B34S matrix language, user-defined subroutines

In contrast to Chapter 14 that was concerned with nonlinear methods that implicitly assumed normally distributed errors, the procedures discussed in this chapter represent nonparametric nonlinear model alternatives. When confronted with a problem where nonlinearity cannot be reasonable assumed away, there are a number of possible ways to proceed.

Option 1: Model exact functional form. Direct Estimation of a nonlinear specification is clearly the best choice if the model is known for certain.

Option 2: GAM and ACE Models. The GAM model is an especially valuable nonlinear exploratory tool that investigates possible nonlinearity by fitting polynomials to the right hand side variables. Graphic analysis of the smoothed series gives feedback on whether there is low dimensional nonlinearity. ACE models smooth both the right and left hand side variables and make estimation of such models as possible. While neither method detects variable interactions, both allow manual incorporation of interaction variables in the model. Comparison of GAM leverage plots with OLS plots indicate the type of nonlinearity that is being estimated.

Option 3: MARSpline Modeling provides an automatic method to identify locally linear partitions of the data based on threshold values and potential interactions among the variables. As a special case of MARSpline modeling, lags of the dependent variable can be included in the modeling dataset to handle time series applications in the spirit of Threshold Autoregressive (TAR) models. Model nonlinearity can be displayed using leverage plots that map the knots and interactions found at specific regions in the n-dimensional nonlinear space. The MARS estimator is of the shrinkage class that provides a way to reduce the number of explanatory variables while allowing for the possibility of nonlinearity. An advantage of MARS over GAM and ACE models is that 2-3 way interactions can be detected and modeled. Graphical analysis of the knot vectors that are identified and used in the OLS estimation step involving transformed data can be inspected to identify specific thresholds present in the data.

Option 4: LOESS Modeling. This approach is especially useful if there is substantial local structure. It is not suitable for large datasets or where there are many right hand side variables.

Option 5: Recursive Covering Models. This approach, discussed in Friedman (1996a) and Hastie-Tibshirani-Friedman (2001, 415), unifies the advantages of the K nearest neighbor modeling approach that involves a large number of overlapping regions based on the training dataset and a CART approach that identifies a smaller number of highly customized disjoint regions.

Option 6: The Projection Pursuit model is a nonparametric multiple regression approach that only assumes continuous derivatves for the regression surface. In contrast to recursive partitioning methods that can be characterized as local averaging procedures, projection pursuit models a regression surface as a sum of general smooth functions of linear combinations of the predictor variables. Graphical analysis using leverage plots can be employed to interpret the estimated model.

Option 7: Exploratory Projection Pursuit Analysis can be used to explore possible nonlinearity in multivariate data by assigning a numerical index to every projection that is a function of the projected data density The number of large values in the projection index indicates the complexity of the data.

Option 8: Random Forest Modeling. A random forest model uses bagging to improve the performance of a CART type model. The performance of a classification problem is improved by estimating many models and by voting to select the appropriate class. For a problem involving a continuous left-hand side variable, averaging is used to improve the out-of-sample performance. The basic idea of the random forest method is to randomly select a bagged dataset, estimate a model using a fixed number of randomly selected input variables and, using this model make predictions for the out-of-bag data. This is repeated multiple times. The random forest technique is especially suitable for classification problems involving many possible outcomes. While probit and logit models can be used when there are a small number of classes such as in the models discussed in chapter 3, for research problems containing large numbers of classes, these methods are not suitable. Random Forest models can be used successfully in such cases, as well as cases typically addressed through classical probit and logit models. For continuous left hand side variable problems random forest methods are suitable for high dimension nonlinear problems involving many right hand side variables. For near linear models or for models with few right hand side variables, random forest models will not perform as well.

Option 9. If there is no left-hand side variable, the options listed above are not applicable. Cluster analysis that includes both k-means and hierarchical models attempts to place variables in a predetermined number of classes and can be used for exploratory data analysis.

The goal of the rest of this chapter is to outline the use of options 5-9.

Keywords: General additive models (GAM), Multivariate adaptive regression splines (MARSpline), Alternating conditional expectations (ACE), LOESS, Cluster analysis, Projection pursuit regression (PPREG), Exploratory projection pursuit analysis, Recursive covering, Random Forest, Bagging