The econometric strategy is based on a methodology developed in Carneiro et al. (2003) and Heckman et al. (2006). I extend their model framework by allowing the measure of LOC to depend on a set of observable determining variables. This extension is based on work by Fahrmeir and Raach (2006). The model is estimated using a Bayesian Markov Chain Monte Carlo algorithm. Borghans et al. (2008) indeed see latent factor theory as a crucial connecting tool between psychology and economics. Borghans et al. (2008) refer to the work of Carneiro et al. (2003) as a successful example for incorporating psychometric questions in an economic outcome model in a way that addresses the problem of endogeneity.

The econometric model is based on two elements: a personality model—based on a traditional factor model—and on an employment model including latent factors as explanatory variables. The two models are estimated simultaneously, allowing for recognition of the unobservable nature of the latent factor. Indeed, treating the latent factor as observable is a less efficient method than a method estimating all parameters simultaneously and taking into account that the latent variable is an estimated and not an observable entity. On the other hand, if the latent factor is estimated in a wrong way, any mistake is carried on to the estimation of the remaining parameters. The model allows the measure of LOC to depend on observable variables. Especially, we are interested in whether immigrants and their children have different positions on the LOC scale.

### 4.1 The LOC model

The first element of the model is a classic factor model. Factor models have been developed in psychology to measure intelligence (Spearman 1904). Latent factor models were also used to measure other personality traits, in political science for measuring concepts and in financial economics to measure latent concepts which influence financial markets.

The main idea of factor models is to use a set of measures for the concept “intelligence”, “discipline”, “peace” or “beliefs on the stock market” and to divide the joint variation among these measures into a common part *θ* and a random part *ε* and to estimate the common part *θ* and its effect on the measures, indicated by *α*. *θ* indicates in this paper the LOC, which is measured using a set of questions related to the LOC (Rotter 1966). The model is a simultaneous equation model of the five psychometric questions above. Each psychometric question is modelled as an ordered probit model. All five questions are assumed to depend on a latent factor *θ*, the LOC, and an independent random error term *ε*
^{M}. The psychometric questions all depend differently on the latent factor—each question has a different factor loading *α*
^{M},which can be interpreted as a coefficient of the latent factor in the regression of *M* on *θ*. The model takes the form:

$$\begin{array}{@{}rcl@{}} M_{1} & = & \{1,2,3\} \\ M_{1}^{\ast} & = & \alpha^{M_{1}} \theta +\varepsilon^{M_{1}} \\ M_{2} & = & \{1,2,3\} \\ M_{2}^{\ast} & = & \alpha^{M_{2}} \theta +\varepsilon^{M_{2}} \\ M_{3} & = & \{1,2,3\} \\ M_{3}^{\ast} & = & \alpha^{M_{3}} \theta +\varepsilon^{M_{3}} \\ M_{4} & = & \{1,2,3\} \\ M_{4}^{\ast} & = & \alpha^{M_{4}} \theta +\varepsilon^{M_{4}} \\ M_{5} & = & \{1,2,3\} \\ M_{5}^{\ast} & = & \alpha^{M_{5}} \theta +\varepsilon^{M_{5}} \end{array} $$

Appendix 2 explains how this model can be identified on the basis of the identification of latent factor models.

### 4.2 The employment model

The second element of the model is an employment model. The latent factor *θ*, estimated through the model above, is treated as an additional explanatory variable in the employment equation. The model takes the form:

$$\begin{array}{@{}rcl@{}} D & = & \{0,1\} \\ D^{\ast} & = & {\beta_{0}^{D}} +\alpha^{D} \theta +\beta^{D} X +\varepsilon^{D} \end{array} $$

### 4.3 The simultaneous equation model

Both models described above are estimated simultaneously using a simultaneous equation model. The model is a linear parametric simultaneous equation model with an embedded factor model structure, as described above. The simultaneous equation model contains the equations for the economic outcome *D* and for the measures *M*. In this paper, the latent concept LOC is endogenized and so I add another equation in the simultaneous equation model to determine *θ*.

The model then takes the following form:

$$\begin{array}{@{}rcl@{}} D & = & \{0,1\} \\ D^{\ast} & = & {\beta_{0}^{D}} +\alpha^{D} \theta +\beta^{D} X +\varepsilon^{D} \\ M & = & \{1,2,3\} \\ M^{\ast} & = & \alpha^{M} \theta +\varepsilon^{M} \\ \theta & = & \gamma W +\varepsilon^{\theta } \end{array} $$

where *D* is an employment indicator and *M* signifies psychometric measures for LOC. Since *M* and *D* are categorical variables, we need to impose a probit structure on the variables, so *D*
^{∗} and *M*
^{∗} indicate the latent underlying variables for the probit models for *M* and *D*.*X* comprises the control variables in the employment equation (called *direct effects*) age, gender, immigrant status, education level, language spoken at home, marital status and whether there are any children under 16 at home. These variables have been chosen on the basis of a Mincer equation with typical control variables for immigrants. *W* comprises the control variables for the latent factor equation (called *indirect effects*) age, gender, whether religion is important, immigrant status, education level, language spoken at home and the time spent in Germany. The importance of religion was added as it is a factor commonly identified as correlated with the LOC (Kahoe 1974). Note that identification of the model *X* and *W* could be identical.

Appendix 3 lists the assumptions needed to identify this model.

#### 4.3.1 Estimation: the Gibbs sampler

The model is estimated by a Bayesian Markov Chain Monte Carlo routine. The likelihood function of the model under the assumption of independently and identically distributed observations is given by

$$\begin{array}{@{}rcl@{}} & & \prod\limits_{i =1}^{N}f (M_{i},D_{i},M_{i}^{\ast},D_{i}^{\ast },\theta i\vert X_{i},W_{i},\alpha,\beta,\gamma,c) \\ & = & \prod\limits_{i =1}^{N}f (M_{i}^{\ast},D_{i}^{\ast},\theta_{i}\vert X_{i},W_{i},\alpha,\beta,\gamma,c) \prod\limits_{i =1}^{N}f (M_{i},D_{i}\vert \theta_{i},M_{i}^{\ast},D_{i}^{\ast},X_{i},W_{i},\alpha,\beta,\gamma,c) \\ & = & \prod\limits_{i =1}^{N}f (M_{i}^{\ast},D_{i}^{\ast},\theta_{i}\vert X_{i},W_{i},\alpha,\beta,\gamma,c) \prod\limits_{i =1}^{N}f (M_{i},D_{i}\vert c) \end{array} $$

where the factor loadings are written as *α*=(*α*
^{M},*α*
^{D}) and the coefficients as *β*=*β*
^{D}. The first simplification follows from exploitation of the product rule. The second step follows from the fact that ordinal responses are solely determined by the underlying variables \(D_{i}^{\ast }\) and \(M_{i}^{\ast }\) and by the cutpoints *c*. We can factor out the likelihood function \(f (M_{i}^{\ast },D_{i}^{\ast },\theta _{i}\vert X_{i},W_{i},\alpha,\beta,\gamma,c)\) into \(f (M_{i}^{ \ast },\theta _{i}\vert.)f (D_{i}^{\ast },\theta _{i}\vert.)\) due to the conditional independence assumptions above. The likelihood functions of \( D_{i}^{\ast }\) and \(M_{i}^{\ast }\) written separately are

$$\begin{array}{@{}rcl@{}} \prod\limits_{i =1}^{N}\left[f(M_{i}^{\ast},\theta_{i}\vert \alpha,\gamma,c,M_{i},W_{i}) \left\{\sum_{k_{M} =1}^{K_{M}}1 (M_{i} =k_{M})1 (c_{k_{M} -1} <M_{i}^{\ast} <c_{k_{M}})\right\}\right] \\ \prod\limits_{i =1}^{N}\left[f(D_{i}^{\ast},\theta_{i}\vert \alpha,\beta,\gamma,D_{i},X_{i},W_{i}) \left\{\sum_{k_{D} =1}^{K_{D}}1 (D_{i} =k_{D})1 (c_{k_{D} -1} <D_{i}^{\ast} <c_{k_{D}})\right\}\right] \end{array} $$

Each of the factors \(f (M_{i}^{\ast },\theta _{i}\vert.)\) and \(f (D_{i}^{\ast },\theta _{i}\vert.)\) needs to be multiplied by two indicators—an indicator which equals one if the observation *M*
_{
i
} (*D*
_{
i
}) falls in category *k*
_{
M
} (*k*
_{
D
}) and an operator indicating that \(M_{i}^{\ast }\) (\(D_{i}^{\ast }\)) must fall between the two cutpoints \(c_{k_{M} -1}\) (\( c_{k_{D} -1}\)) and \(c_{k_{M}}\)(\(c_{k_{D}}\)) according to its category.

*θ* is unobservable and will be estimated. To make the mechanism by which *θ*
_{
i
} influences \(M_{i}^{\ast }\) and of \(D_{i}^{\ast }\) perspicuous, we integrate out *θ*
_{
i
} and obtain the distributions of \( M_{i}^{\ast }\) and \(D_{i}^{\ast }\) conditional on the parameters of the model and on the data.

$$\begin{array}{@{}rcl@{}} f (M_{i}^{\ast }\vert \alpha,c,\gamma,M_{i},W_{i}) & = & { \int \limits_{\theta }}f (M_{i}^{\ast}\vert \alpha,c,\theta_{i},M_{i}) f (\theta_{i}\vert \gamma,W_{i}) d (\theta_{i}) \\ f (D_{i}^{\ast }\vert \alpha,\beta,\gamma,D_{i},X_{i},W_{i}) & = & { \int \limits_{\theta }}f (D_{i}^{\ast }\vert \alpha,\beta,c,\theta_{i},D_{i},X_{i}) f (\theta_{i}\vert \gamma,W_{i}) d (\theta_{i}) \end{array} $$

It becomes obvious that the likelihood function of the model is a high-dimensional integral, which cannot be solved analytically and needs to be solved by numerical methods. Markov Chain Monte Carlo methods provide a way to estimate the parameters of interest by sampling from the integral. The main advantage of the Gibbs sampler is its relative computational ease.

The Gibbs sampler is a Bayesian method. The Bayesian paradigm specifies statistical models as a posterior joint distribution, composed of the two elements prior distribution and likelihood function. The prior distribution contains the beliefs of the researcher about the parameters before taking into account the information in the data. The prior is combined with the likelihood function, which contains the information of the data. The posterior joint distribution is obtained by simply multiplying the priors with the likelihood and it can be written as

$$\begin{array}{@{}rcl@{}} & & f (\beta,\alpha,\gamma,\theta_{i},M^{\ast},D^{\ast},c\vert M,D,X,W) \\ & \propto & f (\beta) f (\alpha) f (\gamma) f (c) \prod \limits_{i =1}^{N}f (M_{i},D_{i},M_{i}^{\ast},D_{i}^{\ast},\theta_{i}\vert X_{i},W_{i},\alpha,\beta,\gamma,c) \end{array} $$

where *f*(*β*)*f*(*α*)*f*(*γ*)*f*(*c*) are the priors for the coefficients of *X*, the factor loadings, the coefficients of *W* and the cutpoints.

The Gibbs sampler is an algorithm which samples from this joint posterior distribution in a sequential way. The idea of the Gibbs sampler is to sample one of the elements among \(M_{i}^{\ast },D_{i}^{\ast },\beta,\alpha,\gamma,c\) and *θ* at a time, conditioning on the last sampled values for the remaining elements and on the data. This procedure is equivalent to sampling from a set of conditional distributions sequentially. Each conditional distribution is a conditional posterior distribution of a parameter value given the last sampled values of the other parameters and the data. These conditionals—each of them constitutes one step of the Gibbs sampling algorithm—are called “full conditionals”. The closed form of the full conditionals follows from the properties of the model. After a sufficient amount of iterations, the algorithm converges under a set of regularity conditions and the sampled values are samples from the true posterior.^{4} The algorithm for the model in this paper ran for 100,000 iterations, and convergence statistics do not indicate that the algorithm has not converged. In the following, I derive the full conditionals of the model.

First, a value is sampled from the posterior conditional distribution (or full conditional) of the latent underlying variables, then from the posterior conditional distribution of the factor loadings and so forth. For the second iteration, the same procedure is repeated, conditioning on the sampled values from the first iteration. The very first iteration starts with a set of specified initial values. The algorithm is not sensitive to the choice of the starting values.

Appendix 4 describes the conditional posterior distributions underlying the respective elements of the model; namely of the latent underlying variables, the factor loadings, the direct coefficients, the cutpoints, the latent factors and the indirect coefficients.