2.1 The dynamic assimilation model
In order to investigate the effect of differential employment status persistence on the relative employment outcomes, we specify a dynamic random-effects probit model for both immigrants and native Swedes. The assimilation model explicitly controls for the past employment status, several observed and unobserved individual characteristics, and endogenous initial values. The dynamic employment-generating process of immigrants (I) is specified as follows:
$$ {d}_{it}^I=1\left(x{\hbox{'}}_{it}^I{\beta}^I+{\lambda}^I{d}_{i,t-1}^I+{\varphi}^I ag{e}_{it}^I+\phi ys{m}_{it}+{\displaystyle {\sum}_j{\psi}_j{C}_j+}{\displaystyle {\sum}_k{\theta}_k^I{\Pi}_k^I+{u}_{it}^I>0}\right) $$
(1)
$$ {u}_{it}^I={\eta}_i^I+{\varepsilon}_{it}^I, $$
(2)
$$ {d}_{i1}^I=1\left(z{\hbox{'}}_{i1}^I{\beta}_1^I+{u}_{i1}^I>0\right), $$
(3)
where d
it
is a binary dependent variable indicating whether an immigrant is employed in the current period t (i denotes the individual, i = 1, …, n (the total number of individuals), and t denotes the period in the panel dataset, t = 1, …, T
i
(unbalanced panel)); x
it
is a vector of socio-demographic and economic characteristics (such as educational attainment level, marital status, and non-labor income); β is the corresponding vector of parameters to be estimated; and d
i,t − 1 is a (observed) binary lagged dependent variable indicating whether an immigrant i was employed in the previous period (t − 1). We interpret parameter λ as the structural state dependence following Heckman (1981).Footnote 2
age and ysm (years-since-migration) are two key variables for our assimilation model, and their second-order terms (age
2 and ysm
2) are used in the estimation specification but are not presented here to simplify the presentation. Immigrants arrive in different cohorts, C
j
, and these yearly indicator variables aim to capture unobserved arrival year-specific characteristics, i.e., cohort fixed effects. The transitory macroeconomic fluctuations in the Swedish economy (such as upward or downward trend in unemployment rates during observation periods) may have different impacts on the employment abilities of immigrants and natives. In order to control for these characteristics, the period effects, \( {\Pi}_k^I \), are included for k observation periods.
In order to calculate the relative employment outcomes of immigrants, we need to estimate the dynamic employment probability-generating function of native Swedes (N):
$$ {d}_{it}^N=1\left(x{\hbox{'}}_{it}^N{\beta}^N+{\lambda}^N{d}_{i,t-1}^N+{\varphi}^N ag{e}_{it}^N+{\displaystyle \sum_k}{\theta}_k^N{\Pi}_k^N+{u}_{it}^N>0\right), $$
(4)
$$ {u}_{it}^N={\eta}_i^N+{\varepsilon}_{it}^N, $$
(5)
$$ {d}_{i1}^N=1\left(z{\hbox{'}}_{i1}^N{\beta}_1^N+{u}_{i1}^N>0\right), $$
(6)
where the variables years-since-migration and arrival cohorts, which are not relevant for the data-generating process of natives, are excluded. The definition of other terms is the same as in the case of immigrants.
2.1.1 Identification
The model given in (1) is not identified. The period effects is a linear combination of arrival cohort and years-since-migration, since the calendar year at any cross section is the sum of the years-since-migration and the year in which the individual immigration occurred (Π
k
= C
j
+ ysm). An additional restriction must be imposed: either that the period effect is the same for both immigrants and native Swedes or that the cohort effect is the same across different arrival cohorts, and thus, we can drop it from the model. Several strategies are used in the literature to identify this model. The assumption that the period effects of immigrants and natives are the same (i.e., \( {\Pi}_k^I={\Pi}_k^N;\forall k \)) is credible if there are no changes in macroeconomic conditions or, if there is a change, the responsiveness of immigrants and natives to these changes are the same. The drawback of this assumption is that changing macroeconomic conditions may influence the price paid for skills of immigrants and natives differently. Thus, if the sensitivities of immigrants and native Swedes are in fact different and if they are not equally affected by changing macroeconomic conditions, this restriction can lead to a severe bias in the estimates of arrival cohort effect and years-since-migration (Barth et al. 2004). In fact, Sweden (and the other Nordic countries) experienced a sharp economic downturn coinciding with our sample period, 1990–2000. Thus, the model, which assumes equal period effects, is biased, and most importantly, this bias might affect the state dependence parameter of natives and immigrants differently. To deal with this issue, our first strategy is to use the local unemployment rates (at the municipality level for each period of observations) by following the wage curve approach suggested in Barth et al. (2004). Second, we group the arrival cohorts into 5-year intervals (the details are given below), allowing us to control for age, year-since-migration, and year fixed effects at the same model specification simultaneously.
2.1.2 Stochastic specifications
The error terms in (1) and (4) are composed as in (2) and (5). The first part, η
i
, is the time-invariant unobserved individual effects, and controlling for these factors is crucial in order to be able to identify structural state dependence. The second part ε
it
is the usual error terms, which are assumed to have a normal distribution with zero mean and unit variance due to identification of the probit model. The specification is a random-effects model that assumes that observed and unobserved characteristics are orthogonal. Yet, this assumption is very strong and can easily be violated (for instance, unobserved taste for work for immigrants can be correlated with experience and education). We use the quasi-fixed effects or otherwise known as the correlated random-effects model of Chamberlain (1984) or Mundlak’s (1978) formulation. The actual disturbance process is assumed to be serially uncorrelated. However, in this model, allowing for an unobserved individual effect induces a serial correlation. The correlation between two sequential error terms is given as \( \mathrm{Corr}\left({\varepsilon}_{it},{\varepsilon}_{is}\right)=\frac{\sigma_{\eta}^2}{\sigma_{\eta}^2+1};\left(t,s=1,\dots, {T}_i;t\ne s\right), \) where \( {\sigma}_{\eta}^2 \) is the variance of the unobserved individual effects.
The dynamic models specified in (1–3) and (4–6) are estimated using the random-effects specification.Footnote 3 In this case, such an approach requires correct specification of the distribution of the initial values and unobserved individual effects. The log-likelihood function of our dynamic model is specified as follows:
$$ \mathrm{Log}(L)={\displaystyle \sum_{i=1}^n} \ln \left[{\displaystyle {\int}_{-\infty}^{\infty}\left\{{f}_{i1}\left({d}_{i1}\Big|{\left\{{X}_{it}\right\}}_{t=1}^{T_i},{\eta}_i\right){\displaystyle {\prod}_{t=2}^{T_i}{f}_{it}\left({d}_{it}\Big|{d}_{i,t-1},{X}_{it},{\eta}_i;\beta \right)}\right\}f\left({\eta}_i\right)d{\eta}_i}\right], $$
(7)
$$ {f}_{it}\left({d}_{it}\Big|{d}_{i,t-1},{X}_{it},{\eta}_i;\beta \right)=\Phi \left[\left(2{d}_{it}-1\right)\left(X{\mathit{\hbox{'}}}_{it}\beta +\lambda {d}_{i,t-1}+{\sigma}_{\eta }{\eta}_i\right)\right], $$
(8)
where X
it
is a vector including all observed variables (except the lagged dependent variable); β is a vector of the corresponding parameters; and Φ is the distribution function of the standard normal random variable. The likelihood function in (7) can easily be maximized when the conditional distribution of the initial values \( {f}_{i1}\left({d}_{i1}\Big|{\left\{{X}_{it}\right\}}_{t=1}^{T_i},{\eta}_i\right) \) is known. In order to identify the magnitude of the structural state dependence and disentangle it from spurious factors leading to persistence, the initial values play an important role (Heckman 1981; Wooldridge 2005). This problem occurs when some individuals enter into the employment status-generating process before the observation period. Many immigrants (and of course native Swedes) entered the Swedish labor market much earlier than the study period (1990–2000). That is, a substantial portion of individuals had a past employment history before entering the panel (1990 in our case). Thus, it is obvious that the first wave employment status cannot be taken as exogenous. This assumption is very strong and can lead to biased and inconsistent estimators for the parameter estimates of structural model parameters (Heckman 1981). The sample initial employment status must instead be considered endogenous, with a probability distribution conditioned on observed and unobserved individual characteristics.
There are two main methods to solve this problem: Heckman’s (1981) reduced form approximation and Wooldridge’s (2005) method. Heckman’s method is based on available pre-sample information with which the conditional distribution of the initial values can be approximated via a reduced form. This approximation allows a flexible specification of the relationships between the initial values and observed and unobserved individual characteristics. Wooldridge (2005) introduces a simple and novel alternative to Heckman’s reduced form approximation. He suggests that the unobserved individual effects can be considered conditional on the initial values and the time-varying exogenous variables in a way similar to the correlated random-effects model of Chamberlain (1984) using a similar auxiliary distribution for the unobserved individual effects. In the present paper, we use the Wooldridge method as we have a long panel dataset (Arulampalam and Stewart 2009; Akay 2012).Footnote 4 In this approach, to solve the initial values problem, the auxiliary distribution of the unobserved individual effects for immigrants and natives is specified as follows:
$$ {\eta}_i^I={\pi}_0^I+{\pi}_1^I{d}_{i1}^I+{\pi}_2^I{\overline{z}}_i^I+{\alpha}_i^I, $$
(9)
$$ {\eta}_i^N={\pi}_0^N+{\pi}_1^N{d}_{i1}^N+{\pi}_2^N{\overline{z}}_i^N+{\alpha}_i^N, $$
(10)
where \( {d}_{i1}^I \) and \( {d}_{i1}^N \) are the first-period employment status of immigrants and natives; \( {\overline{z}}_i^I \) and \( {\overline{z}}_i^N \) are the within means of the time-variant variables that are considered to be correlated with the unobserved individual effect η
i
. The time-variant variables used in (9) and (10) are age, year-since-migration (for immigrants), education, non-labor income, number of children at home, local unemployment rates, and national unemployment rates in the arrival year. α
i
is the new unobserved individual effect, which is now assumed to be normally distributed with zero mean and a finite variance and uncorrelated with the observed characteristics.
2.1.3 The estimators of employment assimilation
The approach adopted here to measure employment assimilation is based on the idea of assimilation having occurred when immigrants’ employment probability levels catch up with those of natives (following Borjas 1985, 1999). To calculate the relative employment probabilities, we are going to simulate the conditional expected values of the dynamic and static models over the life cycle of natives and immigrants. The simplified conditional expectations of the dynamic system-generating employment probabilities are written as follows:
$$ {E}^I\left[{d}_{it}=1\Big|{X}_{it},{d}_{i,t-1}, ag{e}_{it},ys{m}_{it},{\alpha}_i\right]=\Phi \left[\begin{array}{c}\hfill X{\mathit{\hbox{'}}}_{it}{\widehat{\beta}}^I+\lambda {d}_{i,t-1}+\hfill \\ {}\hfill \kern1em {\widehat{\varphi}}^I ag{e}_{it}\left({\tau}_0+\tau \right)+{\widehat{\phi}}^Iys{m}_{it}\left({\tau}_0\right)\hfill \end{array}\right], $$
(11)
$$ {E}^N\left[{d}_{it}=1\Big|{X}_{it},{d}_{i,t-1}, ag{e}_{it},{\alpha}_i\right]=\Phi \left[X{\mathit{\hbox{'}}}_{it}{\widehat{\beta}}^N+\lambda {d}_{i,t-1}+{\widehat{\varphi}}^N ag{e}_{it}\left({\tau}_0+\tau \right)\right] $$
(12)
Our strategy to calculate the employment probability differentials is as follows: First, we assume that the labor market entry age is τ
0 = 20 for each immigrant and native. We predict the employment probability for each immigrant and native at age = 20 and ysm = 0 to calculate the initial employment probability differential upon arrival. We then increase τ up to 45 (implying that the age of the individuals is increased up to 65) and predict the employment probabilities for each year-since-migration. One important issue is how to treat past employment status d
i,t − 1 and unobserved individual effects α
i
in the simulations. There can be several strategies to simulate the life cycle employment probabilities in the case of the dynamic model. Yet, since one of our aims is to compare the results from the dynamic and the static model, the simulation strategy should allow us to directly compare the outcomes of these models. We also note that our estimation strategy assumes that the lagged employment status is “observed”; see Eqs. (1–3) and (4–6). Thus, the straightforward approach is simply to treat past employment status as one of the other (observed) control variables, conditional on the initial conditions specifications defined in (9) and (10). This implies that the difference between the static and dynamic models in our simulations is only due to past employment status and the first-year employment status. Our strategy to deal with the unobserved individual effect is to simply evaluate the unobserved effect at the mean for everyone, α
i
= 0, both for the dynamic and the static model. The standard errors of the prediction are also calculated for each age and year-since-migration combinations to calculate the 95 % confidence intervals.