5.1 Variable construction
This paper takes advantage of a unique database that links inventors to the general population in Sweden from 1985 to 2007 (Jung and Ejermo 2014; Zheng and Ejermo 2015). The data on inventors and inventions were extracted from the Worldwide Patent Statistics (PATSTAT) database provided by the European Patent Office (EPO). If an immigrant is listed as an inventor on at least one patent application to the EPO at any time since the time of his/her migration until 2007, s/he is considered an inventor. Here, we use patent application data instead of data on patents granted as it usually takes long time for a patent to be granted (on average, 5 years in our data).
The linkage between inventors and the population is done by two steps. First, inventors’ social security numbers (SSNs)Footnote 8 were identified by matching the names and addresses provided by the EPO to those offered by a commercial company or a 1990 Swedish population directory. Second, inventors were linked to the entire population in terms of their SSNs. Detailed personal information on inventors and population was obtained from Statistics Sweden, which has collected such information for all Swedish residents from 1985 onwards. Finally, among the identified inventors, 10.9% are foreign-born who contributed to 11.6% (by fractional count) of the identified Swedish patent applications (Zheng and Ejermo 2015).Footnote 9
We focus on those who immigrated at the age of 18–64 in the period 1990–2007 (1985–2007 in the descriptive analysis) as these immigrants could be potentially in the workforce. Compared with immigrants from other regions, immigrants from the EU-15 who migrated at the above ages are more likely to be economic migrants. This can be shown by their higher employment rates (around 62–66% within 2 years of migration between 1985 and 2007, including both fully and partly employed) in Table 7 in Appendix 1.
We use education data to proxy for skills at the time of immigration. As data on immigrants’ education level at the time of migration are not well registered, we imputed the missing data as follows. First, if immigrants have not enrolled in any education in Sweden after they arrived, we assumed their education levels at the time of migration are the same as the first record shown in later years. Second, if immigrants enrolled in secondary high school education in Sweden after they arrived, we assumed they had primary education before they arrived. Third, if immigrants enrolled in post-secondary high school (≥ 13 years of schooling) after they arrived, we expected that they had a secondary high school education (10–12 years of schooling) before they arrived. In total, we imputed 42% of data on education level for immigrants at the time of migration. Figure 5 shows that the percentage of missing data on education level in each year becomes more homogenous based on imputed data compared with the shares of missing observations in the original data. In addition, by comparing the regression results with and without imputation (Tables 2 and 4 vs. Table 8), it seems unlikely that we create problems by using the imputed data, although potential imputation mistakes on data may still exist (see the detailed discussion in the last paragraph of Section 7.1).
We measure the skill composition of immigrants in two ways. First, we compare the education structure of immigrants before and after the liberalization on three education levels. They are as follows: (a) the low educated: immigrants who arrived with education that ends up at primary school (≤ 9 years of schooling); (b) the middle educated: those who arrived with a secondary school education; and (c) the high educated: those who arrived with a post-secondary school education. In addition, we also use inventor data to investigate the effect of the liberalization reform on skills, which has not been done before. In this way, we investigate the probability that an immigrant will become an inventor between the time of their immigration and the end of the study period (i.e., 2007).
When it comes to data on origin of immigrants, we can only observe data on broad region of origin rather than on the country level. Knowing the country of origin would also have been useful in order to control for or examine country-specific effects. The division of region of origin and inflow of immigrants for each group are as follows:
-
(a)
“The EU-15”. Finland, Denmark, and Sweden (the destination country) are excluded because they are part of the Nordic country, where free movement policy enacted in 1954. The EU-15 is our treatment group affected by the migration reform in 1994.
-
(b)
“Other developed regions”. This includes the other Nordic countries (Finland, Denmark, Norway, and Iceland), North America (Canada, the USA, Central America, and the Caribbean countries), and Oceania.
-
(c)
“All other regions” (excluding the rest of Europe and the former Soviet Union (SU)). This group includes “Other developed regions” (group b) as well as Asia, Africa, and South America.
-
(d)
The rest of Europe and the former SU.
5.2 Trends in immigration group composition
Figure 1 shows number of immigrants over time by region of origin as defined above. We can observe that:
First, migration from the EU-15 to Sweden shows a stable increase over the whole period. This increase took place after 1995. In fact, immigration from the EU-15 more than doubled by 2006.
Second, the migration trend for “Other developed regions” is also quite stable and largely in line with that of the EU-15. The main difference is that during the period 1988–1990, a relatively large inflow of immigrants from Denmark and Norway occurred (Statistics Sweden 1989, 1990, 1991). We use this group as our benchmark comparison to that of the EU-15.
Third, migration from “All other regions” (which also includes “Other developed regions”) shows a high level of fluctuation. It has an even sharper spike in 1989. This is because of a large inflow of refugees from developing countries, such as Chile and Lebanon (Statistics Sweden 1989, 1990, 1991). The boom in 2006 is mainly due to large increases in refugees from countries such as Iran, Iraq, Lebanon, and Somalia (Statistics Sweden 2016). Therefore, the group of “All other regions” is not a good choice for comparison because of its large heterogeneity. Immigration from this group occurs more often for humanitarian reasons, and data could be more erratic. However, we still included the group as a robustness check to compare with the results which use “Other developed regions” as a baseline. As expected, the empirical results (available upon request) are quite noisy.Footnote 10 Therefore, we do not report on the empirical results on this part.
Finally, for the rest of Europe and the former SU, we see that, as in Austria (Huber and Bock-Schappelwein 2014), a massive inflow of immigrants took place in 1993 and 1994 because of the large number of refugees that resulted from the breakup of Yugoslavia and the ensuing wars. Immigrants from the former SU also doubled in 1991 compared with 1990 because of the breakup of the SU (Statistics Sweden 2016). This influx of immigrants may have differed substantially from other immigrant flows in terms of motivation and education.Footnote 11 It is clear that refugee waves can distort the interpretation of policy reforms that we study. Therefore, we exclude this group in the regression analyses when compared with immigrants from the EU-15.Footnote 12
Figure 2 shows the proportion of low-, middle-, and high-education levels the year of immigration for each immigrant group by region of origin. For comparison, it also shows the proportion of each education level for the Swedish-born population. Subgraph A shows that the share of the low-educated drops markedly for all groups over time. It is clear that the developed regions have lower shares of the low educated. The EU-15 experienced a marked fall in the share of the low educated in the 1989–1993 period, followed by a small hike in the 1994–1995 period, seemingly corroborating our theoretical prediction. There is a similar trend for immigrants from the rest of Europe and the former SU in those years. It is possible that those changes are due to the refugee crisis in the former Yugoslavia, democratization of formerly communist regimes in Eastern Europe, and the loosening of emigration restrictions on those regions.
Like graph A, graph B shows declining trends with respect to the share of middle-educated immigrants for each immigrant group. By contrast, the trend in middle-educated Swedish born is rising. This is explained by the fact that nowadays, almost every young person has had a secondary (gymnasium) education, which is now seen as almost a prerequisite for getting a job. Simultaneous with a large increase in immigrants in 1993 and 1994, the share of middle-educated immigrants in 1993 from the rest of Europe and the former SU experienced a spike. This group maintained a stable, relatively high level through 1998, before falling in 1999–2003, and then showed an increase. Among immigrants from “Other developed regions” and “All other regions,” the years 1992–1995 displayed a U-shaped pattern. However, for the EU-15, the share consistently decreased until 2001, after which a slight increase can be observed.
Unlike graphs A and B, graph C shows generally growing trends in the shares of the highly educated in each group over the whole period. The share of the highly educated in each immigrant group was higher than for the Swedish population after 1990, seemingly reinforcing Grogger and Hanson’s (2011) conclusion that emigrants generally draw from the highly educated part of the distribution. Again, the rest of Europe and the former SU showed a somewhat irregular pattern, increasing sharply in 1991–1992 and then dropping sharply in 1993 and 1994. The highest share of the highly educated immigrant group comes from the EU-15. Interestingly, this group showed a declining share of highly educated in the period 1993–1995, but a decline is also observed for “Other developed regions” and “All other regions” in the period 1994–1995. These patterns suggest that other things were going on at the period of observation and reinforce the need to control for underlying trends.
Of the identified foreign-born inventors, 23.3 and 31.5% came from the EU-15 and “Other developed regions,” who contributed 24.1 and 30.8% of the identified Swedish patent applications (by fractional count), respectively. Figure 6 in Appendix 1 shows a growing trend of patent applications by the above two groups as well as by the Swedish-born inventors since 1994. This may be related to the economic recovery from financial crisis at the beginning of 1990 as well as Sweden’s accession into EEA, which have led to an increased inflow of highly skilled immigrants from the EU-15 members to Sweden.
In sum, the descriptive analysis shows that, after the EEA was formed in 1994, the number of immigrants from the EU-15 increased slightly, with an increase in the share of low-educated immigrants and decrease in the share of the middle and highly educated during 1994 to 1995, although the absolute number of immigrants in each education level grew.