Ñòåïàíîâ Â.Â. 2. Äîêëàä "Constructing Databases for the Russian Census Project" íà ðàáî÷åé âñòðå÷å "Russian Census Workshop", Brown University, Watson Institute, USA, March 2002

Constructing Databases for the Russian Census Project

Valery Stepanov

Institute of Ethnology and Anthropology

The Institute of Ethnology and Anthropology (IEA), one of the key centers involved in designing the tools that will be used by the 2002 census to measure the ethnic and linguistic identity of the population, has a large amount of data at its disposal. However, a significant portion of this data is not easily subjected to scholarly analysis because it has not been organized in a systematic fashion that would allow easy access for researchers. This is particularly the case for current documentation held by IEA connected to its official correspondence with the State Committee on Statistics (Goskomstat), the State Duma of the Russian Federation (parliament’s lower chamber), and other central and regional government bodies of Russia.

It is also difficult for researchers to secure information on the results of trial (preparatory) censuses, as well as a full account of the evolution of census tools, legislative acts, and technical instructions. Many of the documents on the 2002 census that have been amassed over the past few years are in paper form only and in a single copy. Other documents are kept on the computers of various IEA staff offices (secretarial, reception) and different scholars. In addition, there is no list or catalogue of these documents, nor is there any guarantee of their storage properly secured (no safeguards are in place against computer damage or duplication). Another problem is that there is no universal format for those documents held electronically. The documents are formatted according to different programs (some are in e-mail form), which further complicates the storage issue, as well as access for researchers. It is very important for the Russian Census Project to organize IEA’s informational resources along systematic lines, by placing most of the materials in databases and establishing regular access to them for scholars.

Some efforts have already been made to achieve these goals. Part of IEA’s census-related materials have been transferred to databases, which are systematically updated. This has been made possible by the Institute’s participation in two projects headed by Director Valery Tishkov. One is, “The Development of the Electronic Archive of the Institute of Ethnology and Anthropology of the Russian Academy of Sciences,” financed by the Russian Humanitarian Scientific Foundation. The second project, financed by the MacArthur Foundation, is to support the Network for Ethnological Monitoring and Early Detection of Conflicts (EAWARN). About a dozen databases (among others) containing information on the upcoming census were created in connection with these projects. These databases include federal and regional legislation, materials elucidating the mass media’s preparations for the census, and the results of the Soviet censuses and censuses held in some of the New Independent States.

The problem that remains, however, is that the creation of additional databases on the census can not be supported by any of these projects because their focus is not on studying issues specifically linked to the census. But, to successfully conduct a project on the census itself, some thirty specialized databases are needed (see attached list). Another problem is that only a small segment of the necessary electronic search engines are simple (statistical) databases. Most of the databases needed to study the census are complex systems designed to hold authentic copies of documents. Putting these necessary systems in place will require a certain financial outlay, as well as organizational efforts to create databases exclusive to the project.

It would seem that the project’s databases, under the working title The Russian Census Project, should be organized around several large thematic categories:

Organization of the Census

•Debates on the Census

•Preparation and Conduct of the Census

•Instruments of the Census

•Statistical Data

•Cartography of the Census

•Bibliography

It is obvious that the database should contain updated information on the instrumental part of the census, i.e. the instructions given to the projects’ participants should include information on the characteristics of the mechanism behind the All-Russian census. This information would be provided by three of the thematic categories listed above. “Organization of the Census” would include materials on the organizational activity of Goskomstat in connection with preparing and conducting the census, federal and regional legislative acts, and IEA documents. “Instruments of the Census” would provide detailed information on the method used to measure identity among the population, i.e. it would hold data on the questionnaire and on the principles governing the coding process. The third category, “Preparation and Conduct of the Census,” would include information on the very course of the census based on various sources of data. Particularly important among this data will be the materials of the Network of Ethnological Monitoring (EAWARN).

Another important function of the database will be to assist in the project’s task of throwing light on the debates surrounding the issue of identity among the population in the census (“Debates on the Census” category). This data should cover all of the main spheres of these debates (scholarly, governmental, and public), and reflect their most important aspects, i.e. the political, ethno-cultural (socio-cultural), and scholarly aspects. Furthermore, the database should contain information on existing sociological surveys and scholarly works on the subject since this kind of activity on the eve of the census is an independent factor influencing mass consciousness and the adoption of administrative decisions.

The database will also allow the project’s members to evaluate the validity of the 2002 census, i.e. to assess the degree of distortion in the data obtained by the census. Consequently, the instructions to the researchers should include not only comparable data from previous Soviet censuses, but also data from the current population count of Russia, as well as census materials from states formerly part of the Soviet Union (“Statistical Data” category). Statistical data in the form of graphs, primarily cartographic, would be useful supplements (“Cartography of the Census” category).

Bibliographies of the literature on the most debated questions connected to identity would be another useful aid. These would include lists of scholarly and public-political publications on issues having to do with the complex (or politically-loaded) identity of a host of groups among the population of the regions of the Russian Federation, specifically in the northern Caucasus, the Volga region, and southern Siberia (“Bibliography” category). The bibliography of publications on the 2002 All-Russian census should be as full as possible.

A very important function of the project’s database is to make the project’s activity available on the Internet. To achieve this, a special site on the project will have to be created, which would use the database to provide up-to-date information on the preparation and conduct of the census, publish analytical materials written by project participants, and discuss debated issues. The site would also allow free access to certain parts of the database.

List of Databases Proposed for the Project

*already constructed and updated by IEA

Organization of the Census

1. Problems, errors associated with the 1997 and 2000 trial censuses*

2. Preparations for the census, conducted by Goskomstat in the regions*

3. Legislative acts and procedural documents connected to the census (federal and regional levels)*

4. Current documents of IEA, Goskomstat, and the State Duma on the census*

5. Who is who in the organization and conduct of the 2002 All-Russian census

Debates on the Census

1. Scholarly discussions on census issues (unpublished materials)*

2. Debates on the census in Russian government bodies (federal and regional levels).

Political motives behind support for or opposition to the census; the relationship between “center” and “regions” (as a power issue)

3. Public debates on the census, preceding the census (federal center and regions)*

4. Public debates on the census, following the census (federal center and regions)

5. List of sociological surveys and scholarly works on the census

Preparation and Conduct of the Census

1. Information on the course of the census from Goskomstat*

2. Expert analytical materials of the Network of Ethnological Monitoring and Early Detection of Conflicts (EAWARN)*

3. Publications of the EAWARN bulletin

Instruments of the Census

1. Lists of ethnic groups employed in the period of the Soviet censuses, compared to analogous lists, prepared for the national censuses of the New Independent States (including the list of the 1994 Russian micro-census)*

2. All versions (evolution) of the lists of ethnic groups prepared for the 2002 All-Russian census, with commentary by the experts*

3. “Dictionaries of Nationalities and Languages” of the Soviet censuses and the New Independent States

4. All versions (evolution) of the “Dictionary of Nationalities and Languages” prepared for the 2002 All-Russian census, with commentary by the experts

5. Archive of the questionnaires of the Soviet censuses and the New Independent States

6. Evolution of the questionnaire prepared for the 2002 All-Russian census*

7. Comparative data on the instructions issued to census-takers in the Soviet censuses and prepared for the conduct of the census in the New Independent States

Statistical Data

1. Results of the previous Soviet censuses, as a comparative base for analyzing the results of the 2002 All-Russian census*

2. Statistical census data, lists, and the evaluations of experts on the ethnic and linguistic composition of the population of the New Independent States

3. Statistical data on the ethnic composition of the population of the Russian Federation, as per the vital registration of births and deaths in the 1990s (this will involve expenses). This and other data, in conjunction with data from the 1989, 1979, as well as the 1994 Russian micro-census will allow for controlled comparisons of the 2002 results, making it possible to assess possible discrepancies in the data dependent on factors influencing the census (socio-political)

4. Statistical data on the ethnic composition of the population of the Russian Federation, as per data on the migration of the population (this will involve expenses)*

Cartography of the Census

1. Cartographic presentation of the statistical distribution of ethnic and linguistic identity, as per data from Soviet censuses

2. Cartographic presentation of the statistical distribution of ethnic and linguistic identity, as per data from censuses in the New Independent States

3. Cartographic presentation of the statistical distribution of the ethnic composition of migrants by region in the Russian Federation*

4. Cartographic presentation of the statistical distribution of vital births and deaths among the population of the regions of the Russian Federation, taking into account ethnic composition

Bibliography

1. Bibliography of scholarly and public-political literature on particularly divisive issues connected to ethnic and linguistic identity

2. Bibliography of publications on the 2002 census

Translated by Maria Salomon Arel