The story so far:
The Central Government has announced that the next Census would take place in 2027 and that it would collect information on castes. Can such a massive data exercise be made more useful through a restructuring of the existing Census process?
How is the Census conducted?
The first phase of the Census, called house-listing, would probably be conducted between April to September in 2026. This stage lists all the dwelling units in the country where people live, along with several characteristics of the houses and households.
The second phase called the population enumeration phase would be conducted in 2027, wherein information on several key socio-economic characteristics of the population would be collected. This is also the stage where caste would be recorded.
Why is caste being recorded?
The recording of an individual’s caste was last done in the 1941 Census. However, that data could not be processed due to economic constraints of the Second World War. Thus, effectively, the last Census to provide data on caste has been the 1931 Census that has become too outdated to use for any purpose.
Prime Minister Narendra Modi is reported to have said that caste enumeration as part of the Census is a step the government is taking to bring the marginalised and those left behind in every field into the mainstream. However, given the limitations of the Census as a method of data collection as well as the design of the Census questionnaire, it is doubtful whether this objective can be fulfilled. A restructuring of the Census questionnaires could make more useful data be made easily available to further the objectives indicated by the Union government.
What are the problems with the questionnaires?
It is presumed that the questions that were included in the draft questionnaires for the 2021 Census may more or less remain the same for the 2027 Census. In the 2021 draft, the question on caste was restricted to those belonging to Scheduled Castes (SC) as in the past Censuses.
By making this question applicable to all castes, except Scheduled Tribes (ST), and with consequent changes in instructions and the software used for electronic data collection, data on castes can be collected. The practical difficulties of collecting data on castes is not within the scope of this article. Information on specific castes can be ascertained through literacy/educational levels; age at marriage; mother tongue and other languages known; status of the individual as the main worker, marginal worker or non worker; seeking/available for work; broad classification of industry/occupation of the workers; place of birth/ previous residence; and data on child birth and survival.
While the data on ‘mother tongue and other languages known’ may not be of much importance in assessing the socio-economic status of various castes, information on participation in economic activity and its broad classification may be of use. However, the data on unemployment derived using the response to the question “whether seeking/available for work” suffers from conceptual issues and lack of attention in data collection. For example, this question has a reference period of one year. However, it is not clearly mentioned as to how long a person should be seeking/or be available for work to be classified as unemployed. Though this question has been asked in every Census starting from 1981, it could never give useful data.
Information on ‘child births and survival’ collected in the Census suffers from serious quality issues. These questions, included in the Census from 1981, have outlived their utility as similar information is better collected through the National Family Health Surveys. Getting any reliable caste-wise data from these questions is almost impossible.
Information on migration may be an important aspect to assess whether people of certain castes are more prone to migration. However, data from previous Censuses seem to indicate that a large percentage of migrants are not counted or are not recorded as migrants.
Thus, the only information that would be available to classify caste are those of education, age at marriage and participation in economic activity.
While the Census does collect other information that would help in moving towards the objectives stated by the Union government for the inclusion of caste in the Census, it would need some serious restructuring of the Census questionnaires and process.
How should the Census questionnaires be restructured?
The main objective of the house listing phase is to prepare a list of all dwelling units where people are living or are likely to be living at the time of the Census. This framework helps in carving out new enumeration blocks as required and thus helps balance the workload of the enumerators. Several questions relating to quality of housing, amenities available to households and assets owned, have been asked during this phase from the 1991 Census onwards.
However, in the 1981 Census, these questions were in the household schedule canvassed during the second phase of the census, that is the population enumeration phase.
Transferring these questions from the house-list schedule to the household schedule would help linking information on quality of housing, amenities and assets to other aspects of the population easier. As there is a time gap of six to nine months between the house-listing and population enumeration phases, linking the information on the basis of house number, name of the head of the household etc. may bring about error. Such errors may seriously impact the reliability of data, especially for small communities.
Taking the questions out of the house-listing schedule would also help enumerators to concentrate on the listing of all buildings, be it residential, partly residential or non-residential along with the number of people living in them. Improved house-lists would help in better coverage of the Census. This is very important in urban areas which have higher omission rates in most Censuses.
Such linkages or transfer of questions have not been adopted in either the 2011 Census or in the planning of the 2021 Census (which was advanced due to the COVID-19 pandemic).
Without such data in the Census, it is not possible to answer questions like, “What is the literacy rate of persons living in kutcha house without electricity and whether this is significantly lower than that of others” or “What proportion of the workforce in urban areas live in kutcha houses?”, etc.
The Census should be able to provide answers to the above questions, disaggregated by caste. Then only can the data be used for identifying marginalised communities and the extent of disparities between them. Though collecting accurate data through a Census on many of these variables is not an easy process, and though the quality of data might suffer, it is the best alternative as of now.
Should some questions be omitted?
There is a need to make the Census leaner by dropping unnecessary questions. Several questions on amenities available to the household or assets owned by them may have become redundant. For example, ownership of mobile phones or that of computers may not be as important now as it was five years ago. Similarly, questions on access of households to bank accounts might be omitted. A shorter questionnaire would help the enumerator concentrate on getting more accurate responses to the questions.
The Census has been providing caste/tribe wise data on several socio-economic variables. It is doubtful whether this data have been used to identify the most backward castes/tribes for similar exercises that could aid policy/program formulation. Hopefully, caste-wise data thrown up by the upcoming Census would be used better in policy and program formulations, and be used beyond decisions regarding the percentages for reservation.
The author is a retired officer from the Indian Statistical Service and a former Deputy Registrar General.