We have spent many weeks working on digitizing census data, tediously comparing spreadsheets and meticulously analyzing the entries. But the question remains as to what can be done with this data? What is the point of digitizing this information?
Digitizing this census data opens doors to obtaining many different types of information. As you know, the data includes information about race, status in the household, age, gender, birthplace, occupation, and more. Digitization will allow us to draw conclusions and make connections between these pieces of data.
For example, I created census data for Ward 6, District 59 of Harrisburg. After compiling all the information into a single spreadsheet, I was able to make assertions about the different people living in this district. I first decided to look at the race portion of the data. Using Microsoft Excel, I was able to quickly and easily enter a formula [=COUNTIF(range, criteria)] to determine the number of times the word “white” appears in the race column. Since I only want to know about the appearance of the word “white” in the race column, my formula would be =COUNTIF(I1:I1000,”white”). This tells me that I have a total number of 988 white people in this set of data. I then repeat the process using “black,” which gives me a total number of 11. Using these numbers, I can determine that there were around 99% of people in this district were white, while only around 1% were black.
Excel also allows us to draw conclusions about averages in the data. This feature may be most applicable in the age column of our data. Once again, I enter a formula [=AVERAGE(range)] to determine the average age. Using column K for age, the formula =AVERAGE(K1:K1000) tells me that the average age of people in this district is about 27.
There are a vast number of Excel formulas that can be used to make data inferences far beyond the two I have mentioned here.
Once all the census data is finalized, we will be able to use a GIS (Geographic Information System) to draw deeper conclusions such as spatial relationships between variables like gender and age, race and occupation, and more.