Elucd Case Study: Problems and solutions
On reading the Elucd case study, I put myself in their shoes to comprehend the methods in which they collect their data, process the data, perform the implications that justify the data, and make sure there is no negative aftermath.
I read this report, Dirty Data and Bad Predictions by Rashida Richardson, which summarizes how predictive policing is conducted using ‘dirty data’ from times and moments in society where the population is bent towards corruption, misinformation of cases, racism and other sense of discrimination to fulfill the desire of politically beneficial environments. This made me wonder about one of the three challenges that Elucd faces:
Dashboard. Elucd collects a range of quantitative and qualitative data. How do they make sense of data that is collected in a way that is useful and not ethically askew?
Language as a barrier
Problem:
It is really essential to collect the data in its real form, without any form of biases, and in the right essence of the tone of the individual giving out this data. Sometimes the data collected is often lost in matters of translation, where in the magnitude of the problem could be distorted, the subject of the data could be misinformed and so on. A lot of the local neighborhoods could also be bias towards the police officers who would interact with them in their local language, hence, the data received by Elucd considering these factors would vary from the reality on the ground.
Possible Solutions:
It would be nice if Google translate would solve this problem. However, the way that Google translate works is that it uses frequency of word pairs between two languages as a database for its translations. This means that it cannot put a translation into proper context without the help of a human, hence, bringing back the same problem of loosing words in translation. An effective idea would be for Elucd to hire a local individual speaking the same language of the language in which the neighborhood they are carrying out their research, and also English to help translate the information back to Elucd. This may help understand the local tone of the residents and also not distort any extent of the data.
Qualitative vs Quantitative Data
Problem:
A lot of the data that Elucd would receive would be in the form of numbers and words. How to quantify and weigh the words and numbers proportionally to each other?
Let’s say Cindy complains that there is a lot of noise outside her house because McDonald’s just opened outside her house.
What could be the loudness of this noise in percentage? Does it mean more than 60% since she said there is ‘a lot of noise’? How does Elucd quantify the data in its true form without skewing it?
Possible Solutions:
Elucd could use averaging their data as a way to make sense of their quantifying data. Elucd could visit a house next to Cindys’, and maybe ask Marie how she feels about the noise outside the house. Is it a lot? Is it bearable? For what times of the day is it too loud? This would help Elucd make brackets for quantifying their data as an average. And this would be different for similar complaints. For eg, if the noise is bearable (in case of Cindy’s complaint), their bracket could range from 35% - 55%.
Another solution would be, to hand out Cindy and Marie these brackets and let them mark at what percentage does this noise bother them. While quantifying any data could askew the true essence of the complaint made in words to a certain extent, averaging out data would consider the same complaint from a wider range of people, hence, if not effectively, but still solving the problems for more people.
Biases
Problem:
A lot of the times that Elucd could be carrying research from neighborhoods that would hold biases in their complaints which could also distort the reality and also the magnitude of the problem. For instance, an individual from a certain neighborhood may feel threatened by the presence of another individual not belonging to their race/sexual orientation or religious beliefs etc. While the offender may not be doing anything illegal, he or she may just be reported as being suspicious or offensive to the individual of the neighborhood.
Possible Solution: In cases as such, where biases are involved, Elucd could do a rough study or some form of groundwork of the neighborhood and the demographic of their people to study the types of complaints coming to them. This could help them average out the realness of the problem. For eg, If the neighborhood has no history of complaining against a single race, with legitimate complaints that could be charged lawfully, then this data would be considered as a ‘clean data’.