- LEAVE FIRST PAGE BLANK AND DON’T CHANGE THE FILE NAME
- Do not change the file name when you submit the filed name to me.
- All original work
- IN TEXT CITAITONS AND REFERENCES
- CHECK SPELLING
- Read the directions completely and make sure you follow them
- 230-250 words min each answer
Here the original question
Davenport and Harris refer to the art and science of analytics. Wheelan points out that it is easy to use analytics to mislead.
As analytic savvy managers, you’ll need to assess both the question/issue being analyzed and the findings presented to you. What are some of the criteria you’ll use to determine if the question/issue being studied is appropriate for analytics and the findings accurate?
What I need you to do is the following:
- Read the discussion answers to these 3 post from clasmates and respond back in discussion manner like you been doing before my other assignment to each question 230- words I’m interested in substances than anything else in the discussion
- remember to cite all sources—when relevant—in order to avoid plagiarism.
- All original work
As Wheelan points out throughout chapter 1, statistics are a way in which we can easily reference numbers and complex data sets—collapsing complex information into a single number (Wheelan, pg 2). However, as managers we must be cognizant of the fact that statistics and data may not always be the most accurate reflection of the truth in that there are levels of nuance we lose when “collapsing” these data sets. As Davenport and Harris note, there is an opportunity to question everything, and taking information at face value forfeits an opportunity to discover a potential error or an unexamined path. We must question what we are told and if needed, review the data ourselves.
To mitigate errors and ensure that findings are accurate I would ask questions to whoever was presenting me with the data—if they weren’t able to answer the questions I would assume that there was additional digging that needed to be done and comparisons to run. Specifically, I would ask questions around how they drew their conclusions, what their hypothesizes were going into their research, what their methodology was, and if they could define their variables.
As an aside, while reading I found myself asking myself questions that I was wondering if the rest of the class would be interested in answering or if they had similar questions:
- What role does coincidence play in probability? Wheelan does explain how to account for coincidence but I still think it is interesting to think about
- How do hypotheses on human behavior develop without human bias? Or information bias? Or Davenport and Harris’s “golden gut” reference?
- Why aren’t there other methods of sampling to poll large groups of people in a short amount of time/cost-effective given the rise in social media and technology advancements?
As an analysis manager, if a data sample is presented to me the first criterion that I would require is that this information is collected from an accredited source. I would need to verify the information given to me is authentic. For example, if a clinical trial was completed and the findings were ready for analysis, I would require the data be collected be from a nationally accredited research institution.
Secondly, I would want to ensure that the information collected is ethical; whoever gathered this information had the right to do so and share it with the appropriate organization for analysis. This also coincides with the population that this information was collected from. Is there a way to identify if there is bias during the collection process or alteration from the raw data that was provided to myself? To support this criterion an example is the IRB, Institutional Review Board, for clinical trials. This is an independent organization that monitors data collection and safety of a clinical trial. An independent review organization that supports the data that I am provided would justify if this is appropriate for analytics, and support that the findings are accurate
Additionally, I would need general information on the findings: What is being collected and why? What is the sample size? Where is this information being gathered? How long has this information been collected? How has this information been gathered? And what is the methodology to the collection? These are all factors that are important to confirm if these findings are appropriate for analytics. I would need to be able to answer all of the above questions to effectively complete my job.
The final criterion is that the information that I receive needs to correlate with the end goals of the company. If my company wants to learn about how to improve sales of a beauty line on the west coast, but I am only provided information from customers on the east coast, I would not consider these findings accurate. However, if I was given findings on a national scale and asked to determine a trend from consumers from a specific region, this would be more of an accurate selection.
Based on my own attempts at doing analysis on small and large data sets, I have discovered at times that I will simply find a solution to a problem or question that no one asked or to some degree cares about. It isn’t always the case because it goes back to the intent of the study and the question I asked myself. In my recent experience I worked at a firm where I questioned whether there was bias in the hiring and promotion practices. Initially it was a gut feeling so I actually took the steps to do my own analysis. I went thru the organization chart and focused primarily on director and higher positions, classifying, organizing each individual by the obvious; title, ethnicity, gender, rank. I recorded the data into spreadsheets and then a data visualization tool, did some analysis and found that based on the numbers, only about 1% of the 600 positions were filled by black employees. I drilled down further to division and dept. level and found the same disparity. I also referenced recent reports by the Organization of Governmental Accountability as a confirmation of what was occurring in the industry.
At each step I asked, is my analysis correct. Did I create an error in the analysis based on how I grouped and organized the information? Was I trying to read something into the data that wasn’t there? I checked and rechecked my process, methods and results. All were good. With all of that, I could not determine whether there was an explicit bias in hiring and promoting. Based on the numbers, I could only assume there was implicit bias. The explicit bias would have to be based not just on what was in the org chart but actual practices, performance reviews, etc. This was data I did not have access to. Given my experience, the following criteria seems appropriate.
- The first thing that I find most compelling is really to ask, “What problem are you attempting to solve or what question was asked?”
- Where did the data come from or how was it collected?
- What processes around the data ensured it was accurate and uncompromised when it was created in its raw state? While it was stored?
- How relevant is the data to the problem being solved or the question being asked?
- What processes ensured the analysis was done in open, standard and unbiased way?
- What tools, methods, processes were used to conduct the analysis and to confirm the validity of the findings?
- Can the analysis be backed up by similar or other studies?
- Are the findings supported by real world examples?
- What story does the analysis reveal and is there a counter narrative? The glass half full vs half empty.
I think all of the above are supported by Wheelan, Charles. Naked Statistics: Stripping the Dread from the Data when he writes:
“Even in the best of circumstances, statistical analysis rarely unveils “the truth.” We are usually building a circumstantial case based on imperfect data. As a result, there are numerous reasons that intellectually honest individuals may disagree about statistical results or their implications. At the most basic level, we may disagree on the question that is being answered.”
There is perfect analysis and sometimes data can be questionable when it comes to data collection and data quality. Even with these gaps, there is still an opportunity to better understand results because it may force questions about the data and at times, the analyst intent.
APA 790 words