Leveraging the Benefits of a Structured Approach
In this article I will illustrate a structured approach to statistical analysis. Structured approaches provide the following benefits:
- Discreet: efforts in each step don’t overlap with other steps.
- Measurable: since each step’s efforts are discreet, reporting progress or forecasting completion is simple. The same benefit applies to resource budgeting as well.
- Collaborative: Project managers may assign multiple resources for simultaneous effort.
Structured Data Analysis
- Define Hypothesis
- Decompose Hypothesis Into Tests
- Evaluate Tests
- Report Tests Results
For the following steps, I’ll use examples from my article Tableau: A/B Testing Grades vs. Study Hours.
1) Define Hypothesis
Create a statement about the data that is believed to be true. This step focuses energy in data analysis so it is the most important.
For example, if I suspect students who utilize a tutor achieve higher grades and wanted analysis to validate that suspicion, I would state my hypothesis as “Tutor Usage Delivers Higher Grades.”
For this article, I’ll begin with the following dataset containing 100 records showing each student grade and if they used a tutor.
2) Decompose Hypothesis into Tests
With a Hypothesis defined, prove the Hypothesis’ veracity to avoid your own biases.
Break down the central tenants of the hypothesis into measurable, achievable tests so they may worked on collaboratively within your team.
It is important to make no assumptions with the test so each truth supporting the hypothesis may be proven, resulting in strong positive results for the hypothesis. Failures of any test provide a portion of the hypothesis is based on a faulty assumption.
For this example, I would create the following tests:
a) For each grade (A-F), what is the average percent of those who used a tutor?
b) From Test A, which grades contain the highest percentage of tutor usage?
c) From Test B, does a pattern exist between the three highest grades and tutor usage (does an increase exist, or decrease, or neither).
3) Evaluate Tests
From my analysis, I found the following results.
While evaluating my tests, I determined the following:
a) For each grade (A-F), what is the average percent of those who used a tutor?
From grades F-A, the tutor usage was 13%, 25%, 64%, 80%, and 96%, respectively.
b) From Test A, which grades contain the highest percentage of tutor usage?
Grades C-A contain the highest tutor usage.
c) From Test B, does a pattern exist between the three highest grades and tutor usage (does an increase exist, or decrease, or neither).
Grades increased with tutor usage, with the greatest increase beginning with the ‘C’ grade through to the ‘A’ grade.
4) Report Tests Results
My analysis concludes, excluding other factors, increased tutor usage results in increased grades.
Pingback: Data Analysis | TechMbaBi
Pingback: Statistics | TechMbaBi