Essay type:Â | Quantitative research papers |
Categories:Â | Company Data analysis Statistics |
Pages: | 3 |
Wordcount: | 556 words |
The purpose of the analysis was to help a company get a good understanding of the four main variables; central tendency, variability, the shape of the distribution, and the pattern of the relationship between the variables. The Tool Pak ad in excel made it easy for the computation of the descriptive statistics for the multiple variables of the home values dataset. To further help in understanding the data summary, visualizations of frequency histogram, boxplot, and scatterplots are plotted.
Frequency histograms show the number of occurrences of a variable in a dataset. They help organize and understand the distribution of the dataset. The boxplot represents a descriptive way of categorizing data into quartiles. It is usually plotted as a rectangle, and its horizontal lines mark the median value of the distribution. In contrast, vertical lines drawn from either side of the rectangle represent the upper and the lower quartile mark. The variables that fall far from the central measure are outliers and are represented by a point lying outside of the boxplot as in the Home value boxplot.
Skewness and kurtosis are suitable central tendency measures that help in determining the distribution of variables. Skewness measures the symmetry in distribution, if positive as in the case of Home value, HH Inc, and Per Cap Inc at 1.5377619, 0.608919 and 0.74437462 respectively it points out that the variables are positively skewed or skewed right meaning the right tail of the distribution is longer than the left. If the skewness is negative as in the case of Pct Owner Occ at -1.24581, it implies that the distribution of the variables is negatively skewed or skewed to the left, meaning the left tail is longer than the right.
To compare the degree of relationship between the variables, scatter plots and correlation coefficients are equally important. Scatterplot uses dots to represent different variables and visualize their relationship depending on their position on the horizontal and vertical axes. To ascertain whether there is a significant level of the relationship, the correlation coefficient denoted by r is computed against the two variables. It measures the linear relationship where values within 0.7 – 1.0 indicate a strong positive linear relationship. From the summary Home value/ HH Inc, Home value/ Per Cap Inc and HH/ Per Cap Inc all had positive linear correlation at 0.8192799, 0.729917, and 0.92955, respectively. In contrast, Home Value/Pct Owner Occ, HH Inc/Pct Owner Occ and Per Cap Inc/Pct Owner Occ had a negative linear correlation at -0.6089918, -0.5089918 and – 0.34966 respectively.
From the finding of the descriptive statistics, it can be concluded that:
- The Home value has a Home value dataset column contains values that lie abnormally far from the other random variables. This could arise due to experimental errors of variability, and such data should be cleaned to remove the outliers.
- The majority of the variables, Home value, HH Inc, and Per Cap Inc, are positively skewed at 1.5377619, 0.608919, and 0.74437462, respectively. It can be concluded that the right tail of these distributions is longer than the left tail.
- From the scatterplot visualization and correlation coefficient, three variables had a positive correlation of between 0.7 and 1.0. HH/Per Cap, Home Value/HH Inc, and Home value/Per Cap Inc correlated 0.929558967, 0.819279971, and 0.72991782. It can be concluded that there exists the highest positive linear relationship between Home value Inc and HH Inc, as supported by the correlation coefficient of 0.81927991.
Cite this page
Paper Sample on Exploring Data Analysis: Central Tendency, Variability, and Relationships Between Variables. (2023, Oct 12). Retrieved from https://speedypaper.net/essays/paper-sample-on-exploring-data-analysis-central-tendency-variability-and-relationships-between-variables
Request Removal
If you are the original author of this essay and no longer wish to have it published on the SpeedyPaper website, please click below to request its removal:
- US Death Rates - Statistics Essay Example
- Free Essay with Starbucks Case Study
- Free Essay Sample on a New Constitutional Convention
- Research Paper on Data Visualization Tools and Programming for Data Analytics
- Paper Example. The Keystone XL Pipeline Project
- Essay Sample on. Addressing Ethical Considerations
- Essay Sample on Community Health Promotion Programs
Popular categories