Data Display Basics
When choosing the best visual representation for a dataset, there are a number of factors to consider. This guide will not explore all of these factors but will serve as an introduction to this concept to help you get started.
Levels of Measurement:
One of the most basic factors to consider is the level of measurement of the variable(s) being displayed. In general, categorical variables (nominal, ordinal) work best for pie charts and bar graphs. Continuous variables (interval, ratio) work better with histograms and scatterplots. Let’s use some examples to explore why.
Descriptive Statistics – Categorical variables such as gender, level of education, or race/ethnicity are best represented by proportions or frequencies. It’s not possible to calculate the mean gender (How do you interpret a mean of .73?), so a visual showing frequencies or percentages is best.
Scale Variables – Since continuous variables do not form simple groups, like categorical variables, it’s not effective to use pie charts or bar graphs. Each height, for example, would be represented by a slice in the pie and there would be very few individuals represented by each slice. You could group the heights into classes, but in doing so, you’ve turned it into a categorical variable. Histograms are one way to represent scale variables.
Comparing Means between Groups – Sometimes we want to compare groups based on some outcome variable, meaning we are using two variables: one categorical, one continuous. In this case, we could use a bar chart with the independent variable (grouping variable) on the x-axis and the dependent variable (outcome variable) on the y-axis.
Comparing Two Continuous Variables – Another scenario you might have would be looking for relationships among two scale variables, like in a correlation or regression analysis. Scatterplots are an excellent visual for this type of data.