Improving Visualizations of Hierarchical Qualitative Data

Improving Visualizations of Hierarchical Qualitative Data

Visualizing qualitative data can be difficult if care is not taken for hierarchical characteristics. Variables representing levels of feelings can be presented in a horizontal range to improve comprehension. The online bank, Simple, includes a poll in its newsletter to account holders and often asks for levels of confidence with financial topics. Here’s how to present hierarchical qualitative data in a few different ways based on visualizations from Simple’s monthly newsletter.

To represent qualitative data, careful consideration should be given to:

  • Graph Type
  • Logical Order of Data
  • Color Scheme

Original Graphs

Graph 1

In September, Simple’s poll question was: “How confident do you feel making big purchases in today’s financial environment?” Here is the visualization that accompanied it.

A pie graph created by Simple bank shows levels of confidence for account holder confidence making big purchases in today's financial climate.
Simple’s pie chart of its September survey results for: “How confident do you feel making big purchases in today’s financial environment?”

Although the legend is presented in a sensible high-to-low order, this graph is pretty confusing. The choice of a pie chart muddles the range of emotions being presented. The viewer’s eye, if moving clockwise, hits ‘Not at all Confident’ at about the same time as ‘Very Confident’. The color palette has no inherent significance for the survey responses. It does not travel on an easily understood color spectrum of high to low.

Graph 2

In November, Simple’s poll question was: “How do you feel about the money you’ll be spending this holiday season?” Below is the graph that illustrated these results.

A bar chart shows
Simple’s bar chart of November survey results for: “How do you feel about the money you’ll be spending this holiday season?”

Simple’s graph shows various emotions, but does not show them in any particular order, whether by percentage or type of feeling. Similar to the pie chart, the color palette does not have any particular significance.

Improved Graphs

Using Python and matplotlib’s horizontal stacked bar chart, I created different representations of the survey data for big purchase confidence and feelings about holiday spending. A bar chart presents results for viewers to read logically from left to right.

Graph 1

A horizontal bar chart shows Simple's survey results from high to low confidence levels for making big purchases in today's financial climate.
A horizontal stacked bar chart shows a variation of Simple’s September survey results.

I associated the levels of confidence with a green to red spectrum to signify the range of positive to negative feelings. Another variation could have been a monochrome spectrum where a dark shade moving to a lighter shades would signify decreasing confidence.

Graph 2

A horizontal stacked bar chart shows a range of emotions for holiday spending.
A horizontal stacked bar chart shows a variation of Simple’s November survey results.

I arranged the emotions from negative to positive feelings so they could show a spectrum. The color palette reflects the movements from troubled to excited by moving from red to green.

References

The survey data, as mentioned, comes from Simple‘s monthly newsletter.

This article from matplotlib on discrete distribution provided me with the base for these graphs. The main distinction is that I only included one bar to achieve the singular spectrum of survey results. I found variations of tree maps and waffle plots did not divide sections horizontally in rectangles as well as the stacked bar plot would.

Code

Visual #1 – September Survey Data

category_names1 = ['very \nconfident', 'somewhat \nconfident', 'mixed \nfeelings', 'not really \nconfident', 'not at all \nconfident']
results1 = {'': [14,16,30,19,21]}

def survey1(results, category_names):

    labels = list(results.keys())
    data = np.array(list(results.values()))
    data_cum = data.cumsum(axis=1)
    category_colors = plt.get_cmap('RdYlGn_r')(
        np.linspace(0.15, 0.85, data.shape[1]))

    fig, ax = plt.subplots(figsize=(12, 4))
    ax.invert_yaxis()
    ax.xaxis.set_visible(False)
    ax.set_xlim(0, np.sum(data, axis=1).max())

    for i, (colname, color) in enumerate(zip(category_names, category_colors)):
        widths = data[:, i]
        starts = data_cum[:, i] - widths
        ax.barh(labels, widths, left=starts, height=0.5,
                label=colname, color=color)
        xcenters = starts + widths / 2

        r, g, b, _ = color
        text_color = 'white' if r * g * b < 0.5 else 'darkgrey'
        for y, (x, c) in enumerate(zip(xcenters, widths)):
            ax.text(x, y, str(int(c))+'%', ha='center', va='center',
                    color=text_color, fontsize=20, fontweight='bold',
                   fontname='Gill Sans MT')
    ax.legend(ncol=len(category_names), bbox_to_anchor=(0.007, 1),
              loc='lower left',prop={'family':'Gill Sans MT', 'size':'15'})
    ax.axis('off')
    return fig, ax

survey1(results1, category_names1)

plt.suptitle(t ='How confident do you feel making big purchases in today\'s financial environment?', x=0.515, y=1.16, 
    fontsize=22, style='italic', fontname='Gill Sans MT')
#plt.savefig('big_purchase_confidence.jpeg', bbox_inches = 'tight')
plt.show()

Visual #2 – November Survey Data

category_names2 = ['in a pickle','worried','fine','calm','excited']
results2 = {'': [14,32,16,29,9]}


def survey2(results, category_names):

    labels = list(results.keys())
    data = np.array(list(results.values()))
    data_cum = data.cumsum(axis=1)
    category_colors = plt.get_cmap('RdYlGn')(
        np.linspace(0.15, 0.85, data.shape[1]))

    fig, ax = plt.subplots(figsize=(10.5, 4))
    ax.invert_yaxis()
    ax.xaxis.set_visible(False)
    ax.set_xlim(0, np.sum(data, axis=1).max())

    for i, (colname, color) in enumerate(zip(category_names, 
category_colors)):
        widths = data[:, i]
        starts = data_cum[:, i] - widths
        ax.barh(labels, widths, left=starts, height=0.5,
                label=colname, color=color)
        xcenters = starts + widths / 2

        r, g, b, _ = color
        text_color = 'white' if r * g * b < 0.5 else 'darkgrey'
        for y, (x, c) in enumerate(zip(xcenters, widths)):
            ax.text(x, y, str(int(c))+'%', ha='center', va='center',
                    color=text_color, fontsize=20, fontweight='bold', fontname='Gill Sans MT')
    ax.legend(ncol=len(category_names), bbox_to_anchor=(- 0.01, 1),
              loc='lower left', prop={'family':'Gill Sans MT', 'size':'16'})
    ax.axis('off')
    return fig, ax


survey2(results2, category_names2)
plt.suptitle(t ='How do you feel about the money you\'ll be spending this holiday season?', x=0.509, y=1.1, fontsize=22,
            style='italic', fontname='Gill Sans MT')
#plt.savefig('holiday_money.jpeg', bbox_inches = 'tight')
plt.show()