# Introduction

The goal of Highlights in Canvs MRX is to empower researchers to immediately and in plain language understand the biggest, most statistically significant takeaways from their open ended responses.

To start, best-in class coding of topics and emotions on every open ended response unlocks the ability for researchers to immediately understand how people uniquely feel about a given topic as compared to all others, and what topics or emotions are being expressed by any given cohort.

# Understanding the Algorithm

The Canvs Highlights algorithm uncovers meaningful patterns in open ended responses by automatically running an array of statistical comparisons across topics, emotions, and all of the filters included during the upload process.

The algorithm reports five main types of “highlights” (if they are available):

## Filter-Topics:

The algorithm identifies topics that are over-indexed for a certain subgroup (e.g.,Age 18-25) within a filter (e.g., Age groups), by comparing reactions by that subgroup to reactions by all other subgroups.

** Example**: “[Female] subgroup mentioned 2.9

*x more Expensive*compared to Male."

## Topic-Emotions:

The algorithm identifies emotions that are over-indexed for a certain topic, by comparing reactions about a focal topic to all other reactions.

** Example**: “[Quality] topic received

*1.8x more Enjoyment*compared to the average of all other topics."

## Filter-Emotions:

The algorithm identifies emotions that are over-indexed for a certain subgroup (e.g., Age 18-25) within a filter (e.g., Age groups), by comparing reactions by that subgroup to reactions by all other subgroups.

** Example**: “[Age 18-25] subgroup expressed 2.6

*x more Unsure*compared to the other 6 Age Range subgroups."

The algorithm considers both commonly-accepted statistical criterion as well as practical significance when determining whether a specific emotion/topic is “over-indexed”. In terms of statistical criterion, the algorithm incorporates both small-sample and multiple-comparison corrections when calculating statistical significance (see the following section under “Statistical Details" for more explanation). In terms of practical significance, a highlight is included only if there are 10 or more “relevant” open-ends in the dataset. For instance, in the Filter-Topic example above, that highlight will be included only if there are at least 3 open-ends by the Age 18-25 subgroup that mentioned the topic “Storyline”.

**Wave-Topic:**

The algorithm identifies topics that are over-indexed for a certain wave, by comparing reactions about a focal wave to all other topics.

**Wave-Emotion:**

The algorithm identifies emotions that are over-indexed for a certain wave, by comparing reactions about a focal wave to all other reactions.

# Statistical Details

The algorithm conducts every pairwise comparison between a subgroup’s proportion (e.g., {age 18-25, love%} for “filter-emotions” comparison) and its complement (i.e., {not age 18-25, love%}). For each pairwise comparison, an Agresti-Coull correction is first applied when estimating sample proportions to correct for any small-sample error. Then, a test of equality of proportions is performed, resulting in a z-statistic as its test statistics; the z-statistics is then converted to a p-value and compared to a statistical significance threshold that is determined by Bonferroni correction, to correct for multiple comparisons. Comparisons with p-values that are smaller than the threshold are considered statistically significant.

You can read more about our overall methodology at www.canvs.ai/methodology.

Click here to learn more about Highlights for Canvs MRX.