Published in · 4 min read · 4 days ago
--
A lot of manual work goes into the collection, analysis, and dissemination of user insights. The most common pain point we hear from our community of researchers, product managers, and designers at Riley can be distilled into this one sentence:
“I want to make better data-informed decisions, but there’s just so much data out there, and I don’t always have the time or know-how to synthesize everything in a meaningful and impactful way that can influence my stakeholders.”
As someone who has witnessed the incredible transformation of the user insights industry (research and product analytics) over the past decade — both as an individual contributor and eventually as a functional leader — it pains me to see how much time we still spend on manual tasks when preparing and analyzing user insights. It’s also surprising how inconsistent the methods of gathering and analyzing insights continue to be, especially when the expectation is that these insights will influence business-wide strategies.
Here are the two most impactful ways you can use machine learning to make your user insights more scalable and effective:
Preparation: Finding Unique User Cohorts
What:
Instead of creating pivot tables against psychographic survey data or manually scouring through your CRM to build user cohorts, you can use machine learning models to automatically segment users into meaningful cohorts. These cohorts can be segmented based on behavior, demographics, or preferences, allowing you to create more targeted research and analysis specific to each segment.
How:
- Identify the sources of data that you would like to use. For this example, let’s assume we work at an e-commerce company and want to use a combination of shopping frequency, length of time browsing on the app, location, and age as variables.
- Run a clustering model to identify initial segments. The model I use most frequently is k-means, which partitions data into clusters based on how close each data point is to the centroid of each cluster.
Here is a sample script that you can use to run a k-means analysis for segmentation, assuming the data you collected in Step 1 is in .csv format
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt# Step 1: Load your data from a CSV file
data = pd.read_csv('your_data.csv')
# Step 2: Select the features/columns you want to cluster on
# Assuming we want to cluster based on all numerical columns in the dataset
# If you have specific columns, replace data with data[['column1', 'column2']]
X = data.select_dtypes(include=['float64', 'int64']) # Select only numerical columns
# Step 3: Apply KMeans clustering
kmeans = KMeans(n_clusters=3, random_state=0) # Initialize KMeans with 3 clusters
kmeans.fit(X) # Fit the model to the data
# Step 4: Add the cluster labels back to the original dataframe
data['Cluster'] = kmeans.labels_
# Step 5: Optional - Visualize the clusters (only for 2D or 3D data)
plt.scatter(X.iloc[:, 0], X.iloc[:, 1], c=data['Cluster'], cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='red', marker='X')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('KMeans Clustering')
plt.show()
# Step 6: Optional - Save the dataframe with clusters back to CSV
data.to_csv('clustered_data.csv', index=False)
Analysis: Distilling Themes Consistently in Qualitative Insights
What:
Think about the last time you read through hundreds of pages of customer research notes. You may have spent hours clustering key insights into themes using post-it notes or tagging them into themes on a spreadsheet. Often, the teams I have spoken to mention that, in addition to this process being time-consuming, it is often inconsistent, leading different people to interpret and produce varying insights from the same data.
Instead of manually analyzing qualitative data, you can use natural language processing (NLP) to more objectively distill key themes from all of your data. Since the algorithms and models used in NLP are designed based on inherently objective mathematical principles, you can feel more confident that you are generating consistent insights from your qualitative data.
How:
There are numerous NLP models, but they require significant investments to build, optimize, and maintain. We decided to build Riley to make this technology easily accessible to you! In just minutes, you can use Riley to generate consistent insights from all of your unstructured qualitative data.
If you’d like to see what we’re building at Riley, you can sign up here or send me a message!
I’ll share more tips on how you can use machine learning to make collecting, interpreting, and applying user insights to your decision-making easier in the coming weeks. I’d also love to hear about how you’re currently doing this in your role today!