Instacart Customer Analysis

Project Overview
During the project, four different datasets including customer, product, orders, and sales data of an online grocery delivery service application were analyzed in order to identify patterns in customer spending habits through customer segmentation as the company is looking to improve their customer targeting for ad campaigns.
The majority of the data I had access to was fabricated. This made the data prone to biases and generation of unrealistic results.
Although the data was fake, I removed columns that contained customers' personal data to practice ensuring data privacy.
I conducted a comprehensive customer analysis for Instacart as part of my data immersion course at CareerFoundry to improve my skills in using Python libraries for data analysis.
Tools and Techniques
Datasets were cleaned, wrangled, and merged using Python programming language, NumPy and Pandas libraries. Visualizations were created using Matplotlib and Seaborn libraries.
Data Cleaning
Before conducting exploratory analysis, I wrangled, cleaned, and merged all four datasets to create the final analysis ready dataset which contained over 32 million records. This step involved ensuring data is in consistent format, addressing unexplained missing values, and removing duplicate records.
Exploratory Analysis
During this phase, statistical aggregations yielded valuable insights for the sales and marketing team. I leveraged visualizations in order to effectively illustrate the patterns in customer spending habits, specifically focusing on busy and slow days of the week and times of the day in terms of customer order frequency.

Customer Profiling
and Analysis
I classified customers in terms of loyalty and demographics which required me to derive new variables and flags from the existing columns based on my classification. The loc functions in Pandas library has made this process extremely easy. Again, with the aid of charts, I was able to demonstrate customers' standing in terms of loyalty status, product popularity, and spending habits.
Findings and Recommendations
Since I did not have access to actual data and had to analyze a simulated version of data, the results differed significantly from my initial expectations; particularly regarding the popularity of products among different customer segments. Contrary to my expectations, product popularity varied only slightly across different categories of customers.
In spite of this, based on the charts, the marketing team may be interested in focusing on married customers as they make up the highest spending group.
As a next step, I would have considered creating a suitable machine learning model such as a decision tree for targeting customers more efficiently if significant differences in product popularity were observed among different customer segments.

Want to get in touch?
Drop me a line!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.