Stitch Fix SQL Interview Question

Question:  You have been given the below table on transactions from users. Write a query to get the number of users and total products bought per latest transaction date where each user is bucketed into their latest(last) transaction date.

Explanations & Definition: every user will have a last(lastest) transaction date, also multiple users can have the same latest transaction date. Also, it’s possible that a user had multiple transactions on his/her latest transaction date. Total Products should reflect the count of all products purchased, not necessarily distinct products.

Difficulty Level: Medium, SourceCrackingProductTag: Stitch Fix SQL Interview Question

Assumptions: No NULLS present; DBMS: Spark SQL

Output Columnslatest_trans_date, num_users, total_products_bought

Approach:

Steps: First calculate the last (latest) transaction date for each user and the total products bought, save that as a CTE, now using the CTE find the final answer.

select latest_trans_date, count(distinct user_id) as num_users, count(product_id) as total_products_bought
--- here for the total_products_bought column value any column from the subquery can be used inside the count aggregation function
from(
    select user_id, transaction_id, product_id, transaction_date,
    max(transaction_date) over(PARTITION by user_id) as latest_trans_date 
    from user_transactions)
where latest_trans_date = transaction_date --- filter only those transactions happening on the last purchase date
group by latest_trans_date
order by latest_trans_date

Leave a Comment