Opendoor SQL Interview Question

Question: Assume you are given the below table on house prices of listed properties from various zip codes. Write a query to get the top 5 zip codes by market share of house prices for all zip codes. Also return the corresponding market share (in %, 2 decimal) as well and filter zip codes with less than 100 properties listed.

Explanation & Definition: market share of house price of a zip code = total value of properties in that zip code/ total value of all properties

Difficulty Level: Easy, SourceCrackingProductTag: Opendoor SQL Interview Question

Assumptions: No NULLS present; DBMS: Spark SQL

Output Columnszip_code, total_price, market_share

Approach:

Step-1: for each zip code, find the total property price, divide this price value by the total property price across all zip codes to get the corresponding market share.

Solution:

--- Method-1:
select zip_code,
 sum(price) as total_price,
 round(100*sum(price)/(sum(sum(price)) over()),2) as market_share
 from housing
 group by zip_code
 having count(house_id) >=100
 order by market_share DESC
 limit 5

--- Method-2:
with zip_level_prices as (
     select zip_code,
     sum(price) as total_price
     from housing
     group by zip_code
     having count(house_id) >=100)

select zip_code, total_price, round(100*total_price/(sum(total_price) over()),2) as market_share
from zip_level_prices
order by 3 desc
limit 5

Note: the solution shared above in method-1 will most likely work only on Spark SQL since other dbms don’t allow group by aggregation and window function aggregation in a single query.

to tackle that issue, use the results of zip_code, total price in a Common table expression (CTE) or in a subquery then using that result calculate the market share, as shown in the 2nd Method

Leave a Comment