一道sql问题:Reviews of Categories

https://platform.stratascratch.com/coding/10049-reviews-of-categories?code_type=3

Find the top business categories based on the total number of reviews. Output the category along with the total number of reviews. Order by total reviews in descending order.

yelp_business

business_id:
varchar
name:
varchar
neighborhood:
varchar
address:
varchar
city:
varchar
state:
varchar
postal_code:
varchar
latitude:
float
longitude:
float
stars:
float
review_count:
int
is_open:
int
categories:
varchar

答案是:
select 1
union all
select n+1 from num where n<12
)
select
substring_index(substring_index(categories,’;’,n),’;’,-1) as category,
sum(review_count) as review_cnt
from
yelp_business
inner join
num
on
n <= char_length(categories) - char_length(replace(categories,’;’,’’)) + 1
group by
category

可以求老师和uu们解释一下吗

我看了一下具体数据,这个solution其实主要是把categories这个column从一行变成多行.

Auto Detailing;Automotive 变成两个row:
Auto Detailing
Automotive

先用with recursive生成一个temp table,里面只有一个column叫n,这个column里面是连续的整数从1到12(我猜测是因为categories里面最多能有12个值
然后就是query里面的substring_index, 可以去搜一下mysql的doc,简单理解就是根据n找到第n个分号的category,因为n是从1-12,但是categories里面不一定有这么多个category,所以用 n <= char_length(categories) - char_length(replace(categories,’;’,’’)) + 1 来知道categories里面每一行有多少个n

1 个赞