news 2026/4/18 5:14:15

SQL窗口函数(使用场景)

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
SQL窗口函数(使用场景)

1. 常见排名和排序

-- 行号 ROW_NUMBER() OVER (ORDER BY sales DESC) AS rank -- 排名(允许并列) RANK() OVER (ORDER BY sales DESC) AS rank -- 密集排名 DENSE_RANK() OVER (ORDER BY sales DESC) AS dense_rank -- 分位数 NTILE(4) OVER (ORDER BY salary DESC) AS quartile

2. 时间序列分析

-- 移动平均 SELECT date, sales, AVG(sales) OVER ( ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) AS 7_day_avg FROM daily_sales; -- 环比增长 SELECT month, revenue, (revenue - LAG(revenue, 1) OVER (ORDER BY month)) / LAG(revenue, 1) OVER (ORDER BY month) * 100 AS growth_rate FROM monthly_revenue;

3. 累计计算

-- 累计求和 SELECT date, sales, SUM(sales) OVER ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS running_total FROM sales; -- 累计百分比 SELECT customer, revenue, revenue / SUM(revenue) OVER () * 100 AS pct_total, SUM(revenue) OVER ( ORDER BY revenue DESC ROWS UNBOUNDED PRECEDING ) * 100.0 / SUM(revenue) OVER () AS cum_pct FROM customers;

4. 数据比较

-- 与前一行比较 SELECT date, temperature, temperature - LAG(temperature, 1) OVER (ORDER BY date) AS diff_prev, LEAD(temperature, 1) OVER (ORDER BY date) - temperature AS diff_next FROM weather_data; -- 与分组内第一行比较 SELECT department, employee, salary, FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) as top_salary, salary - FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) as gap_from_top FROM employees;

5. 高级分析场景

-- 会话分析(找出用户连续访问) SELECT user_id, login_time, LAG(logout_time) OVER (PARTITION BY user_id ORDER BY login_time) as prev_logout, CASE WHEN login_time <= LAG(logout_time) OVER (PARTITION BY user_id ORDER BY login_time) THEN 0 ELSE 1 END as is_new_session FROM user_sessions; -- 查找重复记录 SELECT *, ROW_NUMBER() OVER (PARTITION BY email, name ORDER BY created_at) as dup_count FROM users WHERE dup_count > 1; -- 计算留存率 WITH user_activity AS ( SELECT user_id, login_date, MIN(login_date) OVER (PARTITION BY user_id) as first_login, LEAD(login_date) OVER (PARTITION BY user_id ORDER BY login_date) as next_login FROM logins ) SELECT first_login as cohort_date, COUNT(DISTINCT user_id) as cohort_size, COUNT(DISTINCT CASE WHEN next_login = first_login + INTERVAL '1 day' THEN user_id END) as day1_retained FROM user_activity GROUP BY first_login;

6. 复杂业务场景

-- 漏斗分析 WITH funnel AS ( SELECT user_id, MAX(CASE WHEN event = 'view' THEN 1 ELSE 0 END) as viewed, MAX(CASE WHEN event = 'click' THEN 1 ELSE 0 END) as clicked, MAX(CASE WHEN event = 'purchase' THEN 1 ELSE 0 END) as purchased FROM events GROUP BY user_id ) SELECT COUNT(*) as total_users, SUM(viewed) as viewers, SUM(clicked) as clickers, SUM(purchased) as buyers, 100.0 * SUM(clicked) / NULLIF(SUM(viewed), 0) as click_rate FROM funnel; -- 间隔计算 SELECT user_id, event_time, EXTRACT(EPOCH FROM (event_time - LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time) )) as seconds_since_last_event FROM events;

7. 性能优化技巧

-- 避免自连接 -- 传统方式(需要自连接) SELECT a.id, a.value, MAX(b.value) as max_so_far FROM table a JOIN table b ON a.id >= b.id GROUP BY a.id, a.value; -- 使用窗口函数(更高效) SELECT id, value, MAX(value) OVER (ORDER BY id ROWS UNBOUNDED PRECEDING) as max_so_far FROM table;

最佳实践建议:

  1. 注意性能:窗口函数在大量数据上可能较慢,合理使用分区

  2. 结合索引:ORDER BY子句中的字段建议有索引

  3. 使用FILTER(如果数据库支持):

    AVG(salary) FILTER (WHERE department = 'Sales') OVER () as avg_sales_salary
  4. 明确窗口范围:明确指定ROWS或RANGE避免歧义

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/4/17 13:13:13

PowerDMIS跳转功能

PowerDMIS 中的“跳转”功能是一个非常实用和高效的工具&#xff0c;它允许您在程序的各个部分之间快速导航&#xff0c;尤其是在元素定义、评价和尺寸结果之间建立直观的链接。 1.跳转到引用元素 这个功能用于从评价的尺寸结果&#xff0c;反向定位到它所引用的原始几何元素 。…

作者头像 李华
网站建设 2026/4/13 17:15:47

如何看谷歌刚刚发布的Nano Banana Pro模型?

谷歌最近发布的 Nano Banana Pro 模型引起了人工智能领域的广泛关注&#xff0c;作为一款新一代的 图像生成模型&#xff0c;它不仅继承了前辈们的优势&#xff0c;还在多个维度上进行了突破和创新。那么&#xff0c;为什么Nano Banana Pro会成为焦点&#xff1f;它到底具备哪些…

作者头像 李华
网站建设 2026/4/10 15:35:55

飞桨PaddlePaddle入门与核心机制详解

飞桨PaddlePaddle入门与核心机制详解 在深度学习领域&#xff0c;选择一个合适的框架往往决定了项目从研发到落地的效率。当你面对TensorFlow、PyTorch等国际主流框架时&#xff0c;是否曾考虑过一个更贴近中国开发者需求的选择&#xff1f;百度开源的飞桨&#xff08;PaddlePa…

作者头像 李华
网站建设 2026/4/16 10:46:52

高并发下的图片上传问题处理

图片上传优化与7层负载均衡实践&#xff1a;解决高并发场景下的带宽瓶颈 写在前面 最近在做一个充电桩安装业务系统&#xff0c;安装师傅每天安装完充电桩后需要上传大量的现场图片&#xff08;每单可能10-20张&#xff0c;每张2-5MB&#xff09;。随着业务量增长&#xff0c;每…

作者头像 李华
网站建设 2026/4/17 7:49:11

Excalidraw使用技巧:高效绘图与AI转化

Excalidraw 使用技巧&#xff1a;高效绘图与 AI 转化 你有没有遇到过这样的场景&#xff1f;在一场远程会议中&#xff0c;产品经理刚讲完需求&#xff0c;你立刻被点名&#xff1a;“来&#xff0c;画个流程图梳理一下。”这时候&#xff0c;打开绘图软件、拖拽形状、对齐连线…

作者头像 李华