MySQL 计算新登录用户次日留存率
2026/5/6 9:05:29 网站建设 项目流程

本文介绍了三种计算MySQL中新登录用户次日留存率的SQL解法。


解法一使用自连接和子查询,兼容性好且性能优秀;


解法二基于MySQL 8.0+的窗口函数;


解法三通过WITH子句实现清晰逻辑。


核心步骤包括:

1)获取用户首次登录日期;

2)判断次日是否登录;

3)计算留存率(留存用户数/新用户总数)。


文章还对比了不同写法,并扩展了按日期、设备统计及多日留存率的场景,最后强调了整数除法转换和日期处理等注意事项。


MySQL 计算新登录用户次日留存率


解法一:使用自连接 + 子查询(推荐)

sql

SELECT ROUND( COUNT(DISTINCT t1.user_id) * 1.0 / (SELECT COUNT(DISTINCT user_id) FROM login WHERE (user_id, date) IN (SELECT user_id, MIN(date) FROM login GROUP BY user_id)), 3 ) AS p FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) t1 JOIN login l ON t1.user_id = l.user_id AND l.date = DATE_ADD(t1.first_date, INTERVAL 1 DAY);

解法二:使用窗口函数(MySQL 8.0+)

sql

SELECT ROUND( COUNT(DISTINCT CASE WHEN l.date = DATE_ADD(t.first_date, INTERVAL 1 DAY) THEN l.user_id END) * 1.0 / COUNT(DISTINCT t.user_id), 3 ) AS p FROM ( SELECT user_id, MIN(date) OVER (PARTITION BY user_id) AS first_date FROM login ) t JOIN login l ON t.user_id = l.user_id GROUP BY t.first_date;

解法三:使用两个子查询(最清晰)

sql

WITH first_login AS ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ), retention AS ( SELECT f.user_id, f.first_date, CASE WHEN l.date IS NOT NULL THEN 1 ELSE 0 END AS is_retained FROM first_login f LEFT JOIN login l ON f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) ) SELECT ROUND(SUM(is_retained) * 1.0 / COUNT(*), 3) AS p FROM retention;

原理说明

计算步骤:

  1. 找出每个用户首次登录日期

    sql

    SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id
  2. 判断次日是否登录

    • 用户1:首次登录2020-10-12,次日(2020-10-13)有登录 → 留存 ✅

    • 用户2:首次登录2020-10-12,次日(2020-10-13)有登录 → 留存 ✅

    • 用户3:首次登录2020-10-12,次日(2020-10-13)无登录 → 未留存 ❌

    • 用户4:首次登录2020-10-13,次日(2020-10-14)无登录 → 未留存 ❌

  3. 计算留存率

    • 留存用户数:2

    • 新用户总数:4

    • 留存率:2/4 = 0.5 → 0.500

完整示例(建表 + 数据 + 查询)

sql

-- 建表 CREATE TABLE `login` ( `id` int(4) NOT NULL, `user_id` int(4) NOT NULL, `client_id` int(4) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`id`) ); -- 插入数据 INSERT INTO login VALUES (1,2,1,'2020-10-12'), (2,3,2,'2020-10-12'), (3,1,2,'2020-10-12'), (4,2,2,'2020-10-13'), (5,4,1,'2020-10-13'), (6,1,2,'2020-10-13'), (7,1,2,'2020-10-14'); -- 查询(解法一详细版) SELECT ROUND( COUNT(DISTINCT retained.user_id) * 1.0 / COUNT(DISTINCT first.user_id), 3 ) AS p FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) first LEFT JOIN login retained ON first.user_id = retained.user_id AND retained.date = DATE_ADD(first.first_date, INTERVAL 1 DAY);

各种写法对比

写法1:COUNT + CASE WHEN

sql

SELECT ROUND( COUNT(CASE WHEN l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) THEN 1 END) * 1.0 / COUNT(*), 3 ) AS p FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) f LEFT JOIN login l ON f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY);

写法2:使用 EXISTS

sql

SELECT ROUND( COUNT(*) * 1.0 / ( SELECT COUNT(DISTINCT user_id) FROM login WHERE (user_id, date) IN (SELECT user_id, MIN(date) FROM login GROUP BY user_id) ), 3 ) AS p FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) f WHERE EXISTS ( SELECT 1 FROM login l WHERE f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) );

扩展场景

场景1:计算不同日期的次日留存率

sql

SELECT first_date, ROUND( COUNT(CASE WHEN l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) THEN 1 END) * 1.0 / COUNT(*), 3 ) AS retention_rate FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) f LEFT JOIN login l ON f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) GROUP BY first_date ORDER BY first_date;

场景2:计算3日、7日留存率

sql

SELECT ROUND( COUNT(CASE WHEN l3.date IS NOT NULL THEN 1 END) * 1.0 / COUNT(*), 3 ) AS retention_3day FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) f LEFT JOIN login l3 ON f.user_id = l3.user_id AND l3.date = DATE_ADD(f.first_date, INTERVAL 3 DAY);

场景3:按设备统计留存率

sql

SELECT client_id, ROUND( COUNT(CASE WHEN l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) THEN 1 END) * 1.0 / COUNT(*), 3 ) AS retention_rate FROM ( SELECT user_id, MIN(date) AS first_date, client_id FROM login GROUP BY user_id ) f LEFT JOIN login l ON f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) GROUP BY client_id;

MySQL vs SQLite 语法对比

操作MySQLSQLite
日期加一天DATE_ADD(date, INTERVAL 1 DAY)date(date, '+1 day')
四舍五入ROUND(x, 3)ROUND(x, 3)
防止整数除法* 1.0 /* 1.0 /

验证计算过程

sql

-- 查看每个用户的新登录日期和次日登录情况 SELECT f.user_id, f.first_date, CASE WHEN l.date IS NOT NULL THEN '留存' ELSE '未留存' END AS status, l.date AS next_day_login FROM ( SELECT user_id, MIN(date) AS first_date FROM login GROUP BY user_id ) f LEFT JOIN login l ON f.user_id = l.user_id AND l.date = DATE_ADD(f.first_date, INTERVAL 1 DAY) ORDER BY f.user_id;

结果:

user_idfirst_datestatusnext_day_login
12020-10-12留存2020-10-13
22020-10-12留存2020-10-13
32020-10-12未留存NULL
42020-10-13未留存NULL

注意事项

  1. 新用户定义:首次登录的用户,不是所有登录用户

  2. 次日定义:首次登录日期的后一天(DATE_ADD(first_date, INTERVAL 1 DAY)

  3. 一天多次登录:只要次日有登录即可,使用DISTINCT去重

  4. 整数除法:必须* 1.0CAST(... AS DECIMAL)

  5. 保留三位小数:使用ROUND(..., 3)


推荐使用解法一(自连接 + 子查询),兼容性好,逻辑清晰,性能优秀。

需要专业的网站建设服务?

联系我们获取免费的网站建设咨询和方案报价,让我们帮助您实现业务目标

立即咨询