hivesql连续日期统计最大逾期未逾期案例
hivesql连续日期统计最大逾期/未逾期案例
1、虚表(测试表和数据) create test_table as
select a.cust_no, a.r_date, a.yqts from (
select '123' as cust_no, '20231101' as r_date, 0 as yqts
union all
select '123' as cust_no, '20231102' as r_date, 1 as yqts
union all
select '123' as cust_no, '20231103' as r_date, 2 as yqts
union all
select '123' as cust_no, '20231104' as r_date, 3 as yqts
union all
select '123' as cust_no, '20231105' as r_date, 0 as yqts
union all
select '123' as cust_no, '20231106' as r_date, 0 as yqts
union all
select '123' as cust_no, '20231107' as r_date, 1 as yqts
) a
2、加入有表test_table,数据如上,0标识未逾期,1标识逾期,求连续最大无逾期和逾期的天数,以上数据可直接看出最大连续逾期天数为3、未逾期未2。
3、首先将测试数据排序,让数据连续
select a.cust_no, a.r_date, a.yqts from test_table a where 1=1 order by a.cust_no, a.r_date asc
4、将数据分组并给是否逾期打上标识1逾期,2未逾期,多行转一行
select tx.cust_no, count(distinct tx.r_date) as sum_ctn, concat_ws('',collection_list(case when cast(tx.yqts as double) > 0 then '1' else '2' end)) as sfflag from (
select a.cust_no, a.r_date, a.yqts from test_table a where 1=1 order by a.cust_no, a.r_date asc
) tx
执行结果:
cust_no sum_ctn sfflag
123 7 2111221
5、使用正则表达式分别替换,2111221改字符串中包含逾期和未逾期,将逾期1*都用A替换掉(同时也是分割符),剩下就是未逾期的,逾期的同理
select ty.cust_no, ty.sum_ctn, regexp_replace(ty.sfflag, '\2+', 'A') as yqflag, regexp_replace(ty.sfflag, '\1+', 'A') as wyqflag from (
'123' as cust_no 7 as sum_ctn '2111221' as sfflag
) ty
执行结果:
cust_no sum_ctn yqflag wyqflag
123 7 A111A1 2A22A
6、使用炸裂函数explode结合lateral view将数据拆分未多行,使用一次lateral view会生成一次虚表
select tz.cust_no, max(sum_ctn) as sum_ctn, max(length(yqlength)) as yqlength, max(length(wyqlength)) as wyqlength from (
123 as cust_no 7 as sum_ctn A111A1 as yqflag 2A22A as wyqflag
) tz
lateral view explode(split(yqflag,'A')) tb1 as yqlength
lateral view explode(split(wyqflag,'A')) tb2 as wyqlength
group by tz.cust_no
- IDE模式下安装Win7改回AHCI后蓝屏重启的解决方法
- 各种品牌型号的打印机恢复出厂设置的方法汇总
- uniapp中input框聚焦
- 优思学院:SPC控制线为什么是±3西格玛?
- 股市助手:实时股市快讯,真人语音播报,助您第一时间获取最新资讯(自己写的分享给需要的人)
- 【Python】基础练习题
- acedGetKword关键字使用例子
- P9838 挑战 NPC IV ( NOIP模拟赛T3 )
- MATLAB中train函数用法
- 力扣第496题 下一个更大元素 I C++ 暴力
- 【Linux】第十六站:进程地址空间
- 利用Python群组分析方法剖析客户行为
- 什么是智能井盖?万宾科技的智能井盖传感器的效果
- [100天算法】
- Linux上C++通过LDAP协议使用kerberos认证AES加密连接到AD服务器
- BMC Helix解决方案落地亚马逊云科技中国区域,同时上线Marketplace
- 机器视觉系统的组成