JSON多层嵌套复杂结构数据扁平化处理转为行列数据

背景

公司的中台产品，需要对外部API接口返回的JSON数据进行采集入湖，有时候外部API接口返回的JSON数据层级嵌套比较深，举个栗子:

上述的JSON数据中，最外层为请求返回对象，data里面包含返回的业务数据，业务数据按照学校 / 班级 / 学生进行嵌套

在数据入湖时，需要按照最内层的学生视角将数据拆分为行列数据，最终的拆分结果如下:

由于对接的外部API接口返回的JSON数据结构不是统一的、固定的，所以需要通过一种算法对每一层对象、数组进行遍历和钻取，实现JSON数据的扁平化

网上找了一些JSON扁平化的中间件，例如：Json2Flat在扁平化处理过程不太完美，不支持跨层级的数组嵌套结构

所以决定自己实现扁平化处理

关键代码如下:

public class LinkedNode {

    private LinkedNode parent;

    private String parentName;

    private Map<String, Object> data;

    public LinkedNode(LinkedNode parent, String parentName, Map<String, Object> data) {

        this.parent = parent;

        this.parentName = parentName;

        this.data = data;

    }

}

public class JSONFlatProcessor {

    private LinkedList<LinkedNode> nodes;

    private LinkedList<String> column;

    private List<Object[]> data;

    public void find(LinkedNode parent, String parentName, Map<String, Object> data) {

        LinkedNode node = new LinkedNode(parent, parentName, data);

        if (!hasObjectOrArray(data)) {

            nodes.add(node);

        } else {

            for (Map.Entry entry : data.entrySet()) {

                if (entry.getValue() instanceof Map) {

                    find(node, String.valueOf(entry.getKey()), (Map<String, Object>) entry.getValue());

                } else if (isObjectArray(entry.getValue())) {

                    find(node, String.valueOf(entry.getKey()), (List<Map<String, Object>>) entry.getValue());

                }

            }

        }

    }

    public void find(LinkedNode parent, String parentName, List<Map<String, Object>> data) {

        for (Map<String, Object> item : data) {

            find(parent, parentName, item);

        }

    }

    protected Boolean hasObjectOrArray(Map<String, Object> item) {

        Object field;

        for (Map.Entry entry : item.entrySet()) {

            field = entry.getValue();

            if (field instanceof Map || isObjectArray(field)) {

                return Boolean.TRUE;

            }

        }

        return Boolean.FALSE;

    }

    protected Boolean isObjectArray(Object object) {

        return object instanceof List

                && !CollectionUtils.isEmpty((List) object)

                && ((List) object).get(0) instanceof Map;

    }

    public JSONFlatProcessor process(List<Map<String, Object>> data) {

        nodes = new LinkedList<>();

        find(null, null, data);

        return this;

    }

    public JSONFlatProcessor process(Map<String, Object> data) {

        nodes = new LinkedList<>();

        find(null, null, data);

        return this;

    }

    public LinkedList<LinkedNode> getNodes() {

        return nodes;

    }

    public List<String> getColumn() {

        if (CollectionUtils.isEmpty(nodes)) {

            return Collections.emptyList();

        }

        column = new LinkedList<>();

        collectColumn(nodes.getFirst());

        return column;

    }

    protected void collectColumn(LinkedNode node) {

        List<String> innerColumn = new ArrayList<>(node.getData().size());

        String columnBuilder;

        for (Map.Entry entry : node.getData().entrySet()) {

            if (!(entry.getValue() instanceof Map || isObjectArray(entry.getValue()))) {

                columnBuilder = null == node.getParentName()? String.valueOf(entry.getKey()) : String.format("%s.%s", node.getParentName(), entry.getKey());

                innerColumn.add(columnBuilder);

            }

        }

        column.addAll(0, innerColumn);

        if (null != node.getParent()) {

            collectColumn(node.getParent());

        }

    }

    public List<Object[]> getData() {

        if (CollectionUtils.isEmpty(nodes)) {

            return Collections.emptyList();

        }

        data = new ArrayList<>(nodes.size());

        LinkedList<Object> container;

        for (LinkedNode node : nodes) {

            container = new LinkedList<>();

            collectData(node, container);

            data.add(container.toArray());

        }

        return data;

    }

    protected void collectData(LinkedNode node, LinkedList<Object> container) {

        List<Object> innerData = new ArrayList<>(node.getData().size());

        for (Map.Entry entry : node.getData().entrySet()) {

            if (!(entry.getValue() instanceof Map || isObjectArray(entry.getValue()))) {

                innerData.add(entry.getValue());

            }

        }

        container.addAll(0, innerData);

        if (null != node.getParent()) {

            collectData(node.getParent(), container);

        }

    }

    protected static class CollectionUtils {

        public static boolean isEmpty(Collection<?> collection) {

            return (collection == null || collection.isEmpty());

        }

    }

}

public class MainTests {

    public static void main(String[] args) throws Exception {
        String jsonStr = "{\"code\":200,\"requestId\":\"1680177848458\",\"data\":[{\"school\":\"xxx市第一实验小学\",\"no\":\"1001\",\"class\":[{\"name\":\"一(1)班\",\"teacher\":\"吴老师\",\"student\":[{\"name\":\"张同学\",\"age\":6},{\"name\":\"王同学\",\"age\":7}]}]},{\"school\":\"xxx市第二实验小学\",\"no\":\"1002\",\"class\":[{\"name\":\"一(2)班\",\"teacher\":\"陈老师\",\"student\":[{\"name\":\"欧阳同学\",\"age\":6}]}]}]}";
        ObjectMapper jsonMapper = new ObjectMapper();
//        List<Map<String, Object>> map = jsonMapper.readValue(jsonStr, List.class);
        Map<String, Object> map = jsonMapper.readValue(jsonStr, Map.class);

        JSONFlatProcessor processor = new JSONFlatProcessor().process(map);
        System.out.println("数据条数: " + processor.getNodes().size());
        System.out.println("字段名: " + processor.getColumn());
        System.out.println("首行数据: " + new ObjectMapper().writeValueAsString(processor.getData().get(0)));
    }

}

数据条数: 3

字段名: [code, requestId, data.school, data.no, class.name, class.teacher, student.name, student.age]

首行数据: [200,"1680177848458","xxx市第一实验小学","1001","一(1)班","吴老师","张同学",6]

JSON多层嵌套复杂结构数据扁平化处理转为行列数据的更多相关文章

【SpringBoot】 Java中如何封装Http请求，以及JSON多层嵌套解析
前言本文中的内容其实严格来说不算springboot里面的特性,属于JAVA基础,只是我在项目中遇到了,特归纳总结一下. HTTP请求封装目前JAVA对于HTTP封装主要有三种方式: 1. JAV ...
Json多层嵌套，要怎么提取？
一直用Jmeter的Json Extactor,对于多层的Json嵌套,很好用,自己写代码的时候,总是遇到各种Exception 看了网上的资料,整理一下 1. 最简单的JSON提取,只有一层的时候 ...
mybatis 注解写法多层嵌套foreach，调用存储过程，批量插入数据
@Select("<script>" + "DECLARE @edi_Invoice_Details edi_Invoice_Details;" + ...
【JS简洁之道小技巧】第一期扁平化数组
介绍两种方法,一是ES6的flat,简单粗暴.二是递归,也不麻烦. flat ES6自带了flat方法,用于使一个嵌套的数组扁平化,默认展开一个嵌套层.flat方法接收一个数字类型参数,参数值即嵌套层 ...
ASP.NET提取多层嵌套json数据的方法
本文实例讲述了ASP.NET利用第三方类库Newtonsoft.Json提取多层嵌套json数据的方法,具体例子如下. 假设需要提取的json字符串如下: {"name":&quo ...
.net(c#)提取多层嵌套的JSON
Newtonsoft.Json.Net20.dll 下载请访问http://files.cnblogs.com/hualei/Newtonsoft.Json.Net20.rar 在.net 2.0中提 ...
[转]easyui tree 模仿ztree 使用扁平化加载json
原文地址:http://my.oschina.net/acitiviti/blog/349377 参考文章:http://www.jeasyuicn.com/demo/treeloadfilter.h ...
c#多层嵌套Json
Newtonsoft.Json.Net20.dll 下载请访问http://files.cnblogs.com/hualei/Newtonsoft.Json.Net20.rar 在.net 2.0中提 ...
提取多层嵌套Json数据
在.net 2.0中提取这样的json {"name":"lily","age":23,"addr":{"ci ...
多层嵌套的json数据
很多时候我们见到的json数据都是多层嵌套的,就像下面这般: {"name":"桔子桑", "sex":"男", , & ...

随机推荐

免费拥有自己的 Github 资源加速器
TurboHub 是一个免费的 Github 资源加速下载站点,可以帮助你快速下载 Github 上的资源.其核心逻辑是通过 Azure Static Web Apps 服务和 Azure Funct ...
文心一言 VS 讯飞星火 VS chatgpt （76）-- 算法导论7.3 1题
一.如果用go语言,为什么我们分析随机化算法的期望运行时间,而不是其最坏运行时间呢? 文心一言: 在分析算法的性能时,期望运行时间是一个重要的指标,它描述了算法在平均情况下的表现.期望运行时间考虑了所 ...
[ABC305D] Sleep Log题解
题目大意给 \(N\) 个时刻: 当 \(i\) 为奇数时,\(A_i\) 表示刚刚起床的时刻. 当 \(i\) 为偶数时,\(A_i\) 表示开始睡觉的时刻. 有 \(Q\) 次询问,每次求在 \ ...
Ascend C保姆级教程：我的第一份Ascend C代码
本文分享自华为云社区<Ascend C保姆级教程:我的第一份Ascend C代码>,作者:昇腾CANN . Ascend C是昇腾AI异构计算架构CANN针对算子开发场景推出的编程语言,原 ...
用 ChatGPT 做一个 Chrome 扩展 | 京东云技术团队
用ChatGPT做了个Chrome Extension 最近科技圈儿最火的话题莫过于ChatGPT了. 最近又发布了GPT-4,发布会上的Demo着实吸睛. 笔记本上手画个网页原型,直接生成网页.网友 ...
MindSponge分子动力学模拟——使用迭代器进行系统演化（2023.09）
技术背景在前面几篇博客中,我们已经介绍过使用MindSponge去定义一个系统以及使用MindSponge计算一个分子系统的单点能.这篇文章我们将介绍一下在MindSponge中定义迭代器Updat ...
CopyOnWriteArrayList 写时复制思想
写时复制 conpyOnWrite容器即写时复制容器.往一个容器添加元素的时候,不直接往当前容器Object[]添加,而是先将当前容器Object[]进行Copy,复制出一个新的容器Object[] ...
MySQL 的 InnoDB 存储引擎简介
MySQL 是世界上最流行的开源关系型数据库管理系统之一,而其中的存储引擎则是其关键组成部分之一.InnoDB 存储引擎在 MySQL 中扮演了重要角色,提供了许多高级功能和性能优化,适用于各种应用程 ...
Docker系列——介绍、安装、镜像、容器、docker容器与镜像、数据卷、Dockerfile、docker 配置pycharm连接
目录 1 Docker 介绍 1.1 简介 1.2 Docker平台介绍 1.3 为什么使用Docker 2 Docker 整体结构(了解) 2.1 Docker引擎介绍 (Docker Engine ...
轻量通讯协议 --- MQTT
介绍一.MQTT简介 MQTT(Message Queuing Telemetry Transport) 是一种轻量级的消息传输协议,通常用于在物联网(IoT)和传感器网络中进行通信.它设计用于在低 ...

JSON多层嵌套复杂结构数据扁平化处理转为行列数据

背景

JSON多层嵌套复杂结构数据扁平化处理转为行列数据的更多相关文章

随机推荐

热门专题