python 连接 oracle 统计指定表格所有字段的缺失值数

　　参考资料：python 连接oracle -- sqlalchemy及cx_Oracle的使用详解

　　oracle指定表缺失值统计 -- 基于cx_Oracle

import pandas as pd

import cx_Oracle as orcl

# 批量查询数据缺失率

def missing_count(table_name, where_condition={}, **engine):

    #where 条件参数化， str或dict

    sql_tab_columns = "select column_name from user_tab_columns \

        where table_name = '{}'".format(table_name)

    db = ConnectOracle(**engine)

    #sql_select.encode('utf-8')

    columns = db.select_oracle(sql=sql_tab_columns)

    #生成select语句

    ss = ''

    for col in columns.COLUMN_NAME:

        ss += 'sum(decode({},null, 1, 0)) as {}, '.format(col, col)

    ss = ss[:-2]

    #生成where条件

    wh = ''

    if where_condition:

        wh += ' where '

        if type(where_condition)==str:

            wh += where_condition

        if type(where_condition)==dict:

            for key in where_condition.keys():

                if type(where_condition[key])!=str:

                    wh += ('t.' + str(key) + ' = ' +

                           str(where_condition[key]) + ' and ')

                else:

                    wh += ("t." + str(key) + " = '" +

                           str(where_condition[key]) + "' and ")

            wh = wh[:-4]

    #print(ss)

    sql_select =  '''select count(*) as counts, {}

                    from {} t {}

                    '''.format(ss, table_name, wh)

    #print(sql_select)

    res = db.select_oracle(sql=sql_select)

    return pd.Series(res.values.tolist()[0], index=res.columns)

　　缺失值统计2 -- 基于sqlalchemy

import pandas as pd

#import cx_Oracle as orcl

from sqlalchemy import create_engine

# 批量查询数据缺失率

def missing_count(table_name, where_condition={}, **config):

    #where 条件参数化， str或dict

    #定义数据库连接

    #'oracle://qmcbrt:qmcbrt@10.85.31.20:1521/tqmcbdb'

    engine = 'oracle://{username}:{passwd}@{host}:{port}/{sid}'.format(**config)  #dbname -- 各版本语法不同

    db = create_engine(engine)

    #pd.read_sql_query(sql_tab_columns, db)

    #db.execute('truncate table {}'.format(ttb))

    #查询列名 -- 用于生成select项

    sql_tab_columns = "select column_name from user_tab_columns where table_name = '{}'".format(table_name)

    columns = pd.read_sql_query(sql_tab_columns, db)

    #生成select项

    ss = ''

    for col in columns.column_name:

        ss += 'sum(decode({}, null, 1, 0)) as {}, '.format(col, col)

    ss = ss[:-2]

    #生成where条件

    wh = ''

    if where_condition:

        wh += ' where '

        if type(where_condition)==str:

            wh += where_condition

        if type(where_condition)==dict:

            for key in where_condition.keys():

                if type(where_condition[key])!=str:

                    wh += ('t.' + str(key) + ' = ' +

                           str(where_condition[key]) + ' and ')

                else:

                    wh += ("t." + str(key) + " = '" +

                           str(where_condition[key]) + "' and ")

            wh = wh[:-4]

    #select语句

    sql_select =  '''select count(*) as counts, {} from {} t {} '''.format(ss, table_name, wh)

    #pd.Series(res.values.tolist()[0], index=res.columns)

    res = pd.read_sql_query(sql_select, db)

    return res.iloc[0,:]

　　示例

config = {

        'username':'qmcb',

        'passwd':'qmcb',

        'host':'localhost',

        'port':'1521',

        'sid':'tqmcbdb'

        }

where_condition = {

                 'is_normal': 1,

                 'is_below_16': 0,

                 'is_xs': 0,

                 'is_cj': 0,

                 'is_dead': 0,

                 'AAE138_is_not_null': 0,

                 'is_dc': 0,

                 'is_px': 0

                 }

# 计算 QMCB_KM_2019_1_31_1 表的数据缺失数

missing_count('QMCB_KM_2019_1_31_1', where_condition, **config)

python 连接 oracle 统计指定表格所有字段的缺失值数的更多相关文章

python 连接oracle -- sqlalchemy及cx_Oracle的使用详解
python连接oracle -- sqlalchemy import cx_Oracle as orcl import pandas as pd from sqlalchemy import cre ...
python 连接 Oracle 乱码问题（cx_Oracle）
用python连接Oracle是总是乱码,最后发现时oracle客户端的字符编码设置不对. 编写的python脚本中需要加入如下几句: import os os.environ['NLS_LANG'] ...
python连接Oracle的方式以及过程中遇到的问题
一.库连接步骤 1.下载cx_Oracle模块下载步骤工具 pycharm :File--->右键setting--->找到Project Interpreter -----> ...
Python连接Oracle数据查询导出结果
python连接oracle,需用用到模块cx_oracle,可以直接pip安装,如网络不好,可下载离线后本地安装 cx_oracle项目地址:https://pypi.org/project/cx_ ...
Python 连接 Oracle数据库
1.环境设置 [root@oracle ~]# cat /etc/redhat-release CentOS release 6.9 (Final) [root@oracle ~]# python - ...
Python 连接Oracle数据库
连接:python操作oracle数据库 python——连接Oracle数据库 python模块:cx_Oracle, DBUtil 大概步骤: 1. 下载模块 cx_Oracle (注意版本) ...
Python连接oracle数据库例子一
step1:下载cx_Oracle模块,cmd--pip install cx_Oracle step2: 1 import cx_Oracle #引用模块cx_Oracle 2 conn=cx_Or ...
python连接oracle导出数据文件
python连接oracle,感觉table_list文件内的表名,来卸载数据文件主脚本: import os import logging import sys import configpars ...
Python连接Oracle问题
Python连接Oracle问题 1.pip install cx_oracle 2.会出现乱码问题: 方法一:配置环境变量 export NLS_LANG="SIMPLIF ...

随机推荐

ssm+redis整合之redis连接池注入
package com.tp.soft.redis; import javax.annotation.Resource; import org.springframework.beans.factor ...
学习笔记69—金蝶财务软件安装教程(KIS12.3，win10)
****************************************************** 如有谬误,请联系指正.转载请注明出处. 联系方式: e-mail: heyi9069@gm ...
flink入门
wordCount POM文件需要导入的依赖: <dependency> <groupId>org.apache.flink</groupId> <artif ...
js中defer实现等文档加载完在执行脚本
我们可以使用defer来实现类似window.onload的功能: <script src="../CGI-bin/delscript.js" defer></s ...
机器学习之XGBoost算法
目录 1.基本知识点简介 2.XGBoost提升树算法 2.1 XGBoost原理 2.2 XGBoost中损失函数的泰勒展开 2.3 XGBoost中正则化项的选定 2.4 最终的目标损失函数及其最 ...
vue去掉#——History模式
打开index.js文件加在 Vue.use(Router) export default new Router({ mode: 'history', ] }) 若有不明白请加群号:复制 69518 ...
MongoDB AUTH结果验证及开启方法
创建超级管理员(root)和普通用户(gxpt) #创建超级管理员(root) RS1:PRIMARY> use admin RS1:PRIMARY> db.createUse ...
从rnn到lstm，再到seq2seq（二）
从图上可以看出来,decode的过程其实都是从encode的最后一个隐层开始的,如果encode输入过长的话,会丢失很多信息,所以设计了attation机制. attation机制的decode的过程 ...
NOIp 2018 D2T1 旅行//未完成
这个题没有认真读的话就会写下以下的DD代码 #include<bits/stdc++.h> #define N 5010 using namespace std; int n,m; int ...
CRF++安装，提示libstdc++.so.6: version `GLIBCXX_3.4.20' not found解决
安装CRF++, 到CRF++网站CRF++: Yet Another CRF toolkit,下载C++源代码安装包(这里用的是 CRF++-0.58.tar.gz ),解压,进入解压文件并如下过程 ...

python 连接 oracle 统计指定表格所有字段的缺失值数

python 连接 oracle 统计指定表格所有字段的缺失值数的更多相关文章

随机推荐

热门专题