[Python]将Excel文件中的数据导入MySQL
Github Link
需求
现有2000+文件夹,每个文件夹下有若干excel文件,现在要将这些excel文件中的数据导入mysql。
每个excel文件的第一行是无效数据。
除了excel文件中已有的数据,还要添加一列,名为“at_company”,值为821。
流程
(1)获取excel文件列表,并根据excel文件名确定之后需要创建的table名;
(2)连接mysql
(3)创建table
(4)插入数据
(5)断开连接
依赖模块
1. xlrd # to read excel files
2. mysql-connector-python # to work with Mysql
源代码
#-*- coding: utf-8 -*-
import os,sys,datetime
import mysql.connector
import xlrd '''
the main function to import data
username: username of mysql database
password: password for username
database: a specific database in mysql
datapath: the absolute path or relative path of data folder
'''
def importDataHelper(username, password, database, datapath):
'''import data helper'''
'''
Step 0: Validate input database parameters
'''
try:
conn = mysql.connector.connect(user=username, password=password, database=database, use_unicode=True)
except mysql.connector.errors.ProgrammingError as e:
print e
return -1
'''
Step 1: Traverse files in datapath, store file paths and corresponding table names in lists
lists[0] is the list of files paths
lists[1] is the list of table names
'''
lists = getFilesList(datapath)
nfiles = len(lists[0])
'''
Step 2: Store data in mysql via a for-loop
'''
cursor = conn.cursor()
for file_idx in xrange(0, nfiles):
file_path = lists[0][file_idx]
print "processing file(%d/%d):[ %s ]"%(file_idx+1, nfiles, file_path)
table_name = lists[1][file_idx]
num = storeData(file_path, table_name, cursor)
if num >= 0:
print "[ %d ] data have been stored in TABLE:[ %s ]"%(num, table_name)
conn.commit()
cursor.close()
'''
Step 3: Close connection
'''
conn.close() '''
get files list in the dir, including the files in its sub-folders
the return list contain two elements, the first element is a file names list
and the second element is a table names list(will be used for creating tables in database),
'''
def getFilesList(dir):
path_list = []
table_list = []
file_name_list = os.listdir(dir)
for file_name in file_name_list:
path = os.path.join(dir, file_name)
if os.path.isdir(path):
'''get the files in sub folder recursively'''
tmp_lists = getFilesList(path)
path_list.extend(tmp_lists[0])
table_list.extend(tmp_lists[1])
else:
path_list.append(path)
'''convert file name to mysql table name'''
file_name = file_name.split('.')[0] #remove .xls
# file_name = file_name.split('from')[0] #remove characters after 'from'
file_name = file_name.strip()#remove redundant space at both ends
file_name = file_name.replace(' ','_') #replace ' ' with '_'
file_name = file_name.replace('-','_') #replace ' ' with '_'
file_name = file_name.lower() #convert all characters to lowercase
table_list.append(file_name)
return [path_list, table_list] '''
store the data of file file_path in table table_name
file_path: file location
table_name: name of the table that will be created in database
cursor: a mysql cursor
'''
def storeData(file_path, table_name, cursor):
ret = 0
'''open an excel file'''
file = xlrd.open_workbook(file_path)
'''get the first sheet'''
sheet = file.sheet_by_index(0)
'''get the number of rows and columns'''
nrows = sheet.nrows
ncols = sheet.ncols
'''get column names'''
col_names = []
for i in range(0, ncols):
title = sheet.cell(1, i).value
title = title.strip()
title = title.replace(' ','_')
title = title.lower()
col_names.append(title)
'''create table in mysql'''
sql = 'create table '\
+table_name+' (' \
+'id int NOT NULL AUTO_INCREMENT PRIMARY KEY, ' \
+'at_company varchar(10) DEFAULT \'821\', ' for i in range(0, ncols):
sql = sql + col_names[i] + ' varchar(150)'
if i != ncols-1:
sql += ','
sql = sql + ')'
try:
cursor.execute(sql)
except mysql.connector.errors.ProgrammingError as e:
print e
# return -1 '''insert data'''
#construct sql statement
sql = 'insert into '+table_name+'('
for i in range(0, ncols-1):
sql = sql + col_names[i] + ', '
sql = sql + col_names[ncols-1]
sql += ') values ('
sql = sql + '%s,'*(ncols-1)
sql += '%s)'
#get parameters
parameter_list = []
for row in xrange(2, nrows):
for col in range(0, ncols):
cell_type = sheet.cell_type(row, col)
cell_value = sheet.cell_value(row, col)
if cell_type == xlrd.XL_CELL_DATE:
dt_tuple = xlrd.xldate_as_tuple(cell_value, file.datemode)
meta_data = str(datetime.datetime(*dt_tuple))
else:
meta_data = sheet.cell(row, col).value
parameter_list.append(meta_data)
# cursor.execute(sql, parameter_list)
try:
cursor.execute(sql, parameter_list)
parameter_list = []
ret += 1
except mysql.connector.errors.ProgrammingError as e:
print e
# return -1
return ret if __name__ == "__main__":
if len(sys.argv)<5:
print "Missing Parameters"
sys.exit()
elif len(sys.argv)>5:
print "Too Many Parameters"
sys.exit()
username = sys.argv[1]
password = sys.argv[2]
database = sys.argv[3]
datapath = sys.argv[4]
importDataHelper(username, password, database, datapath)
Readme文件(帮导师做的,所以用英文写的文档)
There are two dependency modules need to be installed.
1. xlrd # to read excel files
2. mysql-connector-python # to work with Mysql Directory Structure:
data_path: test files
ImportDataProgram.py: the main program Procedure:
(1) Get all the paths and names of the files need to be stored
(2) Connect MySQL
(3) Create tables for each file
(4) Insert data into each table Usage:
0. create a new database in mysql
For example, after logging in mysql in terminal, you can use the the following command
"create database test_database" to create a database named 'test_database',
you can replace "test_database" with any other names you like. 1. set username, password, database(created in step 0) and datapath in the tail of ImportDataProgram.py
2. run ImportDataProgram.py with the following command
python ImportDataProgram.py [username] [password] [database] [datapath]
# username: your username in your mysql
# password: the corresponding password
# database: the database you specific
# datapath: the directory of excel files
e.g.
python ImportDataProgram.py root root test_database data_path PS:
(1) The Length of Data In Table
I am not sure the maximum length of data, so I set the
length of data in mysql tables is 150 characters (you can find
the code in function storeData(file_path, table_name, cursor), the code is
" sql = sql + col_names[i] + ' varchar(150)' "), you can adjust it according
to your requirements.
(2)Table Name:
You can set the rules of table name, the code is following the comment code:
'''convert file name to mysql table name''' in function getFilesList(dir).
遇到的坑以及填坑方法:
(2)Python使用xlrd读取excel文件的方法[1(比较简要)][2(比较详细)]
(3)Python使用xlrd读取excel文件中日期类型变为浮点数[stack overflow][2中文博客]
(4)Python遍历目录下的文件[1]
(5)Python连接MySQL[1]
(6)Python中print格式化输出(%),取消默认换行(,)[print用法]
(7)Python字符串连接[字符串操作]
(8)Python连接list[连接list]
(9)Python字符串替换[字符串替换]
[Python]将Excel文件中的数据导入MySQL的更多相关文章
- 将Excel表中的数据导入MySQL数据库
原文地址: http://fanjiajia.cn/2018/09/26/%E5%B0%86Excel%E8%A1%A8%E4%B8%AD%E7%9A%84%E6%95%B0%E6%8D%AE%E5% ...
- 将excel文件中的数据导入到mysql
·在你的表格中增加一列,利用excel的公式自动生成sql语句,具体方法如下: 1)增加一列(假设是D列) 2)在第一行的D列,就是D1中输入公式:=CONCATE ...
- 用Python的pandas框架操作Excel文件中的数据教程
用Python的pandas框架操作Excel文件中的数据教程 本文的目的,是向您展示如何使用pandas 来执行一些常见的Excel任务.有些例子比较琐碎,但我觉得展示这些简单的东西与那些你可以在其 ...
- 小技巧之“将Text文件中的数据导入到Excel中,这里空格为分割符为例”
1.使用场景 将数据以文本导出后,想录入到Excel中,的简便方案, 起因:对于Excel的导出,Text导出明显会更方便些 2.将Text文件中的数据导入到Excel中,这里空格为分割符为例的步骤 ...
- Java读取、写入、处理Excel文件中的数据(转载)
原文链接 在日常工作中,我们常常会进行文件读写操作,除去我们最常用的纯文本文件读写,更多时候我们需要对Excel中的数据进行读取操作,本文将介绍Excel读写的常用方法,希望对大家学习Java读写Ex ...
- python读取excel表格中的数据
使用python语言实现Excel 表格中的数据读取,需要用到xlrd.py模块,实现程序如下: import xlrd #导入xlrd模块 class ExcelData(): def __init ...
- 使用Python从PDF文件中提取数据
前言 数据是数据科学中任何分析的关键,大多数分析中最常用的数据集类型是存储在逗号分隔值(csv)表中的干净数据.然而,由于可移植文档格式(pdf)文件是最常用的文件格式之一,因此每个数据科学家都应该了 ...
- 将CSV文件中的数据导入到SQL Server 数据库中
导入数据时,需要注意 CSV 文件中的数据是否包含逗号以及双引号,存在时,导入会失败 选择数据库 -> 右键 -> 任务 -> 导入数据 ,然后根据弹出的导入导出向导(如下图)中的提 ...
- 【Python】从文件中读取数据
从文件中读取数据 1.1 读取整个文件 要读取文件,需要一个包含几行文本的文件(文件PI_DESC.txt与file_reader.py在同一目录下) PI_DESC.txt 3.1415926535 ...
随机推荐
- Oracle Apps DBA R12.2 Syllabus
1. What is Oracle R12.2 R12.2 Definition Architecture Advantages of R12.2 Limitations of R12.2 What ...
- 【java】java下载文件中换行符 在windows和linux下通用的
请使用: public static final String FILE_CONTENT_SPLIT_MARK = "\r\n"; 注意 不是"\n\r",顺序 ...
- Android动画学习笔记大集合
其实动画这个东西我已经了解过很长一段时间了,但是一直没系统的整理过.关于android中的各种动画虽然都会用,但总怕自己会慢慢遗忘.这回看了几篇动画分析的文章,自己也学到了一些东西,在此就梳理一下. ...
- mysql权限管理命令示例
mysql权限管理命令示例 grant all privileges on *.* to *.* identified by 'hwalk1'; flush privileges; insert in ...
- 使用idea 在springboot添加本地jar包的方法本地运行有效,一旦需要打jar就会报错,这就需要在
https://blog.csdn.net/huxiaodong1994/article/details/80702278 1.首先在与src同级的目录下新建一个lib目录,然后将本地jar包放在li ...
- Swift3.0:Get/Post同步和异步请求
一.介绍 Get和Post区别: Get是从服务器上获取数据,Post是向服务器发送数据. 对于Get方式,服务端用Request.QueryString获取变量的值,对于Post方式,服务端用Req ...
- 6.1 如何在spring中自定义xml标签
dubbo自定义了很多xml标签,例如<dubbo:application>,那么这些自定义标签是怎么与spring结合起来的呢?我们先看一个简单的例子. 一 编写模型类 package ...
- go语言之进阶篇 select实现的超时机制
1.select实现的超时机制 示例: package main import ( "fmt" "time" ) func main() { ch := mak ...
- go语言之进阶篇空接口
1.空接口 示例: package main import "fmt" func xxx(arg ...interface{}) { } func main() { //空接口万能 ...
- First Missing Positive leetcode java
题目: Given an unsorted integer array, find the first missing positive integer. For example, Given [1, ...