程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

How Python batch modifies and replaces SQL scripts in Excel power query

編輯:Python

background

Our current BI The technology stack is mainly Python,MySQL,Excel&PowerBI, It's usually ETL After data source and cleaning , It through Power Query Load into Excel Inside , In this way, the simplest Excel Slicer interaction , It is also convenient for us to automatically refresh and distribute reports .
We take an existing model as an example , This model is called Turn to introduction of numerical control I .xlsx. It uses our regular technology chain

Python do ETL,SQL Write the script ,Excel Loading display


problem

Use Power Query A good model is usually written and fixed SQL Script , It won't change easily .
In this model , Altogether 14 individual Power Query Inquire about , Namely :

  • Update date prompt x2
  • The monthly dimension x5 individual OD Hierarchy
  • Day dimension x5 individual OD Hierarchy

Currently, the fixed date of the month dimension is this month 1 From the th to the same day , The daily dimension is the same day , So in SQL In other words, it is fixed to

# During the month 1 Japan 
date_sub(curdate(),interval day(curdate())-1 day)
# On the day 
curdate()

However, the idea of the demand side is Very unstable Of , For example, suddenly want yesterday's , For example, every month 5 I want a copy of last month's …

No problem , There are more ways than difficulties


programme

We have Python,Python Yes pywin32. This module has no documentation , But this module is directly applied VBA.
We just need to write it py Script , Set the date as the parameter , It is good to modify parameters every time
Let's say we expand it one month ahead , From this month 1 It was expanded to last month 1 Number

from win32com.client import Dispatch # Import pywin32
import os # Used to process paths 
#1 Parameters 
FILENAMES = [
'【 republic 】【 month 】 Transfer to table 1 .xlsx',
'【 republic 】【 month 】 Transfer to table 2 .xlsx',
'【 republic 】【 month 】 Transfer to table 3 .xlsx',
'【 republic 】【 month 】 Introduction trend .xlsx'
]
FOLDER = r'E:\Onedrive\doc\for_share\Python_eco\excel'
#2 Handle 
app = Dispatch('excel.application') # open Excel Program 
app.Visible = -1 # Display program 
for file in FILENAMES: # Circulation table 
filepath = os.path.join(FOLDER,file) # Merge paths into full paths 
wkb = app.Workbooks.Open(filepath) # open Excel file 
for p in wkb.Queries: # Traverse Power Queries
replacePair = [
('date_sub(curdate(),interval day(curdate())-1 day)',
'date_sub(date_sub(curdate(),interval day(curdate())-1 day),interval 1 month)')
] # replace text Replace with text 
fml = p.formula
for a in replacePair:
fml = fml.replace(a[0],a[1])
p.formula = fml
print('**REPLACED')
wkb.Save()
wkb.Close(1)
print('**DONE')

summary

  1. The main problem is pywin32 No, Excel Operation related instruction documents
  2. The second problem is that there are very few on the Internet VBA perhaps Python operation Power Query Out of the box code
  3. The key code is wkb.Queries and p.formula these two items.

pack

Now package this block into a function , according to Python International practice

def replacePowerQuery(filelist,replaceTuple):
''' take Excel Inside Power Query SQL Script replaces some words '''
from win32com.client import Dispatch
app = Dispatch('excel.application')
app.Visible = -1
for file in filelist:
wkb = app.Workbooks.Open(file)
for p in wkb.Queries:
fml = p.formula
for a in replaceTuple:
fml = fml.replace(a[0],a[1])
p.formula = fml
print('**REPLACED')
wkb.Save()
wkb.Close(1)
app.Quit()
print('**DONE')
  • filelist Parameters are lists or tuples , Element is the full file path
  • replaceTuple Nesting tuples for lists or tuples
# Examples of use 
filelist = [
r'E:\Onedrive\doc\for_share\Python_eco\excel\【 republic 】【 month 】 Transfer to table 1 .xlsx',
r'E:\Onedrive\doc\for_share\Python_eco\excel\【 republic 】【 month 】 Transfer to table 2 .xlsx',
r'E:\Onedrive\doc\for_share\Python_eco\excel\【 republic 】【 month 】 Transfer to table 3 .xlsx',
r'E:\Onedrive\doc\for_share\Python_eco\excel\【 republic 】【 month 】 Introduction trend .xlsx'
]
replaceTuple = [
('date_sub(curdate(),interval 1 day)','date_sub(curdate(),interval day(curdate()) day)'),
('date_sub(max(everyday),interval 1 day)','date_sub(max(everyday),interval day(max(everyday)) day)')
]
replacePowerQuery(filelist,replaceTuple)

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved