一、 概述
随着信息系统业务需求快速增长,业务系统关联日益复杂,信息系统性能问题逐渐突显,一旦出现信息系统性能问题及不可用问题,将严重影响信息系统的稳定运行及用户体验。
结合运维实践,数据库性能问题是造成信息系统性能下降和非停的重要原因之一,如何进行常态化的数据库性能趋势分析,及时发现数据库性能衰减“病灶”,常态化提升信息系统性能,避免救火式性能优化,成为衡量信息系统管理部门运营能力的重要指标之一。
二、研究目标
使用python语言进行Oracle数据库性能趋势分析。
三、工具介绍
开发语言:python
2.7
数据库:Oracle
11.2.0.4
Web框架:Django
图形展示工具:echart
四、算法介绍
核心算法由运行可靠率、资源竞争率、进程等待率和SQL稳定率四部分组成,如下图所示,本文主要以SQL稳定率为例:
Trend =100-100*sum(( c_time-h_time) /h_time)
说明:
Trend: 表示信息系统性能趋势(%)
c_time: 前一小时SQL平均执行时间(秒)
h_time: 3个月内SQL平均执行时间(秒)
五、效果展示
(1)、系统性能趋势:
(2)、TOPSQL性能趋势分析
(3)、TOPSQL日性能趋势分析
(4)、TOPSQL月性能趋势分析
六、核心代码
核心代码分为数据采集层、数据转换层、web展示层。
(1)、数据采集层:
点击(此处)折叠或打开
-
def get_topsql_info(username,password,ip,port,dbname,c_type,param=0,b_param=0):
- s_top10 = ”
- #s_snap_id = 0
- print oracle_link_target
- if c_type == ‘sql_topsql’:
-
sql_topsql=“
- select round(Elapsed_Time, 2) Elapsed_Time,
- round(cpu_time, 2) cpu_time,
- Executions,
- round(elap_per_exec, 2) elap_per_exec,
- round(total_db_time, 2) total_db_time,
- sql_id,
- substr(nvl(sql_module, ‘ ** SQL module Not Available ** ‘), 1, 30) sql_module,
- sql_text
- from (select nvl((sqt.elap / 1000000), to_number(null)) Elapsed_Time,
- nvl((sqt.cput / 1000000), to_number(null)) CPU_Time,
- sqt.exec Executions,
- decode(sqt.exec,
- 0,
- to_number(null),
- (sqt.elap / sqt.exec / 1000000)) Elap_per_Exec,
- (100 *
- (sqt.elap /
- (SELECT sum(e.VALUE) – sum(b.value)
- FROM DBA_HIST_SYS_TIME_MODEL e, DBA_HIST_SYS_TIME_MODEL b
-
WHERE B.SNAP_ID = “+str(b_param)+“
-
AND E.SNAP_ID = “+str(param)+“
- AND B.DBID = (select dbid from v$database)
- AND E.DBID = (select dbid from v$database)
- AND B.INSTANCE_NUMBER =
- (select instance_number from v$instance)
- AND E.INSTANCE_NUMBER =
- (select instance_number from v$instance)
- and e.STAT_NAME = ‘DB time’
- and b.stat_name = ‘DB time’))) Total_DB_Time,
- sqt.sql_id,
- to_char(decode(sqt.module,
- null,
- null,
- ‘Module: ‘ || sqt.module)) SQL_Module,
- nvl(to_char(substr(st.sql_text, 1, 30)),
- ‘ ** SQL Text Not Available ** ‘) SQL_Text
- from (select sql_id,
- max(module) module,
- sum(elapsed_time_delta) elap,
- sum(cpu_time_delta) cput,
- sum(executions_delta) exec
- from dba_hist_sqlstat
- dba_hist_sqltext st
- where st.sql_id(+) = sqt.sql_id
- order by nvl(sqt.elap, -1) desc, sqt.sql_id)
- where rownum < 100
- “
- elif c_type == ‘top10’:
- #a list of top10: m_top10
- m_top10=get_hsql_info(t,‘top10’)
- #after get top10
- #end get top10
- for h_sql_id in m_top10:
- l_sql_id = h_sql_id[0]
- s_top10 = s_top10+“,'”+l_sql_id+“‘”
- s_top10 = s_top10.strip(‘,’)
- sql_hsql_top10=“select sql_id,to_char(substr(sql_text,1,2000)) sql_text,length(sql_text) sql_length,command_type from dba_hist_sqltext t where t.sql_id in (“+s_top10+‘)’
- else:
- cmd=sql_tablespace
- #print s_top10
- #print log_cmd_i
- cmd =“”
- if c_type == ‘sql_topsql’:
- cmd=sql_topsql
- elif c_type == ‘top10’:
- cmd=sql_hsql_top10
- else:
- cmd=sql_tablespace
- #print len(m_top10)
- print ‘before get topsql exe sql: ‘
- print cmd
- print ‘get db shell: ‘
- conn = cx_Oracle.connect(oracle_link_target)
- cursor = conn.cursor()
- cur = cursor.execute(cmd)
- db_list = cur.fetchall()
- #print ‘before return db_list’
- #print db_list
- return db_list
- cursor.close()
- conn.close()
(2)、数据转换层
点击(此处)折叠或打开
-
select row_number() over(partition by ip order by to_number(total_db_time) desc) rn,
- ip,
- db_name,
- sql_id,
- decode(elap_per_exec, ‘0’, 0.01, elap_per_exec) elap_per_exec,
- decode(elap_avg_exec, ‘0’, 0.01, elap_avg_exec) elap_avg_exec,
- decode(sign(decode(elap_avg_exec, ‘0’, 0.01, elap_avg_exec) – decode(elap_per_exec, ‘0’, 0.01, elap_per_exec)),
- 1,
- ‘up’,
- –1,
- ‘down’,
- ‘equ’) sql_status,
- round((decode(elap_avg_exec, ‘0’, 0.01, elap_avg_exec) –
- decode(elap_per_exec, ‘0’, 0.01, elap_per_exec)) /
- decode(elap_avg_exec, ‘0’, 0.01, elap_avg_exec),
- 2) sql_cont,
- executions,
- total_db_time,
- substr(sql_module, 1, 12) sql_module,
- substr(sql_text, 1, 12) sql_text,
- ch_date
- from (select rownum rn,
- d.ip,
- d.db_name,
- d.sql_id,
- replace(d.elap_per_exec, ‘None’, 0) elap_per_exec,
- e.elap_avg_exec,
- d.executions,
- d.sql_module,
- d.sql_text,
- d.ch_date,
- d.total_db_time
- from hsql.h_topsql d,
- (select b.ip,
- b.sql_id,
- round(avg(replace(b.elap_per_exec, ‘None’, 0)),
- 2) elap_avg_exec
- from hsql.h_topsql_bak b
- group by b.ip, b.sql_id) e
- where d.sql_id = e.sql_id
- and d.ip = e.ip)));
(3)、web展示层
点击(此处)折叠或打开
-
def topsql_line_servlet(request):
- cursor = conn.cursor()
-
query = “select ip,
- (select service_name
- from hsql.h_instance h
- where h.ip = b.ip
- and rownum = 1) service_name,
- sql_id,
- executions,
- elap_per_exec,
- to_char(ch_date, ‘hh34:mi’) sj,
- to_char(ch_date, ‘yyyy-mm-dd’) rq
- from hsql.h_topsql b
- where ch_date > trunc(sysdate)
- order by sj”
- print query
- cursor.execute(query)
- resultset = cursor.fetchall()
- cursor.close()
- conn.close()
七、总结
通过Oracle性能趋势分析工具的应用可以进行细粒度的数据库性能管理,及时发现潜在的信息系统性能衰减隐患,通过持续性、常态化的信息系统性能优化,优化信息系统提升,提升用户体验。