Difference: 201104 (3 vs. 4)

Revision 42011-04-25 - KanBowen

Line: 1 to 1
 
META TOPICPARENT name="Maintenance"
-- ShiJingyan - 2011-04-15
Line: 38 to 38
  为了提高torqsrv的域名解析性能,将所有其管辖的计算结点都在/etc/hosts 里面进行了定义。
Changed:
<
<
2011-04-24 (周日)
>
>
2011-04-20(周三)

为了检查pbs系统中,作业requeue情况,完成pbs系统统计requeue情况的脚本:

/root/kanbw/tracejobTest.py

import os
import re
path = "/var/spool/pbs/server_logs/20110420"
fr = open(path,"r")
contents = fr.read()
jobs = re.findall("\d+\.torqsrv\.ihep\.ac\.cn", contents)
jobs = [int(re.findall("\d+", job)[0]) for job in jobs]
freshJobs = []

fw = open("result","w")
rerunDict = {}
usersDict = {}
queueDict = {}
for job in jobs:
if job in freshJobs:
continue
freshJobs.append(job)
fr = os.popen("tracejob %d" %job)
contents = fr.read()
rerun = len(re.findall("Rerun",contents))
if rerun > 0:
rerunDict[job] = rerun
else:
continue
users = filter(lambda s: "root" not in s, re.findall("\w+@\w+\.ihep\.ac\.cn",contents))
if len(users) > 0:
usersDict[job] = users[0]
else:
continue
queues = re.findall("queue ?= ?\w+",contents)
if len(queues) > 0:
queue = queues[0].replace(" ","")
queueDict[job] = queue.split("=")[1]
else:
continue
line = "%d\t%d\t%s\t%s\n" %(job, rerunDict[job],usersDict[job],queueDict[job])
print line
fw.write(line)
fw.close()

因为担心影响pbs和maui的运行,所以此脚本在系统作业数目较少时使用。

2011-04-24 (周日)

  经过讨论,将offlineq中的所有节点和besq与dp2q合并一起使用,但是其优先级最高
  • 将原offlineq的节点(bws0303-322)的nodes文件里名称从 bws0303.ihep.ac.cn np=8 bes3-farm-besq-offline改为bws0303.ihep.ac.cn np=8 bes3-farm-besq
Line: 53 to 65
  CLASSCFG[offlineq] QDEF=bes PRIORITY=450
Added:
>
>
2011-04-25日(周一)

在PBS系统中,拿出bws0316--bws0391,除了bws0365(给黄秋兰做AFS_PBS测试使用),重新安装64位系统。

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback