Tags:
create new tag
view all tags

April

4.1~4.8(作业和传输正常)

(1)作业总数: 12660
(2)比例: production:5430 98% analysis:6350 60% (Application errors)
 
* 周传输总结:
传输正常
状态:传入13.7TB,传出4.3TB

4.9~4.15(作业和传输正常)

(1)作业总数: 26018
(2)比例: production:7663 98% analysis:18090 70% (FileOpen error with fallback)
 
* 周传输总结:
传输正常
状态:传入15TB,传出3.9TB

4.16~4.22(CMS workflow bug, pilot产生大log>80GB)

(1)作业总数:15467 (CMS workflow bug, pilot产生大log>80GB, 已经在ticket 讨论)
(2)比例: production:4993 80% (中心WMAgent出问题,all T2在75%左右) analysis:9950 60% 
 
* 周传输总结:
传输正常
状态:传入13TB,传出6.3TB

4.22~4.29(Downtime for dCache and dpm)

(1)作业总数:14842
(2)比例: production:2190 93% analysis:8345 60% (应用错误) 
 
* 周传输总结:
传输正常
状态:传入9.1TB, 传出3.5TB

4.30~5.6(作业和传输正常)

(1)作业总数:15450
(2)比例: production:2680 70% (CMSSW exception)  analysis:5786 60% (IO trap, killed by system) 
 
* 周传输总结:
传输正常
状态:传入11.3TB, 传出3.8TB

4月份

(1)作业总数:88575
(2)比例: production:25039 85% (CMSSW exception)  analysis:55980 60% (IO trap, killed by system, output files not found) 
 
* 周传输总结:
传输正常
状态:传入61.2TB, 传出21.3TB

5.7~5.13(作业和传输正常)

(1)作业总数:25361
(2)比例: production:2637 50% (CMSSW exception, CMS T2站点总体有大约7万的该错误,而且时间比较集中在8号,12号)  
             analysis:16429 60% (Config file read error, FileOpen error with fallback, killed by system) 
 
* 周传输总结:
传输正常
状态:传入8.6TB, 传出4.3TB

5.14~5.20(作业和传输正常)

(1)作业总数:20145
(2)比例: production:3148 99%   
             analysis:16741 60% (FileOpen error with fallback, user application errors) 
 
* 周传输总结:
传输正常
状态:传入8.9TB, 传出4.1TB

5.21~5.27(作业和传输正常)

(1)作业总数:17563
(2)比例: production:3033 91%   
             analysis:8901 60% (FileOpen error with fallback, CMSSW exception)
 
* 周传输总结:
传输正常
状态:传入12.3TB, 传出4.35TB

5.28~6.3(传输有问题)

(1)作业总数:23569
(2)比例: production:9144 95%   
             analysis:8901 60% (internal error in crab stageout script,  killed by system)
 
* 周传输总结:
状态:传入8.9TB, 传出4.6TB
* 问题:
(1)SAM tests 中 SE和CE 都出现短暂连接错误,看到与Atlas相似的现象
(2)数据传输中,下载和Atlas一样看到出现时段性的连接错误。外传中,存放传输测试数据的盘有问题,数据不能正常读出,通知小飞处理。

5月份

(1)作业总数:90346
(2)比例: production:36032 87%   (CMSSW exception是主要原因)
             analysis:47432 60%
 
* 周传输总结:
状态:传入42.1TB, 传出19.1TB

6.3~6.10

(1)作业总数: 36320
(2)比例: production:22123 98% 
             analysis:8128 53% (主要是FileOpen错误,当时突发访问量高,出1.9Gb/s, 入1.5Gb/s)
 
* 周传输总结:
状态:传入14.1TB, 传出9.1TB
SAM availability 99.3%

6.11~6.17

* 周作业总结:
(1)作业总数: 41460
(2)比例: production:9073 99% 
             analysis:22417 60% (主要来自应用错误,output files not found)
 
* 周传输总结:
状态:传入15.5TB, 传出5.6TB
SAM availability 100%

6.18~6.24

* 周作业总结:
(1)作业总数: 42132
(2)比例: production:11260 97% 
             analysis:30874 63% (主要来自应用错误)
 
* 周传输总结:
状态:传入20.2TB, 传出7.1TB
SAM availability 100%

-- ZhangXiaomei - 2015-04-22

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r13 - 2015-06-24 - ZhangXiaomei
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback