Wolfgang Ehrenfeld, Andreas Gellrich, Yves Kemp, Steve Aplin
- kswapd problem: seen at other sites, but vague; waiting for new kernel
- InfiniBand errors: will upgrade firmware in blades, might need downtime for big switch upgrade
- one shared SL4 node, WN reorganization; ATLAS and CMS will discuss with users about SL4 usage. - lustre additions - Lustre group quotas possible, guarantees space for certain directory.
(new action item: setup groups) - template for storage problem reports presented and accepted - downtime tomorrow - longer downtime in September? - request: show helpdesk tickets of last months (new action item)
old-1 | NAF/ATLAS | CMT problem | Wolfgang and Stephan Wiesand looked into it, problem also seen on Grid |
1005-1 | NAF | present SL4 work group server usage | show usage at next NUC, open as long as we have a need for SL4 |
1005-2 | ATLAS/CMS | plans for SL4 work group servers | reduce to one shared SL4 IN (done), talk to users, open as long as we have a need for SL4 |
1005-3 | NAF | email notification for /scratch monitoring (instance full) | person away, still open |
1005-4 | NAF | technical constraints for Lustre extension | might end in 1007-1 |
1005-5 | NAF | automatic Lustre space clean-up | expert meeting still foreseen |
1006-1 | NUC | CPU, dCache, Lustre resource requirements | needed for next PRC |
1006-2 | NAF | SGE queue for downloads/IO? | first ideas dismissed |
1006-4 | NAF | form for storage report on NAF web | draft proposed, add to FAQ, review it |
1006-3 | NAF/ATLAS | reboot procedure for (ATLAS) WGS | done |
1006-5 | NUC | ideas for accounting | by e-mail or next NUC or extra meeting , no progress |
1006-6 | NAF | dCache upgrade time line (10 GE in HH) | 30% done |
- account cleanups - new accounts
- missing .OldFiles for early ATLAS users
- Login problems (9 June, 12 July (DESY AFS server)) - Lustre problems on tcx050 (21 June) - High load problem (tcx080 - 28 June, tcx060 - 6 July) - Host unavailable (tcx080 - 6 July)
- Lustre full (ZN, 13 July)
- Problems with iLumiCalc and DB (2x, not working) - ATLAS software question - ganga problem with old release (not fixed) - ganga setup problem with oversized sandbox (NAF config in work) - ATLAS storage question (2x) - gsidcap problem in athena - software installation problems (15.6.11)
- Files written to non ST (user informed, need some follow up on ATLAS documentation)
- dCache problem (doors overloaded, user informed and adviced) - Grid: large output sandboxes at DESY-HH (9 July, affected other jobs?)
There was a lively discussion on both topics. One consensus was, that we should get feedback from our users now that the LHC experiments have first data to analyze.
1007-1 | NAF | setup Lustre groups for ATLAS, CMS, ILC | |
1007-2 | NAF | report on help disk tickets | at every NUC |
1007-3 | ATLAS/CMS/LHCb | ICHEP review; get feedback |
- Wednesday, August 11 2010, 1 am