ATLAS: Period from July 14th to September 15th (Wolfgang Ehrenfeld):
--------------------------------------------------------------------
Accounts:
New accounts (22x)
Problems with restored account (fixed)
AFS quota: (3x)
ATLAS software:
Atlas software request (all on vacation -> usually within a day)
Atlas software problem (missing compiler -> better install script)
Atlas software setup problem (missing cache in requirements file -> better script, better solution by atlas: asetup)
Atlas DB release not installed in time (not available in ganga on SGE -> automatic installation not working due to low AFS quota)
ganga: conditions access on SGE (local setup dropped -> fixed in ganga)
general atlas software problem
Transient problems:
Various problems with login (AFS instabilities -> working on the NAF is painful, problem needs to be solved ASAP)
many work group server reboots
reboot request of tcx080 (7.8.2010 - infiniband instabilities -> need to be better handled by IT)
transient Lustre problem (14.8.2010, 19.8.2010)
transient dq2 problem (16.8.2010)
Work group server:
PROOF Lite incident (PROOF Lite on many work group servers, correlated with AFS instabilities, memory leaks -> loose memory limits on work group servers)
Batch system:
Problems with multicore jobs (not scheduled in time, makes PROOF Lite unusable except on work group servers)
Storage:
low/no disc space on lustre (2x)
slow access time to lustre (not understood)
gsidcap access is not working in athena out of the box
dCache dcap access hangs (a few times)
dCache IO is not sufficient (average job efficiency below 10%)
Grid:
problems with conditions access (PoolFileCatalog is broken, problems with installation on cream, still ongoing)