- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
ATTENTION! We have to do a short maintenance with downtime on Wed 8 Oct 2025, 09:00 - 10:00 CEST . Please finish your work in time to prevent data loss.
For further information, please have look at the IT-News article
XFEL-IT brainstorming on
Zoom: XFEL-IT zoom link
# XFEL IT on containers
## Draft agenda:
- Use cases from XFEL side
- Current setup of XFEL by Robert , Janusz, Luis
- Use cases from IT side
- Technical proposals on how to implement and what is available
## Attendees
IT
- Stefan D.
- Juergen
- Yves
- Sven
- Tigran.
- Tobias Klann
- Marina
- Raren
Xfel
- Michael Schuh
- Janusz S.
- Robert Rocca
- Luis
- Gan-Petro
- Krizstof Wrona
## XFEL status by Robert
DA-service
- not inportatn
- important, but not critical
- Auth services
- python repos
- some we-staff
- IDP federated with keykloack
- infrastructure things, like traefic
- non-essential
- CI runners
- bots to post to local chats/tickets
- integral metrics and so..
None of them essential. All series access only internal from control network.
### Pain points
service owner is the only person how knows how it runs. Most things started with docker-compose. Eventing runs on a single server, no redundancy.
desires a standard way of doing ths, including configuration and ease of redundancy.
expectations from Kubernetes based solution: (low hanging fruits) GitOps approach with tools like argoCD/Flux. Eventing is version controlled. Continues deployment style of deploring. HA, rolling updates. Long term: autoscaling, affinity, Live debugging.
> IT: may of immediate tasks, low hanging fruits, are in place.
CI/CD will need some access to data.
### Luis on what is in place
Lot of applications and devs wants to control how applications are deployed. Since 2016 lot of apps are dockerized. Most of it runs on VMs: dev, test and prod envs. Prod needs multiple VMs for redundancy. Good candidate to deploy on k8s. Now there are 4 big players: puppet to install OS, DB, which are extern, and storage. (Q: are there s3 that can be used?) and, finally, certificate handling. Some apps need external access. some intern, some accesses by wi-fi, but all must be accessed from control network. Current everything started manually. 7 services are mission critical (someone is on-call): mymdc, gitlab, zulip, ....
Storage used by services is netapp.
## XFEL-k8s infrastructure
3 control nodes, 3 workers, shared storage on NFS. Using low-level kube-adm.
used by:
- applications deployments
- gitlab runners
- PDO creation over API
- goal to replace custom docker deployment
- accessing data on GPFS
CSI provider for GPFS?
Monitoring with Zabix. The goal as simple to deploy and re-use decommissioned hardware.
### DESY-IT k8s infrastructure
- DESY-IT uses S3 with CEPH.
- now many people run systems
- Tobias and Stefan
- for dms we need subdomains, but not there yet.
- what about zero downtime?
- the problem ins k8s updates. LB changes IP address. and fail over some times takes too long.
- for shared volumes we use NFS provider from Netapp.
- network partitioning is dome by selecting internal or DMZ networks before hand.
- are there plans to use overlay networks?
- We tryed kernel driver. We separate networks per project.
### Apps by DESY in k8s
- gitlab runners
- mattermost
- hifis services
- harbor
## Next Steps
- XFEL ready to try
AP: IT will provide k8s internal cluster to play. The access from control network is desired. Tobias and Robert wil get in contact.
- Can we build a common knowledge? Some regular exchange? XFEL will get a new person working on it (and open data).
- XFEL has task to investigate commercial clouds. k8s can be technology that can work in both worlds.
- IT can show how data access can be performed. We should organize another meeting on data analysis.
_ by Tigran Mkrtchyan, 29.04.2025 _