October 17, 2018 online meeting minutes

From CLONWiki
Jump to navigation Jump to search

present: Sergey Boyarinov, Nathan, Raffaella, William Gu, Ben, Chris, Pasyuk, Gagik

since we are in production mode, no more (significant) changes until end of run, we discuss remaining reliability issues and ways to fix them.

1. daq crashes observed so far:

  • rcServer loosing mysql access: will check if connecting permanently, reconnect in case of lost connection
  • 'false slave' problem fixed: slave roc's were looking for slaves and registering them by addSlave(), it should be bone by masters only, fixed
  • roc's dma memory leakage: working with Bryan Moffit to find better way to manage DMA memory in vie roc's
  • tdc errors: probably related to hot channels, have to process error messages smarter
  • fiber problems: have messages reporting on trigger fibers, TI fibers will be watched; Tis much more stable with firmware 8.1 but some problems seems remains
  • adcft1vtp and adcft2vtp were swapped and problem with adcft1vtp link disappeared, probably it was bad contact

2. activemq messaging fragmentation and processes control: one big topic 'daq' being split to smaller topics to avoid performance issues, topic 'runlog' was introduced for dbrouter, epics_monitor, run_log_update and ipcbank2et and the rest of run_log_.. programs, topic 'control' was introduced to them as well, it helped, we'll split scaler reporting topics as well

3. camac1 remote control: Nathan is working to control CANBUS from EPICS to be able to roc_reboot camas crate

4. spare electronics pool: will take free laying boards and bring them to Chris; cleanup counting house, move spares to counting room, restore test setups after crates and CPUs arrive; VME power supplies and fan trays from lab will be used to restore tags crate and have spare in counting room; in January full inventory will be conducted

5. we'll make ET select physics events containing scalers