June 3, 2009 online meeting minutes
present: Sergey Boyarinov, Sergey Pozdnyakov, Ben Raydo, Chris Cuevas
1. HV problems: May 23 and May 30 A1532 power supplies in hvec1 crate broke, were replaced; broken ones on the way to CAEN for repair, need to understand the reason to prevent in future; it was no HV alarms so significant beam loss was accounted, now shift personal instructed to clean monitoring histo's global section every 30 minutes - data monitoring alarms needed !!! The reason for epics alarms missing must be understood and fixed.
One A1535 board was modified by Armen in according to CAEN instructions, it should take care of 'full crate' problem (goal is to use internal board's clock instead of external one). Board will be tested and then remaining boards will be modified, and test with 16-slot hvec1 will be conducted during downtime.
We had short meeting with CAEN people and discussed on-site repairs and calibrations. CAEN will provide us with quotes for single channel units, calibration board and repair instructions.
2. DAQ crashes: after recent modifications by Sergey B. DAQ crashes changed, now in most cases roc_reboot does not required, cancel-reset-download sequence is usually enough to restart. However new problems shows up, in particular sometimes during prestart EB reports missing link with some ROC while ROC claims that link was established, must be investigated. Most ROCs runs stable for weeks, while sc2 for example crashed quite often. Sergey is watching trying to find fixes.
It was also found that after running for the long time without reboot (weeks) ROCs started to failed one after another (DCs for example), 'preventive' reboot procedure was implemented, we are rebooting all ROCs at least once a week during downtime.
To improve general performance clondaq1-daq1 and clondaq2-daq1 were moved to 10GBit network cards, and clondb1-daq1 interface was added to clondb1 server (all 68 subnets). PMCs were configured to talk to mysql on clondb1 through clondb1-daq1. Soft must be fixed so ROCs try again if database does not respond.
3. Logs and database: log files / database recovery procedures were restored amd adjusted, missing software finished, procedure will be enforced today.
4. SILO problems: the moving-to-silo process was stuck yesterday night, CC switched to another staging file server (sfs59) and it works now, still unclear what happened with old staging file server (sfs56). Ask CC again to let us buy designated file server for online purposes only, may consider again to expand our raid system (50% more space will be good).
Inventory database development: Will do presentation on new inventory db features - Sergey P. on Monday meeting.
JLAB Discriminator development
Ben presented test results of 2-channel populated prototype board, can be found [JLAB Discriminators]. Need to know what the max pulse amplitude it can handle, and improve small pulses (about 10mV) handling. Maximum output pulse width on timing output will be decreased to 25ns.
CLAS12 Trigger: we discussed shortly MUX boards redesign, it should produce long/shord pulses with digital width (4ns jitter), and be able to mask out bad channels.