Process Management: Difference between revisions

From CLONWiki
Jump to navigation Jump to search
Boiarino (talk | contribs)
No edit summary
Boiarino (talk | contribs)
No edit summary
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''NOTE''': make sure ''clascron'' account exist on all machines participating in non-ipc-based monitoring; scripts will use ''ssh -n node ...'' statements so ''clascron'' must be able to ssh. It must be done without password, so login to every machine running related cron jobs and processes (clon10, clon00, clon01) as ''clascron'' and make sure you can ssh to all other machines without password. If necessary, fix ''~/.ssh/known_hosts''.
In addition to runcontrol-managed processes, other DAQ-related processes are running on machine specified by environment variable '''CLON_DAQ'''. To start those processes, run following command on any clondaq* machine as ''clasrun'':
 
start_process_management
 
It will start ''ipc_process_manager'' controlled by config file ''$CLON_PARMS/processes/ipc_process_manager.cfg''. It is running permanently, making sure that all processes, specified in config fie, are alive, and will restart them if needed. To stop all those processes, run
 
stop_process_management
 
To check running processes:
 
check_process_management
 
 
 
== old info, can be incorrect ==
 
'''NOTE''': make sure ''clascron'' account exist on all machines participating in cron job-based monitoring, and it is in ''onliners'' group. If not, modify ''/etc/passwd'', ''/etc/shadow'' and ''/etc/group'' files using examples from existing machines. Scripts started by cron jobs will use ''ssh -n node ...'' statements so ''clascron'' must be able to ssh. It must be done without password, so login to every machine running related cron jobs (clon10, clon00, clon01) as ''clascron'' and make sure you can ssh to all other machines without password. If necessary, fix ''~/.ssh/known_hosts''.


'''Generic information'''
'''Generic information'''


Directory ''$CLON_PARMS/processes'' contains 6 configuration files for CLON process management system:
Directory ''$CLON_PARMS/processes'' contains 6 configuration files for CLON process management system:
ipc_process_manager.cfg      <- ipc_process_manager(perl script) <- control_ipc_process_manager (csh script)
ipc_critical_processes.cfg  <- ipc_process_monitor(perl script) <- cronjobs
To start/stop 'ipc_process_manager':
control_ipc_process_manager start clasrun
control_ipc_process_manager stop clasrun
Currently used components:
epics_monitor
ipcbank2et
dbrouter
run_log_update
#clas_epics_server
#alarm_handler
#alarm_server
#alarm_browser
CLAS-era stuff:


  critical_processes.cfg
  critical_processes.cfg
ipc_critical_processes.cfg
ipc_process_manager.cfg
  process_manager.cfg
  process_manager.cfg
  remote_critical_processes.cfg
  remote_critical_processes.cfg
Line 14: Line 50:
To watch ipc messages:
To watch ipc messages:


  java clonjava/ipc_monitor -a clasprod
  java clonjava/ipc_monitor -a clasrun

Latest revision as of 13:11, 10 April 2026

In addition to runcontrol-managed processes, other DAQ-related processes are running on machine specified by environment variable CLON_DAQ. To start those processes, run following command on any clondaq* machine as clasrun:

start_process_management

It will start ipc_process_manager controlled by config file $CLON_PARMS/processes/ipc_process_manager.cfg. It is running permanently, making sure that all processes, specified in config fie, are alive, and will restart them if needed. To stop all those processes, run

stop_process_management

To check running processes:

check_process_management


old info, can be incorrect

NOTE: make sure clascron account exist on all machines participating in cron job-based monitoring, and it is in onliners group. If not, modify /etc/passwd, /etc/shadow and /etc/group files using examples from existing machines. Scripts started by cron jobs will use ssh -n node ... statements so clascron must be able to ssh. It must be done without password, so login to every machine running related cron jobs (clon10, clon00, clon01) as clascron and make sure you can ssh to all other machines without password. If necessary, fix ~/.ssh/known_hosts.

Generic information

Directory $CLON_PARMS/processes contains 6 configuration files for CLON process management system:

ipc_process_manager.cfg      <- ipc_process_manager(perl script) <- control_ipc_process_manager (csh script)
ipc_critical_processes.cfg   <- ipc_process_monitor(perl script) <- cronjobs

To start/stop 'ipc_process_manager':

control_ipc_process_manager start clasrun
control_ipc_process_manager stop clasrun

Currently used components:

epics_monitor
ipcbank2et
dbrouter
run_log_update
#clas_epics_server
#alarm_handler
#alarm_server
#alarm_browser

CLAS-era stuff:

critical_processes.cfg
process_manager.cfg
remote_critical_processes.cfg
sys10_critical_processes.cfg

To watch ipc messages:

java clonjava/ipc_monitor -a clasrun