Scratch: Difference between revisions

From CLONWiki
Jump to navigation Jump to search
Boiarino (talk | contribs)
No edit summary
No edit summary
 
(28 intermediate revisions by 7 users not shown)
Line 1: Line 1:
===
test setup error aug 31, 2011:
...
0x9478760 (ROLS_LOOP): INFO: User Go ...
proc_thread: waiting= 31106 processing=     234 microsec per event (nev=59)
net_thread:  waiting=  21361    sending=      2 microsec per event (nev=108)
proc_thread: waiting=  12613 processing=    106 microsec per event (nev=66)
proc_thread: waiting=  13237 processing=    137 microsec per event (nev=66)
net_thread:  waiting=  13001    sending=      1 microsec per event (nev=129)
proc_thread: waiting=  12753 processing=    110 microsec per event (nev=65)
proc_thread: waiting=  13227 processing=    106 microsec per event (nev=66)
0x9478760 (ROLS_LOOP): tdc1190ReadBoardDmaDone: WRONG: nbytes_save[4]=176, res=0 => mbytes=176
0x9478760 (ROLS_LOOP): [ 4] ERROR: tdc1190ReadEvent[Dma] returns -2
0x9478760 (ROLS_LOOP): [ 5] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 6] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 7] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 8] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 0] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 1] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 2] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 3] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 4] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 5] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 6] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 7] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 8] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): [ 0] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 1] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 2] ERROR: tdc1190ReadEvent[Dma] returns 0
...............


'''Sergey Boyarinov's TODO list'''
Sergey Kuleshov Aug 15 2011: send Ben's schematic for DC TDC, and mentioned young engineers from Chile for CLAS disassemble/CLAS12 assemble 


- get TIBCO license
timeout 2: 9599514 + 64 = 9599578
timeout: 9599579 9599578
SEND_BUFFER_ROC 4
timeout 2: 9599579 + 64 = 9599643
timeout: 9599644 9599643
SEND_BUFFER_ROC 4
interrupt: SEND_BUFFER_ROL1
timeout 2: 9599644 + 64 = 9599708
timeout: 9599709 9599708
SEND_BUFFER_ROC 4
interrupt: SEND_BUFFER_ROL1
attempt to send short buffer failed !!!
timeout 2: 9599709 + 64 = 9599773
timeout: 9599774 9599773
SEND_BUFFER_ROC 4
timeout 2: 9599774 + 64 = 9599838
ERROR1: LINK_sized_write() returns errno=851971 (cc=-1, sizeof(nbytes)=4(104040), netlong=104040)
ERROR: net_thread failed (in LINK_sized_write).
timeout: 9599839 9599838
SEND_BUFFER_ROC 4
ERROR: big0.failure=0, big1.failure=1
interrupt: SEND_BUFFER_ROL1
0x9247630 (coda_proc): timer: 2 microsec (min=0 max=601 rms**2=1)
0x9247630 (coda_proc): timer: 2 microsec (min=0 max=601 rms**2=6)
interrupt: SEND_BUFFER_ROL1


- equipment list with DB, lebles (with Sergey P.)


- JLAB discriminators: ask Volker to push it
-------------------------


- need 1881M ADCs, at least few modules
Quota check:


- test sy527 which arrived from repair (with George Jacobs)
http://cc.jlab.org/cgi-bin/quotacheck.cgi


- buy labels for both labeling machines
Nerses:


- ET system debugging (with Carl)
Home number is +374 10 425049


- NTP servers on Solaris, update Solaris post-install page
Cell phone +374 91 206 217


- replug AC power to emergency generators
-----------


- learn, start and test auto-shutdown software on clons
clon02:/etc> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 131.225.70.20 netmask ff000000
age0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 131.225.70.20 netmask fffffc00 broadcast 131.225.255.255
        ether 0:a0:80:0:52:e5
eri0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 129.57.167.2 netmask ffffff00 broadcast 129.57.167.255
        ether 0:3:ba:1d:9b:c0
clon02:/etc>


- on Nerses's request: install S99caRepeater and S99logServer scripts,
  update corresponding procedure for Solaris (ask Paul Letta if necessary)


'''Sergey Boyarinov's COMPLETED list'''
-------


===
SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major
EVENT-TIME: Thu Mar  5 17:41:00 EST 2009
PLATFORM: SUNW,Netra-240, CSN: -, HOSTNAME: clon10
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: 2f3d5430-2bbb-c8aa-87d5-f79b44f89edf
DESC: The ZFS pool has experienced currently unrecoverable I/O failures.  Refer to http://sun.com/msg/ZFS-8000-HC for more information.
AUTO-RESPONSE: No automated response will be taken.
IMPACT: Read and write I/Os cannot be serviced.
REC-ACTION:
 
The pool has experienced I/O failures. Since the ZFS pool property 'failmode'
is set to 'wait', all I/Os (reads and writes) are blocked.  See the zpool(1M)
manpage for more information on the 'failmode' property.  Manual intervention
is required for I/Os to be serviced.  You can see which devices are
affected by running 'zpool status -x':
 
 
 
# zpool status -x
  pool: test
state: FAULTED
status: There are I/O failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
  see: http://www.sun.com/msg/ZFS-8000-HC
scrub: none requested
config:
 
        NAME        STATE    READ WRITE CKSUM
        test        FAULTED      0    13    0  insufficient replicas
          c0t0d0    FAULTED      0    7    0  experienced I/O failures
          c0t1d0    ONLINE      0    0    0
 
errors: 1 data errors, use '-v' for a list
 
 
 
After you have made sure the affected devices are connected, run 'zpool clear'
to allow I/O to the pool again:
 
 
 
# zpool clear test
 
 
 
If I/O failures continue to happen, then applications and commands  for the
pool may hang.  At this point, a reboot may be necessary to allow I/O to the
pool again.
 
 
----
 
mgetty
 
Sergey --- does this help you ?  Today marked the 3rd week since this
case was opened....
 
Paul
 
 
 
-------- Original Message --------
Subject:        CASE 66172087
Date:        Wed, 28 Jan 2009 12:39:51 -0500
From:        Roland 'butch' Morrissette - Sun Microsystems
<Roland.Morrissette@sun.com>
To:        letta@jlab.org
 
 
 
Paul
 
Received this from engineering. Hopefully this is of some helpfull.
 
xdm is creating $HOME/.Xauthority file, xauth only reads it. In
$HOME/.Xdefaults file try adding the following entry as an alternate
location to see if it works.
*
 
DisplayManager.DISPLAY.userAuthDir
 
*/DISPLAY should be the actual display, like
 
/DisplayManager.host1:0.0.userAuthDir: /path/to/alternate/file
 
*
*(If this does not work and still $HOME/.Xauthority file gets written
try changing the permissions on $HOME/.Xauthority file to readonly)
 
Regards,
 
--
 
Roland 'Butch' Morrissette
Sun Service OS Support
Sun Microsystems, Inc.
 
phone:  (781) 442-7112
email:  roland.morrissette@sun.com
(800)USA-4SUN (Reference your Case Id #)
 
My Working Hours : 8am-4pm ET, Monday thru Friday
My Manager's Email: dawn.ball@sun.com
 
 
 
 
 
 
 
5126
303-6644
 
CLAS12 DC (Mac 14-nov-2007): 2 stereo, +-6 degrees, good resolution (1% dp/p, 1 mrad angle), six 6-layer superlayers, 112 wires per layer; reconstruction improvements: use double hits, use segment angle in road dictionary, early l-r ambig. resolution (now it resolved locally and then corrected after track reconstructed ?), find tracks with no TOF hit and cut off accidental tracks (using residials ?), derive off-diagonal terms in error matrix
 
 
SVT readout:
 
SVX4 - old chip
132ns clock, 40pipeline cells ->5.2us trigger latency; can select readout window like pipeline TDCs (position defined +-132us, window size 132ns fixed)
initial part stored upto 4 events, they are rotating internally
32 bits per hit, 128 channels per chip, ... -> 3.2us per chip to get data from the chip to the buffer of 512 full events
L2 pipeline 16us
L1 latency from Amrit is 3us, Amrit will try to make it 4us
Use the same clock as entire CLAS12 trigger system (256MHz)
 
FSSR 125ns instead of 132ns (built for BTev, never used)
self-triggering, do not need L1ACCEPT, L2 pipe 16us can be implemented
 
SVX4 goes to review !!!
 
Generic DAQ drawing will be sent to Amrit in April

Latest revision as of 11:08, 31 August 2011

test setup error aug 31, 2011:

...
0x9478760 (ROLS_LOOP): INFO: User Go ... 
proc_thread: waiting=  31106 processing=     234 microsec per event (nev=59)
net_thread:  waiting=  21361    sending=      2 microsec per event (nev=108)
proc_thread: waiting=  12613 processing=    106 microsec per event (nev=66)
proc_thread: waiting=  13237 processing=    137 microsec per event (nev=66)
net_thread:  waiting=  13001    sending=      1 microsec per event (nev=129)
proc_thread: waiting=  12753 processing=    110 microsec per event (nev=65)
proc_thread: waiting=  13227 processing=    106 microsec per event (nev=66)
0x9478760 (ROLS_LOOP): tdc1190ReadBoardDmaDone: WRONG: nbytes_save[4]=176, res=0 => mbytes=176
0x9478760 (ROLS_LOOP): [ 4] ERROR: tdc1190ReadEvent[Dma] returns -2
0x9478760 (ROLS_LOOP): [ 5] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 6] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 7] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 8] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 0] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 1] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 2] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 3] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 4] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 5] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 6] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 7] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): tdc1190ReadStart: [ 8] not ready ! (nev=0)
0x9478760 (ROLS_LOOP): [ 0] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 1] ERROR: tdc1190ReadEvent[Dma] returns 0
0x9478760 (ROLS_LOOP): [ 2] ERROR: tdc1190ReadEvent[Dma] returns 0
...............

Sergey Kuleshov Aug 15 2011: send Ben's schematic for DC TDC, and mentioned young engineers from Chile for CLAS disassemble/CLAS12 assemble

timeout 2: 9599514 + 64 = 9599578

timeout: 9599579 9599578
SEND_BUFFER_ROC 4
timeout 2: 9599579 + 64 = 9599643
timeout: 9599644 9599643
SEND_BUFFER_ROC 4
interrupt: SEND_BUFFER_ROL1
timeout 2: 9599644 + 64 = 9599708
timeout: 9599709 9599708
SEND_BUFFER_ROC 4
interrupt: SEND_BUFFER_ROL1
attempt to send short buffer failed !!!
timeout 2: 9599709 + 64 = 9599773
timeout: 9599774 9599773
SEND_BUFFER_ROC 4
timeout 2: 9599774 + 64 = 9599838
ERROR1: LINK_sized_write() returns errno=851971 (cc=-1, sizeof(nbytes)=4(104040), netlong=104040)
ERROR: net_thread failed (in LINK_sized_write).
timeout: 9599839 9599838
SEND_BUFFER_ROC 4
ERROR: big0.failure=0, big1.failure=1
interrupt: SEND_BUFFER_ROL1
0x9247630 (coda_proc): timer: 2 microsec (min=0 max=601 rms**2=1)
0x9247630 (coda_proc): timer: 2 microsec (min=0 max=601 rms**2=6)
interrupt: SEND_BUFFER_ROL1



Quota check:

http://cc.jlab.org/cgi-bin/quotacheck.cgi

Nerses:

Home number is +374 10 425049

Cell phone +374 91 206 217


clon02:/etc> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
       inet 131.225.70.20 netmask ff000000 
age0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
       inet 131.225.70.20 netmask fffffc00 broadcast 131.225.255.255
       ether 0:a0:80:0:52:e5 
eri0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
       inet 129.57.167.2 netmask ffffff00 broadcast 129.57.167.255
       ether 0:3:ba:1d:9b:c0 
clon02:/etc> 



SUNW-MSG-ID: ZFS-8000-HC, TYPE: Error, VER: 1, SEVERITY: Major EVENT-TIME: Thu Mar 5 17:41:00 EST 2009 PLATFORM: SUNW,Netra-240, CSN: -, HOSTNAME: clon10 SOURCE: zfs-diagnosis, REV: 1.0 EVENT-ID: 2f3d5430-2bbb-c8aa-87d5-f79b44f89edf DESC: The ZFS pool has experienced currently unrecoverable I/O failures. Refer to http://sun.com/msg/ZFS-8000-HC for more information. AUTO-RESPONSE: No automated response will be taken. IMPACT: Read and write I/Os cannot be serviced. REC-ACTION:

The pool has experienced I/O failures. Since the ZFS pool property 'failmode' is set to 'wait', all I/Os (reads and writes) are blocked. See the zpool(1M) manpage for more information on the 'failmode' property. Manual intervention is required for I/Os to be serviced. You can see which devices are affected by running 'zpool status -x':


  1. zpool status -x
 pool: test
state: FAULTED

status: There are I/O failures. action: Make sure the affected devices are connected, then run 'zpool clear'.

  see: http://www.sun.com/msg/ZFS-8000-HC
scrub: none requested

config:

       NAME        STATE     READ WRITE CKSUM
       test        FAULTED      0    13     0  insufficient replicas
         c0t0d0    FAULTED      0     7     0  experienced I/O failures
         c0t1d0    ONLINE       0     0     0

errors: 1 data errors, use '-v' for a list


After you have made sure the affected devices are connected, run 'zpool clear' to allow I/O to the pool again:


  1. zpool clear test


If I/O failures continue to happen, then applications and commands for the pool may hang. At this point, a reboot may be necessary to allow I/O to the pool again.



mgetty

Sergey --- does this help you ? Today marked the 3rd week since this case was opened....

Paul



Original Message --------

Subject: CASE 66172087 Date: Wed, 28 Jan 2009 12:39:51 -0500 From: Roland 'butch' Morrissette - Sun Microsystems <Roland.Morrissette@sun.com> To: letta@jlab.org


Paul

Received this from engineering. Hopefully this is of some helpfull.

xdm is creating $HOME/.Xauthority file, xauth only reads it. In $HOME/.Xdefaults file try adding the following entry as an alternate location to see if it works.

DisplayManager.DISPLAY.userAuthDir

  • /DISPLAY should be the actual display, like

/DisplayManager.host1:0.0.userAuthDir: /path/to/alternate/file

  • (If this does not work and still $HOME/.Xauthority file gets written

try changing the permissions on $HOME/.Xauthority file to readonly)

Regards,

--

Roland 'Butch' Morrissette Sun Service OS Support Sun Microsystems, Inc.

phone: (781) 442-7112 email: roland.morrissette@sun.com (800)USA-4SUN (Reference your Case Id #)

My Working Hours : 8am-4pm ET, Monday thru Friday My Manager's Email: dawn.ball@sun.com




5126 303-6644

CLAS12 DC (Mac 14-nov-2007): 2 stereo, +-6 degrees, good resolution (1% dp/p, 1 mrad angle), six 6-layer superlayers, 112 wires per layer; reconstruction improvements: use double hits, use segment angle in road dictionary, early l-r ambig. resolution (now it resolved locally and then corrected after track reconstructed ?), find tracks with no TOF hit and cut off accidental tracks (using residials ?), derive off-diagonal terms in error matrix


SVT readout:

SVX4 - old chip 132ns clock, 40pipeline cells ->5.2us trigger latency; can select readout window like pipeline TDCs (position defined +-132us, window size 132ns fixed) initial part stored upto 4 events, they are rotating internally 32 bits per hit, 128 channels per chip, ... -> 3.2us per chip to get data from the chip to the buffer of 512 full events L2 pipeline 16us L1 latency from Amrit is 3us, Amrit will try to make it 4us Use the same clock as entire CLAS12 trigger system (256MHz)

FSSR 125ns instead of 132ns (built for BTev, never used) self-triggering, do not need L1ACCEPT, L2 pipe 16us can be implemented

SVX4 goes to review !!!

Generic DAQ drawing will be sent to Amrit in April