EVIO

From CLONWiki
Revision as of 08:02, 3 February 2012 by 98.166.92.51 (talk)
Jump to navigation Jump to search

EVIO is data format and corresponding software used at JLAB.

EVIO Manual version 2.0: pdf doc.

Event Building EVIO Scheme: pdf ppt pptx.

Not in the manual yet: 8-word EVIO block(record) header:

1. block length
2. block number
3. header length=8
4. event count (# of banks, without dictionary if it is inserted)
5. not used
6. bit info[31-8] version [7-0]; if dectionary is therer, bit 8 (starting from 0) will be set
7. unused
8. magic int = 0xc0da0100


CLAS12 guidelines:

  • 'EVIO BANK' (not 'EVIO SEGMENT' or 'EVIO TAGSEGMENT' will be used for data banks; bank header has 2 words, first one is exclusive bank length in words, second one contains 3 fields: tag[31:20], contentType[15:8] and num[7:0]
  • no mixed formats (for example 'int' and 'short') will be allowed in the same bank, format is described in 8-bit field contentType (or use type=0 and do custom swap ?), except for formatted banks (see below)
  • BOS bank name will be replaced with 12-bit field tag, BOS bank number will be replaced with 8-bit field num
  • banks dictionary (former DDL replacement) will be provided in some form (ASCII ?) and will contains a notation on every bank tag, most important will be the number of 'columns' and the meaning of every column (for example id,tdcl,adcl,tdcr,tdcl), it will help to decode data from the bank and can be used by evio viewers; will be written at least in the beginning of every file
  • data banks will be in 'unsigned short' or 'unsigned int' format; first element will be ID ('slot<<8+channel' without translation, 'layer<<8+wire' or similar with translation)
  • some bank examples:
tdc version 1: ID, TDC
tdc version 2: ID, N_HITS, TDC_1, ... , TDC_N
adc (raw data hits): ID, N_SAMPLES(=window_width), ADC_1, ... , ADC_N
adc (window integral - 32bit): ID, ADC_SUM(can be 22 bits+1 bit overflow)
adc (pulse data hits): ID, N_SAMPLES, FIRST_SAMPLE, ADC_1, ..., ADC_N
  • may give a tags to all existing CLAS banks and make sure CLAS12 bank tags do not overlap - will allow to process old and new banks together


EVIO possible extensions:

  • file split - implemented in CLAS
  • bank 'dictionary-based', or 'format-based' swap (will allow mixed format), can be enforced if contentType=0


EVIO format-based swap

Sergey B. proposed 2 functions to handle format-based banks in EVIO. The idea is to have designated bank with contentType=0xf (or 0x11, 0x12 etc), which will contain 2 internal banks (or segments). First bank (or segment) will contain bank format description, and second bank (or segment) will contain data. Format can be stored in form of ascii string, or in binary form. Two evio swap functions are provided to handle format-based banks: eviofmt() converts format from character string form to binary form, and eviofmtswap() swaps data between big-endian and little-endian forms in according to the format (using binary form). Functions were tested using included test program, but more detailed tests and checks are required, as well as implementation into EVIO package. Available formats are following FORTRAN rules in general, with some restrictions/additions.

Possible formats and corresponding binary codes:

      1    'i'   unsigned int
      2    'F'   floating point
      3    'a'   8-bit char (C++)
      4    'S'   short
      5    's'   unsigned short
      6    'C'   char
      7    'c'   unsigned char
      8    'D'   double (64-bit float)
      9    'L'   long long (64-bit int)
     10    'l'   unsigned long long (64-bit int)
     11    'I'   int
     12    'A'   hollerith (4-byte char with int*32 bytes order for backward compatibility with BOS)

Examples:

"3I,4S,5I"
"3I,4S,(F,I)"
"3I,4S,3(F,I),F,I,2S,L"
"3I,4S,2(2(F,I)),2S,L"
"3I,4S,3(F,I),F,NS,L"

Rules:

  • when format control reaches last (outer) right parenthesis and there are data left, the format starts again by the last preceding right parenthesis, including its group repeat count, if any, or, if no group specification exists, then at the first left parenthesis of the format specification; if none of above exist, it starts from the beginning of the format control (FORTRAN rules)
  • repeat count must be between 2 and 15, missing repeat count assumed 1
  • if repeat count is 'N' instead of digits, it will be taken from the data assuming int*32 format; it allows to have variable length rows (see last example)


EVIO formated banks - EVIO version 4.0 and later

EVIO banks with contentType=0xf are formatted banks. They contains two parts: tagsegment with format description as ASCII string, and bank with data. All data banks produced by CLAS12 DAQ will be primitive data types whenever possible, or formatted type. Formatted banks will be used if different data types are needed inside the same bank, or/and if banks contains raws with variable length. Following formats will be used for basic data banks as they are inserted into the Event Transfer system from Event Builder:

  • CAEN v1190/v1290 TDCs: "2c,Ns" - slot, channel, the number of hits, hits
  • JLAB FADC250 in WINDOW RAW DATA mode: "2c,Ns" - slot, channel, the number of samples, samples
  • JLAB FADC250 in PULSE RAW DATA mode: "2c,N(c,Ns)" - slot, channel, the number of pulses (first sample number, the number of samples, samples)
  • JLAB FADC250 in PULSE INTEGRAL mode: "2c,N(c,i)" - slot, channel, the number of pulses (quality factor, pulse integral)

For 'translated' banks geographic identifiers (slot-channel) will be replaced by logical identifiers, for example layer-wire etc.

Every subsystem will be assigned with unique 16-bit bank tags, while 8-bit bank number can be used for data separation inside subsystem, for example for different crates or sectors. Following scheme can be proposed:

             31..........24 23..........16 15..........8 7..........0
                                   l e n g t h                          <- bank[0]
                system#                      0xf       number           <- bank[1]
                 tag              0x6                  length                       <- segment
                                    "3I,4S,3(F,I),F,NS,L"                            <- format string
                                          l e n g t h                                     <- bank[0]
                    subsystem#                   0x0       number            <- bank[1]
                                                                                              <- data