EVIO

From CLONWiki
Revision as of 12:07, 1 February 2012 by 129.57.81.115 (talk)
Jump to navigation Jump to search

EVIO is data format and corresponding software used at JLAB.

EVIO Manual version 2.0: pdf doc.

Event Building EVIO Scheme: pdf ppt pptx.

Not in the manual yet: 8-word EVIO block(record) header:

1. block length
2. block number
3. header length=8
4. event count (# of banks, without dictionary if it is inserted)
5. not used
6. bit info[31-8] version [7-0]; if dectionary is therer, bit 8 (starting from 0) will be set
7. unused
8. magic int = 0xc0da0100


CLAS12 guidelines:

  • 'EVIO BANK' (not 'EVIO SEGMENT' or 'EVIO TAGSEGMENT' will be used for data banks; bank header has 2 words, first one is exclusive bank length in words, second one contains 3 fields: tag[31:20], contentType[15:8] and num[7:0]
  • no mixed formats (for example 'int' and 'short') will be allowed in the same bank, format is described in 8-bit field contentType (or use type=0 and do custom swap ?), except for formatted banks (see below)
  • BOS bank name will be replaced with 12-bit field tag, BOS bank number will be replaced with 8-bit field num
  • banks dictionary (former DDL replacement) will be provided in some form (ASCII ?) and will contains a notation on every bank tag, most important will be the number of 'columns' and the meaning of every column (for example id,tdcl,adcl,tdcr,tdcl), it will help to decode data from the bank and can be used by evio viewers; will be written at least in the beginning of every file
  • data banks will be in 'unsigned short' or 'unsigned int' format; first element will be ID ('slot<<8+channel' without translation, 'layer<<8+wire' or similar with translation)
  • some bank examples:
tdc version 1: ID, TDC
tdc version 2: ID, N_HITS, TDC_1, ... , TDC_N
adc (raw data hits): ID, N_SAMPLES(=window_width), ADC_1, ... , ADC_N
adc (window integral - 32bit): ID, ADC_SUM(can be 22 bits+1 bit overflow)
adc (pulse data hits): ID, N_SAMPLES, FIRST_SAMPLE, ADC_1, ..., ADC_N
  • may give a tags to all existing CLAS banks and make sure CLAS12 bank tags do not overlap - will allow to process old and new banks together


EVIO possible extensions:

  • file split - implemented in CLAS
  • bank 'dictionary-based', or 'format-based' swap (will allow mixed format), can be enforced if contentType=0


EVIO format-based swap

Sergey B. proposed 2 functions to handle format-based banks in EVIO. The idea is to have designated bank with contentType=0x11(or 0x12), which will contain 2 internal banks (or segments). First bank (or segment) will contain bank format description, and second bank (or segment) will contain data. Format can be stored in form of ascii string, or in binary form. Two evio swap functions are provided to handle format-based banks: eviofmt() converts format from character string form to binary form, and eviofmtswap() swaps data between big-endian and little-endian forms in according to the format (using binary form). Functions were tested using included test program, but more detailed tests and checks are required, as well as implementation into EVIO package. Available formats are following FORTRAN rules in general, with some restrictions.

Possible formats and corresponding binary codes:

      1    'i'   unsigned int
      2    'F'   floating point
      3    'a'   8-bit char (C++)
      4    'S'   short
      5    's'   unsigned short
      6    'C'   char
      7    'c'   unsigned char
      8    'D'   double (64-bit float)
      9    'L'   long long (64-bit int)
     10    'l'   unsigned long long (64-bit int)
     11    'I'   int
     12    'A'   hollerith (4-byte char with int*32 bytes order for backward compatibility with BOS)

Examples:

"3I,4S,5I"
"3I,4S,(F,I)"
"3I,4S,3(F,I),F,I,2S,L"
"3I,4S,2(2(F,I)),2S,L"
"3I,4S,3(F,I),F,NS,L"

Rules:

  • when format control reaches last (outer) right parenthesis and there are data left, the format starts again by the last preceding right parenthesis, including its group repeat count, if any, or, if no group specification exists, then at the first left parenthesis of the format specification; if none of above exist, it starts from the beginning of the format control (FORTRAN rules)
  • repeat count must be between 2 and 15, missing repeat count assumed 1
  • if repeat count is 'N' instead of digits, it will be taken from the data assuming int*32 format; it allows to have variable length rows (see last example)


'EVIO format-based banks - 2011 implementation