I
		S
		D
		A
		T

Wec cal. files

Page maintained by
Reine Gill

The ISDAT Instrument Measured Database Management System

This section will probably be rewritten in the near future, I feel it does not provide a developer with a good start RG.

Background

There exist numerous types of instruments which are the basis of experimental science. Typically, the instrument is capable of storing data in some format which is suitable for that instrument. We call such a format the raw instrument format. The raw instrument format is typically not self describing to humans. To make use of the stored data from some instrument, it is necessary to transform this raw data into some self-obvious representation such as a plot or a table. We call such a format human interpretable format. Such a transformation from the raw instrument format to the human interpretable format must be developed. ISDAT autonomously provides an human interpretable data interface to raw data but as to standard data formats.

Purpose

The purpose of ISDAT is to allow a high-level access to stored raw data acquired from arbitrary instruments.

Main Motivation

The main motivation for ISDAT is based on the fact that

There is no unique standard for storage of raw data.
Due to point 1, it is necessary to develop code to parse the raw data
Data analysis benefits if different types of raw data can be accessed in a uniform way

What ISDAT is not

a real-time system or a data acquisition system
a data format
merely a data analysis tool

Related Systems

MADRIGAL (for upper atmospheric data)
Papco
QSAS
SDDAS (space physics data)
SDT
SEDAT (space weather data)
Starlink (for astronomical data)

The Philosophy of Measurement and uniform data access

To acquire knowledge of physical reality it is necessary to partake in measurement acts. The results of natural science must ultimately depend on measurement acts consisting data collected by (non-subjective) instruments. The philosophy of science has come to an understanding of such measurement acts. Within the modern model, a measurement act is associated with a certain region within the framework of space and time. Measurements must ultimately be attributable to, what in the theory of relativity is known as, a WorldLine or a collection of WorldLine know as fiber bundle.

ISDAT uses the concept of the WorldLine to organize the reference to all data within an ISDAT database. This provides a uniform and powerful way of accessing any type of instrument measured data which is archived according to this philosophy.

A detailed analysis of WorldLine viewpoint of data access reveals that it is not particularly useful for describing certain types of scientific data such GIS (even though it is possible to encode some GIS information within this model) and ironically very amenable to economical data such stock market indexes (as they are associated with a measurement at a certain time at certain stock market exchange). However it is intuitive power of this referencing method which is important and not a discriminating definition of its end usage.

Concepts

Instrument measured data (Im_data)

Isdat is a Data Base Management System (DBMS) for Instrument Measured Data (Im_data). That is, it manages data which originates from an arbitrary measurement instrument. The instrument could either measure some physical parameter but must at least react in some deterministic manner to the physical surrounding it is located in.

WorldLineData

Data organized in such a way that it is indexed according to its coordinates in space-time.

Raw data

Refers to data which is encoded in an arbitrary format.

Data factory

A process, device or agent by which Im_data becomes accessible to an ISDAT FormatInterpreter.

FormatInterpreter

An ISDAT component which can parse files of data encoded in some specific format and extract a particular Im_data. A FormatInterpreter is reminiscent of a driver in ODBC parlance. It is the file format and the particular Im_data which distinguishes between individual FormatInterpreter. To become an FormatInterpreter a component must enroll in the ISDAT FormatInterpreter participation program. At present, ISDAT has the following FormatInterpreters:

Emma and Linda on Astrid-2
F1,F2 and F4 on Freja
V4 onViking
Cassini
Eiscat
WEC on Cluster

ISDAT database logical schema (Relational Db model)

ISDAT can be view according to the entity-relationship model which is the nowadays dominating model for describing databases. The E-R model is however not very amenable to the database model of ISDAT because of its highly arrayed structure and therefore not useful in practice and is for illustrational purposes only.

Public project independent tables

Query Table

Project

Public project dependent tables

GetContents Table
Project	Duration

GetData Table
EpochMoment	Im_data

Position Table
EpochMoment	Location

Attitude Table
EpochMoment	Attitude

Internal Tables

Who Interprets table
LogicalInstrument	FormatInterpreter

FormatInterpreter Projects
FormatInterpreter	File system directory list

Index File
DataRequest	File

Client/Server architecture

ISDAT v2 is built according to a client/server architecture with a typical 2 tier structure. In contrast, ISDAT v3 can be represented as a 3 tier system. The FormatInterpreters, the TransactionManager and the DataConsumers. The TransactionManager and the FormatInterpreters together make up the server side of ISDAT which is called the DataBaseHandler (DBH). DataConsumers are the client side of the ISDAT client/server dichotomy.

There are two kinds of clients the dedicated client is shipped with the ISDAT software and the term ISDAT client refers to any software which can correctly use the ISDAT ClientServices interface.

The conceptual interfaces of ISDAT

To the outside world, the ISDAT is defined through its logical interfaces. A developer or programmer wishing to employ ISDAT need only consider either one of two levels of interface. These two levels are. They are:

FormatInterpreter::Interface
Service Name	Arguments	Returns
ListServices		Array of Methods
DumpData	File Path or File	Array of Im_data

which is the lowest level. It represents the minimum requirements to access raw data.

The DumpData is a minimum data access method. It is essentially interfaces a one-pass parser for that particular data-format. A one-pass parser takes a file or datasource instance and returns a stream of arrayed data.

To give a language which provides access to the raw data but which, conceptually speaking, is on scientific or physical data level, ISDAT provides the entry-level interface:

ClientServices::Interface

Service Name Arguments Returns

ListServices Array of Methods

GetContents LogicalInstrument Array of Interval

GetData DataRequest DataObject

Query LogicalInstrument Array of LogicalInstrument

It is naturally targeted towards clients.

One of important features is that it enforces file or data source (an ODBC term) hiding. At the science level the actual storage of data is not important only the WorldLine referencing is needed. Besides making it closer to scientific queries it also facilitates systematic searches through whole databases.

The CientServices has besides it minimum set of atomic methods, a expandable set of derived client services which maybe named in the ListServices method. A server may thereby provide e.g. specially designed pattern recognition algorithms to be used by clients to find particular events. Other important derived client services methods will include transformation operators, e.g. ISDAT WorldLine data -> HDF, CDF, netCDF.

In practice the Logical interfaces are implemented within some environment which enables distributed access. For the moment only two implementations exist

c library using TCP/IP or streams
java class using TCP/IP, streams

In the future, there could be interfaces implemented in

CORBA
RMI
ODBC

The FormatInterpreter ISDAT participation program

To facilitate the inclusion of new data formats into an ISDAT database, ISDAT specifies a program for how a piece of data retrieval software can qualify as an ISDAT FormatInterpreter. This program allows the integrators of the new data format the choice between numerous levels of interfacing sophistication. At the lowest or entry level, the component must provide a mapping to the DumpData query.
The approach is therefore to provide a minimal query language for all aspiring new formats combined with ensured, well known and uniform high-level services of the ClientServices interface with a syntax amenable to a scientist.

OOP objects of ISDAT interfaces

The ISDAT objects are actual building blocks of the ISDAT source code and programming interface. They are rigorously defined concepts which make up the vocabulary of Im_data transaction.

LogicalInstrument

This specifies the source in which data is stored.

WorldLineData

Contains the Im_data encoded as the so-called "DataObject" along with associated attributes.

DataRequest

This object contains all the necessary information for making a meaningful request for a DataObject. It consists of a LogicalInstrument object and

DataIdentifier

Is used to uniquely identify a piece of Im_data. The structure is reminiscent of a URL, and know as URI, e.g. "isdat:://abisko.irfu.se:13/Cluster/3/efw#1999-07-23_13:01:07.0_10.0"

User Interfaces

User interfaces are interactive clients and therefore utilize the ClientServices interface.

Graphical User Interfaces (GUI)

ISDAT will provide the following GUIs

igr
GoodLook
OpenDX

Promt-based User Interfaces

Will be provided in the ISDAT v2.7 and include

isGetData LogicalInstrument EpochPeriod
isGetConents LogicalInstrument
isQuery LogicalInstrument

Application Programming Interfaces (API)

To develop code which either implements or utilizes the ISDAT logical interfaces the recommended procedures is to program using the ISDAT APIs which exist for numerous programming environments/applications:

Matlab
IDL
ANSI c
Java
PERL

The APIs are very similar to one another with respect to structure and namespace. A UML diagram therefore is suitable to describe them. Here, for example, is a UML desciption of the c API:

FormatInterpreters generation using Data Format Description Language (DFDL)

ISDAT will in the future implement parsers for DFDLs such as EAST. DFDLs describe the structure or format of raw data possibly all the way down to the bit level. What ISDAT will provide is promt-based tool which takes DFDL file and a project description file and then generates c++ code which can act as a FormatInterpreter and can be compiled along with the rest of the ISDAT server code.

Naturally, the motivation for using DFDL and the ISDAT DFDL FormatInterpreter generator tool is that it

speeds up the development of FormatInterpreters
automatically enforces format documentation
replaces other

Main advantages/features of ISDAT

Minimum effort for ISDAT FormatInterpreter participation
Accesses raw data and does not specify any data factory
Enforces file and/or data source hiding
Easy to develop UIs through numerous provided APIs (does not require specific environment such as X or IDL etc.)
Client/server architecture: enables distributed data access and computing
Uniform (cross-project) scientific-level data access/processing