|
|
I
|
S
|
D
|
A
|
T
|
Page maintained by
Reine Gill
|
|
The ISDAT
Instrument Measured Database Management System
This section will probably be rewritten in the near future, I feel it
does not provide a developer with a good start RG.
Background
There exist numerous types of instruments which are the basis of experimental
science. Typically, the instrument is capable of storing data in some format
which is suitable for that instrument. We call such a format the raw instrument
format. The raw instrument format is typically not self describing to humans.
To make use of the stored data from some instrument, it is necessary to
transform this raw data into some self-obvious representation such as a
plot or a table. We call such a format human interpretable format. Such
a transformation from the raw instrument format to the human interpretable
format must be developed. ISDAT autonomously provides an human interpretable
data interface to raw data but as to standard data formats.
Purpose
The purpose of ISDAT is to allow a high-level access to stored raw data
acquired from arbitrary instruments.
Main Motivation
The main motivation for ISDAT is based on the fact that
-
There is no unique standard for storage of raw data.
-
Due to point 1, it is necessary to develop code to parse the raw data
-
Data analysis benefits if different types of raw data can be accessed in
a uniform way
What ISDAT is not
-
a real-time system or a data acquisition system
-
a data format
-
merely a data analysis tool
Related Systems
The Philosophy of Measurement and uniform data access
To acquire knowledge of physical reality it is necessary to partake in
measurement
acts. The results of natural science must ultimately depend on measurement
acts consisting data collected by (non-subjective) instruments. The philosophy
of science has come to an understanding of such measurement acts. Within
the modern model, a measurement act is associated with a certain region
within the framework of space and time. Measurements must ultimately be
attributable to, what in the theory of relativity is known as, a WorldLine
or a collection of WorldLine know as fiber bundle.
ISDAT uses the concept of the WorldLine to organize the reference to
all
data within an ISDAT database. This provides a uniform and powerful way
of accessing any type of instrument measured data which is archived according
to this philosophy.
A detailed analysis of WorldLine viewpoint of data access reveals that
it is not particularly useful for describing certain types of scientific
data such GIS (even though it is possible to encode some GIS information
within this model) and ironically very amenable to economical data such
stock market indexes (as they are associated with a measurement at a certain
time at certain stock market exchange). However it is intuitive power of
this referencing method which is important and not a discriminating definition
of its end usage.
Concepts
Instrument measured data (Im_data)
Isdat is a Data Base Management System (DBMS) for Instrument Measured Data
(Im_data). That is, it manages data which originates from an arbitrary
measurement instrument. The instrument could either measure some physical
parameter but must at least react in some deterministic manner to the physical
surrounding it is located in.
WorldLineData
Data organized in such a way that it is indexed according to its coordinates
in space-time.
Raw data
Refers to data which is encoded in an arbitrary format.
Data factory
A process, device or agent by which Im_data becomes accessible to an ISDAT
FormatInterpreter.
FormatInterpreter
An ISDAT component which can parse files of data encoded in some specific
format and extract a particular Im_data. A FormatInterpreter is reminiscent
of a driver in ODBC parlance. It is the file format and the particular
Im_data which distinguishes between individual FormatInterpreter. To become
an FormatInterpreter a component must enroll in the ISDAT
FormatInterpreter participation program. At present, ISDAT has the
following FormatInterpreters:
-
Emma and Linda on Astrid-2
-
F1,F2 and F4 on Freja
-
V4 onViking
-
Cassini
-
Eiscat
-
WEC on Cluster
ISDAT database logical schema (Relational Db model)
ISDAT can be view according to the entity-relationship model which is the
nowadays dominating model for describing databases. The E-R model is however
not
very amenable to the database model of ISDAT because of its highly arrayed
structure and therefore not useful in practice and is for illustrational
purposes only.
Public project independent tables
Public project dependent tables
GetContents Table |
Project |
Duration |
GetData Table |
EpochMoment |
Im_data |
Position Table |
EpochMoment |
Location |
Attitude Table |
EpochMoment |
Attitude |
Internal Tables
Who Interprets table |
LogicalInstrument |
FormatInterpreter |
FormatInterpreter Projects |
FormatInterpreter |
File system directory list |
Index File |
DataRequest |
File |
Client/Server architecture
ISDAT v2 is built according to a client/server architecture with a typical
2 tier structure. In contrast, ISDAT v3 can be represented as a 3 tier
system. The FormatInterpreters, the TransactionManager and the DataConsumers.
The TransactionManager and the FormatInterpreters together make up the
server side of ISDAT which is called the DataBaseHandler (DBH). DataConsumers
are the client side of the ISDAT client/server dichotomy.
There are two kinds of clients the dedicated client is shipped
with the ISDAT software and the term ISDAT client refers to any
software which can correctly use the ISDAT ClientServices interface.
The conceptual interfaces of ISDAT
To the outside world, the ISDAT is defined through its logical interfaces.
A developer or programmer wishing to employ ISDAT need only consider either
one of two levels of interface. These two levels are. They are:
FormatInterpreter::Interface |
Service Name |
Arguments |
Returns |
ListServices |
|
Array of Methods |
DumpData |
File Path or File |
Array of Im_data |
which is the lowest level. It represents the minimum requirements to access
raw data.
The DumpData is a minimum data access method. It is essentially interfaces
a one-pass parser for that particular data-format. A one-pass parser
takes a file or datasource instance and returns a stream of arrayed data.
To give a language which provides access to the raw data but which,
conceptually speaking, is on scientific or physical data level, ISDAT provides
the entry-level interface:
ClientServices::Interface |
Service Name |
Arguments |
Returns |
ListServices |
|
Array of Methods |
GetContents |
LogicalInstrument |
Array of Interval |
GetData |
DataRequest |
DataObject |
Query |
LogicalInstrument |
Array of LogicalInstrument |
It is naturally targeted towards clients.
One of important features is that it enforces file or data source (an
ODBC term) hiding. At the science level the actual storage of data is not
important only the WorldLine referencing is needed. Besides making it closer
to scientific queries it also facilitates systematic searches through whole
databases.
The CientServices has besides it minimum set of atomic methods, a expandable
set of derived client services which maybe named in the ListServices
method. A server may thereby provide e.g. specially designed pattern
recognition algorithms to be used by clients to find particular events.
Other important derived client services methods will include transformation
operators, e.g. ISDAT WorldLine data -> HDF, CDF, netCDF.
In practice the Logical interfaces are implemented within some environment
which enables distributed access. For the moment only two implementations
exist
-
c library using TCP/IP or streams
-
java class using TCP/IP, streams
In the future, there could be interfaces implemented in
The FormatInterpreter
ISDAT participation program
To facilitate the inclusion of new data formats into an ISDAT database,
ISDAT specifies a program for how a piece of data retrieval software can
qualify as an ISDAT FormatInterpreter. This program allows the integrators
of the new data format the choice between numerous levels of interfacing
sophistication. At the lowest or entry level, the component
must provide a mapping to the DumpData query.
The approach is therefore to provide a minimal query language for
all aspiring new formats combined with ensured, well known and uniform
high-level services of the ClientServices interface with a syntax amenable
to a scientist.
OOP objects of ISDAT interfaces
The ISDAT objects are actual building blocks of the ISDAT source code and
programming interface. They are rigorously defined concepts which make
up the vocabulary of Im_data transaction.
LogicalInstrument
This specifies the source in which data is stored.
WorldLineData
Contains the Im_data encoded as the so-called "DataObject" along with associated
attributes.
DataRequest
This object contains all the necessary information for making a meaningful
request for a DataObject. It consists of a LogicalInstrument object and
DataIdentifier
Is used to uniquely identify a piece of Im_data. The structure is reminiscent
of a URL, and know as URI, e.g. "isdat:://abisko.irfu.se:13/Cluster/3/efw#1999-07-23_13:01:07.0_10.0"
User Interfaces
User interfaces are interactive clients and therefore utilize the ClientServices
interface.
Graphical User Interfaces (GUI)
ISDAT will provide the following GUIs
Promt-based User Interfaces
Will be provided in the ISDAT v2.7 and include
-
isGetData LogicalInstrument EpochPeriod
-
isGetConents LogicalInstrument
-
isQuery LogicalInstrument
Application Programming Interfaces (API)
To develop code which either implements or utilizes the ISDAT logical interfaces
the recommended procedures is to program using the ISDAT APIs which exist
for numerous programming environments/applications:
-
Matlab
-
IDL
-
ANSI c
-
Java
-
PERL
The APIs are very similar to one another with respect to structure and
namespace. A UML diagram therefore is suitable to describe them.
Here, for example, is a UML desciption of the c API:
FormatInterpreters generation using Data Format Description Language (DFDL)
ISDAT will in the future implement parsers for DFDLs such as EAST. DFDLs
describe the structure or format of raw data possibly all the way down
to the bit level. What ISDAT will provide is promt-based tool which takes
DFDL file and a project description file and then generates c++ code which
can act as a FormatInterpreter and can be compiled along with the rest
of the ISDAT server code.
Naturally, the motivation for using DFDL and the ISDAT DFDL FormatInterpreter
generator tool is that it
-
speeds up the development of FormatInterpreters
-
automatically enforces format documentation
-
replaces other
Main advantages/features of ISDAT
-
Minimum effort for ISDAT FormatInterpreter participation
-
Accesses raw data and does not specify any data factory
-
Enforces file and/or data source hiding
-
Easy to develop UIs through numerous provided APIs (does not require specific
environment such as X or IDL etc.)
-
Client/server architecture: enables distributed data access and computing
-
Uniform (cross-project) scientific-level data access/processing
|