Generally speaking you don't need to further post-process the data obtained in an OpenLC run. There are times, however, in which you would like to analize carefully which are the response times in iteration #N, or simply, access to data which is not printed by default after a OLCCommander execution.
OpenLC saves, by default, all the run (both raw an reduced) data on a directory in the server side. In this chapter I will try to tell you where and how access to this internal data.
As you probably already know (you have configured it), the rundata is saved in the directory stated in the OpenLC-config.xml (normally in /etc/OpenLC directory), under the tag rundbpath. The structure of this directory is outlined in figure 5.1.
As you can see, the OpenLC run database is a two-level hierarchical directory. In the first level, there exist a directory for each run number (in the form (run%06d % runNumber)). In the second level (i.e. for each run), a number of files are created containing all the data for the current run.
These second-level files are grouped in three subsets:
If you want to have look at the data collected by the experiment without further processing, you can start with RawData.xml. Here you have an example of the information you can find there:
<!DOCTYPE PyObject SYSTEM "PyObjects.dtd" [<PyObject class="RawData" module="OpenLC.server.RawData" id="138480564"> <attr type="string" name="hdfFilename" value="/home/falted/OpenLC/rundata/run000097/raw.hdf"></attr> <attr type="numeric" name="runid" value="97"></attr> <attr type="numeric" name="nevents_saved" value="2505"></attr> <attr type="numeric" name="nevents" value="2505"></attr> <attr type="numeric" name="elapsedTime" value="5.0204830169677734"></attr> <attr type="dict" name="filenames" id="136977668"> <entry> <key type="string" value="Local/test/random"></key> <val type="string" value="/home/falted/OpenLC/rundata/run000097/raw/Local/test/random.nc"></val> </entry> <entry> <key type="string" value="Local/test/constant"></key> <val type="string" value="/home/falted/OpenLC/rundata/run000097/raw/Local/test/constant.nc"></val> </entry> <entry> <key type="string" value="Local/test/linear"></key> <val type="string" value="/home/falted/OpenLC/rundata/run000097/raw/Local/test/linear.nc"></val> </entry> </attr> <attr type="dict" name="lastindex" id="138290436"> <entry> <key type="string" value="Local/test/random"></key> <val type="numeric" value="835"></val> </entry> <entry> <key type="string" value="Local/test/constant"></key> <val type="numeric" value="835"></val> </entry> <entry> <key type="string" value="Local/test/linear"></key> <val type="numeric" value="835"></val> </entry> </attr> <attr type="string" name="zipfile" value="/home/falted/OpenLC/rundata/run000097/raw.zip"></attr> <attr type="string" name="raw_dir" value="/home/falted/OpenLC/rundata/run000097/raw"></attr> <attr type="list" name="cmdIDs" id="137022540"> <item type="string" value="Local/test/constant"></item> <item type="string" value="Local/test/random"></item> <item type="string" value="Local/test/linear"></item> </attr> <attr type="string" name="currentDir" value="/home/falted/OpenLC/rundata/run000097"></attr> <attr type="string" name="datetime" value="Sun Jun 23 12:13:56 2002"></attr> </PyObject>
As you can see, you can get some interesting information here. If you look carefully at filenames tag, you will discover that it's a mapping (dictionary in Python jargon) between the commands in the Scenario and the NetCDF filenames. The files referenced by these filenames are package inside the raw.zip file.
In raw.hdf you can get all the raw data structured in a tree, which is well adpated to show the Scenario descrition for this run. The .hdf extension means that the file is in HDF5 format and there is a variety of software and utilities to read it. I would recommend you a couple: HDF5 tools (http://hdf.ncsa.uiuc.edu/hdf5tools.html) and HDFView (http://hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/). If you don't have HDF5 support, I suggest you to have a look at the utilities ncdump and ncgen which comes with the NetCDF library or the excellent plotting tool called grace (http://plasma-gate.weizmann.ac.il/Grace/).
Next, I'll describe shortly some of these utilities.
There is quite a few utilities to deal with HDF5 files in HDF5 tools, but the most important ones are h5dump and h5ls. You can get instructions by passing them the flag -?. As an example of use, look at the next command and its output:
$ h5ls -r raw.hdf /raw.hdf/Local Group /raw.hdf/Local/test Group /raw.hdf/Local/test/constant Group /raw.hdf/Local/test/constant/commandNumber Dataset {835} /raw.hdf/Local/test/constant/dataTransferred Dataset {835} /raw.hdf/Local/test/constant/threadNumber Dataset {835} /raw.hdf/Local/test/constant/timeSpent Dataset {835} /raw.hdf/Local/test/constant/wallClock Dataset {835} /raw.hdf/Local/test/linear Group /raw.hdf/Local/test/linear/commandNumber Dataset {835} /raw.hdf/Local/test/linear/dataTransferred Dataset {835} /raw.hdf/Local/test/linear/threadNumber Dataset {835} /raw.hdf/Local/test/linear/timeSpent Dataset {835} /raw.hdf/Local/test/linear/wallClock Dataset {835} /raw.hdf/Local/test/random Group /raw.hdf/Local/test/random/commandNumber Dataset {835} /raw.hdf/Local/test/random/dataTransferred Dataset {835} /raw.hdf/Local/test/random/threadNumber Dataset {835} /raw.hdf/Local/test/random/timeSpent Dataset {835} /raw.hdf/Local/test/random/wallClock Dataset {835} /raw.hdf/info Group
A nicer utility is HDFView. This is a Java application which lets you browse the contents of the HDF5 file, and even visualize dataset histograms. From the HDFView manual:
The HDFView is a Java-based tool for browsing and editing NCSA HDF4 and HDF5 files. HDFView allows users to browse through any HDF4 and HDF5 file; starting with a tree view of all top-level objects in an HDF file's hierarchy. HDFView allows a user to descend through the hierarchy and navigate among the file's data objects. The content of a data object is loaded only when the object is selected, providing interactive and efficient access to HDF4 and HDF5 files. HDFView editing features allow a user to create, delete, and modify the value of HDF objects and attributes.
In figure 5.2 you can see an example of an HDFView session.
This utility is very useful to browse NetCDF files. From the manual:
The ncdump tool generates the CDL text representation of a netCDF file on standard output, optionally excluding some or all of the variable data in the output. The output from ncdump is intended to be acceptable as input to ncgen. Thus ncdump and ncgen can be used as inverses to transform data representation between binary and text representations. ncdump may also be used as a simple browser for netCDF data files, to display the dimension names and sizes; variable names, types, and shapes; attribute names and values; and optionally, the values of data for all variables or selected variables in a netCDF file.
When the run is finished, the OpenLC server automatically starts a process to reduce the raw data and extract some run data statistics (following instructions in the scenario file) and saves it in files. You can find two files with the post-processed data information.