1. Introduction & Simple Example
Important note: CML-file stands for 'Cybule-like Multi Language'; obviously CML-file does not define any multi-language such as HTML; it's a joke :o). Firstly it was developed for my project Cybule (something like neuronal network) and was generalised into this.
Well, what is it CML-file: it's a C++ class that provides some (I guess very easy) way to acquire data from a text configuration or data file. It uses very common "technology" of keywords: a keyword stands on the begining of each line. If somebody wants the rest of the line, simply sends the keyword to CML-file and CML-file returns wanted data. It's all for now but CML-file has some other capabilities, such as:
#
-- this file contains information about train 'W.A.Mozart' that is going
from Berlin - Haupt Bahnhoff to Prague - Main Station --
#
-- definition of the train --
Name
"Wolfgang Amadeus Mozart"
Locomotive
type=Electric power=1750.5kW
Wagon
Post
Wagon
1stClass 1
Wagon
Restaurant
Wagon
2ndClass 2
Wagon
2ndClass 3
Wagon
2ndClass 4
#
-- crew --
Machinenfuhrer
"Helmut Diesenhoffer"
Conductors
3
ConductorNames
"Irene Ober-Hausendorfer" "Joachim Heinzmann" "Carl Wolf"
Cook
"Jonathan Krueger-Stadler"
Waiter
"Heinrich Zuckermann"
#
-- stations --
Station
"Berlin - Haupt Bahnhoff"
Station
"Dresden"
Station
"Chemnitz"
Station
"Decin"
Station
"Usti nad Labem"
Station
"Prague - Main Station"
#
-- arrivals and departs
Depart
"Berlin-Haupt Bahnhoff" 10.15
In
"Dresden" 11.29 11.35
In
"Chemnitz" 13.02 13.15
In
"Decin" 14.44 14.50
In
"Usti nad Labem" 15.22 15.30
Arrival
"Prague - Main Station" 16.54
#
-- reservations --
SECTION
Reservations
"Armin
Muehler-Stahl" 1 33 "Berlin - Haupt Bahnhoff" "Prague - Main Station"
"Jorgen
Prochnow" 2 59 "Chemnitz" "Dresden"
"Ludwig
von Spretti-Weilbach" 2 58 "Berlin - Haupt Bahnhoff" "Dresden"
END
#
-- end of file --
I guess nobody needs any explanation of this absolutely
stupid and useless example - but look, it's only an example :o).
Several modes of data format occur in this example,
so:
Conductors 3
This is the most simple form - a keyword and single value (int in this case); value can be also double or string, as in following:
Name "Wolfgang Amadeus Mozart"
Strings need not to be in commas ("something") but
in case they include spaces.
If you want to have more than one value on single
line you may (and they need not to ne of the same type, one may be int
and other double etc.). Example is:
ConductorNames "Irene Ober-Hausendorfer" "Joachim Heinzmann" "Carl Wolf"
CML-file does not provide any parsing of a single line; if you need to parse something like this:
Locomotive type=Electric power=1750.5kW
you have to read the rest of the line behind the keyword as a line (whole string till the end of line) and parse it using other routines. As you can mention, this example does not tell how many wagons the train has - but it doesn't matter, CML-file will count them itself. So, when you want to read data of wagons, you start with asking "how many wagons do we have" and then (for example in some for-cycle) you read them one after one in the order as they are stored in the file by asking "give me type of wagon number 3" for example (but look, 3rd wagon is this:
Wagon 2ndClass 2
- every indexing starts from zero). Notice: number 2 is
number of passenger wagon that I use in reservations section - look there.
The same comes for data of stations.
Lines that contain information about stations between
Berlin and Prague has the same keyword and you can read this using reading
with key, where key is string-type and is for example "Dresden".
Than you can use 'reading with order' as in ConductorNames
line; note that 11.29 (we are in
Dresden) has order 0 and 11.35
has order 1:
In "Dresden" 11.29 11.35
Last thing are reservations. Between lines
SECTION Reservations
and
END
are stored data lines:
"Armin
Muehler-Stahl" 1 33 "Berlin - HauptBahnhoff" "Prague - Main Station"
"Jorgen Prochnow"
2 59 "Chemnitz" "Dresden"
"Ludwig von Spretti-Weilbach"
2 58 "Berlin - HauptBahnhoff" "Dresden"
Reading of this is done by following steps:
1. Tell CML-file which section you want to read
(well, you may have more than one section in one file - but they must have
different names).
2. One after one you get whole these lines.
3. Tell CML-file that you are done with this section
(note: you may not start reading somewhere in other place of file during
reading one section; first you must tell CML-file that you have finished
the section.
2. Instanciating & configuring CML-file
Well, just look how to instanciate some CML-file.
The CML-file is an C++ class, so you must declare
this as a variable (the instance of class):
cmlfile F ;
Or you may straightly open a file doing this:
cmlfile F( "train.cml" ) ;
I think I should note there that after all you may close it using:
F.close() ;
but it is not necessary, because destructor of the
class itself closes everything. Well, generally it is better to use destructor,
because CML-file has relatively high amount of memory allocated for itself
(of course, not megabytes) and destructor frees this all.
After instanciating CML-file you should configure
it, which means, tell CML-file what keywords and what sections you will
use and also the 'requirements' - this means for example the following:
every train must have a locomotive, so line with keyword Locomotive
is necessary - so required. The CML-file can than answer a question if
every required data are present in the file.
You have 3 ways to configure a CML-file: or using
some C++ interface, or with so called rc-file or by 'inheritance' from
another CML-file. No, I don't mean 'inheritance' in the C++ way of speaking
but simply this: you send one CML-file to another and the other will copy
the configuration to itself. It will copy - well , it's slow and I could
only link it, but I assume, that another file may have something special
and so I would like to leave the possibility to make additional changes.
There is a function called
void cmlfile::compile ( void )
that reads the file and creates a map. Normally it is called automatically; but you can call it by yourself if you want:
F.compile() ;
2.1 Configuring using C++ interface
It is the most complicated way. We have some couple of functions:
void cmlfile::set_labels
( int olabels ) ;
void cmlfile::set_sections
( int osections ) ;
void cmlfile::set_label_string
( int olabel , char *olabel_string ) ;
void cmlfile::set_section_string
( int olabel , char *osection_string ) ;
void cmlfile::require
( int olabel ) ;
void cmlfile::minimal
( int olabel , int ominimal ) ;
void cmlfile::optionalize
( int olabel ) ;
void cmlfile::refuse
( int olabel ) ;
void cmlfile::set_minimal_rows
( int osection , int ominimal ) ;
So:
set_labels
tells CML-file how many labels (keywords as for example Machinenfuhrer)
will be used,
set_sections tells
CML-file how many sections (as Reservations)
will be used,
set_label_string
tells that for example keyword for label number 5 is "Machinenfuhrer";
this number has no other meaning but it must be lower than number of labels
that was set using set_labels;
also, if I said that I will have 10 labels, their numbers must be between
0 and 9; 10 is too big,
set_section_string
is the same for sections,
require
tells CML-file that apropriate label is necessary to be in the file; it
means that it must be there minimally once,
minimal
tells minimal amount of lines with apropriate label that must be in the
file,
optionalize tells
that apropriate label need not to be in the file,
refuse tells
that apropriate label must not be in the file (but actually it does not
generate an error if it catch such label),
set_minimal_rows
tells that apropriate section must have minimally some amount of rows;
if no of functions above (since require)
is used, default value is that every label is optional and minimal number
of rows for all sections is 0.
Now, I will show example function that configure CML-file for reading file that defines the train (but first I will make some #defines, because I don't want to remember numbers of labels):
#include "cmlfile.h"
#define Name 0
#define Locomotive
1
#define Wagon 2
#define Machinenfuhrer
3
#define Conductors
4
#define ConductorNames
5
#define Cook 6
#define Waiter
7
#define Station
8
#define Depart
9
#define In 10
#define Arrival
11
#define Reservations 0
void configure (
cmlfile &oF )
{
oF.set_labels( 12 ) ;
oF.set_sections( 1 ) ;
oF.set_label_string( Name,"Name" ) ;
oF.set_label_string( Locomotive,"Locomotive" ) ;
oF.set_label_string( Wagon,"Wagon" ) ;
oF.set_label_string( Machinenfuhrer,"Machinenfuhrer" ) ;
oF.set_label_string( Conductors,"Conductors" ) ;
oF.set_label_string( ConductorNames,"ConductorNames" ) ;
oF.set_label_string( Cook,"Cook" ) ;
oF.set_label_string( Waiter,"Waiter" ) ;
oF.set_label_string( Station,"Station" ) ;
oF.set_label_string( Depart,"Depart" ) ;
oF.set_label_string( In,"In" ) ;
oF.set_label_string( Arrival,"Arrival" ) ;
oF.set_section_string( Reservations,"Reservations" ) ;
oF.require( Name ) ;
oF.require( Locomotive ) ;
oF.minimal( Wagon,2 ) ;
oF.require( Machinenfuhrer ) ;
oF.require( Conductors ) ;
oF.minimal( Station,2 ) ;
oF.require( Depart ) ;
oF.require( Arrival ) ;
}
Everything clear? Hope so :o)
Note: you may use functions as require anytime,
so for example, when you realise, that train goes through 4 station between
Berlin and Prague, you may apply:
F.minimal( In,4 ) ;
2.2 Configuring using an rc-file
This is somehow more simple. You create an rc-file, that may look like this:
Name
1
Locomotive 1
Wagon 2
Machinenfuhrer
1
Conductors 1
ConductorNames
1
Cook 0
Waiter 0
Station 2
Depart 1
In 0
Arrival 1
SECTION Reservations
0
Number at the end of each lines means the minimal
number of lines with apropriate keyword in the file.
Assume that name of rc-file is example.rc. Then
we can configure CML-file instead of function configure
from previous subsection by calling this:
F.load_labels( "example.rc" ) ;
It opens rc-file and reads everuthing itself.
Obviously, if you want another format of rc-file,
you must read your own routine using the C++ interface.
2.3 Configuring using another CML-file
Well, if you have for example CML-file F already configured, you can inherit it's configuration to another CML-file (for example F2) using this:
F2.inherit_labels( F ) ;
And it's all.
There are several cases that may occur while reading data on lines (which means not in sections).
This is (in our railroady example):
Name
"Wolfgang Amadeus Mozart"
Machinenfuhrer "Helmut
Diesenhoffer"
Conductors 3
Cook "Jonathan Krueger-Stadler"
Waiter "Heinrich Zuckermann"
Nothing more than a keyword (single in whole file) and single data item. We have functions:
void
cmlfile::get_value ( int olabel , int &ovalue ) ;
void cmlfile::get_value
( int olabel , double &ovalue ) ;
void cmlfile::get_value
( int olabel , char *ovalue ) ;
So we can acquire for example name of the train by this:
char train_name[64] ;
F.get_value( Name,train_name
) ;
I would like to note here, that if you used rc-file to configure CML-file that in such case is not very apropriate to define these #defines as was done is subsection 2.1 - and this is because you may not be sure that you wrote items into rc-file in the order that is corresponing with #defines. So you must either remember this order (and it is not comfortable, I know) or instead of what is written above, use:
F.get_value( LABEL( "Name" ),train_name ) ;
Example again:
ConductorNames "Irene Ober-Hausendorfer" "Joachim Heinzmann" "Carl Wolf"
Irene Ober-Hausendorfer has order 0, Joachim Heinzmann
1 and Carl Wolf 2.
Functions are similar only they have one parameter
additional:
void cmlfile::get_value
( int olabel , int oorder , int &ovalue ) ;
void cmlfile::get_value
( int olabel , int oorder , double &ovalue ) ;
void cmlfile::get_value
( int olabel , int oorder , char *ovalue ) ;
Carl Wolf can be obtained by this:
char conductor_name3[64]
;
F.get_value( ConductorNames,2,conductor_name3
) ;
3.3 Multiple data for one label
Example are wagons:
Wagon
Post
Wagon
1stClass 1
Wagon
Restaurant
Wagon
2ndClass 2
Wagon
2ndClass 3
Wagon
2ndClass 4
First of all you should ask CML-file how many wagons do we have:
int number_of_wagons
;
F.get_value_count(
Wagon,number_of_wagons ) ;
Then you may read information on wagons one after one in cycle:
int i,wagon_id ;
char wagon[64]
;
for ( i=0 ; i<number_of_wagons
; i++ )
{
F.get_value_with_index( Wagon,i,wagon ) ;
if ( !strcmp( wagon,"1stClass" ) || ( !strcmp( wagon,"2ndClass" )))
{
F.get_value_with_index( Wagon,i,1,wagon_id ) ;
}
}
Note - I have read the number at the end of the line
if type of wagon was 1stClass or
2ndClass
using
the function that has additional parameter order like in previous subsection.
Yes, I merely forgot to citate the function headers:
void cmlfile::get_value_with_index
( int olabel , int oindex , int &ovalue ) ;
void cmlfile::get_value_with_index
( int olabel , int oindex , double &ovalue ) ;
void cmlfile::get_value_with_index
( int olabel , int oindex , char *ovalue ) ;
void cmlfile::get_value_with_index
( int olabel , int oindex , int oorder , int &ovalue ) ;
void cmlfile::get_value_with_index
( int olabel , int oindex , int oorder , double &ovalue ) ;
void cmlfile::get_value_with_index
( int olabel , int oindex , int oorder , char *ovalue ) ;
Sometimes we may have multiple lines with the same keyword and we want to choose which one to read each time by some additional key; in our example - we already know through what stations our train goes and now we need to know where it arrives there and when leaves. These lines contain such information.
In
"Dresden" 11.29 11.35
In "Chemnitz"
13.02 13.15
In "Decin"
14.44 14.50
In "Usti
nad Labem" 15.22 15.30
So we should use some function of type get_value_with_key; here they are:
void cmlfile::get_value_with_key
( int olabel , char *okey , int &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , char *okey , double &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , char *okey , char *ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , char *okey , int oorder , int &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , char *okey , int oorder , double &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , char *okey , int oorder , char *ovalue ) ;
Say that we would like to acquire data for Chemnitz. Just use this:
double chemnitz_arrival,chemnitz_leave
;
F.get_value_with_key(
In,"Chemnitz",0,chemnitz_arrival ) ;
F.get_value_with_key(
In,"Chemnitz",1,chemnitz_leave ) ;
The key is not necessarily a string; it may also be simply int. Then you may use these set of functions:
void cmlfile::get_value_with_key
( int olabel , int okey , int &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , int okey , double &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , int okey , char *ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , int okey , int oorder , int &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , int okey , int oorder , double &ovalue ) ;
void cmlfile::get_value_with_key
( int olabel , int okey , int oorder , char *ovalue ) ;
It's absolutely simple. Double value may be given also by percentage - instead 0.05 there is 5%. If you use one of these functions:
void cmlfile::get_percentage
( int olabel , double &ovalue ) ;
void cmlfile::get_percentage
( int olabel , int oorder , double &ovalue ) ;
void cmlfile::get_percentage_with_key
( int olabel , int okey , double &ovalue ) ;
void cmlfile::get_percentage_with_key
( int olabel , int okey , int oorder , double &ovalue ) ;
void cmlfile::get_percentage_with_key
( int olabel , char *okey , double &ovalue ) ;
void cmlfile::get_percentage_with_key
( int olabel , char *okey , int oorder , double &ovalue ) ;
void cmlfile::get_percentage_with_index
( int olabel , int okey , double &ovalue ) ;
void cmlfile::get_percentage_with_index
( int olabel , int okey , int oorder , double &ovalue ) ;
they automaticaly choose whether it is written as
the first or second case and in both return 0.05.
Promiles are not supported :o).
3.6 Reading whole line for parsing using other routines
As was said before, CML-file does not provide any functionality for parsing lines such as:
Locomotive type=Electric power=1750.5kW
So you must parse it yourself. CML-file only gives you the line, ifever there is only one line or multiple lines. If you have more than one line, you can access them either by index similar as functions get_value_with_index either using int or string key similar as with functions get_value_with_key. So there are these functions:
void
cmlfile::get_line ( int olabel , char *oline ) ;
void cmlfile::get_line_with_key
( int olabel , int okey , char *oline ) ;
void cmlfile::get_line_with_key
( int olabel , char *okey , char *oline ) ;
void cmlfile::get_line_with_index
( int olabel , int oindex , char *oline ) ; .
Using this:
char line[256] ;
F.get_line( Locomotive,line
) ;
you acquire string 'type=Electic power=1750.5kW' into the variable line.
3.7 Reading data while they are spread into multiple files
Very shortly. Files may be included one into other
using directive FILE. I decided
about the limit of maximum files opened from and it is now 16. If you disagree,
just hack my cmlfile.c file, find
#define
MaxFiles 16, correct it and recompile.
So, to using it, simply write something like:
FILE otherfile.cml
into file (like train.cml)
and everything else work (I hope :o)).
And - if you place 6 stations in file train.cml
and 4 in file otherfile.cml, the
train will than go through 10 stations (get_value_count
will return 10.) Is it clear?
Some kind of data are such that using keyword with them is the right thing. Assume that if we have very long list of data where each item has the same syntax, using keyword on the beggining of each line will make our file unusably too long. So we close them all into the section, that looks like this:
SECTION Reservations
"Armin
Muehler-Stahl" 1 33 "Berlin - Haupt Bahnhoff" "Prague - Main Station"
"Jorgen
Prochnow" 2 59 "Chemnitz" "Dresden"
"Ludwig
von Spretti-Weilbach" 2 58 "Berlin - Haupt Bahnhoff" "Dresden"
END
Reading of these data contains 3 steps.
1. Start reading some section - you will get number
of rows.
int rows;
F.find_section(
Reservation,rows ) ;
2. Read one after one all lines
char line[256] ;
int i ;
for ( i=0 ; i<rows
; i++ )
{
F.get_section_line( Reservations,line ) ;
}
3. Tell CML-file that you are done reading this section. As was mentioned before, CML-file will not allow you to read any other data if you don't finish reading the section.
F.end_reading_section() ;
There are headers of our functions:
void cmlfile::find_section
( int olabel , int &osection_rows ) ;
void cmlfile::get_section_line
( int olabel , char *oline ) ;
void cmlfile::end_reading_section
( void ) ;
How many sections can you have? As many as you want,
but they must have different names. Also, you can spread one section into
multiple files and CML-file will paste them into one.
If you used rc-file to configure CML-file, it is
not very useful to send directly the labels into functions above and instead
you may write:
F.find_section( SECTION( "Reservations" ),rows ) ;
and so on.
CML-file is able to check whether the file contains enough data. Look in chapter Instanciating & configuring CML-file, there I explained how to tell CML-file, which data are required and how many times. After this you can use couple of functions that can tell you if file is OK or not:
void cmlfile::check_requirements
( void ) ;
void cmlfile::check_label
( int olabel ) ;
void cmlfile::check_section
( int osection ) ;
Must I explain details? Guess not.
Instead I will say something about what these functions
are doing if they find some problem. They:
1. Write some sentence to console.
2. Sets some internal flag.
You can check whether some problem occures using this function:
int cmlfile::error_in_requirements ( void ) ;
which returns 1 if problems have occured and 0 if
everything is OK.
Well, I think that you won't be satisfied with this
working-around-problems, but if you know something better, just hack my
cybfile.c
file. Find these functions - I think everybody can understand very easily.
CML-file provides capability for writing into some
output file also. It uses the same labels and so it prints out the correct
strings as keywords into output file. If you find this useless, you obviously
need not to use it :o).
I will present here functions and examples and for
each example I will present it's output. But first you must open some file
for output:
cmlfile outF ;
F.open_for_output(
"out.cml" ) ;
Than just use these functions:
void cmlfile::out
( int olabel , int ovalue ) ;
void cmlfile::out
( int olabel , double ovalue ) ;
void cmlfile::out
( int olabel , char *ovalue ) ;
void cmlfile::out_label
( int olabel ) ;
void cmlfile::out_int
( int ovalue ) ;
void cmlfile::out_double
( double ovalue ) ;
void cmlfile::out_string
( char *ovalue ) ;
void cmlfile::out_eoln
( void ) ;
void cmlfile::out_section
( int olabel ) ;
void cmlfile::out_data
( char *oline ) ;
void cmlfile::out_section_end
( void ) ;
Now, one after one:
outF.out( Name,"Wolfgang Amadeus Mozart" ) ;
will write this:
Name "Wolfgang Amadeus Mozart"
Then:
outF.out_label(
ConductorNames ) ;
outF.out_string(
"Joseph Keitel" ) ;
outF.out_string(
"Heinz Albert Petersen" ) ;
outF.out_eoln()
;
will write this:
ConductorNames "Joseph Keitel" "Heinz Albert Petersen"
And:
outF.out_section(
Reservations ) ;
outF.out_data(
"Helmut Grockenberger 4 70 Dresden Decin" ) ;
outF.out_section_end()
;
will write this:
SECTION Reservations
Helmut Grockenberger
4 70 Dresden Decin
END
It is not exactly what you expect? Commas missing? Well, section data are not my problem, you know. CML-file only reads and writes whole lines.
If some error ocuurs, CML-file sets some internal flag. These functions tell you if some error has occured.
int cmlfile::error_opening_file
( void ) ;
int cmlfile::error_reading_file
( void ) ;
int cmlfile::error_writing_file
( void ) ;
int cmlfile::error_in_requirements
( void ) ;
int cmlfile::any_error
( void ) ;
You should check it - "time to time." Note that if error occurs and you want to continue reading, you should use this:
void cmlfile::clear_errors ( void ) ;
CML-file allows you reading next data even if error-flag is set to 1. But - you cannot recognize any new errors that may occur next because flags are already set to 1.
CML-file is now released as shared library. You get file cmfile-0001.tgz; unpack it, than type 'make' or 'make lib' - it creates the library; than become root and type 'make install' - first of it you can edit lines:
INSTDIR_INCLUDE = /usr/include
INSTDIR_LIB = /usr/lib
These lines specify destination, where header file (cmlfile.h) and library (libcmlfile.so) are going to be stored. (Wow, these lines you can find in the makefile.)
After having successfully installed cmlfile, you can use it in some your own program. Just include the header:
#include <cmlfile.h>
When compiling and linking your program, use switch -lcmlfile to your linker. Such as:
g++ -o something something.o -lcmlfile
Please do not forget that cmlfile is c++ class, so you need g++ to compile the piece of program that uses it.
This is the first released version of cmlfile (cmlfile-0001).
I hope it does not contain fatal errors, but who knows...
I must mention, that this comes under GPL.