A.I. programming in Prolog and Assembler

September 21, 2007

Reading EXCEL CSV-files as Prolog Clauses (SWI-Prolog source-code)

stylized depiction of a csv text file
Image via Wikipedia

If you need to convert into Prolog terms “raw data” supplied in EXCEL csv-files, read on! The source code in this posting will read any CSV file, converting each semicolon-delimited line (or record) of the CSV file into a Prolog clause, asserted in RAM. It is also possible to use the same code to read data deliberately provided (e.g. by another application) as a CSV-file, but which is specifically intended for use as a set of Prolog clauses.

This code also uses a couple of specification predicates: time_field_type/1, field1_as_functor/1, and conv_csvhead/2. These predicates control the behaviour of the conversion process, as follows:

time_field_type/1 :

  • time_field_type(0). In this case, time-fields in the CSV file (of the form “HH:MM” or “HH:MM:SS…”) are translated into minutes, ignoring seconds or hundredths of a second.
  • time_field_type(1). In this case, time-fields in the CSV file (of the form “HH:MM” or “HH:MM:SS…”) are translated into seconds, ignoring hundredths of a second.
  • time_field_type(2). In this case, time-fields in the CSV file are kept as they are, as atoms (e.g. ’03:35′, ’12:45:20′, etc).

field1_as_functor/1:

  • field1_as_functor(0): Each line in the CSV-file is interpreted as a prolog clause, where the functor of the clause is the first field of the record, and the other fields are arguments.
  • field1_as_functor(foo) (where ‘foo’ can be any atom): Each line in the CSV file is interpreted as a prolog clause, where the functor of the clause is foo (or any atom supplied as 1st argument to field1_as_functor/1) and all the fields are arguments.

conv_csvhead/2:

  • This predicate is used to convert the contents of the first field (of the CSV-file) into a (user-defined) internal Prolog representation. It is used only if “time_field_type(0)” exists. For example, to convert records where the first field is a Prolog functor ‘job’ but the actual contents of this field are ‘j’ (for brevvity), using a definition “conv_csvhead(j,job)” will convert each ‘j’ into a functor ‘job’. (Use of conv_csvhead/2 is optional; in the default case, it does nothing!)

Finally, some notes:

  • The main predicate to call is “loaddb(CSVfile)“, where CSVfile can be e.g. “test.csv”.
  • Provision has been taken for special fields which contain Lists of items, comma-delimited. In EXCEL these fields will appear as longish strings, but this code was written to parse them as Prolog atom-lists. (Comment-out this section if you don’t need it).
  • The only type of field that is currently not converted into any meaningful internal representation is DATE. Dates are converted to atoms, just as they appear, without parsing their actual contents. (As an exercise, you can re-use parts of the same code to parse date-fields!) The honest reason for this omission is that… I didn’t need dates (in an application I am developing, for which this code was also written).

The source-code follows. There are useful comments inside this code. You can just copy and paste what follows from this point onwards, into a text file saved for compilation by SWI-Prolog, ending in “.pl”: (more…)

Advertisements

September 20, 2007

SWI-Prolog source code: Converting hours-and-minutes to integers (e.g. for use in CLP)

Filed under: CLP, Conversions, Prolog, source code, SWI-Prolog, time-predicates — Omadeon @ 5:31 pm

This short posting is about a useful piece of SWI-Prolog code I keep (re-)using, ever since I wrote it. It is a predicate that converts time (in an EXCEL-compatible format ‘HH:MM’ or ‘HH:MM:SS’) to integers expressing minutes only, e.g. integers suitable for use in CLP applications (Constraints Logic Programming over finite integer domains). I am developing a serious CLP application, during the last few weeks and I regard the following (bi-directional) conversion code as indispensable:

%%% Conversion of Hours-and-minutes to integers and vice-versa (e.g. for CLP problems)
%%% converts number of minutes to a valid time-string e.g. '03:05':
mins2hourmin(MINS,OUTX):- nonvar(MINS), MINS > 0,
    Hours is MINS // 60, Minsx is MINS mod 60,
    num2str2(Hours,S1x), num2str2(Minsx,S2x),
    swritef(OUTX,'%w:%w',[S1x,S2x]), !
    ;
    MINS = 0, OUTX = '00:00', !.
mins2hourmin(HMINx,HRi):- nonvar(HRi),
    sub_atom(HRi,0,2,_,A1x), atom_number(A1x,HOURx),
    sub_atom(HRi,3,2,_,A2x), atom_number(A2x,MINSx),
    HMINx is MINSx + 60*HOURx, !.
mins2hourmin(MINS1,OUTX):- nonvar(MINS1), MINS1 < 0,
    MINS is -MINS1,
    Hours is MINS // 60, Minsx is MINS mod 60,
    num2str2(Hours,S1x), num2str2(Minsx,S2x),
    swritef(OUTX,'-%w:%w',[S1x,S2x]), !.
%% the same predicate operating on Lists of time-entities: 
mins2hourmin_list([],[]):- !.
mins2hourmin_list([M|ML],[X|XL]):- mins2hourmin(M,X), !, mins2hourmin_list(ML,XL).


%%% an auxilliary predicate for mins2hourmin/2:
num2str2(N,Sx):- N >= 10, swritef(Sx,'%w',[N]), !
    ;
    swritef(Sx,'0%w',[N]), !.

	

Bridging gaps between Prologs (SWI-Prolog predicates implented in LPA Win-Prolog)

Filed under: Code Conversion, LPA Win-Prolog, Prolog, source code, SWI-Prolog — Omadeon @ 4:36 pm

Today I spent too much time trying to force a SWI-Prolog project (of timetable scheduling, custom-made for a specific company) to run in a different compiler: LPA Win-Prolog. I needed badly to use certain graphics routines and other goodies of LPA Prolog (a commercial compiler), entire volumes of them in fact. So, I ended up writing code in LPA Prolog that implements some quite common SWI-Prolog predicates. Here are some of them:

The SWI-Prolog predicate ‘between/3’ generates (non-deterministically) a number, ranging from a minimum value to a maximum Value (as ‘bound’ 1st and 2nd arguments). I.e., in the SWI-Prolog console:

?- between(1,3,X).
X = 1 ;
X = 2 ;
X = 3 ;

Well, here is an LPA Win-Prolog implementation of this predicate (also valid in any other ISO-compatible Prolog):

between(Min,_,Min).
between(Min,Max,Out):- M2 is Min+1, M2 =< Max, between(M2,Max,Out).

OK, This was an easy example, while most probably the same code already exists in elementary Prolog textbooks. Here are some other (not-so-obvious) examples:

In LPA Prolog there are some very special, very efficient unique commands, like ‘find/3, which operates on ‘input streams’ to locate (sub-)strings inside them. Now, the so-called ‘input stream’ can itself (effectively) be just another string (turned into a stream through the special LPA command ‘<~’). The use of this predicate, ‘find/3’ to write quickly efficient code for non-deterministic search (of substrings inside larger strings) is a natural happy consequence. E.g.

findsubs(SubSTR,STR,Px):- len(SubSTR,Len), findrep(SubSTR,Len,Px) <~ STR.

findrep(_,Len,Px):- inpos(EndP), EndP > 0, Px is EndP-Len.
findrep(SubSTR,Len,Px):- find(SubSTR,0,Sx), +Sx=``, findrep(SubSTR,Len,Px). 

I wrote this code after understanding the (well-documented) similar non-deterministic predicate ‘replace’ (not a built-in command but given as simple source-code in LPA-Prolog’s Reference Quide, page 226). The reason I wrote it is because I needed it as a sub-predicate to implement of SWI-Prolog‘s superb predicate ‘sub_atom‘. (This was already used -alas in several places- inside the SWI-Prolog project, which was to be converted into LPA WIn-Prolοg).So, here is the resulting LPA-Prolog implementation of ‘sub_atom‘, valid for (almost) all possible flow-patterns (at least those used in my project, plus a few more):

/*********************************************************************************************
(a description of sub_atom/5 from SWI-Prolog's Open Source documentation):
sub atom(+Atom, ?Before, ?Len, ?After, ?Sub)
    ISO predicate for breaking atoms. It maintains the following relation: Sub is a sub-atom of Atom
    that starts at Before, has Len characters and Atom contains After characters after the match.
?- sub_atom(abc, 1, 1, A, S).
 A = 1, S = b
   The implementation minimises non-determinism and creation of atoms. This is a very flexible
   predicate that can do search, prefix- and suffix-matching, etc.
**********************************************************************************************/
% HERE is my LPA Win-Prolog (exact) implementation of 'sub_atom/5':
%%%%%%%%%% (also valid in most other ISO-ish Prologs) %%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% (i,i,i,o,o)
sub_atom(Atom,Start,Len,LenAfterX,SubX):-
   nonvar(Atom), nonvar(Start), nonvar(Len), var(LenAfterX), var(SubX),
   cat(Lx,Atom,[Start,Len]), Lx = [_,SubX,End], len(End,LenAfterX).
% (i,o,i,i,o)
sub_atom(Atom,Start,Len,LenAfter,SubX):-
   nonvar(Atom), nonvar(Len), nonvar(LenAfter), var(Start), var(SubX),
   len(Atom,LenTotal), Start is LenTotal - LenAfter - Len,
   cat(Lx,Atom,[Start,Len]), Lx = [_,SubX,_].
% (i,o,o,o,i)  (effectively a non-deterministic search-function)
sub_atom(Atom,Startx,SubLenx,LenAfterX,Sub):-
   nonvar(Atom), nonvar(Sub), var(SubLenx), var(LenAfterx), var(Startx),
   len(Sub,SubLenx),  atom_string(Atom,STR), atom_string(Sub,SubSTR), len(STR,Slen),
   findrep(SubSTR,SubLenx,Startx) <~ STR, LenAfterX is Slen-Startx-SubLenx.
% where 'findrep/3' was written previously as a part of 'findsubs' (top of this post).
%(i,i,o,o,o)
sub_atom(Atom,Start,Len,LenAfterX,SubX):-
   nonvar(Atom), nonvar(Start), var(Len), var(LenAfterX), var(SubX),
  cat(Lx,Atom,[Start]), Lx = [_,End], len(End,Len2),
  between(0,Len2,Lenx), cat(Lxx,End,[Lenx]),
  Lxx = [SubX,End2], len(End2,LenAfterX), Len=Lenx.
%(i,o,o,o,o)
sub_atom(Atom,Start,Len,LenAfterX,SubX):-
   nonvar(Atom), var(Start), var(Len), var(LenAfterX), var(SubX),
   len(Atom,Len1), between(0,Len1,Pos),
  cat(Lx,Atom,[Pos]), Lx = [_,End], len(End,Len2),
  between(0,Len2,Lenx), cat(Lxx,End,[Lenx]),
  Lxx = [SubX,End2], len(End2,LenAfterX), Len=Lenx, Start=Pos.
%(i,o,i,o,o)
sub_atom(Atom,Start,Len,LenAfterX,SubX):-
   nonvar(Atom), var(Start), nonvar(Len), var(LenAfterX), var(SubX),
   len(Atom,Len1), LenPre is Len1-Len, between(0,LenPre,Pos),
  cat(Lx,Atom,[Pos,Len]), Lx = [_,Mid,End],
  SubX = Mid, LenAfterX is Len1-Pos-Len, Start=Pos.

%%%%%%%%%% (end of code) %%%%%%%%

Well, that’s it, for the moment, I’m afraid.

I must go back to my project now, but (rest assured) I will be coming back here, soon, again and again, and again…

Blog at WordPress.com.