A.I. programming in Prolog and Assembler

September 21, 2007

Reading EXCEL CSV-files as Prolog Clauses (SWI-Prolog source-code)

stylized depiction of a csv text file
Image via Wikipedia

If you need to convert into Prolog terms “raw data” supplied in EXCEL csv-files, read on! The source code in this posting will read any CSV file, converting each semicolon-delimited line (or record) of the CSV file into a Prolog clause, asserted in RAM. It is also possible to use the same code to read data deliberately provided (e.g. by another application) as a CSV-file, but which is specifically intended for use as a set of Prolog clauses.

This code also uses a couple of specification predicates: time_field_type/1, field1_as_functor/1, and conv_csvhead/2. These predicates control the behaviour of the conversion process, as follows:

time_field_type/1 :

  • time_field_type(0). In this case, time-fields in the CSV file (of the form “HH:MM” or “HH:MM:SS…”) are translated into minutes, ignoring seconds or hundredths of a second.
  • time_field_type(1). In this case, time-fields in the CSV file (of the form “HH:MM” or “HH:MM:SS…”) are translated into seconds, ignoring hundredths of a second.
  • time_field_type(2). In this case, time-fields in the CSV file are kept as they are, as atoms (e.g. ’03:35′, ’12:45:20′, etc).

field1_as_functor/1:

  • field1_as_functor(0): Each line in the CSV-file is interpreted as a prolog clause, where the functor of the clause is the first field of the record, and the other fields are arguments.
  • field1_as_functor(foo) (where ‘foo’ can be any atom): Each line in the CSV file is interpreted as a prolog clause, where the functor of the clause is foo (or any atom supplied as 1st argument to field1_as_functor/1) and all the fields are arguments.

conv_csvhead/2:

  • This predicate is used to convert the contents of the first field (of the CSV-file) into a (user-defined) internal Prolog representation. It is used only if “time_field_type(0)” exists. For example, to convert records where the first field is a Prolog functor ‘job’ but the actual contents of this field are ‘j’ (for brevvity), using a definition “conv_csvhead(j,job)” will convert each ‘j’ into a functor ‘job’. (Use of conv_csvhead/2 is optional; in the default case, it does nothing!)

Finally, some notes:

  • The main predicate to call is “loaddb(CSVfile)“, where CSVfile can be e.g. “test.csv”.
  • Provision has been taken for special fields which contain Lists of items, comma-delimited. In EXCEL these fields will appear as longish strings, but this code was written to parse them as Prolog atom-lists. (Comment-out this section if you don’t need it).
  • The only type of field that is currently not converted into any meaningful internal representation is DATE. Dates are converted to atoms, just as they appear, without parsing their actual contents. (As an exercise, you can re-use parts of the same code to parse date-fields!) The honest reason for this omission is that… I didn’t need dates (in an application I am developing, for which this code was also written).

The source-code follows. There are useful comments inside this code. You can just copy and paste what follows from this point onwards, into a text file saved for compilation by SWI-Prolog, ending in “.pl”: (more…)

Advertisements

Create a free website or blog at WordPress.com.