Data Loaders

Introduction

Data loaders are Java objects that load data from certain type of data source, such as a CSV file or MySQL data-base, and expose the data as variables for the templates. For example, a data loader was invoked in the Quick Tour when you have written csv(data/birds.csv).

Data loaders are typically invoked:

  • ... by TDD functions used in the data setting
  • ... by TDD functions used in the localData setting
  • ... by the pp.loadData method used in FreeMarker templates

Predefined data loaders

Data loaders that load data directly from files will want you to give the path of the file as parameter. There you have to give real path. If you give a relative path, then it will be interpreted relatively to the dataRoot, which defaults to the sourceRoot if you didn't specified it. The data files can be outside the dataRoot directory; it is used only as a base directory.

Note: I have not written a database/SQL/JDBC data loader yet... Of coruse, you can write a such data loader yourself. Contributions are welcome!

csv

Parameters:
  1. path: string. The path of the CSV file.
  2. options: hash, optional. The list of valid options:
    • separator: The character that separates the columns. Defaults to ";". It can be any character. It also understands the string "tab", which means that the separtor character will be the tab character.
    • encoding: The charset used for reading CSV file. Defaults to the value of the sourceEncoding setting.
    • trimCells: Specifies if all cells will be trimmed (boolean). Trimming means the removal of all leading and trailing white-space. Defaults to no trimming (in which case only header cells are trimmed). For tables directly entered by humans (like in Excel) it is strongly recommended to turn this on.
    • decimalSeparator: Alternative character used for the decimal dot in the CSV files. The dot will be always assumed as decimal separator, except if groupingSeparator is set to dot. Note that this option has significance only if you use :n (or :number) in the headers, so the data loader has to interpret the text as numbers.
    • groupingSeparator: The character used for grouping symbol in the CSV file (as the coma in $20,000,000). Be default, grouping is not allowed. Note that this option has significance only if you use :n (or :number) in the headers.
    • altTrue: Alternative word interpreted as boolean true. Note that this option has significance only if you use :b (or :boolean) in the headers, so the data loader has to parse the text as booleans.
    • altFalse: Alternative word interpreted as boolean false. Note that this option has significance only if you use :b (or :boolean) in the headers.
    • emptyValue: A string or a sequence of strings which define the values that are equal to an empty cell. For example, if this is ["-", "N/A"], then cells whose content is a minus sign or N/A will look like they were left empty. The comparison is case-sensitive. The header row is not affected by this option.
    • dateFormat: The pattern used for parsing date-only values. Note that this option has significance only if you use :d (or :date) in the headers.
    • timeFormat: The pattern used for parsing time-only values. Note that this option has significance only if you use :t (or :time) in the headers.
    • dateTimeFormat: The pattern used for parsing date-time values. Note that this option has significance only if you use :dt (or :dateTime) in the headers.
    • normalizeHeaders: Specifies if the header names coming from the file will be normalized or should be left as is (boolean). Normalization means:
      1. Remove the part between the first "(" and last ")", before the header is parsed for column type identifier (like ":n"; column type identifiers are discussed somewhere later).
      2. After the type identifier was extracted and removed, the cell value is trimmed. (Note that this happens even if this option is off.)
      3. Then it's converted to lower case.
      4. Then the following characters are replaced with "_": space, comma, semicolon, colon.
      5. Then all "__" and "___" and so on is replaced with a single "_".
      For example, "Price, old (category: A, B, F): n" will be normailzed to "price_old", and the column type identifier will be n.
    • headers: Use this if the CSV file has no header row, that is, if the first row doesn't store column names. The option value must be a sequence of strings. Each item in the sequence corresponds to a cell value of the imaginary (actually missing) header row, and will be parsed on the same way as a real header row would be.
    • replaceHeaders: Use this if the CSV file does have a header row, but you don't like/trust the content of it. The rules are the same as with the header option, except of course that it replaces the existing header row.

Examples (with TDD syntax):

  • csv(data/foo.txt)
  • csv(data/foo.txt, {separator:tab, encoding:UTF-8})
  • csv(data/foo.txt, {separator:',', encoding:ISO-8859-2, headers:[name, size:n, lastModified:dt]})

The csv data loader parses CSV (Column Separated Values) file, or other file of similar formats (as tab divided text, comma separated values), and returns it as a sequence. The sequence is the list of the table rows, and each row is a hash, where you can access the cells with the column name. The column names (headers) are stored in the first row of the CSV file (unless you use the headers option), so this row is not part of the result sequence.

For example, if this is the CSV file:

name;color;price
rose;red;10
rose;yellow;15
tulip;white;6

and you load it into the variable flowers, then you can print it as:

<table border=1>
<#list flowers as flower>
  <tr><td>${flower.name}<td>${flower.color}<td>${flower.price}
</#list>
</table>

and the output will be:

<table border=1>
  <tr><td>rose<td>red<td>10
  <tr><td>rose<td>yellow<td>15
  <tr><td>tulip<td>white<td>6
</table>

The rows are not only hashes, but also sequences, so instead of the column name you can use the 0 based column index. Thus the above template could be written like this as well, just it is less readable:

<table border=1>
<#list flowers as flower>
  <tr><td>${flower[0]}<td>${flower[1]}<td>${flower[2]}
</#list>
</table>

and actually then it would be simpler to write:

<table border=1>
<#list flowers as flower>
  <tr><#list flower as cell><td>${cell}</#list>
</#list>
</table>

The values (cells) in the CSV file will be always exposed as string variables, unless you specify a different type in the header cells directly. To specify the type, type a colon after the column name, and a type identifier. The type identifier can be: n (or number) or b (or boolean) or d (or date) or t (or time) or dt (or dateTime) or s (or string). For example, if you want to expose the price column as number, and not as string (so you can do arithmetic with it, or use the number formatting facilities), then the CSV file would be:

name;color;price:n
rose;red;10
rose;yellow;15
tulip;white;6

Numerical values must use dot (.) as decimal separator, and no grouping, unless you change these with the decimalSeparator and/or groupingSeparator options.

Boolean values must be one of true, yes, y, 1, false, no, n, 0, or the words defined be the altTrue and altFalse options. Upper- and lower-case letters are not distinguished.

Date, time and date-time values use common SQL format with optional time zone, as "2003-06-25", "22:30:08", "2003-06-25 10:30:08 PM", "2003-06-25 10:30:08 PM GMT+02:00". But if you use option dateFormat, timeFormat, or datetimeFormat, then that format has to be used for dates, times, and date-times respectively. If the time zone is not given in a value, the value of the timeZone setting is used.

The variable returned by the csv data loader is not only a sequence, but also a hash at the same time, that contains one key: headers. This is a sequence that stores the column names.

<#list flowers.headers as h>
- ${h}
</#list>

will print:

- name
- color
- price

Note that only the name of the column is returned, not the type identifier stuff (as :n).

json

Parameters:
  1. path: string. The path of the JSON file.
  2. charset: string, optional. The charset used for reading the JSON file. Defaults to the sourceEncoding setting.

Examples (with TDD syntax):

  • json(data/foo.json)
  • json(data/foo.json, UTF-8)

This data loader parses a JSON file. The JSON file must contain a single top-level JSON value of any type, like a JSON array, or a JSON object, or a JSON string, etc. Example file content (this contains a JSON array on the top-level):

[
  {
    "name": "Jean Test",
    "maidenName": "Jean Test",
    "age": 20,
    "skills": [ "HTML", "CSS" ],
    "testResults": { "a": 10.5, "b": 20, "c": 30 },
    "decided": true
  },
  {
    "name": "José Test",
    "maidenName": null,
    "age": 30,
    "skills": [ "Ruby", "C++", "Cuda" ],
    "testResults": { "a": 20, "b": 30, "c": 40 },
    "decided": false
  }
]

Let's loaded the above JSON array into the applicants data-model variable:

data: {
  ...
  applicants: json(applicants.json)
  ...
}  

Note: Above, the applicants: label is required because this file defines a JSON array, so we need a name with which we can refer to it from the templates. If a JSON file defines a JSON object instead (like { "applicants": [ ... ], "otherStuff": ... }), then, if you omit the label, the top-level name-value pairs from the JSON object will be added to the data-model directly.

Now you could print the name and age of all applicants, and print their skills in nested listing like this:

<#list applicants as applicant>
  ${applicant.name} (age: ${applicant.age})
  <#list applicant.skills as skill>
    - ${skill}
  </#list>
</#list>

That is, JSON arrays will be FTL (FreeMarker Template Language) sequences, JSON objects will be FTL hashes, JSON strings will be FTL strings, JSON numbers will be FTL numbers, JSON boolean will be FTL boolean. JSON null-s will be FTL undefined variables (Java null-s), so for applicants[1].maindenName FreeMarker would give an undefined variable error, and you need to write something like applicants[1].maindenName!'N/A' instead.

The loaded JSON will also act as an FTL node tree (an advanced FreeMarker feature), similarly to an XML. This means, among others, that you can get the parent of a value. The parent of a value is its containing array or object:

<#-- We just pick some skill for this example: -->
<#assign skill = applicants[0].skills[0]>
<#-- Now let's say we only have the skill variable, and we don't know
     how was it get. We can still get whose skill it was: -->
${skill} is the skill of ${skill?parent?parent.name},
<#-- skill?parent is the array, the parent of that is the JSON object -->

More details about the node nature:

  • The ?node_type of a value will be the name of the JSON type, like "string", "array", "object", etc.
  • If a value is the right-side of a key-value pair in a JSON object, its ?node_name will be the value of the key (like "name", "age", etc.). Otherwise, it will be "unnamed" + capital node type, like "unnamedString".
  • When traversing the tree with <#visit ...>/<#recurse>, or using ?children, JSON null-s will appear as existing nodes with ?node_type "null" and no FTL type other than node (thus they are unprintable, uncomparable, unlistable, etc.).

See the FreeMarker Manual for more about nodes and the related directives and built-ins!

text

Parameters:
  1. path: string. The path of the text file.
  2. charset: string, optional. The charset used for reading text file. Defaults to the sourceEncoding setting.

Examples (with TDD syntax):

  • text(C:/autoexec.bat)
  • text(C:/autoexec.bat, CP850)

This data loader loads a plain text file, and returns that as a string.

slicedText

Parameters:
  1. path: string. The path of the text file.
  2. options: hash, optional. The list of valid options:
    • separator: The string that slices the text file into items, that is, the string between two adjoining items. It defaults to "\n", so each line of the text will be an item. Examples: If you want the items to be separated with empty lines, then you should write "\n\n" (this is what is between two items if you imagine it). Of course in this case the items can contain single line-breaks (but not multiple consequent line breaks), not like in the default case. If you want to separate the items with a "---" that is considered as separator only if it's alone in a line (not in the middle of a line) then the separator should be "\n---\n". If you simply want to separate the items with semicolon then the separator should be ";" (in this case you certainly want to use the trim option described below). There is some "magic" involved in the handling of line-breaks: 1.: no mater if you are using UN*X, DOS/Windows or Mac line-breaks in the separator option and in the text file, they will be considered to be equal, 2.: when the text file is searched for a multi-line separator string, those evil invisible extra spaces and tabs at the end of the lines will be tolerated.
    • trim: If it's true, then each item will be trimmed individually, that is, the leading and trailing white space will be removed from them. Defaults to false.
    • dropEmptyLastItem: If it's true then the last item will be removed from the result if it's a 0 length string (after trimming, if the trim option is true). Defaults to true, so if the text file ends with a separator or the text file is empty, you will not get a needless empty item.
    • encoding: The charset used for reading text file. Defaults to the value of the sourceEncoding setting.

Examples (with TDD syntax):

  • slicedText(data/foo.txt)
  • slicedText(data/foo.txt, {separator:\n---\n, encoding:UTF-8})
  • slicedText(data/foo.txt, {separator:';', trim})

This data loader loads a text file and slices it to items (strings) by cutting out all occurences of the separator string specified with the separator option. The result is a sequence of strings.

For example, if this is the data/stuff.txt file:

This is the first item.

Still the first item...

--8<--

The items are separated with this:
--8<--
just don't forget to surround it with
empty lines.

--8<--

This is the last item.

and it's loaded in the configuration file like:

data: {
    stuff: slicedText(data/stuff.txt, {separator:"\n\n--8<--\n\n"})
}

then the output of this template file

<#escape x as x?html>

<#list stuff as i>
<pre>
${i}
</pre>
</#list>

</#escape>

will be this:

<pre>
This is the first item.

Still the first item...
</pre>
<pre>
The items are separated with this:
--8&lt;--
just don't forget to surround it with
empty lines.
</pre>
<pre>
This is the last item.
</pre>

Note the double \n-s in the separator that causes the data loader to treat "–8<--" as separator only if it is surrounded with empty lines.

tdd

Parameters:
  1. path: string. The path of the TDD file.
  2. charset: string, optional. The charset used for reading TDD file. Defaults to the sourceEncoding setting.

Examples (with TDD syntax):

  • tdd(data/foo.tdd)
  • tdd(data/foo.tdd, ISO-8859-5)

This data loader parses a TDD file. The loaded TDD is interpreted in hash mode, and TDD functions will invoke data loaders, exactly like with the data setting.

See <FMPP>/docs/examples/tdd for a concrete example.

tddSequence

Parameters:
  1. path: string. The path of the TDD file.
  2. charset: string, optional. The charset used for reading TDD file. Defaults to the sourceEncoding setting.

This is like the tdd dataloader, except that it interprets the file as a TDD sequence (TDD "sequence mode"), rather than as a TDD hash. So the result will be a sequence (a list), not a hash.

properties

Parameters:
  1. path: string. The path of the "properties" file.

Loads a "properties" file. The result is a hash.

xml

Parameters:
  1. path: string. The path of the XML file.
  2. options: hash, optional. Options. See later...

Note: Sometimes XML files are rather source files (as the HTML-s and the JPG in the Quick Tour) than data files (as data/birds.tdd in the Quick Tour), just you need to invoke a template to render them to their final form (e.g. to HTML pages). If this is the case, you should use renderXml processing mode (see here...), not XML data loader. An example that uses this approach is <FMPP>/docs/examples/xml_rendering.

Loads an XML file. This uses the built-in XML wrapper of FreeMarker 2.3 and later; please read FreeMarker Manual/XML Processing Guide for more information about the usage of the returned variable.

Notes:

  • Comment nodes will be removed by default (see option removeComments below).
  • Processing instruction well be keep by default (see option removePIs below).
  • CDATA sections count as plain text nodes.
  • There are no adjacent text nodes in the node tree; adjacent text nodes will be joined.

The example <FMPP>/docs/examples/xml_try is a good tool to understand this data loader and its options.

Options:

  • removeComments: Optional, defaults to true: Specifies if XML comment nodes will be removed from the node tree.
  • removePIs: Optional, defaults to false: Specifies if XML processing instruction nodes will be removed from the node tree.
  • validate: Optional, defaults to the value of the validateXml setting: Specifies if the XML file should be validated (checked agains the DTD). If it's true, and the XML is not valid, then the data loading fails.
  • namespaceAware: Optional, defaults to true: Specifies if the XML parsing should be XML name-space aware or not. If the parsing is not name-space aware, colons in element and attribute names will not be considered as name-space prefix delimiters, so the "local name" of the node will contain the colon and the name-space prefix before that.
  • xincludeAware: Optional, defaults to false: Basically, it specifies whether the XML parsing should replace the XInclude-s with the included content. For most practical purposes you should set it to true (but due to backward-compatibility issues it defaults to false). Note that for setting this successfully to true you need at least Java 5.
  • xmlns: Optional. This is a hash that maps prefixes to XML name-space URI-s. The prefixes can be used in the other options. This option has no impact outside the XML data loader invocation, so the prefixes will not be available in the templates. The XML name-space for prefixless elements (with other words, the default name-space) is "no name-space" initially, but it can be changed by assigning name-space URI to the reserved prefix D (similar as if you use xmlns="..." in an XML document).
  • index: Adds attributes to element nodes, that can be helpful when later you process the XML with the templates. We say on this, that you index the elements. Well, it's hard to explain... :) Look at the output of example <FMPP>/docs/examples/xml_try; the id attributes were added by this option, they are not present in the XML file. Also you can easily try other aspects of indexing with that example. The value of this option is a hash that contains the indexing options, or a sequence of hashes if you want to do more indexing successively. The hash accepts these subvariables (sub-options):
    • element: Required. The name of the XML element or elements to index. If this is a string, then it selects a single element only. If this is a sequence, then it must be the sequence of element names that it selects. To use name-space prefixes, see option xmlns.
    • attribute: Optional, defaults to id. This is the name of the XML attribute that will be added to the indexed elements (the indexed elements are the elements selected by the sub-option element). If the element has this attribute already, then the original value of the attribute will be kept. To use name-space prefixes, see option xmlns.
    • numbering: Optional, defaults to sequential: The rule for the numbering of the indexed elements. The number is used in the attribute values; see option value. This must be one of the following strings:
      • sequential: The number for the first indexed element is 1, for the second indexed element is 2, etc. The element numbering is done in the "document order", that is, in the order in which the first character of the XML representation of each node occurs (after expansion of general entities).
      • hierarchical: Similar to sequential numbering, but, for example, if we have an indexed element that got number 2, then indexed elements nested into this element will get numbers 2_1, 2_2, 2_3, etc. The indexed elements that are nested into the element numbered as 2_1, will get numbers 2_1_1, 2_1_2, 2_1_3, etc. For example, look at the output file names in <FMPP>/docs/examples/xml_rendering; they are generated with the hierarchical indexing of the part and chapter elements. Or use <FMPP>/docs/examples/xml_try to try it.
    • value: Optional, defaults to 'ppi_%n': A string that describes the format of index values. All character will go directly into the value, except that % is used for place holders. The valid place holders are:
      • '%n': This will be replaced with the value generated based on the value of option numbering.
      • '%e': This will be replaced with the name (local, prefixless name) of the element
      • '%%': This will be replaced with a single %.

eval

Parameters:
  1. expression: string. Expression in BeansShell language.
  2. variables: hash, optional. Values that will be visible for the expression as variables.

Evaluates a BeanShell expression, or runs a more complex BeanShell script and uses its return-ed value.

You can use the predefined variable engine to access the current FMPP engine instance.

Examples (with TDD syntax):

  • eval('System.getProperties()')
  • eval('fmpp.Engine.getVersionNumber()')
  • eval('
        f = engine.getSourceRoot().getParentFile();
        if (f == null) {
            return new String[]{};
        } else {
            return f.list();
        }
    ')
  • In a configuration file:
    width: 100
    height: 50
    area: eval('a * b', {a:get(width), b:get(height)})

htmlUtils

Parameters: none

The returned hash contains custom directives that help HTML template development.

Currently it contains only 1 directive: img. This is used as HTML img element, but the width and height attributes are calculated automatically (unless these attributes are specified in the template). Also, the processing will stop with error, if the image file pointed by the src attribute is missing.

Example: If you have loaded the hash into the html variable, then you can do this in a template:

<@html.img src="falcon.png" alt="Falcon" />

and the output will be something like this (depending on the concrete dimensions of the image):

<img src="falcon.png" alt="Falcon" width="80" height="120">

See also the example in <FMPP>/docs/examples/img_dims.

Directive img accepts any extra parameters (attributes). The extra parameters will be printed as the attributes of the HTML img element.

xhtmlUtils

Parameters: none

This is the same as htmlUtils, but it is for XHTML output.

now

Parameters:
  1. options: hash, optional. The supported options are:
    • locale: The locale (language) used for displaying the date. It defaults to the "locale" setting of the FMPP engine. The value of the option is a usual java locale strings, such as en_GB, fr, ar_SA, ...etc. To see the complete list of locale codes, call the command line FMPP tool:
      fmpp --print-locales
    • date: The format of the date (year, month, day) part of the date: short, medium, long, default. The exact meaning of these formats are locale dependent, and it is defined by the Java platform implementation. You can't use this option together with the pattern option.
    • time: The format of the time (hour, minute, second, millisecond) part of the date: short, medium, long, default. The exact meaning of these formats are locale dependent, and it is defined by the Java platform implementation. You can't use this option together with the pattern option.
    • pattern: A pattern that specifies the formatting of the date.
    • zone: The time zone you would like to use ford displaying the date. Defaults to the "time zone" FMPP engine setting. The zone is given with a string, for example: GMT, GMT+02, GMT-02 or GMT+02:30.

Note: FreeMarker now has a .now variable, so you don't need this data loader any more. You can print the current date and time with ${.now} (or ${.now?date} or ${.now?time}), and the format and zone is specified by the global datetimeFormat (or dateFormat or timeFormat) and timeZone FMPP settings.

Examples (with TDD syntax):

  • now()
  • now({pattern:"EEEE, MMMM dd, yyyy, hh:mm:ss a '('zzz')'"})
    This will print something like: Tuesday, April 08, 2003, 09:24:44 PM (GMT+02)
  • now({date:short, time:short, zone:GMT})

This data loader loads the current date and time from the system clock, and produces a string from it. Since FreeMarker has introduced native date/time value support, it is maybe better to use pp.sessionStart or pp.now instead of this data loader.

get

Parameters:
  1. varName: string. The name of the variable.
  2. subVarName: string, optional. The name of the sub-variable.
  3. subSubVarName: string, optional. ...etc. You can specify any number of parameters.

This data loader returns the value of a variable that was already put into the data hash. For example:

data: {
    a: 123
    b: get(a)
}

Here the value of b will be the same as the value of a, which is 123. You can retrieve the value of subvariables with additional parameters:

data: {
    a: {
        x: 123
        y: {
            q:234
        }
    }
    b: get(a, x)
    c: get(a, y, q)
}

Here b will be 123, c will be 234.

The get data loader was introduced so that data loaders can get the values loaded by other data loaders previously. For example:

data: {
    doc: xml(data/foo.xml)
    index: com.example.MyIndexBuilderDataLoader(get(doc))
}

The get data loader sees the variables that were already put into the hash in the same enclosing data/localData hash TDD expression where it is called, but it doesn't see values that are put into the data model elsewhere, as in other (inherited or inheriting) configuration files. However, if you use get in the localData setting, it will also see the session level data (see data).

antProperty

Parameters:
  1. propertyName: string. The name of the Ant property.
  2. defaultValue: any, optional. The value returned if the Ant property does not exist.

This data loader returns the value of an Ant property. If no Ant property with the given name exists, it will return nothing (i.e. will not add new variable to the shared data model), or it will return the value of the second parameter if you use that.

The values of Ant properties are strings. But sometimes you want to see a property as numerical variable, or as boolean variable, etc. If the property name you give as data loader parameter ends with ?n, then the string will be converted to number variable, and the ?n itself will not count as the part of the actual property name. The complete list of postfixes:

  • ?n: Convert to number. The string value must follow Java language number format, that is, the decimal separator is always dot (.).
  • ?b: Convert to boolean. Valid values are: true, false, yes, no, y, n, 0, 1. Upper- and lowercase letters are considered as equivalent.
  • ?d: Convert to date. Use common SQL format with optional time zone, as "2003-06-25 GMT". If the time zone is omitted, the value of the timeZone setting is used.
  • ?t: Convert to time. Use common SQL format with optional time zone, as "22:05:30 GMT-02:00". The usage of AM/PM is also supported, e.g. "10:05:30 PM". If the time zone is omitted, then the value of the timeZone setting is used.
  • ?dt: Convert to date-time. The date and time format is the same as with ?d and ?t, e.g. "2003-06-25 10:05:30 PM GMT".
  • ?s: Keep the property as string. This can be useful if the actual property name ends with a prefix, so you must protect it against misinterpretation: foo?t?s.

To see a concrete example, look at <FMPP>/docs/examples/ant2.

This data loader will work only if you execute FMPP as Ant task.

antProperites

Parameters:
  1. propertyName1: string, optional. The name of an Ant property to expose.
  2. propertyName2: string, optional. The name of another Ant property to expose.
  3. propertyNameN: string, optional. ...etc. You can specify any number of Ant parameters.

Returns the set of all Ant properties, or the set of selected ant properties, as a hash.

If you use this data loader without parameters, then it returns the hash of all Ant properties. To see a concrete example, look at <FMPP>/docs/examples/ant.

You can also give the name of properties to expose as parameters. For example if you want to put only the Ant properties foo and bar into the shared data model, then you write: antProperties(foo, bar). Parameters that refer to non-existing Ant properties will be silently ignored. The same postfixes (as ?n) are supported as with antProperty.

To see a concrete example, look at <FMPP>/docs/examples/ant2.

This data loader will work only if you execute FMPP as Ant task.

antProject

Parameters: none

Returns the current Ant project object (org.apache.tools.ant.Project). See the example in <FMPP>/docs/examples/ant.

This data loader will work only if you execute FMPP as Ant task.

antTask

Parameters: none

Returns the current Ant task object (org.apache.tools.ant.Task). See the example in <FMPP>/docs/examples/ant.

This data loader will work only if you execute FMPP as Ant task.

Custom data loaders

If you want to write custom data loader, you have to write a Java class that implements the fmpp.tdd.DataLoader interface (see in the API documentation). Then, if the class is available in the class path, or if you drop its jar into the lib directory of the FMPP installation, you can call it similarly as a predefined data loader, just you use the full-qualified class name as the data loader name. For example:
com.example.fmpp.SQLDataLoader("SELECT * FROM products").

If you have written a data loader that can be useful for other FMPP users, I will be happy to make it available on the FMPP home page for download. So if you want to contribute with your data loader, drop me a mail (ddekanyREMOVETHIS@freemail.hu (delete the "REMOVETHIS"!)) or write to the mailing list. Thank you!