Defining a Source
A source defines the structure of the data in the data source. Similar to a database table, a source has fields and primary keys. Sources configuration can be found in PENROSE_HOME/conf/sources.xml.
<sources> <source name="u"> <connection-name>MySQL</connection-name> <field name="username" primaryKey="true" /> <field name="firstName" /> <field name="lastName" /> <field name="password" /> <parameter> <param-name>tableName</param-name> <param-value>users</param-value> </parameter> </source> </sources>
To specify a source, you need to specify the followings:
Name
Specify the source name. This will be used in creating mapping.
Connection Name
Specify the connection name used by this source.
Fields
Specify the field name that will be accessed by this source. For JDBC sources, this will mean the table columns. For JNDI sources, this will mean the attribute types.
Parameters
Specify the connection-specific parameters. See below.
JDBC Source Parameters
| Parameter | Description | Required | Example |
|---|---|---|---|
| tableName | Database table name (Penrose 1.0) | Yes | users |
| catalog | Database catalog name (Penrose 1.1) | No | example |
| schema | Database schema name (Penrose 1.1) | No | system |
| table | Database table name (Penrose 1.1) | Yes | users |
| filter | Search filter | No | lastName = 'Smith' |
LDAP Source Parameters
| Parameter | Description | Example |
|---|---|---|
| baseDn | Search base DN | dc=penrose,dc=safehaus,dc=org |
| scope | Search scope | OBJECT, ONELEVEL, or SUBTREE |
| filter | Search filter | (objectClass=*) |
| objectClasses | Comma-separated list of object classes for newly added entries | person,organizationalPerson,inetOrgPerson |
Data Loading
There are 2 types of data loading mechanisms:
- Load everything at once (default)
This is faster for small database where data can be loaded quickly into memory. It will load the full data including the primary keys in one operation. - Search the primary keys first, then load as needed
This is more scalable for larger database. The data source will be queried first to get the primary keys, then it will only load the full data of entries that don't exist in the cache.
Data loading can be configured by adding the following parameters:
| Parameter | Description | Valid Values | Default |
|---|---|---|---|
| sizeLimit | Size limit | integer | 100 |
| loadingMethod | Loading method | loadAll, searchAndLoad | loadAll |
Cache
Each source has 2 caches:
- Filter cache
It stores the primary keys resulting from search operations. - Data cache
It stores the data resulting from load operations.
When Penrose is about to search the data source using a search filter, first it checks the filter cache. If the filter is not in the cache, it will perform the search operation, then stores the resulting primary keys into the cache.
When Penrose is about to load the data source using a set of primary keys, first it checks the data cache. If any of the requested data has not been loaded, it will perform the load operation for those missing data, then stores results into the cache.
To configure the cache, add the following parameters:
| Parameter | Description | Valid Values | Default |
|---|---|---|---|
| filterCacheSize | Filter cache size | integer > 0 | 100 |
| filterCacheExpiration | Filter cache expiration (in minutes) | integer >= 0 | 5 |
| dataCacheSize | Data cache size | integer > 0 | 100 |
| dataCacheExpiration | Data cache expiration (in minutes) | integer >= 0 | 5 |
You can set the cache expiration to 0 to disable the cache. In this case all requests will always be performed against the datasource.
