Stipa is an easy to use and highly adaptable data collection solution for mobile devices. Stipa’s versatility arises from an XML document called the Stipa Data Collection Protocol. This document specifies what information will be collected during a particular data collection effort. It is human-readable and can be edited using a basic text editor (although more sophisticated graphical user interfaces are also available for that purpose).
A Data Collection Protocol is constructed from seven core XML elements. Figure 1 illustrates the hierarchical structure of these core elements and how they should be arranged in the XML document. All other elements used in a Data Collection Protocol (some are required; others are optional) have been omitted from Figure 1 for clarity, but a full description of them can be found in subsequent figures and tables.
Protocol. This is the root element of the Data Collection Protocol. It is identified by a universally unique identifier (UUID) that should remain unchanged even if the protocol is modified. The protocol ID is specified in the ID child element. Subsequent modifications to the protocol should be identified using the Version child element. The date of the most recent modification should be documented using the LastModified element. Unique identifiers help to facilitate proper matching of data to metadata, as well as ingestion of data into external databases.
Form. Each protocol must contain one or more Form element. Form elements are listed as child elements of Forms, which is a direct descendent of Protocol. Forms provide a way of logically organizing observations that will be performed on subjects during a data collection effort.
Attribute. Each Form element must contain one or more Attribute element. Attribute elements are listed as child elements of Attributes, which is a direct descendent of Form. Attributes are used to specify which characteristics of a subject need to be measured or observed. They also define how observations should be recorded; for example, as free-form text, selections from a list, numbers within a range, and so on.
ObservationSet. Optionally, each Form element can contain one or more ObservationSet element. ObservationSet elements are listed as child elements of ObservationSets, which is a direct descendent of Form. ObservationSet elements can be used to specify how forms should be repeated for the same study subject. For example, the same attributes may need to be measured at multiple points along a line transect. Or, the same information may need to be collected for multiple horizons within a soil excavation. If multiple ObservationSets are listed for a single Form, the latter ObservationSets are interpreted as being nested within the former, in the order they are listed.
Observation. Each ObservationSet must contain one or more Observation element. Observation elements are listed as child elements of Observations, which is a direct descendent of ObservationSet. If a form is completed multiple times for the same study subject, observations are used to identify these replicate entries. In some data entry scenarios, it may be desirable to have a static, pre-defined set of observations. In other cases, it may not be possible to predict how many times a form will be completed during data collection (for example, how many horizons a particular soil profile will contain). If the ability to add and remove observations is desirable, the Mutable child element of ObservationSet should be set to yes. Observation sets are considered immutable by default.
SharedList. Optionally, each protocol can contain one or more SharedList element. SharedList elements are listed as child elements of SharedLists, which is a direct descendent of Protocol. SharedList elements can be used to define categories that apply to more than one Attribute. Referencing a SharedList, rather than replicating the same Category list for different Attributes, can help to reduce the size and complexity of the data collection protocol.
Category. Optionally, each Attribute can contain one or more Category element. Category elements are listed as child elements of Categories, which is a direct descendent of Attribute or SharedList. Category elements can be used to define the choices available for a particular Attribute. Categories should be listed only for attributes that are of the “category” type.
Element | Description | Type | Maximum Length | Required |
---|---|---|---|---|
Protocol | Yes | |||
ID | Universally unique identifier | UUID | 36 | Yes |
Label | Protocol display name | String | 100 | Yes |
Description | Protocol description | String | 255 | No |
Origin | Name of the individual or organization that created the protocol | String | 100 | No |
SubjectLabel | Display name of data collection subjects | String | 100 | No |
Version | Universally unique identifier of the protocol version | UUID | 36 | No |
LastModified | Date and time of the last protocol edit | Datetime | NA | No |
ShowAttributeValues | Yes if attribute values will be displayed by default in the attribute list and no otherwise. Default is yes. | String in set {no, yes} | NA | No |
ShowCategoryDescription | Yes if category descriptions will be displayed by default in the category list and no otherwise. Default is yes. | String in set {no, yes} | NA | No |
Forms | List of Form elements | Element list | NA | No |
SharedLists | List of SharedList elements | Element list | NA | No |
Form | Yes | |||
ID | Form identifier (must be unique within the scope of the protocol) | String | 40 | Yes |
Label | Data form display name | String | 100 | Yes |
Description | Data form description | String | 255 | No |
AttributeLabel | Display name of data attributes. Default is ‘Attribute’. | String | 255 | No |
ObservationSets | List of ObservationSet elements | Element list | NA | No |
Attributes | List of Attribute elements | Element list | NA | No |
Attribute | Yes | |||
ID | Attribute identifier (must be unique within the scope of the data form) | String | 40 | Yes |
Label | Attribute display name | String | 100 | Yes |
Description | Attribute description | String | 255 | No |
ControlLabel | Label to display above the data entry control, if different than the attribute display name. Default is no label. | String | 255 | No |
Figure | Name of the figure to display with the attribute description. Default is no figure. | String | 100 | No |
Type | Attribute type | String in set {boolean, calendar, category, date, geometry, number, photo, text, time} | NA | Yes |
Optional | Yes if the attribute is optional and no otherwise. Default is yes. | String in set {no, yes} | NA | No |
MaxLength | Maximum number of characters that can be entered. Default is 1000 characters. | Integer in set {n>0} | 4 | No |
Format (text type only) |
Character sequence used to format the entry (n = number, a = lowercase or uppercase letter, A = uppercase letter). Default is no format. | String | 100 | No |
Min (number type only) |
Minimum allowable value. Default is no minimum bound. | Decimal | 40 | No |
Max (number type only) |
Maximum allowable value. Default is no maximum bound. | Decimal | 40 | No |
Precision (number type only) |
Maximum allowable decimal places. Default is unlimited decimal places. | Integer in set {0≤n≤10} | NA | No |
Increment (number type only) |
Increment amount. Default is no increment amount. | Integer | 40 | No |
Unit (number type only) |
Unit of measure. Default is no unit of measure. | String | 40 | No |
Tool | Measurement tool. Default is no measurement tool. | String in set {clinometer, gps} | NA | No |
StartYear (date type only) |
First year to be displayed in the date or category control. Default is 2000. | Integer | 4 | No |
EndYear (date type only) |
Last year to be displayed in the date or category control. Default is 2050. | Integer | 4 | No |
SharedList (category type only) |
Name of the shared list used to populate the category list. Default is no shared list. | String | 40 | No |
MaxCount (category type only) |
Maximum number of categories that can be selected. Default is unlimited selections. | Integer in set {n>0} | 4 | No |
AutoPrioritize (category type only) |
Yes if most-used categories are moved to the top of the category list and no otherwise. Default is no. | String in set {no, yes} | NA | No |
LabelObservation (category type only) |
Yes if selected categories will be used to label observations and no otherwise. Default is no. | String in set {no, yes} | NA | No |
CategoryModifier (category type only) |
An optional modifier to be applied to selections. Only relevant if MaxCount is 1. | String | 100 | No |
Categories (category or text type only) |
List of Category elements | Element list | NA | No |
Validations | List of Validation elements | Element list | NA | No |
ObservationSet | No | |||
ID | Unique identifier within the scope of the data form | String | 40 | Yes |
Label | Observation set display name | String | 100 | Yes |
Description | Observation set description | String | 255 | No |
Mutable | Yes if observations can be added/removed during data entry and no otherwise. Default is no. | String in set {no, yes} | NA | No |
LabelMethod | Method used to label new observations. Observations can be labeled by numeric order (increment), current date (date), or user-specified label (manual). Default is manual. Only relevant if Mutable is yes. | String in set {date, increment, manual} | NA | No |
Observations | List of Observation elements | Element list | NA | No |
Observation | No | |||
ID | Unique identifier within the scope of the observation set | String | 40 | Yes |
Label | Observation set display name | String | 100 | Yes |
Description | Observation set description | String | 255 | No |
SharedList | No | |||
ID | Unique identifier within the scope of the protocol | String | 40 | Yes |
Label | Shared list display name | String | 100 | Yes |
Description | Shared list description | String | 255 | No |
Categories | List of Category elements | Element list | NA | No |
Category | No | |||
ID | Unique identifier within the scope of the data attribute or shared list | String | 40 | Yes |
Label | Category display name | String | 100 | Yes |
Description | Category description | String | 255 | No |
Priority | Yes if the category will be moved to the top of the category list and no otherwise. Default is no. | String in set {no, yes} | NA | No |
Validation | No | |||
Type | Validation type | String in set {distinct value, exclusion set, exclusion switch, exclusive interval, inclusion set, inclusion switch, value combination} | NA | Yes |
Attributes | List of Attribute elements containing a single string value and no child elements | Element list | 40 (Attribute value) | No |
Dependencies | List of Dependency elements containing a single string value and no child elements | Element list | 255 (Dependency value) | No |
Values | List of Value elements containing a single string value and no child elements | Element list | 255 (Value value) | No |
The code sample below illustrates the hierarchical structure of all required and optional XML elements in a Data Collection Protocol document. Stipa does not enforce the ordering of child elements, but imitating the order shown below will heighten the document's readability.