Stipa Data Collection Protocol

Stipa is an easy to use and highly adaptable data collection solution for mobile devices. Stipa’s versatility arises from an XML document called the Stipa Data Collection Protocol. This document specifies what information will be collected during a particular data collection effort. It is human-readable and can be edited using a basic text editor (although more sophisticated graphical user interfaces are also available for that purpose).

A Data Collection Protocol is constructed from seven core XML elements. Figure 1 illustrates the hierarchical structure of these core elements and how they should be arranged in the XML document. All other elements used in a Data Collection Protocol (some are required; others are optional) have been omitted from Figure 1 for clarity, but a full description of them can be found in subsequent figures and tables.

<?xml version="1.0" encoding="UTF-8" ?>
<Protocol>
  <Forms>
    <Form>
      <ObservationSets>
        <ObservationSet>
          <Observations>
            <Observation></Observation>
          </Observations>
        </ObservationSet>
      </ObservationSets>
      <Attributes>
        <Attribute></Attribute>
      </Attributes>
    </Form>
  </Forms>
  <SharedLists>
    <SharedList></SharedList>
  </SharedLists>
</Protocol>

Figure 1. XML document structure – core elements.

Protocol. This is the root element of the Data Collection Protocol. It is identified by a universally unique identifier (UUID) that should remain unchanged even if the protocol is modified. The protocol ID is specified in the ID child element. Subsequent modifications to the protocol should be identified using the Version child element. The date of the most recent modification should be documented using the LastModified element. Unique identifiers help to facilitate proper matching of data to metadata, as well as ingestion of data into external databases.

Form. Each protocol must contain one or more Form element. Form elements are listed as child elements of Forms, which is a direct descendent of Protocol. Forms provide a way of logically organizing observations that will be performed on subjects during a data collection effort.

Attribute. Each Form element must contain one or more Attribute element. Attribute elements are listed as child elements of Attributes, which is a direct descendent of Form. Attributes are used to specify which characteristics of a subject need to be measured or observed. They also define how observations should be recorded; for example, as free-form text, selections from a list, numbers within a range, and so on.

ObservationSet. Optionally, each Form element can contain one or more ObservationSet element. ObservationSet elements are listed as child elements of ObservationSets, which is a direct descendent of Form. ObservationSet elements can be used to specify how forms should be repeated for the same study subject. For example, the same attributes may need to be measured at multiple points along a line transect. Or, the same information may need to be collected for multiple horizons within a soil excavation. If multiple ObservationSets are listed for a single Form, the latter ObservationSets are interpreted as being nested within the former, in the order they are listed.

Observation. Each ObservationSet must contain one or more Observation element. Observation elements are listed as child elements of Observations, which is a direct descendent of ObservationSet. If a form is completed multiple times for the same study subject, observations are used to identify these replicate entries. In some data entry scenarios, it may be desirable to have a static, pre-defined set of observations. In other cases, it may not be possible to predict how many times a form will be completed during data collection (for example, how many horizons a particular soil profile will contain). If the ability to add and remove observations is desirable, the Mutable child element of ObservationSet should be set to yes. Observation sets are considered immutable by default.

SharedList. Optionally, each protocol can contain one or more SharedList element. SharedList elements are listed as child elements of SharedLists, which is a direct descendent of Protocol. SharedList elements can be used to define categories that apply to more than one Attribute. Referencing a SharedList, rather than replicating the same Category list for different Attributes, can help to reduce the size and complexity of the data collection protocol.

Category. Optionally, each Attribute can contain one or more Category element. Category elements are listed as child elements of Categories, which is a direct descendent of Attribute or SharedList. Category elements can be used to define the choices available for a particular Attribute. Categories should be listed only for attributes that are of the “category” type.

Table 1. XML Elements.

Element Description Type Maximum Length Required
Protocol Yes
ID Universally unique identifier UUID NA Yes
Label Protocol display name Alphanumeric 100 Yes
Description Protocol description Alphanumeric 255 No
Origin Name of the individual or institution that created the protocol Alphanumeric 100 No
SubjectLabel Display name of sampling subjects Alphanumeric 100 No
Version Universally unique identifier of the protocol version UUID NA No
LastModified Date and time of the last protocol edit Datetime NA No
ShowAttributeValues Yes if attribute values will be displayed by default in the attribute list and no otherwise. Default is no. {yes, no} NA No
ShowCategoryDescription Yes if category descriptions will be displayed by default in the category list and no otherwise. Default is no. {yes, no} NA No
Forms List of Form elements Element list NA No
SharedLists List of List elements Element list NA No
Form Yes
ID Form identifier (must be unique within the scope of the protocol) Alphanumeric 40 Yes
Label Data form display name Alphanumeric 100 Yes
Description Data form description Alphanumeric 255 No
AttributeLabel Display name of data attributes. Default is ‘Attribute’. Alphanumeric 255 No
ObservationSets List of ObservationSet elements Element list NA No
Attributes List of Attribute elements Element list NA No
Attribute Yes
ID Attribute identifier (must be unique within the scope of the data form) Alphanumeric 40 Yes
Label Attribute display name Alphanumeric 100 Yes
Description Attribute description Alphanumeric 255 No
ControlLabel Label to display above the data entry control, if different than the attribute display name. Default is no label. Alphanumeric 255 No
Figure Name of the figure to display with the attribute description. Default is no figure. Alphanumeric 100 No
Type Attribute type {text, number, boolean, date, time, calendar, category, photo, geometry} NA Yes
Optional Yes if the attribute is optional and no otherwise. Default is no. {yes, no} NA No
MaxLength Maximum number of characters that can be entered. Default is 1000 characters. Number in set {n>0} 4 No
Format
(text type only)
Character sequence used to format the entry (n = number, a = lowercase or uppercase letter, A = uppercase letter). Default is no format. Alphanumeric 100 No
Min
(number type only)
Minimum allowable value. Default is no minimum bound. Number 40 No
Max
(number type only)
Maximum allowable value. Default is no maximum bound. Number 40 No
Precision
(number type only)
Maximum allowable decimal places. Default is unlimited decimal places. Number in set {0≤n≤10} NA No
Increment
(number type only)
Increment amount. Default is no increment amount. Number 40 No
Unit
(number type only)
Unit of measure. Default is no unit of measure. Alphanumeric 40 No
Tool Measurement tool. Default is no measurement tool. {clinometer, gps} NA No
StartYear
(date type only)
First year to be displayed in the date control. Default is 2000. Number 4 No
EndYear
(date type only)
Last year to be displayed in the date control. Default is 2050. Number 4 No
SharedList
(category type only)
Name of the shared list used to populate the category list. Default is no shared list. Alphanumeric 40 No
MaxCount
(category type only)
Maximum number of categories that can be selected. Default is unlimited selections. Number in set {n>0} 4 No
AutoPrioritize
(category type only)
Yes if most-used categories are moved to the top of the category list and no otherwise. Default is no. {yes, no} NA No
LabelObservation
(category type only)
Yes if selected categories will be used to label observations and no otherwise. Default is no. {yes, no} NA No
CategoryModifier
(category type only)
An option modifier to be applied to selections. Only relevant if MaxCount is 1. Alphanumeric 100 No
Categories
(category type only)
List of Category elements Element list NA No
ObservationSet No
ID Unique identifier within the scope of the data form Alphanumeric 40 Yes
Label Observation set display name Alphanumeric 100 Yes
Description Observation set description Alphanumeric 255 No
Mutable Yes if observations can be added/removed during data entry and no otherwise. Default is no. {yes, no} NA No
LabelMethod Method used to label new observations. Observations can be labeled by numeric order (increment), current date (date), or user-specified label (manual). Default is manual. Only relevant if Mutable is yes. {manual,increment,date} NA No
Observations List of Observation elements Element list NA No
Observation No
ID Unique identifier within the scope of the observation set Alphanumeric 40 Yes
Label Observation set display name Alphanumeric 100 Yes
Description Observation set description Alphanumeric 255 No
SharedList No
ID Unique identifier within the scope of the protocol Alphanumeric 40 Yes
Label Shared list display name Alphanumeric 100 Yes
Description Shared list description Alphanumeric 255 No
Categories List of Category elements Element list NA No
Category No
ID Unique identifier within the scope of the data attribute or shared list Alphanumeric 40 Yes
Label Category display name Alphanumeric 100 Yes
Description Category description Alphanumeric 255 No

The code sample below illustrates the hierarchical structure of all required and optional XML elements in a Data Collection Protocol document. Stipa does not enforce the ordering of child elements, but imitating the order shown below will heighten the document's readability.

<?xml version="1.0" encoding="UTF-8" ?>
<Protocol>
  <ID></ID>
  <Label></Label>
  <Description />
  <Origin />
  <SubjectLabel></SubjectLabel>
  <Version />
  <LastModified />
  <ShowAttributeValues />
  <ShowCategoryDescription />
  <Forms>
    <Form>
      <ID></ID>
      <Label></Label>
      <Description />
      <AttributeLabel></AttributeLabel>
      <ObservationSets>
        <ObservationSet>
          <ID></ID>
          <Label></Label>
          <Description />
          <Mutable />
          <AutoIncrement />
          <Observations>
            <Observation>
              <ID></ID>
              <Label></Label>
              <Description />
            </Observation>
          </Observations>
        </ObservationSet>
      </ObservationSets>
      <Attributes>
        <Attribute>
          <ID></ID>
          <Label></Label>
          <Description />
          <ControlLabel />
          <Figure />
          <Type></Type>
          <Optional />
          <MaxLength />
          <Format />
          <Min />
          <Max />
          <Precision />
          <Increment />
          <Unit />
          <Tool />
          <StartYear />
          <EndYear />
          <SharedList />
          <MaxCount />
          <AutoPrioritize />
          <LabelObservation />
          <CategoryModifier />
          <Categories>
            <Category>
              <ID></ID>
              <Label></Label>
              <Description />
              <Priority />
            </Category>
          </Categories>
          <Validations>
            <Validation>
              <Type></Type>
              <Attributes>
                <Attribute></Attribute>
              </Attributes>
              <Dependencies>
                <Dependency></Dependency>
              </Dependencies>
              <Values>
                <Value></Value>
              </Values>
            </Validation>
          </Validations>
        </Attribute>
      </Attributes>
    </Form>
  </Forms>
  <SharedLists>
    <SharedList>
      <ID></ID>
      <Label></Label>
      <Description />
      <Categories>
        <Category>
          <ID></ID>
          <Label></Label>
          <Description />
          <Priority />
        </Category>
      </Categories>
    </SharedList>
  </SharedLists>
</Protocol>

Figure 2. XML document structure – all elements.

Up next: Managing Stipa Data Collection Protocols in My Workspace →