STEMMA History

Development of the STEMMA® ("Source Text for Event and Ménage MApping”) data model and source format began around 2011. This page charts its chronological history.

Draft [2012-01-02]


First draft specification uploaded to new STEMMA Web site.

Draft [2012-04-27]

The STEMMA research notes were collected together and made (almost) readable. The 70+ pages were uploaded to the STEMMA site as a resource for any similar work on family history data to utilise.

V1.0 [2012-07-16]

STEMMA passed from being a draft specification to the first fully working version.

A number of its features were streamlined or revised as a result of it being applied to my own data, and following further research. The copious associated Research Notes were updated in keeping with the new specification, and supplemented by a Data Model section that shows the model being applied to a number of case studies.

New or improved features include:


  • Rationalised way of extending partially controlled vocabularies in order to support custom types, subtypes, roles, styles, and other tag values.
  • Unified approach to defining core and custom properties for Persons and Places.
  • Streamlined handling of multi-valued properties (and of Citation/Resource parameters) such as 'Roles'.
  • Support for local-events (i.e. that only affect one person).
  • Support for Events with multiple sources of information, i.e. multiple sets of properties for each associated Person.
  • Support for Dublin Core semantic tags for both Person/Place properties and Resource/Citation parameters. Support for their machine-readable OpenURLs.
  • Streamlined approach to Person and Place names that retains their unified handling but accommodates name types, name styles and sorting for different cultures.
  • Support for Dual Dates (aka Double Years).
  • Support for URL hyperlinks in narrative.
  • Support for general reference notes in narrative.
  • Copyright and other permissions/prohibitions.
  • Identification of physical artefacts as Resources.
  • Extended inheritance mechanism to Resources (e.g. attachments) and Citations (e.g. sources) so that the same details may be shared between multiple entities.


V2.0 [2013-05-28]


STEMMA underwent a considerable number of refinements to both strengthen and streamline its specification. Features include:


  • Better support for recording transcriptions, including uncertain characters, marginalia, original emphasis, alternative spellings/meanings.
  • Better separation of evidence from conclusion for marked-up references and for Property values.
  • Generic Group concept that can be used to model time-dependent Sets of Person, e.g. family units.
  • Support for attribution of individuals, whether represented within the family history or external to it. Contact details, including address, phone, email addresses, Web sites, and messaging systems.
  • Revised date-string representation for world calendars.
  • Downloads section added to Web site.


V2.1 [2013-10-16]


Changes include:


  • Changes to NoteRef. Added new Anom element for transcription anomalies.
  • Allow <br> in orig-text data.
  • Added Hamlet place-type.


V2.2 [2014-04-17]


Changes include:


  • Changed Group to also represent real-life entity rather than just Person Sets. Include hierarchy, events, alternative names, subtype, resources (docs & photos).
  • Added GroupRef mark-up.
  • Added GroupRef data-type.
  • Move Person GroupLnk inside EventLnk/Eventlet.
  • Split BirthEvent/DeathEvent to allow for Eventlet.
  • Have equivalent to Birth/Death (Creation & Demise) for Group and Place. Replaces Void in Places.
  • Handle "related entities" with JoinFrom (in Creation), SplitTo (in Demise), and RelatedTo elements in Place/Group.
  • Simplify Eventlet by removing hierarchy support.
  • Persisted Counters in Dataset header for assisted key generation.
  • Added GroupProperties to ExtendedProperties.
  • Event Place optional in Eventlet.
  • Resource entity distinguishes physical artefacts and images thereof.
  • Improved digital data-types for Resources.
  • Sensitivity levels accepted on Resources (e.g. photographs and documents).
  • External IDs accepted on Person, Place, Group, and Event entities.



V3.0 [2014-10-20]


Major change to trim excess flexibility, and to address certain known failings:


  • Could not represent the Properties for unidentified or incidental people in a given source.
  • Overloading of Role Property with relationships.
  • Problems representing a “directed Property”, to another entity reference, as opposed to, say, Head.Wife.
  • Cannot inherit from an Event when it has Detail elements in it.
  • Representation of top-level research reports.


Changes include:


  • Reversal of Person-to-Event (etc) links to place Properties in the Event, alongside the respective source details.
  • Added References element to Event for representing subject references in the sources, and their respective Properties. This element supersedes the previous Detail element.
  • Introduction of “abstract” entities for the sole purposes of inheritance.
  • Make Event hierarchies bottom-up rather than top-down for consistency and ease of validation.
  • Deprecation of parameter substitution into Citation URIs; both named parameter markers and the ‘=?’ form.
  • Inclusion of NARRATIVE as a top-level Dataset entity for research reports and authored works.
  • Changed semantic types on Properties and Parameters to use “DC:” namespace prefix rather than simply “DC.”. DCType attribute changed to SemType.
  • Reinstatement of Event-specific Property values to represent named items of information for an event.
  • Explicit control over entity-Key imports for multi-Dataset Documents and multi-Document collections.
  • PersonEL, PlaceEL, and GroupEL data-types added for Properties that describe a relationship between two evidential subjects, such as person-to-person.
  • Addition of ‘Header’ TEXT_TYPE for details of authorship, title, etc., in narrative works.
  • Adjustments to NAME_VARIANTS to move the Type attribute, add an Initial=’boolean’ option for using initials, an indication of cultural style, and an optional override for character sorting.
  • Added optional PersonalName (within Person entity) to complement PlaceName (in Place) and GroupName (in Group).
  • Revise syntax of <Constraints> element to associate narrative with a specific constraint, e.g. to express causal relationships.
  • Added optional coordinates to a Place in order to represent a point, an enclosed area (i.e. polygon), or an open line (e.g. for a street).
  • Separation of Relationship from Role.
  • Several new event-types and event-subtypes.



V4.0 [2015-11-22]


Major change to finally accommodate sources, information, evidence, and conclusions in a single model that supports the major approaches to research and representation.


Changes include:


  • Introduction of a new Source entity that embraces both Citations and Resources for a particular information source. Citations and Resource entities are now connected to Source entity rather than to each other.
  • Support for source assimilation & analysis, source mining, and the ability to drill-down on conclusions, all provided via the Source entity.
  • The <References> element, within Events, is now superseded by <SourceLnk> which links to the new Source entity. Enclosed *Ref elements (e.g. <PersonRef>) changed to *Lnk elements for consistency. Removal of the ID attribute introduced in V3.0.
  • Support for cross-source analysis and correlation via a new Matrix entity.
  • Support for a generalised approach to multi-tier personae.
  • Additional of Animal entity, strongly modelled on Person entity, including related mark-up and namespaces.
  • <CitationLnk>/<ResourceLnk> from Person, Place, Group, and Event entities, changed to <SourceLnk>.
  • Reviewed the goal of sticking to XHTML tags for presentation, replacement of the <Hi> element with HTML-like ones, and the addition of support for <sup>/<sub> elements, columnar text, simple tables, and indentation.
  • Removal of ‘Unreadable’ mode from the <Anom> element.
  • Support for distinguishing manuscript and typescript transcriptions in the <Text> element. Support for numbering lines and pages in transcriptions. Positional control over annotations such as marginalia.
  • <FromText> element added to <Narrative> in order to share re-usable sections of text. This has meant that the NoteKey attribute, in the semantic mark-up, was no longer required and so was deleted.
  • Categorisation of the layers in a Citation chain.
  • The optional <DisplayFormat> element of the Citation entity has been re-interpreted as a set of pre-formatted language-specific strings. This may exist in addition to the mandatory set of named parameter values, and the two together can also be used as a simple citation-template.
  • The Intrinsic Functions, mentioned at the end of Semantic Mark-up, have been changed to Intrinsic Methods in preparation for defining a run-time object model. The set is also supplemented by ones for accessing subject-entity names.
  • Small changes to subject-entity *-name-mode vocabularies to factor-out a generic name-mode (missing from previous specification).
  • Place coordinates (including bounding shapes) are now time-dependent, the same as any parent-Place link.
  • Added Canton and Colony to place-type vocabulary. The place-type of House is now replaced by Number and Apartment for flexibility.
  • <Quality>, <Reliability>, and <Credibility> elements moved from the Citation entity to the new Source entity.


Although refinements will continue, I anticipate this to be the last major change to the STEMMA specification. I will, therefore, concentrate subsequent efforts on describing its advantages and philosophy, and in providing more worked examples.