If you want to include transcriptions in your data then they should appear in the Resource entity that describes the associated digital or physical artefact; they should not appear in any Citation entity that references the data source. Although the transcription could appear in a top-level Narrative entity, this is not recommended since it disassociates it from any digital image of the text and other descriptive information about it. Note that the Source entity is designed to work with transcriptions identified by a <ResourceLnk> element.
In principle, the Source entity can work directly from an image — especially if it contains a transcription within its meta-data, possibly linked to the image using SVG — but there are currently no examples for this.
For a multi-page source then there may be separate Resource entities but these can be grouped using STEMMA’s inheritance mechanism. See Hierarchical Sources. The Source entity’s <SourceLet> elements also accept an abbreviated Parameter list in their own <ResourceLnk> elements (sharing the remainder with the main <Frame> element) in order to make it convenient to cite individual parts of a source.
In summary, the transcription would appear in a <Narrative> element within the Resource entity, and it would use the mark-up described under Narrative Structure to capture original presentational attributes, anomalies such as marginalia and corrections, and the semantics of subject references, dates, etc. Some of the semantic mark-up takes a DetLnk=’key’ attribute that allows that field to be labelled for analysis and correlation within the Source and Matrix entities. The Resource entity can even indicate that you have the physical original as well as an image of it. See Handling Transcriptions, and also Structured Narrative for a worked example.