The base content model for token level elements,
including PCDATA, possibly inter-mixed with <abbr> and
<num> elements.
Elements that can appear at the paragraph level -i.e.,
in between paragraphs, at the same level as <p>. This includes the
elements in class M.INTER plus <p> and <sp>.
An abbreviation of any sort.
a loosely-structured bibliographic citation appearing
within a corpus text.
The body of the text, excluding any front or back
matter.
Contains the primary statement of responsibility given
for a work on its title page or at the head or ending of the work, most often
applicable to newspapers. Can contain any phrase-level element plus the tag
<docAuthor> for the author's name.
Zero or more phrase-level elements
(xces:phrase.seq)
A single document, either forming part of or derived
from a corpus, containing a <cesHeader> element, followed by either a
<body> element or a <group> element.
Used to group together material appearing at the end of
a division, including in particular <dateline> and
<keywords>.
Contains the correct form of a passage apparently
erroneous in the copy text.
A date in any format.
Can contain untagged prose intermixed with markup for
dates, times, names, addresses, abbreviations, and numbers.
identifies a word or phrase regarded as linguistically
distinct (e.g., archaic, technical, dialect, etc.).
Any subdivision of a written text, e.g. chapter,
section, sub-section, article, etc.
The location of a graphic, illustration, or
figure.
A point where material has been omitted in a
transcription, whether for editorial sampling practice, or because the material is
illegible.
Groups together a sequence of distinct texts that are
regarded as a unit, such as a sequence of prose essays, poems, etc.
Any heading, for example, the title of a section. This
element can also appear inside the <list> and <poem>
elements to mark the title of a list or poem. It can contain any phrase-level
element.
A word or phrase graphically distinct from the
surrounding text, for reasons concerning which no claim is made. The rend attribute
should provide the original rendition information when its function has not yet been
determined.
An item within a list.
Terms and lists of terms that may appear at the
beginning or end of a text as identifying material.
A line of verse.
Groups of verse lines (marked by <line>
lowercase el), most often into stanzas. Use the type attribute to identify the
reason for the grouping.
A collection of distinct items flagged as such by
special layout in written texts, often functioning as a single syntactic unit. Note
that <list> is the only phrase-level element which is also a
paragraph-level element; its content model is exactly the same in both instances.
A number, word, phrase indicating a
quantity.
A proper noun or noun phrase.
Any form of note, usually a footnote. This tag marks
only notes that are a part of the original text, not notes that may be added by the
encoder, etc.
A number, written in any form.
Groups together any opening material that is not a
heading at the start of a division, including in particular <dateline>
and <keywords>.
A paragraph in a written text.
A poem, or an extract from one, embedded or quoted
within a text.
A pointer to another location in the current document in
terms of one or more identifiable elements.
Quoted dialogue or other quoted material appearing
inside a paragraph.
a quotation from some author other than that of the
surrounding text, usually either embedded or displayed.
Author of a quote or poem in the
text.
A reference to another location in the current document,
in terms of one or more identifiable elements, possibly modified by additional text
or comment.
Text which has been regularized or normalized in some
sense.
Identifies an s-unit within a document, typically an
orthographic sentence.
Material marked as "written to be spoken'' or "written
as spoken", usually by the presence of a speaker prefix, for example in a play
script or printed interview.
Any kind of stage direction within a dramatic
text.
Text displayed in tabular form
A single-word, multi-word or symbolic designation which
is regarded as a technical term.
An individual text.
A phrase defining a time of day in any
format.
Extends the globalAtts group to include type and wsd
attributes.
Contains a series of sentences (marked with
<s> tags), a series of tokens (marked with <tok>tags), a
series of paragraph-like elements marked with <par> tags), or "plain
text" data (PCDATA), which is marked with <data> tags.
Contains one or more "chunks" of
annotation.
Contains a corpus tag, when this tag appears within the
<lex> element, it gives the corpus tag associated with the
accompanying morphosyntactic information.
Groups one or more disambiguated corpus tags and/or full
morphosyntactic descriptions associated with the token.
Groups one or more alternative sets of morphosyntactic
information associated with the token.
An XLink simple link that marks paragraph boundaries.
Contains a series of <tok> elements, a series of <s>
elements, or a series of <data> elements--or any inter-mixture of
these elements.
An XLink simple link that contains a token, consisting
of its orthographic form in the original document, followed optionally by
disambiguated corpus tag and/or one or more alternative sets of morphosyntactic
information associated with the token.