Schema
This article assumes that you have already read the “Schema” section of the introduction to the editing engine architecture.
# Quick recap
The editor’s schema is available in the editor.model.schema
property. It defines allowed model structures (how model elements can be nested), allowed attributes (of both elements and text nodes), and other characteristics (inline vs. block, atomicity in regards of external actions). This information is later used by editing features and the editing engine to decide how to process the model, where to enable features, etc.
Schema rules can be defined by using the Schema#register()
or the Schema#extend()
methods. The former can be used only once for a given item name which ensures that only a single editing feature can introduce this item. Similarly, extend()
can only be used for defined items.
Elements and attributes are checked by features separately by using the Schema#checkChild()
and the Schema#checkAttribute()
methods.
# Defining allowed structures
When a feature introduces a model element, it should register it in the schema. Besides defining that such an element may exist in the model, the feature also needs to define where this element can be placed. This information is provided by the allowIn
property of the SchemaItemDefinition
:
schema.register( 'myElement', {
allowIn: '$root'
} );
This lets the schema know that <myElement>
can be a child of <$root>
. The $root
element is one of the generic nodes defined by the editing framework. By default, the editor names the main root element a <$root>
, so the above definition allows <myElement>
in the main editor element.
In other words, this would be correct:
<$root>
<myElement></myElement>
</$root>
While this would be incorrect:
<$root>
<foo>
<myElement></myElement>
</foo>
</$root>
To declare which nodes are allowed inside the registered element, the allowChildren
property could be used:
schema.register( 'myElement', {
allowIn: '$root',
allowChildren: '$text'
} );
To allow the following structure:
<$root>
<myElement>
foobar
</myElement>
</$root>
Both the
and allowIn
properties can also be inherited from other allowChildren
SchemaItemDefinition
items.
You can read more about the format of the item definition in the SchemaItemDefinition
API guide.
# Defining additional semantics
In addition to setting allowed structures, the schema can also define additional traits of model elements. By using the is*
properties, a feature author may declare how a certain element should be treated by other features and by the engine.
Here is a table listing various model elements and their properties registered in the schema:
Schema entry | Properties in the definition | |||||
---|---|---|---|---|---|---|
isBlock |
isLimit |
isObject |
isInline |
isSelectable |
isContent |
|
$block |
true |
false |
false |
false |
false |
false |
$container |
false |
false |
false |
false |
false |
false |
$blockObject |
true |
true [1] |
true |
false |
true [2] |
true [3] |
$inlineObject |
false |
true [1] |
true |
true |
true [2] |
true [3] |
$clipboardHolder |
false |
true |
false |
false |
false |
false |
$documentFragment |
false |
true |
false |
false |
false |
false |
$marker |
false |
false |
false |
false |
false |
false |
$root |
false |
true |
false |
false |
false |
false |
$text |
false |
false |
false |
true |
false |
true |
blockQuote |
false |
false |
false |
false |
false |
false |
caption |
false |
true |
false |
false |
false |
false |
codeBlock |
true |
false |
false |
false |
false |
false |
heading1 |
true |
false |
false |
false |
false |
false |
heading2 |
true |
false |
false |
false |
false |
false |
heading3 |
true |
false |
false |
false |
false |
false |
horizontalLine |
true |
true [1] |
true |
false |
true [2] |
true [3] |
imageBlock |
true |
true [1] |
true |
false |
true [2] |
true [3] |
imageInline |
false |
true [1] |
true |
true |
true [2] |
true [3] |
listItem |
true |
false |
false |
false |
false |
false |
media |
true |
true [1] |
true |
false |
true [2] |
true [3] |
pageBreak |
true |
true [1] |
true |
false |
true [2] |
true [3] |
paragraph |
true |
false |
false |
false |
false |
false |
softBreak |
false |
false |
false |
true |
false |
false |
table |
true |
true [1] |
true |
false |
true [2] |
true [3] |
tableRow |
false |
true |
false |
false |
false |
false |
tableCell |
false |
true |
false |
false |
true |
false |
- [1] The value of
isLimit
istrue
for this element because all objects are automatically limit elements. - [2] The value of
isSelectable
istrue
for this element because all objects are automatically selectable elements. - [3] The value of
isContent
istrue
for this element because all objects are automatically content elements.
# Limit elements
Consider a feature like an image caption. The caption text area should construct a boundary to some internal actions:
- A selection that starts inside should not end outside.
- Pressing Backspace or Delete should not delete the area. Pressing Enter should not split the area.
It should also act as a boundary for external actions. This is mostly enforced by a selection post-fixer that ensures that a selection that starts outside, should not end inside. It means that most actions will either apply to the “outside” of such an element or to the content inside it.
Taken these characteristics, the image caption should be defined as a limit element by using the isLimit
property.
schema.register( 'myCaption', {
isLimit: true
} );
The engine and various features then check it via Schema#isLimit()
and can act accordingly.
“Limit element” does not mean “editable element”. The concept of “editability” is reserved for the view and expressed by the EditableElement
class.
# Object elements
For an image caption like in the example above it does not make much sense to select the caption box, then copy or drag it somewhere else.
A caption without the image it describes makes little sense. The image, however, is more self-sufficient. Most likely users should be able to select the entire image (with all its internals), then copy or move it around. The isObject
property should be used to mark such behavior.
schema.register( 'myImage', {
isObject: true
} );
The Schema#isObject()
can later be used to check this property.
Every object is automatically also:
- A limit element – For every element with
isObject
set totrue
,Schema#isLimit( element )
will always returntrue
. - A selectable element – For every element with
isObject
set totrue
,Schema#isSelectable( element )
will always returntrue
. - A content element – For every element with
isObject
set totrue
,Schema#isContent( element )
will always returntrue
.
# Block elements
Generally speaking, content is usually made out of blocks like paragraphs, list items, images, headings, etc. All these elements should be marked as blocks by using isBlock
.
It is important to remember that a block should not allow another block inside. Container elements like <blockQuote>
, which can contain other block elements, should not be marked as blocks.
There is also the $block
generic item which has isBlock
set to true
. Most block type items will inherit from $block
(through inheritAllFrom
).
# Inline elements
In the editor, all HTML formatting elements such as <strong>
or <code>
are represented by text attributes. Therefore, inline model elements are not supposed to be used for these scenarios.
Currently, the isInline
property is used for the $text
token (so, text nodes) and elements such as <softBreak>
or placeholder elements such as described in the Implementing an inline widget tutorial.
The support for inline elements in CKEditor 5 is so far limited to self-contained elements. Because of this, all elements marked with isInline
should also be marked with isObject
.
# Selectable elements
Elements that users can select as a whole (with all their internals) and then, for instance, copy them or apply formatting, are marked with the isSelectable
property in the schema:
schema.register( 'mySelectable', {
isSelectable: true
} );
The Schema#isSelectable()
method can later be used to check this property.
All object elements are selectable by default. There are other selectable elements registered in the editor, though. For instance, there is also the tableCell
model element (rendered as a <td>
in the editing view) that is selectable while not registered as an object. The table selection plugin takes advantage of this fact and allows users to create rectangular selections made of multiple table cells.
# Content elements
You can tell content model elements from other elements by looking at their representation in the editor data (you can use editor.getData()
or Model#hasContent() to check this out).
Elements such as images or media will always find their way into the editor data and this is what makes them content elements. They are marked with the isContent
property in the schema:
schema.register( 'myImage', {
isContent: true
} );
The Schema#isContent()
method can later be used to check this property.
At the same time, elements like paragraphs, list items, or headings are not content elements because they are skipped in the editor output when they are empty. From the data perspective they are transparent unless they contain other content elements (an empty paragraph is as good as no paragraph).
Object elements and $text
are content by default.
# Generic items
There are several generic items (classes of elements) available: $root
, $container
, $block
, $blockObject
, $inlineObject
, and $text
. They are defined as follows:
schema.register( '$root', {
isLimit: true
} );
schema.register( '$container', {
allowIn: [ '$root', '$container' ]
} );
schema.register( '$block', {
allowIn: [ '$root', '$container' ],
isBlock: true
} );
schema.register( '$blockObject', {
allowWhere: '$block',
isBlock: true,
isObject: true
} );
schema.register( '$inlineObject', {
allowWhere: '$text',
allowAttributesOf: '$text',
isInline: true,
isObject: true
} );
schema.register( '$text', {
allowIn: '$block',
isInline: true,
isContent: true
} );
These definitions can then be reused by features to create their own definitions in a more extensible way. For example, the Paragraph
feature will define its item as:
schema.register( 'paragraph', {
inheritAllFrom: '$block'
} );
Which translates to:
schema.register( 'paragraph', {
allowWhere: '$block',
allowContentOf: '$block',
allowAttributesOf: '$block',
inheritTypesFrom: '$block'
} );
And this can be read as:
- The
<paragraph>
element will be allowed in elements in which<$block>
is allowed (e.g. in<$root>
). - The
<paragraph>
element will allow all nodes that are allowed in<$block>
(e.g.$text
). - The
<paragraph>
element will allow all attributes allowed in<$block>
. - The
<paragraph>
element will inherit allis*
properties of<$block>
(e.g.isBlock
).
Thanks to the fact that the <paragraph>
definition is inherited from <$block>
other features can use the <$block>
type to indirectly extend the <paragraph>
definition. For example, the BlockQuote
feature does this:
schema.register( 'blockQuote', {
inheritAllFrom: '$container'
} );
Because <$block>
is allowed in <$container>
(see schema.register( '$block' ...)
), despite the fact that the block quote and paragraph features know nothing about each other, paragraphs will be allowed in block quotes: the schema rules allow chaining.
Taking this even further, if anyone registers a <section>
element (with the allowContentOf: '$root'
rule), because <$container>
is also allowed in <$root>
(see schema.register( '$container' ...)
) the <section>
elements will allow block quotes out–of–the–box.
You can read more about the format of the item definition in SchemaItemDefinition
.
# Defining advanced rules in checkChild()
callbacks
The Schema#checkChild()
method which is the a base method used to check whether some element is allowed in a given structure is a decorated method. It means that you can add listeners to implement your specific rules which are not limited by the declarative SchemaItemDefinition
API.
These listeners can be added either by listening directly to the event:checkChild
event or by using the handy Schema#addChildCheck()
method.
For instance, to disallow nested <blockQuote>
structures, you can define such a listener:
schema.addChildCheck( ( context, childDefinition ) => {
// Note that the context is automatically normalized to a SchemaContext instance and
// the child to its definition (SchemaCompiledItemDefinition).
// If checkChild() is called with a context that ends with blockQuote and blockQuote as a child
// to check, make the checkChild() method return false.
if ( context.endsWith( 'blockQuote' ) && childDefinition.name == 'blockQuote' ) {
return false;
}
} );
# Implementing additional constraints
Schema’s capabilities are limited to simple (and atomic) Schema#checkChild()
and Schema#checkAttribute()
checks on purpose. One may imagine that the schema should support defining more complex rules such as "element <x>
must be always followed by <y>
". While it is feasible to create an API that would enable feeding the schema with such definitions, it is unfortunately unrealistic to then expect that every editing feature will consider these rules when processing the model. It is also unrealistic to expect that it will be done automatically by the schema and the editing engine themselves.
For instance, let’s get back to the "element <x>
must be always followed by <y>
" rule and this initial content:
<$root>
<x>foo</x>
<y>bar[bom</y>
<z>bom]bar</z>
</$root>
Now imagine that the user presses the “Block quote” button. Normally it would wrap the two selected blocks (<y>
and <z>
) with a <blockQuote>
element:
<$root>
<x>foo</x>
<blockQuote>
<y>bar[bom</y>
<z>bom]bar</z>
</blockQuote>
</$root>
But it turns out that this creates an incorrect structure — <x>
is not followed by <y>
anymore.
What should happen instead? There are at least 4 possible solutions: the block quote feature should not be applicable in such a context, someone should create a new <y>
right after <x>
, <x>
should be moved inside <blockQuote>
together with <y>
or vice versa.
While this is a relatively simple scenario (unlike most real-time collaborative editing scenarios), it turns out that it is already hard to say what should happen and who should react to fix this content.
Therefore, if your editor needs to implement such rules, you should do that through model’s post-fixers fixing incorrect content or actively prevent such situations (e.g. by disabling certain features). It means that these constraints will be defined specifically for your scenario by your code which makes their implementation much easier.
To sum up, the answer to who and how should implement additional constraints is: your features or your editor through the CKEditor 5 API.
# Who checks the schema?
The CKEditor 5 API exposes many ways to work on (change) the model. It can be done through the writer, via methods like Model#insertContent()
, via commands and so on.
# Low-level APIs
The lowest-level API is the writer (to be precise, there are also raw operations below, but they are used for very special cases only). It allows applying atomic changes to the content like inserting, removing, moving or splitting nodes, setting and removing an attribute, etc. It is important to know that the writer does not prevent from applying changes that violate rules defined in the schema.
The reason for this is that when you implement a command or any other feature you may need to perform multiple operations to do all the necessary changes. The state in the meantime (between these atomic operations) may be incorrect. The writer must allow that.
For instance, you need to move <foo>
from <$root>
to <bar>
and (at the same time) rename it to <oof>
. But the schema defines that <oof>
is not allowed in <$root>
and <foo>
is disallowed in <bar>
. If the writer checked the schema, it would complain regardless of the order of rename
and move
operations.
You can argue that the engine could handle this by checking the schema at the end of a Model#change()
block (it works like a transaction — the state needs to be correct at the end of it). In fact, we plan to strip disallowed attributes at the end of these blocks.
There are problems, though:
- How to fix the content after a transaction is committed? It is impossible to implement a reasonable heuristic that would not break the content from the user perspective.
- The model can become invalid during real-time collaborative changes. Operational Transformation, while implemented by us in a very rich form (with 11 types of operations instead of the base 3), ensures conflict resolution and eventual consistency, but not the model’s validity.
Therefore, we chose to handle such situations on a case-by-case basis, using more expressive and flexible model’s post-fixers. Additionally, we moved the responsibility to check the schema to features. They can make a lot better decisions a priori, before doing changes. You can read more about this in the “Implementing additional constraints” section above.
# High-level APIs
What about other, higher-level methods? We recommend that all APIs built on top of the writer should check the schema.
For instance, the Model#insertContent()
method will make sure that inserted nodes are allowed in the place of their insertion. It may also attempt to split the insertion container (if allowed by the schema) if that will make the element to insert allowed, and so on.
Similarly, commands — if implemented correctly — get disabled if they should not be executed in the current place.
Finally, the schema plays a crucial role during the conversion from the view to the model (also called “upcasting”). During this process converters decide whether they can convert specific view elements or attributes to the given positions in the model. Thanks to that if you tried to load incorrect data to the editor or when you paste content copied from another website, the structure and attributes of the data get adjusted to the current schema rules.
Some features may miss schema checks. If you happen to find such a scenario, do not hesitate to report it to us.
Every day, we work hard to keep our documentation complete. Have you spotted an outdated information? Is something missing? Please report it via our issue tracker.