Main Content

Requirements for Converting HTML to DOM Objects

To convert HTML content to an mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile object, the HTML content must be XML parsable. HTML content is XML parsable when it complies with the rules for properly formed XML, such as:

  • Include a closing tag for all elements.

  • Use lower case for the opening and closing (start and end) tags of an element. For example, use <p> and </p> for a paragraph element, not <P> and </P>.

  • Nest elements properly. If you open an element inside another element, close the first element before you close the containing element.

  • Enclose attribute values with quotation marks. For example, use <p align="center"></p>.

For details, see the W3Schools summary of XML rules at www.w3schools.com/xml/xml_syntax.asp.

Supported HTML Elements and Attributes

This table shows the HTML elements and attributes that are supported when you convert HTML to a DOM object. Unsupported elements and attributes are ignored.

HTML ElementAttributes
aclass, style, href, name
addressclass, style
bclass, style
bigclass, style
blockquoteclass, style
bodyclass, style
brn/a
centerclass, style
citeclass, style
codeclass, style
ddclass, style
delclass, style
dfnclass, style
divclass, style
dlclass, style
dtclass, style
emclass, style
fontclass, style, color, face, size
h1, h2, h3, h4, h5, h6class, style, align
hrclass, style, align
iclass, style
insclass, style
imgclass, style, src, height, width
kbdclass, style
liclass, style
markclass, style
nobrclass, style
olclass, style
pclass, style, align
preclass, style
sclass, style
sampclass, style
smallclass, style
spanclass, style
strikeclass, style
strongclass, style
subclass, style
supclass, style
tableclass, style, align, bgcolor, border, cellspacing, cellpadding, frame, rules, width
tbodyclass, style, align, valign
tfootclass, style, align, valign
theadclass, style, align, valign
tdclass, style, bgcolor, height, width, colspan, rowspan,align, valign, nowrap
thclass, style, bgcolor, height, width, colspan, rowspan,align, valign, nowrap
trclass, style, align,bgcolor, valign
ttclass, style
uclass, style
ulclass, style
varclass, style

For information about these elements, see https://developer.mozilla.org/en-US/docs/Web/HTML/Element.

Supported HTML CSS Style Attributes for All Elements

You can use HTML style attributes to format HTML content that you append to a DOM report. A style attribute is a string of Cascading style sheets (CSS) formats.

These CSS formats are supported:

  • background-color

  • border

  • border-bottom

  • border-bottom-color

  • border-bottom-style

  • boder-bottom-width

  • border-color

  • border-left

  • border-left-color

  • border-left-style

  • boder-left-width

  • border-right

  • border-right-color

  • border-rigtht-style

  • border-right-width

  • border-style

  • border-top

  • border-top-color

  • border-top-style

  • border-top-width

  • border-width

  • color

  • counter-increment

  • counter-reset

  • display

  • font-family

  • font-size

  • font-style

  • font-weight

  • height

  • line-height

  • list-style-type

  • margin

  • margin-bottom

  • margin-left

  • margin-right

  • margin-top

  • padding

  • padding-bottom

  • padding-left

  • padding-right

  • padding-top

  • text-align

  • text-decoration

  • text-indent

  • vertical-align

  • white-space

  • width

For information about these formats, https://developer.mozilla.org/en-US/docs/Web/CSS/Reference.

Support for HTML Character Entities

You can append HTML content that includes special characters, such as the British pound sign, the U.S. dollar sign, or reserved XML markup characters. The XML markup special characters are >, <, &, ", and '. To include special characters, use HTML named or numeric character references. For example, to include the left angle bracket (<) in HTML content that you want to append, use one of these character entity references:

  • The named character entity reference <

  • The numeric character entity reference &003c;

For more information, see https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references.

DOCTYPE Declaration

The HTML content that you append to a DOM report does not need to include a document type declaration (see https://en.wikipedia.org/wiki/Document_type_declaration). If the content includes a document type declaration, it must meet the following conditions:

  • If the content includes character entity references (special characters), the document type declaration must reference a document type definition (DTD) that defines the referenced entities. For example, the following declaration specifies a DTD file that defines all HTML character entities:

    <!DOCTYPE html SYSTEM "html.dtd">

    The html.dtd is included in the MATLAB® Report Generator™ software.

  • If the document type declaration references a DTD file, a valid DTD file must exist at the path specified by the declaration. Otherwise, appending the content causes a DTD parse error. For example, the following declaration causes a parse error:

    <!DOCTYPE html SYSTEM "foo.dtd">
  • If the content to be appended does not include character entity references, the document type declaration does not need to reference a DTD file. For example, the following declaration works for content that does not use special characters:

    <!DOCTYPE html>

Tip

To avoid document type declaration issues, remove declarations from existing HTML content that you intend to append to DOM reports. If the content does not include a declaration, the DOM prepends a valid declaration that defines the entire HTML character entity set.

Related Topics