HTML : HTML parser and encoder

更新时间:
2024-05-13

HTML : HTML parser and encoder

This module is an HTML format parsing module, similar to the JavaScript built-in JSON object. It can parsing HTML into an AST and stringify'ing it back to the original string.

User can use the following code to import the HTML module.

var HTML = require('html');

Support

The following shows HTML module APIs available for each permissions.

 User ModePrivilege Mode
HTML.parse
HTML.stringify

HTML Module

HTML.parse(htmlString[, options])

  • htmlString {String} String of HTML.
  • options {Object} Parse option. default: undefined.
  • Returns: {Array} Html abstract syntax tree.

options contains the following members:

  • components {Array} | {Object} Components whose children will be ignored when generating the AST.

Takes a string of HTML and turns it into an AST, the only option you can currently pass is an object of registered components whose children will be ignored when generating the AST.

Example

// this html:
var html = '<div class="oh"><p>hi</p></div>';

// becomes this AST:
var ast = HTML.parse(html);

console.log(ast);
/*
[{
  // can be `tag`, `text` or `component`
  type: 'tag',

  // name of tag if relevant
  name: 'div',

  // parsed attribute object
  attrs: {
    class: 'oh'
  },

  // whether this is a self-closing tag
  // such as <img/>
  voidElement: false,

  // an array of child nodes
  // we see the same structure
  // repeated in each of these
  children: [
    {
      type: 'tag',
      name: 'p',
      attrs: {},
      voidElement: false,
      children: [
        // this is a text node
        // it also has a `type`
        // but nothing other than
        // a `content` containing
        // its text.
        {
          type: 'text',
          content: 'hi'
        }
      ]
    }
  ]
}]
*/

HTML.stringify(ast)

  • ast {Array} Html abstract syntax tree.
  • Returns: {String} String of HTML.

Takes an AST and turns it back into a string of HTML.

AST node types

'tag'

properties:

  • type {String} Will always be tag for this type of node.
  • name {String} Tag name, such as div.
  • attrs {Object} An object of key/value pairs. If an attribute has multiple space-separated items such as classes, they'll still be in a single string, for example: class: "class1 class2".
  • voidElement {Boolean} Whether this tag is a known void element as defined by specopen in new window.
  • children {Array} Array of child nodes. Note that any continuous string of text is a text node child, see below.

'text'

properties:

  • type {String} Will always be text for this type of node.
  • content {String} Text content of the node.

'component'

If you pass an object of components as part of the options object passed as the second argument to HTML.parse() then the AST won't keep parsing that branch of the DOM tree when it one of those registered components.

This is so that it's possible to ignore sections of the tree that you may want to handle by another "subview" in your application that handles it's own DOM diffing.

properties:

  • type {String} Will always be component for this type of node.
  • name {String} Tag name, such as div.
  • attrs {Object} An object of key/value pairs. If an attribute has multiple space-separated items such as classes, they'll still be in a single string, for example: class: "class1 class2".
  • voidElement {Boolean} Whether this tag is a known void element as defined by specopen in new window.
  • children {Array} it will still have a children array, but it will always be empty.
文档内容是否对您有所帮助?
有帮助
没帮助