com.openexchange.mail.text
Class HTMLProcessing

java.lang.Object
  extended by com.openexchange.mail.text.HTMLProcessing

public final class HTMLProcessing
extends java.lang.Object

HTMLProcessing - Various methods for HTML processing.

Author:
Thorben Betten

Method Summary
static org.w3c.dom.Document createDOMDocument(java.lang.String string)
          Creates a DOM document from specified XML/HTML string.
static java.lang.String formatContentForDisplay(java.lang.String content, java.lang.String charset, boolean isHtml, com.openexchange.session.Session session, MailPath mailPath, UserSettingMail usm, boolean[] modified, DisplayMode mode)
          Performs all the formatting for both text and HTML content for a proper display according to specified user's mail settings.
static java.lang.String formatHrefLinks(java.lang.String content)
          Searches for non-HTML links and convert them to valid HTML links.
static java.lang.String formatHTMLForDisplay(java.lang.String content, java.lang.String charset, com.openexchange.session.Session session, MailPath mailPath, UserSettingMail usm, boolean[] modified, DisplayMode mode)
          Performs all the formatting for HTML content for a proper display according to specified user's mail settings.
static java.lang.String formatTextForDisplay(java.lang.String content, UserSettingMail usm, DisplayMode mode)
          Performs all the formatting for text content for a proper display according to specified user's mail settings.
static java.lang.String getConformHTML(java.lang.String htmlContent, ContentType contentType)
          Creates valid HTML from specified HTML content conform to W3C standards.
static java.lang.String getConformHTML(java.lang.String htmlContent, java.lang.String charset)
          Creates valid HTML from specified HTML content conform to W3C standards.
static java.lang.Character getHTMLEntity(java.lang.String entity)
          Maps specified HTML entity - e.g.
static java.lang.String html2text(java.lang.String htmlContent, boolean appendHref)
          Converts specified HTML content to plain text.
static java.lang.String htmlFormat(java.lang.String plainText)
          Formats plain text to HTML by escaping HTML special characters e.g.
static java.lang.String htmlFormat(java.lang.String plainText, boolean withQuote)
          Formats plain text to HTML by escaping HTML special characters e.g.
static java.lang.String prettyPrint(java.lang.String htmlContent)
          Pretty prints specified HTML content.
static java.lang.String prettyPrintXML(org.w3c.dom.Node node)
          Pretty-prints specified XML/HTML node.
static java.lang.String prettyPrintXML(java.lang.String string)
          Pretty-prints specified XML/HTML string.
static java.lang.String replaceHTMLEntities(java.lang.String content)
          Replaces all HTML entities occurring in specified HTML content.
static java.lang.String urlEncodeSafe(java.lang.String text, java.lang.String charset)
          Translates specified string into application/x-www-form-urlencoded format using a specific encoding scheme.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

formatTextForDisplay

public static java.lang.String formatTextForDisplay(java.lang.String content,
                                                    UserSettingMail usm,
                                                    DisplayMode mode)
Performs all the formatting for text content for a proper display according to specified user's mail settings.

Parameters:
content - The plain text content
usm - The settings used for formatting content
mode - The display mode
Returns:
The formatted content
See Also:
formatContentForDisplay(String, String, boolean, Session, MailPath, UserSettingMail, boolean[], DisplayMode)

formatHTMLForDisplay

public static java.lang.String formatHTMLForDisplay(java.lang.String content,
                                                    java.lang.String charset,
                                                    com.openexchange.session.Session session,
                                                    MailPath mailPath,
                                                    UserSettingMail usm,
                                                    boolean[] modified,
                                                    DisplayMode mode)
Performs all the formatting for HTML content for a proper display according to specified user's mail settings.

Parameters:
content - The HTML content
charset - The character encoding
session - The session
mailPath - The message's unique path in mailbox
usm - The settings used for formatting content
modified - A boolean array with length 1 to store modified status of external images filter
mode - The display mode
Returns:
The formatted content
See Also:
formatContentForDisplay(String, String, boolean, Session, MailPath, UserSettingMail, boolean[], DisplayMode)

formatContentForDisplay

public static java.lang.String formatContentForDisplay(java.lang.String content,
                                                       java.lang.String charset,
                                                       boolean isHtml,
                                                       com.openexchange.session.Session session,
                                                       MailPath mailPath,
                                                       UserSettingMail usm,
                                                       boolean[] modified,
                                                       DisplayMode mode)
Performs all the formatting for both text and HTML content for a proper display according to specified user's mail settings.

If content is plain text:

  1. Plain text content is converted to valid HTML if at least DisplayMode.MODIFYABLE is given
  2. If enabled by settings simple quotes are turned to colored block quotes if DisplayMode.DISPLAY is given
  3. HTML links and URLs found in content are going to be prepared for proper display if DisplayMode.DISPLAY is given
If content is HTML:
  1. Both inline and non-inline images found in HTML content are prepared according to settings if DisplayMode.DISPLAY is given

Parameters:
content - The content
charset - The character encoding (only needed by HTML content; may be null on plain text)
isHtml - true if content is of type text/html; otherwise false
session - The session
mailPath - The message's unique path in mailbox
usm - The settings used for formatting content
modified - A boolean array with length 1 to store modified status of external images filter (only needed by HTML content; may be null on plain text)
mode - The display mode
Returns:
The formatted content

html2text

public static java.lang.String html2text(java.lang.String htmlContent,
                                         boolean appendHref)
Converts specified HTML content to plain text.

Parameters:
htmlContent - The validated HTML content
appendHref - true to append URLs contained in hrefs and srcs; otherwise false.
Example: <a href=\"www.somewhere.com\">Link<a> would be Link [www.somewhere.com]
Returns:
The plain text representation of specified HTML content

formatHrefLinks

public static java.lang.String formatHrefLinks(java.lang.String content)
Searches for non-HTML links and convert them to valid HTML links.

Example: http://www.somewhere.com is converted to <a href="http://www.somewhere.com">http://www.somewhere.com</a>.

Parameters:
content - The content to search in
Returns:
The given content with all non-HTML links converted to valid HTML links

getConformHTML

public static java.lang.String getConformHTML(java.lang.String htmlContent,
                                              ContentType contentType)
Creates valid HTML from specified HTML content conform to W3C standards.

Parameters:
htmlContent - The HTML content
contentType - The corresponding content type (including charset parameter)
Returns:
The HTML content conform to W3C standards

getConformHTML

public static java.lang.String getConformHTML(java.lang.String htmlContent,
                                              java.lang.String charset)
Creates valid HTML from specified HTML content conform to W3C standards.

Parameters:
htmlContent - The HTML content
charset - The charset parameter
Returns:
The HTML content conform to W3C standards

createDOMDocument

public static org.w3c.dom.Document createDOMDocument(java.lang.String string)
Creates a DOM document from specified XML/HTML string.

Parameters:
string - The XML/HTML string
Returns:
A newly created DOM document or null if given string cannot be transformed to a DOM document

prettyPrintXML

public static java.lang.String prettyPrintXML(java.lang.String string)
Pretty-prints specified XML/HTML string.

Parameters:
string - The XML/HTML string to pretty-print
Returns:
The pretty-printed XML/HTML string

prettyPrintXML

public static java.lang.String prettyPrintXML(org.w3c.dom.Node node)
Pretty-prints specified XML/HTML node.

Parameters:
node - The XML/HTML node pretty-print
Returns:
The pretty-printed XML/HTML node

prettyPrint

public static java.lang.String prettyPrint(java.lang.String htmlContent)
Pretty prints specified HTML content.

Parameters:
htmlContent - The HTML content
Returns:
Pretty printed HTML content

replaceHTMLEntities

public static java.lang.String replaceHTMLEntities(java.lang.String content)
Replaces all HTML entities occurring in specified HTML content.

Parameters:
content - The content
Returns:
The content with HTML entities replaced

getHTMLEntity

public static java.lang.Character getHTMLEntity(java.lang.String entity)
Maps specified HTML entity - e.g. &uuml; - to corresponding ASCII character.

Parameters:
entity - The HTML entity
Returns:
The corresponding ASCII character or null

htmlFormat

public static java.lang.String htmlFormat(java.lang.String plainText,
                                          boolean withQuote)
Formats plain text to HTML by escaping HTML special characters e.g. "<" is converted to "&lt;".

Parameters:
plainText - The plain text
withQuote - Whether to escape quotes (") or not
Returns:
properly escaped HTML content

htmlFormat

public static java.lang.String htmlFormat(java.lang.String plainText)
Formats plain text to HTML by escaping HTML special characters e.g. "<" is converted to "&lt;".

This is just a convenience method which invokes htmlFormat(String, boolean) with latter parameter set to true.

Parameters:
plainText - The plain text
Returns:
properly escaped HTML content
See Also:
htmlFormat(String, boolean)

urlEncodeSafe

public static java.lang.String urlEncodeSafe(java.lang.String text,
                                             java.lang.String charset)
Translates specified string into application/x-www-form-urlencoded format using a specific encoding scheme. This method uses the supplied encoding scheme to obtain the bytes for unsafe characters.

Parameters:
text - The string to be translated.
charset - The character encoding to use; should be UTF-8 according to W3C
Returns:
The translated string or the string itself if any error occurred