We are going to discuss generated HTML content and how to validate it in an automated fashion. The results will be appropriate to work into your continuous integration (CI) pipeline.
In my experience the pendulum has swung a few times between dynamic HTML content and statically generated content. Without having a full discussion of the positives and negatives of each approach, there are cases where using a generated site is useful.
There are many solutions for generating HTML content, but we’ll focus on one tool since we’re primarily interested in how to validate the HTML output. Grain is a Groovy based site generator that uses Gradle for its build system.
## htmlSanityCheck Gradle Plugin
The org.aim42.htmlSanityCheck
Gradle plugin is an easy way to validate HTML in your Gradle build. It has a number of rules, is configurable, and will output JUnit style reports that can be consumed by most CI tools.
## Step 1: Create Grain Project
- Clone the Octopress Grain theme from https://github.com/double16/grain-theme-octopress 2.
./grainw
3. Open http://localhost:4000
in a web browser
## Step 2: The htmlSanityCheck Plugin
The htmlSanityCheck
plugin is easy to add to this project. It already exists in the repo referenced above. The parts of interest that you need to add to your own build.gradle
follow:
htmlSanityCheck { // generate is provided by grain, we want the HTML generated before we check it dependsOn generate // target is the default folder for grain content sourceDir # file('target') // set the location of the results, this can be used for any project checkingResultsDir # file("${buildDir}/reports/htmlchecks") // validate external links, it will take longer, adjust as desired checkExternalLinks # true }
// validate HTML as part of the standard Gradle check process check.dependsOn htmlSanityCheck ```
\## Step 3: Generate and Validate HTML
Validation can be run by itself or as part of the Gradle `check` task.
```shell ./gradlew check ```
OR
```shell ./gradlew htmlSanityCheck ```
If the build fails, open the `build/reports/htmlchecks/index.html` file to inspect the failures. More details can be seen during the build by using the `--info` option.
```shell ./gradlew check --info ... ================================================== Summary for file index.html
page path : grain-theme-octopress/target/blog/2014/01/08/pullquote-tag/index.html page title : Pullquote tag - Octopress theme for Grain page size : 16062 bytes
-------------------------------------------------- Results for Duplicate Definition of id Check 4 id checked, 0 duplicate id found.
-------------------------------------------------- ...
## Sample Output
* XML output is JUnit compatible so that tools that ingest JUnit reports will work with htmlSanityCheck
. Output is in the build/test-results/htmlchecks/
folder in the root of the project.
* The HTML report looks clean and similar to JUnit HTML reports.

## Summary
“Test everything!” applies to your generated HTML. Not only can the HTML be checked for correctness and valid links, but standards such as accessibility, can also be validated before publishing the content.
The htmlSanityCheck
tool is a great way to test your generated HTML from Gradle or other build tools. See https://github.com/aim42/htmlSanityCheck for full documentation.