Validating FHIR resources for fun and nonprofit

March 1, 2021

Christian Burtchen

Christian Burtchen is a senior frontend developer specializing in all things JavaScript at Data4Life. Combining a humanities background with software engineering, he thinks all technology should be used to (Tron voice) fight for users. He also likes nerdy references.

We’ve been talking a lot about the FHIR acronym – that it concerns resources (R) relating to all things healthcare (H) and that their implementation is intended to be fast (F).

But it’s time to address the Ilephant in the room: The resources must also be interoperable, meaning all parties need to be able to trust their format and content.

That’s why the good people working on FHIR have been painstakingly defining base resources and others that build on, interweave with, or otherwise extend them. And all those definitions come with examples and a schema in JSON and XML.

But: For all of us to benefit from these definitions, we need to be sure that everybody actually obeys their rules. Therefore, our JavaScript software development kit has had FHIR resource validation since its very beginning, for every record created or updated. And as we at Data4Life are an end-to-end-encrypted platform, all of this needs to happen on the client side.

So, given that we have a schema to validate against and the data to validate, what’s the problem?

Just in time: valid validation issues

For once, even the mightiest regular expressions can’t capture all of the expectations for FHIR resources. As a simple example, DocumentReference objects may contain multiple attachments – however, these are not supposed to be different attachments, but rather different representations of the same content, such as a JPEG image of an X-Ray and its DICOM representation (Digital Imaging and Communications in Medicine).

Problems of this kind that require human inspection or still-not-quite-there artificial intelligence are simply beyond the scope of our validation. Another inherent limitation specific to JSON schema validation is the inability to process sliced resources.

Pesky content issues such as these aside, how do we actually validate the resources? There is an ecosystem of JSON schema-based validators available that work with JavaScript. We decided to use Another JSON Schema Validator (AJV) because of its very robust support for the JSON schema contents. And the implementation is a valid breeze:

// This code is loosely adapted from our original 2018 validation implementation
const ajv = new Ajv();
// fhirSchemaFile is a JSON schema of ALL fhir resources in one documents
ajv.addSchema(fhirSchemaFile, 'fhir.schema.json');

this.validator = ajv; // used as a reference

// getting the matching reference within the resource to look up the right place in the schema
const ref = this.getRefName(resource.resourceType);
...
const valid = this.validator.validate({ $ref: ref }, resource);

At the end of this, valid will either be true or this.validator will contain a list of errors that AJV found.

Recap: We download the JSON schema and, on runtime, AJV will be instantiated, add a schema, and validate against it. Sounds smooth? Almost. There is only one problem with this, but it’s a huge problem:

In order to do runtime validation, AJV requires Content Security Policy headers to set unsafe-eval to true. And even if you don’t subscribe to the “eval is evil” doctrine: as an organization set out to be responsible, safe, secure, and trustworthy, you probably don’t want to enable anything that has “unsafe” in its name. So else what could we do?

Building validation ahead of time

AJV has a command line tool that allows the pre-generation of actual JavaScript function files based on a schema. And it’s very straightforward to use:

# requires global ajv-cli installation
npx ajv compile -s fhir-careplan.schema.json -0 careplan.js

Great! There are only a few issues with this. The resulting JavaScript file needs some spot-cleaning to be a readable ES6 module. More problematically, even the smallest FHIR schemas generate files well north of 20 000 lines of code, closer to Jules Verne’s nautical fiction than to ideal file size. And putting the aforementioned all-in-one FHIR schema in this process results in comically gargantuan output.

If you now hope for a “but we found a magic flip that saves 70% of this'' paragraph like a Dr. House epiphany 6 minutes before the show ends, I am sorry to disappoint, it’s not Lupus. Of course we employ best practices such as compression during build time and gzipping on delivery. Of course we import the resulting function dynamically only when it’s needed – so only during upload or creation and only for the specific resource in question.

if (version === FHIR_VERSION_R4) { // we support both STU3 and R4
        switch (resourceType) {
          case 'Encounter':
            returnPromise = import('@d4l/js-fhir-validator/r4/js/Encounter').then(bundle => {
              this.validator[version][resourceType] = bundle.default;
              return this.validator[version][resourceType];
            });
            break;
...

But the fundamental trade-off between in this case security and some performance implications still remains. In our case, it was an obvious choice, and we want to encourage others to follow in our footsteps.

Open-sourcing the validation

We believe that others might profit from the work we put into this, so we’ve decided to open-source our client-based JavaScript FHIR validation.

It’s available as an npm module:

npm install @d4l/js-fhir-validator

Within your respective project, you can use it as below:

  import('js-fhir-validator/r4/js/diagnosticreport').then(validatorFunction => {
       const validationResult = validationFunction(diagnosticReport);
       // validationResult is either true or false
       if (validationFunction.errors) { // errors are reset after every call
	...
       }
   })
}

There are a few constraints with our approach – limits in the translatability of the FHIR specifications as well as limits we had to impose on their ever-expanding interweaving. For instance, a Bundle can contain just about any other resource there is, so it has to be split up by the client and every resource ought to be validated individually. The same is true for contained resources – otherwise a client would always have to download all possible definitions.

Right now, our tests validate thousands of resources already – both official FHIR examples and resources we generate in our projects. We are excited to bring this project to the public and welcome your feedback! So please send any comments to we@data4life.care.

Share using social media

Validating FHIR resources for fun and nonprofit

Just in time: valid validation issues

Building validation ahead of time

Open-sourcing the validation

More posts

Cumulative Layout Shift at Data4Life

Introducing Core Web Vitals at Data4Life

How we extend Ansible to enable complex configuration management