What is Airlock?

Airlock is an AI policy layer to prevent the disclosure of sensitive information, such as PII and PHI, in your AI applications. By passing the outputs of your AI-powered applications through Airlock, you can prevent sensitive information from being exposed and monitor the sentiment and offensiveness of AI-generated text.

Airlock FAQ

Frequently asked questions about Airlock. For any questions not answered here please refer to our Support.

How do I run Airlock?
Airlock is available on the cloud marketplaces as a virtual machine product:

  • Airlock on the AWS Marketplace
  • Airlock on the Google Cloud Marketplace
  • Airlock on the Microsoft Azure Marketplace

Does Airlock use ChatGPT or other third-party APIs?
No. Airlock never transmits your text or documents to any third-party service.

Airlock can run in a firewalled or air-gapped environment. For example, if you are using AWS, you can deploy Airlock to a private subnet and use security groups and network ACLs to prevent any outbound traffic from the Airlock instance and its subnet. In fact, we recommend doing so to increase your overall security posture.

Is Airlock open source?
Airlock is built upon an open source project called Phileas, an open source project for finding and redacting PII and PHI in text and documents. Everyone is welcome to check out the Phileas code to learn more about how it works, to submit an issue when one is found, and to contribute via pull requests. Phileas is licensed under the Apache License, version 2.

What types of PII, PHI, and other sensitive information can Airlock find?
Airlock can redact many types of PII, PHI, and other sensitive information. We are constantly adding new types of information and new versions of each type. For example, a person’s age may be written in many ways and we work to add new ways as we discover them. If you wish to discuss these types of information in depth please contact us.

Some of the types of PII, PHI, and sensitive information identified by Airlock are listed below:

  • Ages
  • Bitcoin Addresses
  • US Cities
  • US Counties
  • Credit Card Numbers
  • Custom Dictionaries (define your own information)
  • Custom Identifiers (can be used to define custom medical record numbers, financial transaction numbers)
  • Dates
  • US Drivers License Numbers
  • Email Addresses
  • Hospital Names
  • IBAN Codes
  • IP Addresses (IPv4 and IPv6)
  • MAC Addresses
  • Passport Numbers
  • Persons’ Names (supports fuzzy matching, first name, last name, and whole name)
  • Phone/Fax Numbers
  • Physician Names
  • SSNs and TINs
  • Shipping Tracking Numbers
  • US States
  • URLs
  • VINs
  • US Zip Codes

How does Airlock know what types of PII and PHI to redact?
You create policies that tell Airlock what types of PII and PHI to find. A policy lists the types of sensitive information (phone numbers, names, etc.), when to remove them, and how to remove them. You can have as many policies as you need and you can select which policy to apply when redacting text.

How is Airlock deployed?
Airlock can be deployed to your cloud via the cloud’s marketplace. See Airlock’s home page for links to the cloud marketplaces.

Is Airlock guaranteed to find 100% of all sensitive information in my text?
Airlock uses state of the art natural language processing (NLP) technology to identify sensitive information in text. These NLP methods use trained models created from a large corpus of text. The process of applying the model to text is non-deterministic. There are many factors that could affect the identification of sensitive information in your text such as how similar your text is to the corpus that was used to train the model, how the text is formatted, and the length of the text. For these reasons, it is important that you assess Airlock’s performance on your data prior to utilization in a production system.

The confidence value in the filter strategy condition can be used to tune the NLP engine’s detection. Each identified entity has an associated confidence score between 0 and 100 indicating the model’s estimate that the text is actually an entity, with 0 being the lowest confidence and 100 being the highest confidence. The confidence value in the filter strategy allows you to filter out entities based on the confidence. For example, the condition confidence > 75 means that entities having less than a 75 confidence value will be ignored and entities having a confidence value greater than 75 will be filtered from the text.

What platforms are supported by Airlock?
Airlock supports several platforms and which platform is used may be determined by your choice of cloud provider. See Airlock’s home page for links to the cloud marketplaces.

What is Airlock’s license agreement?
You can view the Airlock License Agreement.