Remove PII and PHI from OpenAI ChatGPT API Requests

It seems like OpenAI’s ChatGPT is everywhere, doesn’t it? It’s finding its way into a lot of industries and helping out in a lot of places. But as it grows in adoption it also brings risks. As ChatGPT starts being used in applications that contain sensitive information like PII and PHI, we have to be careful to protect the PII and PHI from being transmitted to OpenAI’s API. We want to prevent our users and applications from sending sensitive information to third-parties.

ChatGPT does have some safeguards in that if you ask it a question like Whose social security number is 123-45-6789? , you will get a response similar to “the information you are looking for can’t be provided.” (Does it even know to start with? Probably not.) But! We still transmitted the social security number to a third-party and that’s not good!

Philter OpenAPI Proxy

Introducing the Philter OpenAI Proxy project. This open source project is a proxy server that uses Philter to remove PII and PHI from requests bound for OpenAI’s API endpoint. Instead of sending your API requests to https://api.openai.com, you direct the API requests to the proxy.

If you are running the proxy on your computer, you would use https://localhost:8080 as the endpiont. When the proxy receives a chat completion API request, the proxy extracts the content of the messages in the request and uses Philter to identify and redact the PII and PHI in the messages. The redacted messages are then returned to the proxy where they are then passed on to OpenAI’s API without the PII and PHI.

Redacting an OpenAI ChatGPT API Request

Here’s an example. Let’s say you send the following request. Note that the host in the request below is localhost:8080 so the message is being sent to the proxy and not to OpenAI.

curl -k http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Whose social security number is 123–45–6789?"}]
}'

The proxy will extract the content for each message in the request. Each content string will be analyzed by Philter for PII and PHI. In this case, Philter will see the social security number 123–45–6789. The response from Philter will be Whose social security number is REDACTED? This redacted text is then sent to OpenAI’s API free of the social security number.

Using the Proxy

To use the proxy, clone its repository. It’s a small Go application so you can build it using the Makefile with make build. Next, generate a self-signed certificate using the command make cert. Now, run the built executable file:

PHILTER_ENDPOINT=https://your-philter-ip:8080 ./philter-openai-proxy

That’s it! The proxy is up and running and listening for requests. In your client applications that are using OpenAI’s API, just change your requests from going to https://api.openai.com to instead go to http://localhost:8080.

You can also use environment variables to change the filenames of the certificate and key file, as well as the port on which the proxy listens.

The Philter OpenAI Proxy is licensed under the Apache License, version 2.

Leave a Comment

Your email address will not be published. Required fields are marked *