Zero-Shot Named Entity Recognition using OpenAI ChatGPT API

sourajit roy chowdhury
4 min readMar 11, 2023

--

This is a hands-on practical of Zero-Shot Named Entity Recognition system using OpenAI ChatGPT API, Python and Streamlit in Google Colab.

Disclaimers

  • This disclaimer pertains to the code and output presented in this post. For general-purpose entities, this code might work quite properly, but I highly recommend if you are using NER for deep domain-related tasks, then fine-tuning GPT-3 models would be the preferred choice.

Content

What is Named Entity Recognition?

Named Entity Recognition (NER) is a Natural Language Processing (NLP) technique used to identify and extract important entities from unstructured text data. The entities can be anything from the names of people, places, organizations, dates, and more.

NER involves developing models and algorithms that can analyze large volumes of textual data (often called text corpus) and automatically identify the entities within it. This is typically achieved through the use of deep learning techniques such as neural networks, which are trained on labeled data to recognize patterns and extract relevant information from text.

NER is an important component of Information Extraction (IE) because it provides a way to automatically identify and extract important entities from unstructured text data. These entities can then be used as the basis for further information extraction tasks, such as identifying relationships between entities or extracting attributes of entities.

Here in this article, we will neither train a model using a large text corpus nor use a NER-specific pre-trained model. Rather, we will try to extract named entities using ChatGPT itself and show it as a streamlit application.

Zero-Shot NER using OpenAI ChatGPT API

Setting up prompts

We will use different prompts to set up our NER system using ChatGPT.
For this article we will use three entities: PERSON, LOC, DATE.

As you can see SYSTEM_PROMPT is to define the ChatGPT about the role that it needs to follow.

USER_PROMPT_1 is the prompt that the user (myself) provides to clarify ChatGPT’s role.

ASSISTANT_PROMPT_1 this prompt is the acknowledgment by ChatGPT but designed by the user (myself) only.

GUIDELINES_PROMPT defines, how ChatGPT should perform the NER task. In the guidelines, I provided the entity definitions so that ChatGPT can understand more about the entities it needs to extract. Also, I have provided the output format to guide ChatGPT’s response for further processing. And, some examples so that ChatGPT can understand how to extract the entities.

Using the OpenAI ChatGPT API

The above prompts can be fed to the OpenAI ChatGPT completion API in the below format.

Note the message variable in line number 8. This is how ChatGPT expects its prompt inputs. This is also known as Chat Markup Language (ChatML).

Provide your own text to test the NER system

Now you can provide your own text and check for the prediction by the above API as below.

The output you will get is below,

{'PERSON': ['Hanyu Pinyin', 'Christopher'], 'DATE': ['24-06-2018', '2 years'], 'LOC': ['None']}

Setup Streamlit on Google Colab

Setup ngrok in Google Colab

Ngrok will help you in tunneling the streamlit app to a temporary URL, which is quite easy to use and specifically for quick testing purposes from google colab.

  1. Create a free account in ngrok. Get the authentication token as shown below.

2. Now, download the ngrok Linux version as shown below.

!wget https://bin.equinox.io/c/bNyj1mQVY4c/ngrok-v3-stable-linux-386.tgz

3. Extract the downloaded file

!tar -xvf ngrok-v3-stable-linux-386.tgz

4. Authenticate your ngrok agent. You only have to do this once. The Authtoken is saved in the default configuration file.

!./ngrok config add-authtoken <your_ngrok_authetication_token>

5. Install the ngrok python package

!pip install pyngrok

6. Create a tunnel from Google Colab local host to ngrok URL

from pyngrok import ngrok 
public_url = ngrok.connect(port='8501')
public_url

# Output
# <NgrokTunnel: "http://365f-34-125-228-241.ngrok.io" -> "http://localhost:80">

Creating a simple streamlit app

  1. The streamlit app will take a text input.
  2. The input text will be passed to the ChatGPT API to get the extracted entities.
  3. The entities will be formatted in different colors to show back to the streamlit UI.
  4. The code looks similar to the below one.

5. Once you run the application, it will create a temporary URL to interact with your streamlit application

!streamlit run ./ner_streamlit_app.py & npx localtunnel --port 8501

The final output will be very similar to the below one.

Full Example Notebook in GitHub

You can get the full example notebook in my GitHub. You can easily copy the steps in your google colab to run your own version of the NER system.

Thank you for reading this article. Consider subscribing to this post if you found its contents valuable and enlightening. Your support will be a source of inspiration for crafting future posts that are equally informative and engaging. Follow me on medium and subscribe to my newsletter.

--

--