Fine Tuning TensorFlow Bert Model for Sentiment Analysis

Kai Jun Eer
3 min readAug 14, 2020

Bidirectional Encoder Representations from Transformers (BERT) is a major advancement in the field of Natural Language Processing (NLP) in recent years. BERT achieves good performances in many aspects of NLP, such as text classification, text summarisation and question answering.

In this article, I will walk through how to fine tune a BERT model based on your own dataset to do text classification (sentiment analysis in my case). When browsing through the net to look for guides, I came across mostly PyTorch implementation or fine-tuning using pre-existing dataset such as the GLUE dataset. Therefore, I would like to provide a guide on the Tensorflow implementation using my own customised dataset.

Hugging Face library provides convenient pre-trained transformer models to be used, including BERT. We will be using TFBertForSequenceClassification, the tensorflow implementation of fine tuning BERT model. This pretrained model is trained on the Wikipedia and Brown corpus. The following code installs the library and loads the pretrained model.

Loading pretrained model

After loading the pretrained model, it is time to load our dataset. In my project, my dataset consists of two columns — sentence and polarity. A polarity of 0 means negative sentiment for the corresponding sentence, while a polarity of 1 means positive.

Sample dataset format

Next, we need to format the data such that it is recognised by the TFBertForSequnceClassification model. The pretrained BERT model takes three input features — input ids, token type ids and attention masks. Input ids are an id number assigned to each word based on the existing BERT vocabularies. Since BERT tokenizer helps you to pad your sentence with 0 so that every sentence is of the same length, token type ids are required to differentiate between actual words and paddings. Attention masks help to recognise which sentence does the word belong to. BERT tokenizer has a function encode_plus which converts your raw sentences into the three input features. The following code helps to organise your dataset in Tensors, such that it is compatible with BERT tensorflow implementation.

Converting raw dataset to recognised data format
Fine tune the BERT model

With the dataset and BERT pretrained model in place, we could fine-tune the model such that it suits our purposes. Tensorflow implementation is a very simple way to train your model, as shown in the code below.

And now you’re done! If you would like to make sentiment predictions on your test dataset, simply follow the code below, where pred_sentences is a list containing your test sentences.

Print out the prediction results of the sentiment analysis model

Of course, there are other models other than BERT, for example XLNet, RoBerta and GPT. However, different models will take different input data formats, thus you might need to spend some time converting your raw dataset to their accepted format. One thing to note is that if you are only required to do sentiment analysis on very general sentences, most of the time you could already achieve a good result without fine tuning the model. Fine tuning is usually only needed if you are planning to do sentiment analysis on a very specific subject matter, for example the outlook of Bitcoin prices. Rising prices could mean positive and negative to different people, so it matters to fine tune your sentiment classifier.

Happy exploring!

--

--

Kai Jun Eer

Wandering in the world of artificial intelligence and blockchain