Category
>Artificial Intelligence
>NLP

OpenAI’s GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle."

Tanesh Balodi
Oct 06, 2019

For those who have studied Natural Language Processing (NLP) thoroughly, and are aware of various techniques like the bag of words, tokenization, stemming and lemmatization, etc. they probably know how natural language processing can generate a new word by analyzing the whole statement or sentence, or how it can be used in a sentiment analysis for a movie review, but what if I tell you there is a model of natural language processing which is released only partly, because of the epic results they got from it and feared that if they would release the whole model it might be used disgracefully, this model is our topic for today, it is none other than Generative Pre-trained Transformer-2 (GPT-2) released by the researchers of OpenAI, earlier on 2019.

Let us know about this revolutionary model further.

Introduction to Generative Pre-Trained Transformer

An OpenAI research team came up with a model for which they trained about 40GB internet text, the performance of the model was unbelievable, the NLP techniques or model, we knew, was able to predict the new text, but this model is such a powerful model that it can predict a whole article or story only with the few sentences or words, and the result was so optimum that you cannot even guess that it has been generated by a machine. Let me give you a sample example from open AI research

For today’s homework assignment, please describe the reasons for the US Civil War.

Above is a human written query for Generative Pre-trained Network, and the machine-generated answer was more than a shocking result for any NLP model, it was marvelous, let’s see the result -:

It is easy to identify why the Civil War happened because so many people and so many books and so much television and films tell us that it was the cause, that it has something to do with race or economics or religion. There is much agreement that it was essentially a war of slavery on behalf of capitalism, about a century of slavery.

But that’s not what most people think of when they think of the Civil War. Many people think of the war as about states’ rights. You’re not wrong about that, but it’s kind of misleading to say that the Civil War was a conflict between states’ rights and federalism. So let’s try again. What’s the idea that has inspired so many people to go onto the battlefields of Gettysburg and Antietam and Gettysburg and Petersburg and Fredericksburg? The American ideal of the republic--a notion of limited government--is a great part of the history.

This is not even the whole answer that was generated by GPT-2, but moreover, it is just a paragraph, this model fulfilled the intuition behind natural language processing up to an unbelievable extent, as the whole idea tend to provide the understanding of raw text to a machine just like any human will interpret and understand.

Different Versions of GPT-2

GPT-2 whole model was trained on 40GB trained dataset which included more than 1.5 billion parameters with about 48 layers but as a precaution, the released version had about 117 million parameters with about 12 layers in order to lessen its performance and accuracy, another version that is released of this model that carries about 345 million parameters, and hence performed better.

Also, the accuracy of this model is more than the previous record with a substantial difference, for example, the previous model achieved 85.7% accuracy on “Children’s Book Test Common Nouns” dataset whereas GPT-2 acquired 93.30 % accuracy on the same dataset, leaving a difference of less than 3% to acquire human-like accuracy.

Architecture on which GPT-2 is based

Let’s go with the name Generative Pre-Trained Transformer, here ‘Generative’ clearly depicts the generative nature of this model where it tends to understand the text and generates the text which has some real meaning and is based on facts, ‘Pre-Trained’ in the name suggests the huge number of parameters over which this model is trained. ‘Transformer’ in the model name is the most important notation as it depicts its architecture, which we are going to discuss further-:

Architecture of the Transformer

Above is an architecture of ‘Transformer’, that does all the fine-tuning of text, we can see the different layers with a different purpose, the output result that this transformer provide is text prediction and text classifier.

The huge dataset is fed to this transformer and training of data is done millions of time, This is the reason behind its success over language modeling, machine translation, and auto-text generation. The transformer can be said as the founding stone for this very efficient model. The main purpose of the transformer is to set as an instrument for machine translation in this model for providing optimum results in natural language processing.

To implement the Transformer there are 4 main steps to be followed :

Inserting Input: We have to feed or insert each and every word of the text document to the transformer, the embedding of words is a common practice in neural machine translation. In this step, every word will be provided with a vector known as the embedding vector.

Positional Encoding: Positional encoding refers to providing a position to the embedding vector which we provided in the last step.

Creating Masks: Creating a mask in Transformer serves its purpose in encoder as well as a decoder, mainly it is used to make a perfect prediction of the next word by stopping decoder at the right time.

Feed Forward Layer: The feedforward layer in Transformer has two most important operations which are ReLU (read more about activations functions here) and dropout operations performing linear operations. Also after these operations, normalization is done which is very important in order to provide uniformity in results.

Why Unsupervised Learning is Preferred?

As we know, that major model which are widely used have preferred supervised learning and have achieved major success using them, most of the algorithm best fits for supervised learning, so why unsupervised learning is preferred in OpenAI’s most advanced natural language processing model GPT-2? the reason is very much practical, in their notes they wrote “Since unsupervised learning removes the bottleneck of explicit human labeling, it also scales well with current trends of increasing computation, and availability of raw data.

Unsupervised learning is a very active area of research but practical uses of it are often still limited”. Now you know why they are preferring unsupervised learning, one more reason to add up is that labeled and cleaned data is expensive, so choosing unsupervised learning is a clever choice.

What are the Drawbacks of GPT-2?

False Information:- Generative Pre-trained Transformer-2 is trained over millions of websites, but the righteousness or correctness of the content on those websites cannot be neglected, as our model is trained on such dataset it creates a problem like exploitation of biases in the data distribution.

Heavy Computation:- OpenAI’s GPT-2 requires heavy computational setup as compared to the previous language models on which training was done using a single GPU, but this model is pre-trained over such a huge dataset that it needs one month on 7-8 GPU’s, also it has about 37 layers and 12 blocks, these numbers tell the amount of computation that is done in this model.

Unpredictable Generalization:- According to the OpenAI research team, this text generator model has performed really well on almost every dataset, but they have seen counterintuitive behavior while evaluating the out-of-distribution way.

GPT-2, as said by their creator, is the most advanced text generator model ever built for language modeling and prediction of next tokens, but the team also remarked it as “The AI that is too dangerous to release”. This statement tells the potential of this NLP model and possible applications of GPT-2 could be of creating a fake text or information which will eventually be the next impossible task to distinguish whether the information is generated by a machine or is an human-generated text.

That might be the reason, the Open AI didn’t release it with a full 1.5 billion parameter pre-trained model, with the possible threat of misuse. GPT-2 can be considered as the most perfect text generative model ever created, although there is an advancement that is needed in the future but seeing it’s potential, we can assume that we are very close to ideal text predictive and generative model.

Why GPT-2?

Why did we felt the need of GPT-2 when there was already GPT? let me tell you that sooner after GPT, google released its natural language processing model known to be BERT which performed better than OpenAI’s GPT model, it was able to generate the words which were just the blank spots in between the sentence which was a big achievement, but later, OpenAI came up with this idea, where they have used the same earlier model with the only advancement or upgrade they did was by installing more GPUs and with a huge parameter and about 40 gigabytes of internet information.

And as a result, it performed phenomenal and better than google BERT by generating the whole document with the information as less as a sentence and sometimes even a word. Another model which was released by google was ELMo that was work on “semi-supervised sequence learning”, and gained good accuracy, on the other hand, BERT stands for Bidirectional encoder representations from transformers, it achieved the accuracy for about 86.7% on MultiNLI dataset which was 4.67 % improvement from previous results. This success of google’s model leads the OpenAI team to think about the new way to implement natural language processing like never before.

Conclusion

Undoubtedly, Generative Pre-Trained Transformer is the best research in the field of Natural Language Processing, though there are huge chances of more substantial advancement, which seems to be achieved earlier than it was predicted. But with this research held by the OpenAI team, fine-tuning is improved with more generalization than ever.

We hope to see new marvels on Natural Language Processing techniques in the future, and not being too much predictive, but in my opinion, we are very near to get the ideal result of our desire. For more blogs in Analytics and new technologies do read Analytics Steps.

Latest Comments

vivek.vikash

Oct 09, 2019

Hey Rahul, Great job here on analyticssteps .You know what ? i read a lot of blog posts and never heard such interesting topic anywhere else. I love this topic ,it's very ingenious. :)

1 Reply

Rahul Gupta

Oct 09, 2019

Hey Vivek, Thank you! I hope, I will come up with such topics in the near future.

nashlucas33

May 25, 2022

I AM LUCAS NASHVILLE, I AM PROUD TO TESTIFY ABOUT JOINING THE NEW WORLD ORDER. I JUST RECENTLY JOINED THE ORGANIZATION FROM SEYCHELLES AND I LIVE IN THE BAHAMAS NOW. PLEASE BEWARE!!! OF SO MANY FAKE POSTS ON HOW TO JOIN ILLUMINATI, I WAS SCAMMED SEVERAL TIMES TRYING TO JOIN THE ORGANIZATION. THE FAKE PEOPLE PROMISED ME MONEY, A CAR, AND A HOUSE BUT IT WAS ALL LIES. I LOST OVER €3500 UNTIL I FOUND A GENUINE WAY TO JOIN EASILY WITH THE HELP OF THE CITIZEN RECRUIT DEPARTMENT 666. AFTER JOINING THE NEW WORLD ORDER, I RECEIVED MY FIRST MONEY REWARD WHICH AMOUNTED TO €100,000 AFTER GOING THROUGH THE ILLUMINATI LOYALTY PROOF TEST WHICH I CAN EASILY SAY WAS SIMPLE AND WELL DONE. PLEASE BEWARE OF SO MANY FAKE POSTS ONLINE. Contact the genuine citizen outreach Department by email citizenrecruitdepartment666@gmail.com WhatsApp or send a text to the recruiting Department here: +1 (647) 800-8405

drmirabel78

Aug 16, 2022

Hello everyone my name is Robert Alexander am from U.K i'm giving a testimony on how joined the illuminati brotherhood, I was trying to join this organization for so many years now, I was scammed by fake agents in Africa, Europe, Asia and America, I was down could not feed my self and my family anymore and I tried to make money by all means but all in vain, I was afraid to contact any illuminati agent because they have eat my money, one day I came across a post of someone giving a testimony thanking a lady called Kelly mirabel for helping him to join the illuminati brotherhood, then I looked at the man's email and the phone number that was written there, I was afraid to contact him because I was scammed a lot of times by scammers who ate my money up to 250,000 pounds and went away with the money then I was very confused so I decided to contact the person that was given the testimony and I called him and I communicated with him on phone calls before he started telling me his own story about when he wanted to join. He told me everything to do, then I made up my mind and called the agent called Kelly mirabel and she told me everything to do and I was initiated, surprisingly was given my benefit of being a new member of the great illuminati brotherhood was so happy, for those of you trying to join this organization this is your opportunity for you to join CONTACT, Kelly mirabel, on WhatsApp +2349011796741 or Email: illuminatiorganization7566@gmail.com

drmirabel78

Aug 16, 2022

cw797769

Aug 16, 2022

HELLO EVERYONE I WAS BROWSING THE INTERNET AND I SAW A LOT OF WRONG QUOTE CONCERNING THE ILLUMINATI SOCIETY, I FELT BAD ABOUT IT, I WANT TO LET YOU KNOW FEW THINGS ABOUT THE ILLUMINATI SOCIETY. FIRST, I JOIN THE ILLUMINATI SOCIETY THROUGH THE HELP OF AN AGENT SOMEONE INTRODUCED ME TO ONLINE, AFTER YEARS OF DETERMINATION TO BE A MEMBER. BEING A MEMBER OF THE ILLUMINATI YOUR WEALTH IS GUARANTEE, YOU WILL BE PROTECTED, FAME, POWER INFLUENCE E.T.C ALL THESE THEY WILL GIVE YOU. ONE THING I WANT TO CORRECT IS THAT THE ILLUMINATI DON'T PAY MEMBER ANY SALARY, IF YOU ARE NEWLY INITIATED THEY WILL GIVE YOU THE SEED OF WEALTH AND BLESS YOU WITH WISDOM, POWER, INFLUENCE E.T.C YOU NEED TO BE SUCCESSFUL THE SEED OF WEALTH IS THE ONLY MONEY THE ILLUMINATI SOCIETY GIVE TO THEIR MEMBER, WITH THIS YOU CAN START ANYTHING WITH THE MONEY AND YOU WILL BE SUCCESSFUL. ANOTHER THING IS THAT THE SOCIETY HAVE SPECIAL BLESSING FOR POLITICIANS AND SUPER STARS. BEING AN ILLUMINATI MEMBER IS A PERSONAL DECISION, THE SOCIETYv DON'T FORCE OR BEG PEOPLE TO JOIN THEM. I JOINED BECAUSE I WANT TO, NO BODY FORCE ME AND AM VERY HAPPY TO BE A MEMBER TODAY BECAUSE THEY HAVE CONTRIBUTED GREATLY TO MY LIFE BY MAKING ME ONE OF THE LEADING BUSINESS MAN IN THE WORLD. IF YOU ARE INTERESTED IN JOINING THE ILLUMINATI SOCIETY CONTACT AGENT Fred ON WHATSAPP +51918611249 OR OR EMAIL HIM ON. ILLUMINATITEMPLEOFWEALTH99@GMAIL.COM. HE WAS THE ONE THAT HELP ME. THIS IS THE LITTLE HELP I CAN GIVE YOU

OpenAI’s GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle."

Introduction to Generative Pre-Trained Transformer

For today’s homework assignment, please describe the reasons for the US Civil War.

Architecture on which GPT-2 is based

Why Unsupervised Learning is Preferred?

What are the Drawbacks of GPT-2?

Why GPT-2?

Conclusion

Share Blog :

Trending blogs

Latest Comments

vivek.vikash

Rahul Gupta

nashlucas33

drmirabel78

drmirabel78

cw797769