NLP in Action: Natural Language Generation
An overview of the general process, importance, and popular models of NLG.
This is the third issue of our series, NLP in Action. This series highlights the good and the bad of common methods of natural language processing. In doing so, we hope to spark conversation and curiosity in the world of NLP.
Ankur’s Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
“If the architecture is any good, a person who looks and listens will feel its good effects without noticing.” - Carlo Scarpa
Although the 20th century Italian architect uttered this in reference to physical architecture, something can be said for the expansiveness of that notion in other domains. How often is it that when composing an email or writing a paper, one might half-observantly hit a key in response to a blurb of suggested text, thus accepting the uncannily correct suggestion? For instance, and at a basic level, in writing “How are you,” one rarely has to complete their typing of ‘are’ before the suggested word of ‘you’ appears. This often unnoticed feature happens to be a perfect example of a good architecture that goes unnoticed.
Natural Language Generation (NLG), a subfield of NLP, is the technology behind this suggested text.
What is Natural Language Generation?
Natural language generation, or NLG, is the software process of producing meaningful phrases in the form of natural language. NLG uses content determination, planning, and realization to produce text based on the existing previous and/or potentially surrounding text provided. In the case of writing an email that provides autocomplete suggestions, the NLG model is working to ‘predict’ the most likely following word given what you have typed in the email up to that point (although how many words back a particular model interprets to produce a prediction varies from model to model). As one might expect, one of the most essential components of NLG is the aspect of interpreting, or understanding, what the previous text is about. This task is driven by NLG’s cousin, Natural Language Understanding (NLU).
What is Natural Language Understanding?
Natural language understanding is just what it sounds like: the interpretation of text. Because NLG cannot simply produce words at random and expect to produce coherent text, NLU serves to translate the linguistic characteristics & properties of previous words into a context that NLG can work with. These various properties could be a number of different things, ranging from semantic tags, logic properties (e.g., if-then reasonings, ‘not’ operators), urgency, and topic type, among others. For more on NLU, refer to this article by BMC (for a brief technical version, refer to the Research Gate article).
How does NLG technically work?
Although the specific architecture varies from model to model, NLG consists of several crucial stages that define any general design: text planning, sentence planning, and realization.
Text Planning - This is essentially where the aforementioned NLU takes place, with the general context being formulated, and the generated text ordering being determined. This stage can be broken into Context Determination and Document Clustering.
Sentence Planning - Using what has been tokenized in the previous stage, sentences, text flow, and punctuation are all determined. Additional lexical decisions are made for clarity (such as substituting pronouns and adding conjunctions).
Realization - Almost serving as a final edit, grammatical accuracy is considered and regulated, such as proper tense (e.g., “ran” instead of “runned”).
Beyond this stage, the technical details of NLG vary from model to model, as well as require extensive explanation. As such, this is the end of our ‘under the hood’ evaluation of NLG processing. To highlight the overall process a little more though, this Python notebook provides coded examples of NLG in action, demonstrating the input text accounted for in Text Planning and the final process of the producing text.
As an aside, the two pre-trained systems selected as demonstrations in our notebook, BERT and GPT-3, are highly popular models in the research community (although BERT was originally developed by Google for Google Search). Both BERT and GPT-3 work using a Transformer architecture, which essentially allows for larger training datasets and is currently the most widely used architecture in NLG.
What makes NLG so important?
A fair question… after all, as convenient as it is to save time that could have been spent typing “See you then!” in response to an email invitation, personal autocomplete bots hardly warrant in-depth research, much less revolutionary applications in society. However, these autocompletions merely scratch the surface of what NLG has the potential to do.
In the business world alone, NLG offers a radically different solution to analytic interpretations. Compared to historical interpretations of data (e.g., bar graphs and pie charts), NLG enhances how humans can interact with the plethora of data available. Instead of relying on graphics that display trends and present themes, textual generation offers the option of written reports conveying the same information, albeit clearer and more coherently than complex visualizations.
Additionally, with the current capabilities of NLG, we are beginning to see the first semi-capable chatbots—ones that might soon be able to hold conversations indistinguishable from those had with a human. Applications of this range from industry to academia, serving in roles of customer service to teacher aides. Not only does this introduce Hollywood levels of sci-fi to the real world, but it expands our understanding of what linguistic communication truly is.
Too good to be true? A word of caution on hype and limitations.
As with any technology though, the excitement around initial findings and breakthroughs is overwhelming. NLG is in its infancy pertaining to both its technical capabilities and adoption. Despite its first commercial use in the 1990s, companies like OpenAI and Arria have devoted themselves to accelerating NLG’s growth in recent years. OpenAI, the creator of the front-running GPT-3 model, provides its services to over 300 applications. Arria, a pioneer in the field, has over 40 NLG patents and clients across retail, energy, financial, media, and weather sectors. NLG has grabbed the attention of firms across industries and many see its potential.
Despite its recent advancements, NLG is undoubtedly an incremental innovation as opposed to a disruptive one. Gartner, a credible IT research company, stated that NLG is currently at its peak of inflated expectations. Slowly NLG is being adopted where it works best: given structured data for financial reports, weather forecasts, and sports games. Surprisingly, NLG is also making strides in areas previously thought to be only completed by humans: op-ed articles and movie scripts.
Although NLG is capable of generating text, it is difficult to comprehend a machine producing emotion and originality between the lines. Long sequences of text are also difficult for models to generate if not given direction, as exemplified in the Google Colab notebook. Also, if biases are present in the training data, NLG models will propagate them. These are few of the many challenges NLG will need to overcome to become more widely adopted.
If past emerging technologies have taught us two things, it is how quickly they advance and how quickly they are adopted. NLG undoubtedly has the technological power to change how anything is written. At least for now, NLG could not have written this article. But, NLG may soon do more than just finish common phrases and generate text for structured articles.
Subscribe to get full access to the newsletter and never miss an update.