Blog & News

NLG drives consistent narratives

By Ehud Reiter | August 4, 2021
BL134-EhudConsistency

One of the advantages of using NLG to automate the production of narratives is that the language is consistent. When people write narratives, they write them in different ways depending on their style and preferences, which can confuse readers; this doesn’t happen with NLG.

To give a concrete example, many years ago I was working on an NLG system to generate weather forecasts for offshore oil rigs. We did some exploratory work with rig workers who routinely read and used weather forecasts. At the time, forecasts were written by human forecasters, on a rota system (e.g., Jim wrote the forecasts on Monday, Tom wrote them on Tuesday, and Sarah wrote them on Wednesday). The rig workers we talked to complained that each forecaster wrote forecasts differently. For example:

• Sarah was more pessimistic than Tom. If a day had a mixture of good and bad weather, Sarah would focus on the bad weather and Tom would focus on the good weather.

• Tom used “by evening” to mean midnight, while Jim used “by evening” to mean 6PM.

• Jim’s forecasts were shorter and higher-level than Sarah’s.

There is nothing wrong with any of the above styles! But the rig workers told us that the inconsistency was really annoying. If bad weather was briefly mentioned, they didn’t know how serious the bad weather would be without knowing whether Sarah or Tom wrote the report. If the report said something would happen “by evening,” they had to check whether Tom or Jim wrote the report.

When we introduced our NLG weather report generator, all of these problems went away. The reports were all mildly pessimistic, used words in a consistent way, and so forth. The guys on the rigs who read the forecasts no longer had the hassle of interpreting reports in different ways depending on who wrote them, which they really appreciated.

I’ve seen similar scenarios in many other areas, including medical, engineering, and financial narratives. People are very different, which means that people write narrative in different ways, and this can cause problems for readers. One of my colleagues once investigated descriptions of brain x-rays, and found that the same brain tumor would be described as “small” by one doctor and “large” by another doctor, which was very confusing to the people who read the descriptions!

Of course, it is important that the language used in narratives is varied. Narratives become boring if they are too similar! But the language must not vary in a way that confuses readers. While it’s hard to avoid unsafe/confusing variations in manually-written narratives when these are written by more than one person, we can program NLG systems to only vary language in “safe” ways in their narratives.

MORE BLOG AND NEWS