Author: Dr. Robert Dale
Language is not the only way to deliver information. It just so happens that it's a particularly good way to deliver information — it's a tool we've been honing for thousands of years, after all — but there are other ways we can deliver information, most obviously via graphical presentations of one sort or another. There are times when a well-chosen picture is indeed worth a thousand words. For example, a simple line graph may well be the best way to communicate the increase in CO2 in the atmosphere over the centuries. The key requirement is to use what works best: choose the best information delivery device that we have available for the task at hand.
But how do we determine which way of delivering information is best? To answer this question we need to turn to science for advice. In particular, we have to add the psychology of perception to our shopping basket of sciences, along with whatever the nascent field of visualisation might have to offer. There's still significant research to be done here — in a recent survey, we unearthed nearly 100 academic papers that had something to say about choosing between text and graphics for different kinds of information, but almost all of it so tied to specific contexts of use that you'd have to say we are a long way from robust scientific support for specific media choices. So when we build a multimodal NLG application — one that uses both text and graphics — we think very carefully about what information to convey in each modality.
But let's suppose, based on the science, you have established a principled way for your technology to make a decision about how best to present a piece of information. Are we done yet? Far from it. As already implied above, the science of visualisation is in its very early stages, so there's a lot still to be discovered there about what works, what doesn't, and why. But even if you've decided on using language to deliver that information, there are still other considerations you need to bring to bear in determining how best to do that. Linguistics, it turns out, only gets you so far: it provides some reassurance that you'll be communicating the information in a way that is consistent with how people understand language. And findings from psycholinguistics will help you avoid building sentences that are technically correct but hard to understand, or narrative flows that are incoherent, although the scientific community's understanding here remains limited — readability measures, for example, mostly remain in the dark ages of counting words and syllables, proxies for the rather harder-to-get-at notions of syntactic and semantic complexity. More importantly, linguistics and psycholinguistics so far don't have a great deal to tell us about the shady grey area where text and visual representation merge, which we might refer to as being about 'visible language': that covers, for example, what point size you use in your text for maximum effectiveness, how long your lines are for maximum ease of reading, and whether or not, just this once, you can use Comic Sans to get the point across without your audience smirking.
Whatever you call this new science we are trying to create, the key point is that it is all about delivering information effectively. You can't just pick and choose what parts of the scientific basis you pay attention to: all of these elements need to be brought together. Some are well developed, others just beginning, and there are dimensions that have yet to be properly identified. Together, they form what we call the new science of information delivery.
What's the relevance of all of this to us here at Arria NLG? Clearly, we want our technology to generate texts that are as good as they possibly can be. They should be fluent, articulate, concise, coherent, and clear. With decades of R&D baked-in to our solutions, we believe we already do pretty well on these fronts, but we see a future where we will have scientifically well-founded measures of these characteristics. At Arria NLG, we're pushing forward to make this new science a reality, resulting in evidence-based language generation techniques that really do provide the right information to the right people at the right time — and in the right way.
Author:Dr. Robert Dale, Chief Technology Officer and Chief Strategy Scientist at Arria NLG. Dr Dale is recognized as one of the world’s foremost experts in Natural Language Generation (NLG) research and Development, having authored or edited seven books and 160 papers on computational linguistics. He was a Professor in the Department of Computing, Director of the Centre for Language Technology at Macquarie University. He co-authored the seminal textbook “Building Natural Language Generation Systems” with Arria NLG Chief Scientist and co-founder Prof. Ehud Reiter.