In this topic, we will discuss everyday applications of NLP. We will see how to use Online Translators, Grammar checkers, Text2Image generators, and text generators. So, if you are familiar with all of that, feel free to skip this topic.
The topic will introduce you to an advanced search on the Internet, help you communicate in Chinese, use a grammar checker, and generate an image out of text.
Searching on the web
Web search is partially an NLP task because we need to extract keywords to find the best-fitting results — a task for word sense disambiguation, morphological parsing, and query-language. It also includes other AI engineering tasks such as ranking, search engine indexing, relevance, and so on.
Google Search or Bing Search aren't something new for you. But there are many good search applications beyond the basic Google search engine: and we don't mean Bing or Yahoo!
Let's start with Google Scholar, a search engine for scientific articles and books. The screenshot below shows the found articles on text simplification since 2021 (you can set up a date range in the left column):
Scirus is a similar service. It focuses on scientific literature. You can search either by keywords, author name, title. You can find a list of articles/books by a specific field.
WayBackMachine offers you to search through deleted web pages. When we say "What happens on the Internet stays on the Internet", this is true thanks to WayBackMachine. For example, if you want to read a deleted article in LiveJournal (a social network that was popular in '00s), you can find it on WayBackMachine.
WayBackMachine is also an excellent library — it offers books in many languages. Its main specialization is so-called vintage books, though there are many 21st-century books too. Here is an example of a 17th-century book that you can find on this website:
Apart from web pages and vintage books, you can also find old movies, TV programs and video in the WayBackMachine's Moving Image Archive section. Below is an example of search input, "The Great Train Robbery". We have successfully found an 1903 American silent film.
Wolfram is another good search engine that can give an answer to a rather complicated question. We can search, for example, what are the languages spoken in Nepal, and it will quickly reply…
Mind that their percentage number do not add up to 100% altogether. If there are two languages in one country: A and B, then there will always be someone who knows both languages, so A will be spoken by 51% and B by 51%.
If we create a query on the total number of native speakers of Portuguese:
The result section shows the number of native speakers, but the table and the map below show the data of the total number of speakers. To change it, change the upper icon "Total speakers" to "Native speakers". If you doubt whether this info is correct, you click on the "Sources" icon just above the map.
In Wolfram, we can compare different languages too. For this, you need to enter languages separated by commas: Czech, Italian, and Sinhala.
The comparison goes all the way down and includes a language classification:
For more details on the Wolfram language, check the official instruction.
Online translators
Modern online translators are based on neural-network models such as BERT and others. Helsinki-NLP is a good example of the transformer model.
There are plenty of online translators now available: Google Translator, PROMPT, DeepL Translator, and others.
The most popular one is Google. It offers translation of entire documents and websites, images and audio. The same is available in DeepL Translator. PROMPT offers a good translation application, but it's not free.
Let's see how different programs manage to translate a simple English sentence into a French one.
This is how Google translates:
PROMPT:
DeepL:
Checking grammar
Grammar-checking applications are based on language models; for grammar checking, we use Fast.ai's ULMFIT and Google's BERT models; they process a large corpus. A good corpus is a significant detail because the model compares the word you write with one that is in the corpora.
There are many open-source grammars, spelling, and style checkers.
LanguageTool web service can be used via a web interface in a web browser, or via specialized client-side plug-ins for Microsoft Office, LibreOffice, Apache OpenOffice, Vim, Emacs, Firefox, Thunderbird, and Google Chrome.
LanguageTool does not check a sentence for grammatical correctness; only for some typical errors. Therefore, it's easy to invent ungrammatical sentences that LanguageTool will still accept.
The following text was typed in the web browser, and LanguageTool correctly identified most of the errors:
Other options for grammar checkers include Grammar and Spelling checker by Ginger, Spell checker, and Grammar checker by Scribens.
Google also offers a grammar checker if you use Google Docs, Google Sheets, and so on. It works the same way as the Microsoft Office grammar checker in Word or Excel. Here it underlines the errors and shows which form is the correct one:
Text2Image generation
Many Text2Image models are based on encoder-decoder neural-network models. First, we need to extract the keyword — for this task, we generally use TF-IDF. The other important process here is image generation, but it is a whole other technique.
There are numerous models that draw a picture based on the written text. One of them is Craiyon. The main advantage of Craiyon is that it's free, unlike other similar models.
Let's see what Craiyon draws for how often do you play tennis?
You can see nine pictures that are related to playing tennis.
Similar models are: PixelLab (available on Google Play and App Store), starryai (available on Google Play and App Store), DALL-E (available on Google Play and App Store), AI Library, and Text2Art.
Some models offer image generation without text in the input — AI Artist is one of them.
Elai.io offers video generation from text input. The trial period is available for 14 days. You need to pay for it after that.
Automatically generated text
On 8 September 2020, The Guardian published an article written by the neural network GPT-3, although a human editor manually picked the published fragments.
Nowadays, not only news, but a scientific article, legal text, and novel can be generated with a neural network. Manual editing is still needed, though.
So, let's start with words. There are many models that generate a word that does not exist in real life. For example, Word Generator. Go to this website and click on the button Generate Another New Word to get the result.
This same website includes:
Quote Generator — it generates "famous quotes" with a picture in the background. Basically, it is an image with text inside;
Poetry Generator for poetry generation;
Anime Story Generator generates short anime novels, not mangas or anime films.
Other applications for text generation:
Text generator for some small articles;
Free AI Writer and Text Generator composes interesting stories on more than 50 world languages, including such exotic as Malayalam and Lettish. You can change different settings and control the output. All you need is to enter a title for articles; they will be ready in a second;
Premium Text Generator is an application available on Google Play and App Store. It generates text for small messages in social networks. So if you are lazy enough to reply to messages yourself — this is for you.
OpenAI's GPT2 Text Generation — a pre-trained language model available for your smartphone. You can install this application on Google Play and App Store.
Conclusion
In this topic, we have discussed where we can find NLP in real life: online translators, grammar checkers, text2image, and text generators. In the future, we hope you will learn how to train your model for machine translation, Text2Image, and text generation.