Knowledge of thirty thousand words is sufficient for a program to transcribe a voice recording from English, and we have half a million of them in the database for transcribing Czech, says Petr Herian, head and owner of Newton Media, in the interview for Reportér Magazine.
Our interview today is published on the website and as a podcast. What technologies will „bite“ into it to monitor it?
First, there will be tracking technologies that monitor when a new message appears on the web. And there will be also speech recognition technologies involved- what we will talk about will be rewritten into text so that it can be searched in the transcript.
How long is the content of such a conversation kept? And where exactly is it stored?
It is in our data centre and is stored indefinitely. The oldest materials we have are twenty-five years old. We store data in all its forms, so for an interview, it is an audio recording and a text transcript.
Where physically is the data?
It’s in Prague, in a rented data centre, where all our hardware is located.
How many resources is Newton monitoring today? To start, let’s say, for example, in the Czech Republic…
In the Czech Republic, we archive around forty thousand articles a day, from approximately ten thousand sources.
How many are there in total with world resources?
We primarily focus on monitoring and archiving articles from the countries where we operate as a group, which is Central and Eastern Europe. In these countries, we intend to have a full-text version of the articles. Otherwise, we work with like-minded organizations around the world, so it can be said that there are hundreds of thousands of resources a day.
How is monitoring in practice? I imagine that some algorithms go through all the mentioned sources in a certain rhythm.
The basis is the same as twenty-five years ago, even then we decided to work with publishers. The primary part of monitoring is thus based on the fact that every day we store everything that has been published in media houses. In the case of the Internet, we use technology that looks at updated websites as information is added to them. For social networks, we cooperate with external partners and we also have developed our technology for this purpose.
We record everything
Do you monitor only keywords or areas of interest according to client requirements, or do you do some general monitoring, say all bohemians from world sources?
We monitor everything, even if no one asks. We have in our databases a complete version of all articles from every newspaper that has been published. The same goes for news and other servers. We can then search for keywords, specific people or, for example, analytically search for information that the client is interested in.
This means that if, for example, I want to look at what was written about Slavia Prague eight years ago, I should find it there…
Yes. I looked for the first article in our archive and it is an article from the Informační servis, later Respekt magazine, from November 20, 1989. Since1990, we have articles from the Rudé právo and the Hornopočernický zpravodaj. This was since Horní Počernice once ordered the processing of the archive of the local newsletter. Since 1995, when our company was established, we have a complete archive of all newspapers as well as transcripts of radio and television news programs. Everyone can look for themselves, a few weeks ago we launched free access to the archive, which is free for seven days, so everyone can look at articles for twenty-five years.
You talked about transcripts of TV and radio shows. Do you also save their video or audio versions?
Yes, in full length. But this was not the case in the past, the disk capacity was smaller.
The media field is constantly evolving. Only on a small Czech market do various media arise and disappear. How do you monitor which new servers appear and how do you evaluate whether to include them in your archive?
In the case of print, we try to negotiate with the publisher. For websites, we try to monitor what is new, and if there is some news content, we want to include it in our monitoring.
Does viewership, for example, play a role in inclusion in monitoring? Do you also deal with the reliability of the information provided?
We monitor everything if possible. For clients, we then link resources to how reliable their information can be.
So far, we have talked about the media, but even more confusing is the ocean of information on social networks. What does monitoring of Facebook, Twitter, Instagram, and similar look like?
We are not able to monitor the complete content of social networks. We have some of our technology for basic monitoring and we work with companies that specialize in the local markets. And we have a system that is designed for global markets. It is a relatively complicated set of various steps and procedures.
Let us say I want to know how the Reporter stands on social networks at the moment. What can you find out?
We can find out mentions on social networks, what impact they had, how many likes they obtained. We are also able to follow the discussions under the articles. It depends on how wide you want the analysis to be.
What are your clients most interested in on social networks?
They are interested in potential criticism, the sentiment around the brand, and how successful their activities are.
Slavic languages are difficult
As a layman, I can still imagine how millions of texts are browsed and certain words are searched for in them. But how does it work with picture and sound? What are you looking for?
In case of images, we most often look for either specific people, products, or logos. Monitoring of static images is still quite easy, but in the case of video, it is complicated. It is a huge number of individual images, so it’s more for clients who can afford the price associated with processing such a large amount of data.
How does sound monitoring work? Do any of your machines play radio shows or podcasts? Do they work directly with sound or transcripts when searching?
The machine first automatically rewrites the sound into text, and then the transcription is checked and added by the editors. The second option we use, for example, for message monitoring, is just automatic transcription, where our technologies can reach 98 to 99 percent reliability. They get to know new terms every day, so they get better and better.
What system do machines use to learn new words?
We have two models. Acoustic, which learns to recognize words in all nineteen languages we work with, no matter what language they come from. And then a language one, that always works for a specific speech. By using words from the huge databases of texts that we have in our archives, he can achieve incredibly good results even in difficult Slavic languages.
How many words in Czech does your system know?
We have about four million words in the database, but we use a dictionary of about half a million words for speech recognition so that all those computer systems can handle it at all. For example, in English, some thirty thousand words are enough for the success of word recognition at the level of 98 percent. In Czech with all our prefixes, suffixes, endings, and the like, about three hundred to four hundred thousand words are needed for the same success. We have been cooperating with the team of Professor Jan Nouza from the Technical University of Liberec for a long time on the development of voice recognition technologies, and it is great cooperation.
Where is your voice transcription technology used today? Could it be used, for example, to record court proceedings?
Recently, we also have a pilot project in courtrooms, where we can distinguish and record the speeches of judges, the defense, the prosecution, and witnesses, and experts. We can thus help a lot to make the work of court reporters more efficient. This is a great help, especially for judges, who have a sufficiently solid record – albeit not accurate – of what has been said in the courtroom within a few hours of the end of the hearing.
But doesn’t it seem that court reporters will lose their jobs and be replaced soon by boxes with voice transcription technology?
Definitely not. It helps recorders a lot, but when someone becomes in front of the court very emotional, or when a sound signal is weak, technology can’t handle it for one hundred percent.
If you are installing transcription technology in court, do you need to have any screening? I assume that such things must be in some special regime…
In this case, the system is directly managed by the Ministry of Justice, which oversees everything. Our technologies are also used for transcripts of CNB Bank Board meetings, the content of which is secret by law for seven years. Of course, the set mode also corresponds to this, no one from our company can access the content of those records.
The machine can not replace people
Let’s get to your business. You offer a number of services based on the data obtained. What are customers most interested in today? What is the „best deal“?
The best deal, if you call it that, is still classic monitoring. But we see a huge demand for quick analyses – what appeared where, why it happened, what the impact is… And all this in a combination of traditional media and social networks. Besides, we can offer the work of our experts, who can analyse and interpret in detail what lies behind those media outcomes and responses.
I assume that you need real people and machines are not enough to interpret the information obtained in monitoring? How many such people do you employ?
In the Czech Republic it is about a hundred people, in the whole group there are about three hundred co-workers.
The results for 2019 showed that Newton Media earned about 160 million and achieved a profit of seven million. How did the numbers develop in the covid year 2020? Has the pandemic threatened your business?
The effects of coronavirus were, of course, evident in the demand for our services from clients in the fields most affected, such as tourism and gastronomy. But there is also a long-term trend, where the value of unique information decreases over time, which is related to the fact that it is technologically easier. That is why companies of our type always must think about how to bring the best possible added value to customers.
When you started in the field a quarter of a century ago, Newton was such a „smarter cutting service.“ Today you are also analysing pictures or speech. Where will the development go next?
There will probably be no revolutionary change in the near future. We will strive to improve the technology so that we can make the best use of the machine analysis of the information we have at our disposal. And add qualified analytical work to that.
Isn’t there a danger that most activities in your field will move to machine intelligence and then no people will be needed anymore?
I think not. If we want to provide maximum benefit to clients, technology cannot, at least in the near future, replace human labour.
Let’s try to look at the matter from the other side. Does it still make sense for your potential customers in the time of information overload to watch what was written or broadcast where?
Today, clients look at social networks and websites through summaries, expressed in graphical form. If they had to examine every mention that appears where they will not be able to follow it and it will not be useful to them. Thanks to the experience and technology we have at our disposal, we can distinguish the essential. Ten years ago, our clients wanted us to find everything that was rumoured about them. Today, they want only the most important thing, to capture a certain trend and get a summary.
There will be elections this year. Are there political entities among your clients?
They are political parties, government organizations, as well as media and PR agencies. In general, it can be said that most large organizations in the Czech Republic are among our customers.
Do you monitor the political sentiment in the country for yourself?
I don’t do it personally, nor does Newton create such analyses for itself. But if it is interesting for some media or some of our customers, then we can provide such an analysis.