Spaces:
Running
Running
WEBVTT | |
0:00:00.000 --> 0:00:10.115 | |
That easy to say this is a good translation | |
and this is a bad translation. | |
0:00:10.115 --> 0:00:12.947 | |
How can we evaluate? | |
0:00:13.413 --> 0:00:26.083 | |
We will put an emphasis on machine translation | |
because that is currently the state of the | |
0:00:26.083 --> 0:00:26.787 | |
art. | |
0:00:28.028 --> 0:00:35.120 | |
But we are now focused on the details of neural | |
networks where we are describing the basic | |
0:00:35.120 --> 0:00:39.095 | |
ideas and how to use the info machine translation. | |
0:00:39.095 --> 0:00:41.979 | |
This is not a neural network course. | |
0:00:42.242 --> 0:00:49.574 | |
If you have some background in Neo Networks, | |
that is of course of an advantage, but it should | |
0:00:49.574 --> 0:00:51.134 | |
not be a challenge. | |
0:00:51.134 --> 0:00:58.076 | |
If you have not done the details, we'll shortly | |
cover the background and the main ideas. | |
0:00:58.076 --> 0:01:00.338 | |
How can we use them for for? | |
0:01:00.280 --> 0:01:06.880 | |
Machine translation: We will starve the first | |
two, three lectures with some like more traditional | |
0:01:06.880 --> 0:01:12.740 | |
approaches how they work because they still | |
give some good intuition, some good ideas. | |
0:01:12.872 --> 0:01:17.141 | |
And they help us to understand where our systems | |
might be better. | |
0:01:17.657 --> 0:01:22.942 | |
And yeah, we have an innocence on really what | |
do we need to do to build a strong system. | |
0:01:23.343 --> 0:01:35.534 | |
And then we have a part on experience where | |
it's about how to build the systems and how | |
0:01:35.534 --> 0:01:37.335 | |
to apply it. | |
0:01:39.799 --> 0:01:47.774 | |
For additional reading materials, so we have | |
the slides on the website. | |
0:01:47.774 --> 0:01:55.305 | |
There is also links to papers which cover | |
the topic of the lecture. | |
0:01:55.235 --> 0:01:58.436 | |
If You'd Like to Study Additional Books. | |
0:01:59.559 --> 0:02:07.158 | |
Think the most relevant is this machine translation | |
from Philip Kurnan, which gives an introduction | |
0:02:07.158 --> 0:02:09.210 | |
about machine translation. | |
0:02:09.210 --> 0:02:15.897 | |
But this lecture is, of course, not a one | |
to one like we don't go through the book, but | |
0:02:15.897 --> 0:02:17.873 | |
it covers related topics. | |
0:02:18.678 --> 0:02:25.094 | |
Is a previous version of that statistical | |
machine translation focusing on that part, | |
0:02:25.094 --> 0:02:28.717 | |
and we cover some of that part rather than | |
all. | |
0:02:28.717 --> 0:02:35.510 | |
If you want to have more basics about natural | |
language processing, this might be helpful. | |
0:02:39.099 --> 0:02:53.738 | |
In addition, there is an online course on | |
machine translation which we also develop here | |
0:02:53.738 --> 0:02:57.521 | |
at which is available. | |
0:02:57.377 --> 0:03:04.894 | |
Input where you're, of course, free to use | |
that I might give you some other type of presentation | |
0:03:04.894 --> 0:03:07.141 | |
of the lecture important is. | |
0:03:07.141 --> 0:03:14.193 | |
It's, of course, a lot shorter and book doesn't | |
cover all the topics which you're covering | |
0:03:14.193 --> 0:03:15.432 | |
in the lecture. | |
0:03:15.655 --> 0:03:19.407 | |
So, of course, for the exam everything which | |
was in the lecture is important. | |
0:03:19.679 --> 0:03:25.012 | |
This covers like the first half where don't | |
know exactly the first X lectures. | |
0:03:26.026 --> 0:03:28.554 | |
Feel free to have a look at that. | |
0:03:28.554 --> 0:03:29.596 | |
It's shorter. | |
0:03:29.596 --> 0:03:36.438 | |
Maybe there's some of you interesting to have | |
very short videos or after the lecture single | |
0:03:36.438 --> 0:03:39.934 | |
this topic I didn't understand want to repeat. | |
0:03:40.260 --> 0:03:50.504 | |
Then this might be helpful, but it's important | |
that there is more content in the lecture. | |
0:03:53.753 --> 0:04:02.859 | |
The exam will be minutes and oral exam and | |
just make an appointment and then. | |
0:04:05.305 --> 0:04:09.735 | |
If you think this is a really cool topic, | |
want to hear more. | |
0:04:09.735 --> 0:04:14.747 | |
There's two similars, one on advanced topics | |
in machine translation. | |
0:04:15.855 --> 0:04:24.347 | |
Which is every Thursday and there is one which | |
was already on Monday. | |
0:04:24.347 --> 0:04:34.295 | |
But if you're interested in speech translation | |
to contact us and there, I think,. | |
0:04:34.734 --> 0:04:47.066 | |
Then there are other lectures, one more learning | |
by Professor Vival, and for us some of you | |
0:04:47.066 --> 0:04:48.942 | |
have already. | |
0:04:48.888 --> 0:04:55.496 | |
Lecture, which is related but of discovering | |
more general natural language processing than | |
0:04:55.496 --> 0:04:57.530 | |
will be again available in. | |
0:04:57.597 --> 0:05:07.108 | |
Winter semester, and then we are concentrating | |
on the task of machine translation and mighty. | |
0:05:11.191 --> 0:05:14.630 | |
Yeah, and also there's an automatic speech | |
emission problem. | |
0:05:16.616 --> 0:05:27.150 | |
And this is a bit what we are planning to | |
talk about in this semester. | |
0:05:27.150 --> 0:05:30.859 | |
Today we have a general. | |
0:05:31.371 --> 0:05:37.362 | |
Then on Thursday we are doing a bit of a different | |
lecture and that's about the linguistic. | |
0:05:37.717 --> 0:05:42.475 | |
It may be quite different from what you're | |
more computer scientist, what you've done there, | |
0:05:42.475 --> 0:05:43.354 | |
but don't worry. | |
0:05:43.763 --> 0:05:49.051 | |
We're coming in a very basic thing that I | |
think it's important if you're dealing with | |
0:05:49.051 --> 0:05:53.663 | |
natural language to have a bit of an understanding | |
of what language isn't. | |
0:05:53.663 --> 0:05:59.320 | |
Maybe I've learned about that in high school, | |
but also for you this I guess some years ago. | |
0:05:59.619 --> 0:06:07.381 | |
And so it's a bit of yeah, it better understand | |
also what other challenges there. | |
0:06:07.307 --> 0:06:16.866 | |
And especially since we are all dealing with | |
our mother time, it may be English, but there | |
0:06:16.866 --> 0:06:25.270 | |
is a lot of interesting phenomena which would | |
not occur in these two languages. | |
0:06:25.625 --> 0:06:30.663 | |
And therefore we'll also look a bit into what | |
are things which might happen in other languages. | |
0:06:30.930 --> 0:06:35.907 | |
If we want to build machine translation, of | |
course we want to build machine Translation | |
0:06:35.907 --> 0:06:36.472 | |
for many. | |
0:06:38.178 --> 0:06:46.989 | |
Then we will see a lot of these machine learning | |
based how to get the data and process the data | |
0:06:46.989 --> 0:06:47.999 | |
next week. | |
0:06:48.208 --> 0:07:03.500 | |
And then we'll have one lecture about statistical | |
machine translation, which was the approach | |
0:07:03.500 --> 0:07:06.428 | |
for twenty years. | |
0:07:07.487 --> 0:07:17.308 | |
And then maybe surprisingly very early we'll | |
talk about evaluation and this is because evaluation | |
0:07:17.308 --> 0:07:24.424 | |
is really essential for machine translation | |
and it's very challenging. | |
0:07:24.804 --> 0:07:28.840 | |
To decide if machine translation output is | |
good or bad is really challenging. | |
0:07:29.349 --> 0:07:38.563 | |
If you see another translation for a machine | |
to decide is not as difficult and even for | |
0:07:38.563 --> 0:07:48.387 | |
a machine translation output and ask them to | |
rate, you'll get three different answers: And | |
0:07:48.387 --> 0:07:55.158 | |
so it's worse to investigate it, and of course | |
it's also important to have that at the beginning | |
0:07:55.158 --> 0:08:01.928 | |
because if we're later talking about some techniques, | |
it will be always saying this technique is | |
0:08:01.928 --> 0:08:03.813 | |
better by x percent or so. | |
0:08:04.284 --> 0:08:06.283 | |
And we'll also have a practical good course | |
of this. | |
0:08:06.746 --> 0:08:16.553 | |
Then we're going to build language models | |
which are in point to translation models. | |
0:08:16.736 --> 0:08:28.729 | |
After the half you have a basic understanding | |
of what and basic machine translation. | |
0:08:29.029 --> 0:08:39.065 | |
And then on the second part of the lecture | |
we will cover more advanced topics. | |
0:08:39.065 --> 0:08:42.369 | |
What are the challenging? | |
0:08:43.463 --> 0:08:48.035 | |
One challenge is, of course, about additional | |
resources about data. | |
0:08:48.208 --> 0:08:53.807 | |
So the question is how can we get more data | |
or better data and their different ways of | |
0:08:53.807 --> 0:08:54.258 | |
doing? | |
0:08:54.214 --> 0:09:00.230 | |
Our thralling data will look into our building | |
systems which not translate between one language | |
0:09:00.230 --> 0:09:06.122 | |
but which translate between fifteen languages | |
and youth knowledge and share knowledge between | |
0:09:06.122 --> 0:09:09.632 | |
the language so that for each pair they need | |
less data. | |
0:09:11.751 --> 0:09:19.194 | |
And then we'll have something about efficiency. | |
0:09:19.194 --> 0:09:27.722 | |
That is, of course, with more and more complex | |
models. | |
0:09:27.647 --> 0:09:33.053 | |
Because then nobody can afford to do that, | |
so how can you build really efficient things? | |
0:09:33.393 --> 0:09:38.513 | |
Who also like energy is getting more expensive | |
so it's even more important to build systems. | |
0:09:39.419 --> 0:09:43.447 | |
We're Looking to Biases So. | |
0:09:43.423 --> 0:09:50.364 | |
That is a machine translation quite interesting | |
because some information are represented different | |
0:09:50.364 --> 0:09:51.345 | |
in languages. | |
0:09:51.345 --> 0:09:55.552 | |
So if you think about German, there is always | |
clear or not. | |
0:09:55.552 --> 0:10:00.950 | |
But in a lot of situations, it's clear if | |
you talk about to teach her about. | |
0:10:01.321 --> 0:10:03.807 | |
Another Person If It's Male or Female. | |
0:10:04.204 --> 0:10:13.832 | |
From English to German you don't have this | |
information, so how do you generate that and | |
0:10:13.832 --> 0:10:15.364 | |
what systems? | |
0:10:15.515 --> 0:10:24.126 | |
Will just assume things and we'll see that | |
exactly this is happening, so in order to address | |
0:10:24.126 --> 0:10:27.459 | |
these challenges and try to reduce. | |
0:10:28.368 --> 0:10:35.186 | |
The main adaptation is what I said that beginning | |
systems are good at the task they are trained. | |
0:10:35.186 --> 0:10:37.928 | |
But how can we adapt them to new task? | |
0:10:38.959 --> 0:10:51.561 | |
Document level is doing more context and we | |
have two lectures about speech translation, | |
0:10:51.561 --> 0:10:56.859 | |
so mostly before we are translating. | |
0:10:57.117 --> 0:11:00.040 | |
Are now translating audio things. | |
0:11:00.040 --> 0:11:05.371 | |
We have just additional challenges and these | |
we will address. | |
0:11:10.450 --> 0:11:22.165 | |
So to the motivation, why should you work | |
on the theme translation and why should you | |
0:11:22.165 --> 0:11:23.799 | |
put effort? | |
0:11:24.224 --> 0:11:30.998 | |
So we want or we are living in a more global | |
society. | |
0:11:30.998 --> 0:11:37.522 | |
You have now the chance to communicate with | |
people. | |
0:11:37.897 --> 0:11:44.997 | |
And the danger of course is that languages | |
are dying, and more and more languages are | |
0:11:44.997 --> 0:11:45.988 | |
going away. | |
0:11:46.006 --> 0:11:53.669 | |
I think at least that some opportunity in | |
order to keep more languages is that we have | |
0:11:53.669 --> 0:12:01.509 | |
technology solutions which help you to speak | |
in your language and still communicate with | |
0:12:01.509 --> 0:12:04.592 | |
people who speak another language. | |
0:12:04.864 --> 0:12:16.776 | |
And on the one hand there is the need and | |
more and more people want to speak in some | |
0:12:16.776 --> 0:12:19.159 | |
other languages. | |
0:12:19.759 --> 0:12:27.980 | |
For example, Iceland was really keen on getting | |
Icelandic into commercial systems and they | |
0:12:27.980 --> 0:12:36.471 | |
even provided data and so on because they wanted | |
that their language is spoken longer and not | |
0:12:36.471 --> 0:12:38.548 | |
just people switching. | |
0:12:38.959 --> 0:12:47.177 | |
So there's even like yeah, they were spending | |
for promoting this language in order to have | |
0:12:47.177 --> 0:12:55.125 | |
all these digital tools available for languages | |
which are not spoken by so many people. | |
0:12:56.156 --> 0:13:07.409 | |
So it's questionable and it's not completely | |
clear technology always provides. | |
0:13:10.430 --> 0:13:25.622 | |
If we think about machine translation, there | |
are different use cases in which you can use | |
0:13:25.622 --> 0:13:26.635 | |
that. | |
0:13:27.207 --> 0:13:36.978 | |
And this has some characteristics: So typically | |
in this case it is where machine translation | |
0:13:36.978 --> 0:13:40.068 | |
was used first anybody. | |
0:13:40.780 --> 0:13:50.780 | |
Because most youth outlets around the world | |
report at least some of the same events, like | |
0:13:50.780 --> 0:13:58.669 | |
was probably covered around the world in a | |
lot of different languages. | |
0:13:59.279 --> 0:14:08.539 | |
That is one point yes, so the training gator | |
is there. | |
0:14:08.539 --> 0:14:16.284 | |
That's definitely a good point here and then. | |
0:14:17.717 --> 0:14:19.425 | |
Yes, there was my regional idea. | |
0:14:19.425 --> 0:14:23.256 | |
The motivation program was a bit different | |
by you, but it's a good point. | |
0:14:23.256 --> 0:14:26.517 | |
So on the one end you'll understand maybe | |
not perfect English. | |
0:14:26.517 --> 0:14:30.762 | |
Also, it's for his personal use, so you're | |
using machine translation for you use. | |
0:14:31.311 --> 0:14:37.367 | |
It's not as important that this is really | |
perfect written text, but you're more interested | |
0:14:37.367 --> 0:14:38.564 | |
in understanding. | |
0:14:38.858 --> 0:14:45.570 | |
Maybe it's more clearer if you think about | |
the other situation where it's about dissimination | |
0:14:45.570 --> 0:14:48.926 | |
that means producing text in another language. | |
0:14:48.926 --> 0:14:55.138 | |
So just imagine you have a website or you | |
have a restaurant and you want to offer your | |
0:14:55.138 --> 0:14:55.566 | |
menu. | |
0:14:56.476 --> 0:15:01.948 | |
And in this case maybe you want to have a | |
higher quality because in some of your. | |
0:15:01.901 --> 0:15:06.396 | |
You're presenting something of yourself and | |
you want to have good quality. | |
0:15:06.396 --> 0:15:11.490 | |
Just remember you're writing a letter and | |
if you're translating your letter then you | |
0:15:11.490 --> 0:15:17.123 | |
don't want to have it full of mistakes because | |
it's somehow a bad, bad oppression but if it's | |
0:15:17.123 --> 0:15:20.300 | |
assimilation it's about you getting the information. | |
0:15:20.660 --> 0:15:25.564 | |
So here you want your disciplination, you're | |
producing texts for another language. | |
0:15:26.006 --> 0:15:31.560 | |
And then you have the disadvantage that you | |
maybe want to have a higher quality. | |
0:15:31.831 --> 0:15:43.432 | |
Therefore, typically there is less amount, | |
so normally you're getting more information | |
0:15:43.432 --> 0:15:46.499 | |
than you're producing. | |
0:15:49.109 --> 0:15:57.817 | |
Then of course there is a dynamic scenario | |
where there is some type of interaction and | |
0:15:57.817 --> 0:16:07.099 | |
the one thing which is interesting about the | |
dialogue scenario is there is: So if you're | |
0:16:07.099 --> 0:16:18.045 | |
translating a website you have all the data | |
available but in a dialogue scenario you. | |
0:16:18.378 --> 0:16:23.655 | |
And we'll see that in speech recognition this | |
is a big challenge. | |
0:16:23.655 --> 0:16:30.930 | |
Just to mention German where in German the | |
work is often more at the end, so each harmony. | |
0:16:32.052 --> 0:16:36.343 | |
Know that you want to generate the English | |
sentence. | |
0:16:36.343 --> 0:16:42.740 | |
Now you need to know if you cancel this registration | |
to produce a second word. | |
0:16:42.740 --> 0:16:49.785 | |
So you have to either guess or do something | |
in order to provide the translation before | |
0:16:49.785 --> 0:16:52.052 | |
the translation is already. | |
0:16:57.817 --> 0:17:00.530 | |
The question, of course, is in the new world. | |
0:17:00.530 --> 0:17:05.659 | |
I mean, of course, we can, on the one hand, | |
say we don't want to have English, but the | |
0:17:05.659 --> 0:17:10.789 | |
question is do we really need that many languages | |
and how many are here at the moment? | |
0:17:11.291 --> 0:17:20.248 | |
Does anybody have an idea how many languages | |
are spoken in the world? | |
0:17:23.043 --> 0:17:26.510 | |
This is already the first big challenge. | |
0:17:26.510 --> 0:17:34.120 | |
What a language is and what no language is | |
is already difficult, and then maybe one point | |
0:17:34.120 --> 0:17:40.124 | |
people have to argue first about written language | |
or spoken languages. | |
0:17:40.400 --> 0:17:47.765 | |
For written languages I think that number | |
is still too low, but for a spoken language | |
0:17:47.765 --> 0:17:53.879 | |
people normally think: So you see that it's | |
really a lot of languages which will be difficult | |
0:17:53.879 --> 0:17:54.688 | |
to all happen. | |
0:17:55.035 --> 0:18:00.662 | |
And these are just like you see Europe where | |
there's relatively few languages. | |
0:18:00.662 --> 0:18:05.576 | |
You already have quite a lot of languages, | |
even walls and countries. | |
0:18:06.126 --> 0:18:13.706 | |
Of course sometimes you share the language, | |
but then you have Briton or Gillesian vest | |
0:18:13.706 --> 0:18:17.104 | |
where you have languages in a country. | |
0:18:18.478 --> 0:18:24.902 | |
And yeah, of course, there's the question: | |
When does it start to be a language? | |
0:18:24.902 --> 0:18:27.793 | |
And when is it more like a dialect? | |
0:18:27.793 --> 0:18:28.997 | |
So is Catalan? | |
0:18:28.997 --> 0:18:31.727 | |
Is Swiss German a known language? | |
0:18:31.727 --> 0:18:33.253 | |
Or is it the same? | |
0:18:33.293 --> 0:18:36.887 | |
So then, of course, it's are like Czech and | |
Slovakian. | |
0:18:36.887 --> 0:18:42.704 | |
I know heard that people can understand each | |
other so they can just continue talking and | |
0:18:42.704 --> 0:18:45.711 | |
understand by some of their own language and. | |
0:18:46.026 --> 0:18:56.498 | |
Of course, it's partly also like about your | |
own nationality, so I think some people said | |
0:18:56.498 --> 0:18:57.675 | |
creation. | |
0:18:58.018 --> 0:19:04.957 | |
But think for a lot of people you shouldn't | |
say that they are part of being creation language. | |
0:19:05.165 --> 0:19:10.876 | |
But you see therefore that it is not completely | |
clear that there is no hardwater between this | |
0:19:10.876 --> 0:19:13.974 | |
and the new language, and this is a different | |
one. | |
0:19:14.094 --> 0:19:19.403 | |
And of course it's getting more fluent when | |
you talk about scientific things. | |
0:19:19.403 --> 0:19:25.189 | |
I guess sometimes it's no longer clear if | |
it's German or English because we start to | |
0:19:25.189 --> 0:19:27.707 | |
use a lot of English terms in there. | |
0:19:27.707 --> 0:19:31.519 | |
So of course there's interesting mixes which | |
will talk. | |
0:19:33.193 --> 0:19:38.537 | |
So should everybody just speak English, and | |
these numbers are a bit older, have to admit: | |
0:19:38.938 --> 0:19:47.124 | |
However, I don't think they're completely different | |
now and it says like how many people know in | |
0:19:47.124 --> 0:19:54.718 | |
Europe can speak English for countries where | |
English is not the mothertown or for people. | |
0:19:54.995 --> 0:20:06.740 | |
In some countries like smaller ones, for smaller | |
countries you have quite high numbers. | |
0:20:07.087 --> 0:20:13.979 | |
However, there are many countries where you | |
have like twenty to thirty percent of the population, | |
0:20:13.979 --> 0:20:16.370 | |
only being able to speak English. | |
0:20:16.370 --> 0:20:22.559 | |
So if we would only do everything only in | |
English, we would exclude half the population | |
0:20:22.559 --> 0:20:23.333 | |
of Europe. | |
0:20:23.563 --> 0:20:30.475 | |
And therefore providing translations is very | |
important and therefore, for example, the European | |
0:20:30.475 --> 0:20:35.587 | |
Parliament puts a really large amount of money | |
into doing translation. | |
0:20:35.695 --> 0:20:40.621 | |
So that's why you can speak in your mother | |
too in the European Parliament. | |
0:20:40.621 --> 0:20:46.204 | |
Everybody like everyone elected there can | |
speak in there and they were translated to | |
0:20:46.204 --> 0:20:52.247 | |
all the other languages and it's a huge effort | |
and so the question is can we do better with | |
0:20:52.247 --> 0:20:52.838 | |
machine. | |
0:20:53.493 --> 0:20:58.362 | |
And for other countries things are even more. | |
0:20:58.362 --> 0:21:05.771 | |
They may be not worse, difficult, but they | |
are even more challenging. | |
0:21:06.946 --> 0:21:13.764 | |
So there's even more diversity of languages | |
and it might be even more important to do machines. | |
0:21:16.576 --> 0:21:31.034 | |
If you see how many people speak French, Portuguese | |
or English, it's relatively few compared to | |
0:21:31.034 --> 0:21:33.443 | |
the population. | |
0:21:33.813 --> 0:21:46.882 | |
So think that this should be around millions | |
would understand you, but all the others wouldn't. | |
0:21:49.289 --> 0:21:54.877 | |
So it seems to be very important to provide | |
some taebo translation. | |
0:21:54.877 --> 0:21:58.740 | |
It's a quite big industry as a European Union. | |
0:21:58.740 --> 0:22:05.643 | |
This is already also quite long ago, but it | |
won't get less spent like in that year. | |
0:22:05.643 --> 0:22:08.931 | |
One point three billion on translation. | |
0:22:09.289 --> 0:22:21.315 | |
So it might be very helpful to have tools | |
in order to provide them, and as said, not | |
0:22:21.315 --> 0:22:26.267 | |
all directions might be important. | |
0:22:26.426 --> 0:22:35.059 | |
Is even not possible for students, so in the | |
European Parliament they don't have all combinations | |
0:22:35.059 --> 0:22:36.644 | |
of the different. | |
0:22:36.977 --> 0:22:42.210 | |
And language is so if they want to translate | |
from Maltese to Estonian or so. | |
0:22:42.402 --> 0:22:47.361 | |
And maybe they have a translator for that, | |
but there are some directions which don't have | |
0:22:47.361 --> 0:22:47.692 | |
that. | |
0:22:47.692 --> 0:22:52.706 | |
Then they handle directly, but they would | |
translate first to French, German or or English, | |
0:22:52.706 --> 0:22:57.721 | |
and then there would be a second translator | |
getting the translation and really translating | |
0:22:57.721 --> 0:22:59.154 | |
to your Italian language. | |
0:22:59.299 --> 0:23:06.351 | |
And it's not always English, so they are really | |
selecting what is most helpful. | |
0:23:06.351 --> 0:23:13.931 | |
But you see that even in this small setup, | |
with this large amount of effort in there, | |
0:23:13.931 --> 0:23:17.545 | |
there's not enough ability to translate. | |
0:23:19.819 --> 0:23:21.443 | |
And of course this was text. | |
0:23:21.443 --> 0:23:26.538 | |
Then you have a lot of other things where | |
you want to, for example, do speech translation. | |
0:23:26.538 --> 0:23:31.744 | |
There is a lot of conferences which currently | |
are all held in English, which of course might | |
0:23:31.744 --> 0:23:35.831 | |
also not be the best solution if you've gone | |
to some of the conferences. | |
0:23:36.176 --> 0:23:45.964 | |
You might have heard some accented speech | |
where people speak a language that is very | |
0:23:45.964 --> 0:23:49.304 | |
different from their mother. | |
0:23:49.749 --> 0:23:52.059 | |
Might be difficult to understand. | |
0:23:52.212 --> 0:23:59.123 | |
We're currently having an effort for example | |
by ACL, which is the conference organized in | |
0:23:59.123 --> 0:24:06.112 | |
this field to provide these translations into | |
ten hour languages so that also students who | |
0:24:06.112 --> 0:24:06.803 | |
are not. | |
0:24:06.746 --> 0:24:12.446 | |
That familiar English is able to read the | |
papers and watch the present case. | |
0:24:16.416 --> 0:24:25.243 | |
So the question is what can you do here and | |
one interesting solution which we'll cover | |
0:24:25.243 --> 0:24:26.968 | |
in this lecture? | |
0:24:27.087 --> 0:24:38.112 | |
This always comes with a question: is it will | |
it replace the human? | |
0:24:38.112 --> 0:24:40.382 | |
And yes, the. | |
0:24:40.300 --> 0:24:49.300 | |
Idea, but the question doesn't really happen | |
and I'm any skeptical about that. | |
0:24:49.300 --> 0:24:52.946 | |
So currently we are not seeing. | |
0:24:53.713 --> 0:24:55.807 | |
So much more effort needed. | |
0:24:55.807 --> 0:25:00.294 | |
Of course, machine translation is now used | |
as some type of. | |
0:25:01.901 --> 0:25:11.785 | |
If you think about in the European Parliament, | |
they will have some humans doing their translation | |
0:25:11.785 --> 0:25:18.060 | |
because: If you think about the chancel of | |
Germany trembling somewhere and quite sure | |
0:25:18.060 --> 0:25:18.784 | |
you want,. | |
0:25:19.179 --> 0:25:31.805 | |
And so it's more like we are augmenting the | |
possibilities to have more possibilities to | |
0:25:31.805 --> 0:25:37.400 | |
provide translation and travel around. | |
0:25:39.499 --> 0:25:53.650 | |
How can this technology help so machine translation | |
is one way of dealing with? | |
0:25:54.474 --> 0:26:01.144 | |
Of course, there is other tasks which do even | |
without machine translation. | |
0:26:01.144 --> 0:26:04.613 | |
Just think about summarize my lecture. | |
0:26:04.965 --> 0:26:08.019 | |
Approaches doing that what they call end to | |
end. | |
0:26:08.019 --> 0:26:11.635 | |
So you just put an English text and get a | |
German summary. | |
0:26:11.635 --> 0:26:17.058 | |
However, a good baseline and an important | |
thing is to either first lecture into German | |
0:26:17.058 --> 0:26:22.544 | |
and then do a summary art, first do a summary | |
in English and then translation language. | |
0:26:23.223 --> 0:26:28.764 | |
Translation is very important in order to | |
different application scenarios. | |
0:26:28.764 --> 0:26:33.861 | |
We have that dissemination dialogue but also | |
information extraction. | |
0:26:33.861 --> 0:26:39.993 | |
So if you want to do like get information | |
not only from English websites but from. | |
0:26:40.300 --> 0:26:42.427 | |
Very different websites. | |
0:26:42.427 --> 0:26:46.171 | |
It's helpful to have this type of solution. | |
0:26:50.550 --> 0:26:52.772 | |
Yeah, what can you translate? | |
0:26:52.772 --> 0:26:59.660 | |
Of course, we will focus on text, as I said | |
for most of them, because it's about translation | |
0:26:59.660 --> 0:27:06.178 | |
and anything first translates to text, and | |
then change to text, and then we can do text | |
0:27:06.178 --> 0:27:07.141 | |
translation. | |
0:27:09.189 --> 0:27:19.599 | |
And text is not equals text, so we can do | |
translation that is some of the most common. | |
0:27:19.499 --> 0:27:27.559 | |
Is working on translation, so just imagine | |
you are developing your new. | |
0:27:27.947 --> 0:27:34.628 | |
Nowadays you don't want to have to only be | |
available in English or German books in as | |
0:27:34.628 --> 0:27:40.998 | |
many languages as possible, and if you use | |
the standard tools it's not that easy. | |
0:27:41.141 --> 0:27:50.666 | |
We have a different type of domain and there | |
again we have very few contexts. | |
0:27:50.666 --> 0:27:56.823 | |
Normally we translate: To pick up an app you | |
have the menu and there's like safe. | |
0:27:57.577 --> 0:28:02.535 | |
And then you only have safe. | |
0:28:02.535 --> 0:28:14.845 | |
How should translate safe should it be written | |
or should it be spicing? | |
0:28:16.856 --> 0:28:24.407 | |
Then, of course, if you have like files, it | |
might be that you have meta data to transport. | |
0:28:26.466 --> 0:28:27.137 | |
Novels. | |
0:28:27.137 --> 0:28:32.501 | |
Some work on that, but yeah, that's always | |
a typical criticism. | |
0:28:32.501 --> 0:28:36.440 | |
You'll never be able to translate Shakespeare. | |
0:28:36.656 --> 0:28:43.684 | |
Think this is somehow the last use case of | |
machine translation. | |
0:28:43.684 --> 0:28:47.637 | |
For a translation of books there's. | |
0:28:47.847 --> 0:28:57.047 | |
But the nice thing about machine translation | |
is that it can translate to things which are | |
0:28:57.047 --> 0:29:05.327 | |
boring, so think about translating some bureaucrative | |
forms or some regulations. | |
0:29:05.565 --> 0:29:11.302 | |
This is normally not very interesting, it's | |
very repetitive, so their automation works | |
0:29:11.302 --> 0:29:11.697 | |
well. | |
0:29:11.931 --> 0:29:17.519 | |
Of course, there is also translations on Paibos | |
images. | |
0:29:17.519 --> 0:29:24.604 | |
I guess you point your camera to an object | |
where it translates things. | |
0:29:25.005 --> 0:29:43.178 | |
And we'll cover that at the end, as said, | |
the speech translation. | |
0:29:43.663 --> 0:29:46.795 | |
So you can't provide the translation of the | |
lecture. | |
0:29:46.795 --> 0:29:50.518 | |
If I'm five slides further then you would | |
see the translation. | |
0:29:50.518 --> 0:29:52.291 | |
It might not be very helpful. | |
0:29:54.794 --> 0:29:57.062 | |
We are not speaking as we are written. | |
0:29:57.062 --> 0:29:59.097 | |
It's again like a domain mismatch. | |
0:29:59.359 --> 0:30:10.161 | |
So typically the sentences are not full sentences | |
and I'm saying this is not the right way to | |
0:30:10.161 --> 0:30:19.354 | |
praise it and if you just read what was written | |
it might be hard to understand. | |
0:30:23.803 --> 0:30:36.590 | |
We are focusing on the first application scenario | |
that is fully out of management. | |
0:30:37.177 --> 0:30:46.373 | |
Of course, there are quite interesting application | |
scenarios for other things where it should | |
0:30:46.373 --> 0:30:47.645 | |
be referred. | |
0:30:47.867 --> 0:30:49.695 | |
Where it's no longer going to be. | |
0:30:49.695 --> 0:30:52.436 | |
We have this tool and it works, but it's a | |
market. | |
0:30:52.436 --> 0:30:57.381 | |
We have the machine translation system and | |
the human translator, and they somehow cooperate | |
0:30:57.381 --> 0:30:59.853 | |
and try to be as fast as possible in doing | |
a. | |
0:31:00.380 --> 0:31:12.844 | |
The easiest idea there would be the first | |
point you take the machine translation. | |
0:31:13.553 --> 0:31:17.297 | |
That sometimes farther might not be the best | |
way of suing it. | |
0:31:17.357 --> 0:31:25.308 | |
Any ideas or what else you could do, then | |
maybe the machine could aid the human and say | |
0:31:25.308 --> 0:31:27.838 | |
I'm sure about this author. | |
0:31:28.368 --> 0:31:32.319 | |
Yeah, very interesting, very good. | |
0:31:32.319 --> 0:31:42.252 | |
Of course, the dangerous thing there is you | |
asking something from a machine translation | |
0:31:42.252 --> 0:31:45.638 | |
system where it's really bad. | |
0:31:45.845 --> 0:31:50.947 | |
There is quality estimation that maybe it | |
will couple that in evaluation so in evaluation | |
0:31:50.947 --> 0:31:55.992 | |
you know what is correct translation and you | |
have another output and you try to estimate | |
0:31:55.992 --> 0:31:57.409 | |
how good is the quality. | |
0:31:57.409 --> 0:32:02.511 | |
In quality estimation you don't have you only | |
have a source and time and good question is | |
0:32:02.511 --> 0:32:03.531 | |
exactly this one. | |
0:32:03.531 --> 0:32:05.401 | |
Is it a good translation or not? | |
0:32:05.665 --> 0:32:12.806 | |
This might be easier because the system might | |
not know what translation is. | |
0:32:13.053 --> 0:32:23.445 | |
Human is very good at that for machines that | |
are difficult, but of course that's an interesting | |
0:32:23.445 --> 0:32:24.853 | |
application. | |
0:32:25.065 --> 0:32:32.483 | |
Be more interactive so that you may be translating | |
if the human changes the fifth word. | |
0:32:32.483 --> 0:32:36.361 | |
What does it mean for the remaining sentence? | |
0:32:36.361 --> 0:32:38.131 | |
Do I need to change? | |
0:32:38.131 --> 0:32:43.948 | |
There are also things like you don't have | |
to repeat the same errors. | |
0:32:47.767 --> 0:32:57.651 | |
Hell our automated basemen, you only want | |
to correct at once and not at all positions. | |
0:33:00.000 --> 0:33:21.784 | |
And then they ask, for example, so before | |
the translation is done they ask: I'm not directly | |
0:33:21.784 --> 0:33:23.324 | |
aware of that. | |
0:33:23.324 --> 0:33:33.280 | |
I think it's a good way of ending and I think | |
it's where, especially with more advanced dialogue | |
0:33:33.280 --> 0:33:34.717 | |
strategy and. | |
0:33:35.275 --> 0:33:38.831 | |
Currently think of most of the focus is like | |
at least determining. | |
0:33:39.299 --> 0:33:45.646 | |
Don't have this information that is already | |
challenging, so there is quite some work on | |
0:33:45.646 --> 0:33:49.541 | |
quality estimation that I'm missing your information. | |
0:33:49.789 --> 0:33:53.126 | |
But is there something missing? | |
0:33:53.126 --> 0:33:59.904 | |
It's really quite challenging and think that | |
is where currently. | |
0:34:00.260 --> 0:34:05.790 | |
What is there is there is opportunities to | |
provide or there is models to directly provide | |
0:34:05.790 --> 0:34:06.527 | |
additional? | |
0:34:06.786 --> 0:34:13.701 | |
You can give them anything you have and provide | |
them. | |
0:34:13.701 --> 0:34:21.129 | |
It's a similar situation if you're translating | |
to German. | |
0:34:21.641 --> 0:34:31.401 | |
And it would just guess normally or do some | |
random guessing always means it's using some | |
0:34:31.401 --> 0:34:36.445 | |
information which should not be really there. | |
0:34:36.776 --> 0:34:46.449 | |
So then you can provide it with an additional | |
input or you should use formula or non formula. | |
0:34:47.747 --> 0:35:04.687 | |
To know that this information is missing. | |
0:35:04.544 --> 0:35:19.504 | |
Since you're not specifically modeling this, | |
it's likely that there is a gender difference | |
0:35:19.504 --> 0:35:21.805 | |
in languages. | |
0:35:26.046 --> 0:35:39.966 | |
One are we doing good search on machine translation, | |
so it's a very important part to ask in natural | |
0:35:39.966 --> 0:35:42.860 | |
language processing. | |
0:35:43.283 --> 0:35:49.234 | |
So of course you have a lot of computer science | |
thing in there and that's the backbone of. | |
0:35:49.569 --> 0:36:01.848 | |
However, task and understanding you can also | |
get from information like computational linguistics, | |
0:36:01.848 --> 0:36:08.613 | |
which tell you about what language it's good | |
to know. | |
0:36:08.989 --> 0:36:15.425 | |
Doesn't mean that in a computer we have to | |
bottle it exactly the same, but for example | |
0:36:15.425 --> 0:36:22.453 | |
to know that there is something like morphology, | |
which means how words are built, and that for | |
0:36:22.453 --> 0:36:24.746 | |
some languages it's very easy. | |
0:36:24.746 --> 0:36:28.001 | |
In English there is nearly no worth coming. | |
0:36:28.688 --> 0:36:35.557 | |
Well in Germany you already start for soon | |
you have like different forms and so on. | |
0:36:36.316 --> 0:36:41.991 | |
And for other languages, for finish, it's | |
even more complicated with Basque. | |
0:36:41.991 --> 0:36:44.498 | |
I think for some words more than. | |
0:36:45.045 --> 0:36:52.098 | |
So knowing this, of course, gives you some | |
advice. | |
0:36:52.098 --> 0:37:04.682 | |
How do I look at that now because we'll see | |
in the basic treat each word as an individual? | |
0:37:06.106 --> 0:37:09.259 | |
Of course there is a lot of interest also | |
prone from industry. | |
0:37:09.259 --> 0:37:10.860 | |
There is a lot of applications. | |
0:37:11.191 --> 0:37:17.068 | |
There's research groups at Google, Facebook, | |
and Amazon. | |
0:37:17.068 --> 0:37:26.349 | |
So there's quite a lot of interest in providing | |
that for German and English it is solved. | |
0:37:26.546 --> 0:37:27.569 | |
Annoucing it's hard. | |
0:37:27.569 --> 0:37:31.660 | |
We're saying that not hard, but of course | |
we haven't acquired high quality in them. | |
0:37:32.212 --> 0:37:39.296 | |
But there's currently really a large trend | |
in building other systems for low research | |
0:37:39.296 --> 0:37:40.202 | |
languages. | |
0:37:40.480 --> 0:37:53.302 | |
So there are tasks on last year's task on | |
translating from Native American languages: | |
0:37:53.193 --> 0:37:58.503 | |
Don't know yet but but five other languages, | |
so how can you translate from them? | |
0:37:58.538 --> 0:38:05.074 | |
Then you don't have like millions of sentences, | |
but you might have only the Bible or some more | |
0:38:05.074 --> 0:38:05.486 | |
data. | |
0:38:05.486 --> 0:38:08.169 | |
Then the question is, what can you do? | |
0:38:08.169 --> 0:38:09.958 | |
And how good can you get? | |
0:38:14.794 --> 0:38:17.296 | |
One thing is very important. | |
0:38:17.296 --> 0:38:25.751 | |
Of course, in a lot of A I is to measure the | |
quality and what you can measure is quite important. | |
0:38:25.986 --> 0:38:37.213 | |
So that's why for many years of regular there | |
is different evaluation campaigns where people | |
0:38:37.213 --> 0:38:38.178 | |
submit. | |
0:38:39.419 --> 0:38:45.426 | |
We're often part of the statistical machine | |
translation original, yet now I think it's | |
0:38:45.426 --> 0:38:51.019 | |
a machine translation where it's mostly about | |
European languages and used texts. | |
0:38:51.051 --> 0:38:57.910 | |
The International Workshop of Spoken Language | |
Translation, which is translation about lectures | |
0:38:57.910 --> 0:39:04.263 | |
which we are co organizing, and there is a | |
bovia as I said building strong systems this | |
0:39:04.263 --> 0:39:04.696 | |
time. | |
0:39:04.664 --> 0:39:11.295 | |
This has established translating conference | |
presentations from English into ten different | |
0:39:11.295 --> 0:39:17.080 | |
languages: And then, of course, you have to | |
deal with things like special vocabulary. | |
0:39:17.037 --> 0:39:23.984 | |
You think about recurrent real networks are | |
terms like co-recurrent networks, convolutional | |
0:39:23.984 --> 0:39:24.740 | |
networks. | |
0:39:25.545 --> 0:39:29.917 | |
That might be more difficult to translate | |
and you also have to decide who I need to translate | |
0:39:29.917 --> 0:39:33.359 | |
or should I keep it in English, and that's | |
not the same in each language. | |
0:39:33.873 --> 0:39:37.045 | |
In German maybe mostly you keep it. | |
0:39:37.045 --> 0:39:44.622 | |
I think in French people are typically like | |
wanting to translate as much as possible. | |
0:39:44.622 --> 0:39:52.200 | |
These are then challenges and then, of course, | |
in Poland where it's also challenging. | |
0:39:53.153 --> 0:39:59.369 | |
I think all of the speakers in the test that | |
are not native in your speakers, so you need | |
0:39:59.369 --> 0:40:05.655 | |
to translate people with a German accent or | |
with a French accent or with a Japanese accent | |
0:40:05.655 --> 0:40:09.178 | |
or an English accent, which poison has additional. | |
0:40:12.272 --> 0:40:21.279 | |
Yes, so there is criticism always with new | |
technologies because people say will never | |
0:40:21.279 --> 0:40:23.688 | |
translate Shakespeare. | |
0:40:24.204 --> 0:40:26.845 | |
Partly agree with the second. | |
0:40:26.845 --> 0:40:34.682 | |
Maybe it's not good at translating Shakespeare, | |
but there's many people working on that. | |
0:40:35.255 --> 0:40:38.039 | |
Of course, the poison cookie is a challenge. | |
0:40:38.858 --> 0:40:44.946 | |
The thing is here that the cookie chart that | |
you can't never be sure if the machine translation | |
0:40:44.946 --> 0:40:47.546 | |
system doesn't really mistake somewhere. | |
0:40:47.546 --> 0:40:53.316 | |
So if you can't be sure that there's no error | |
in there, how can you trust the translation? | |
0:40:55.275 --> 0:41:01.892 | |
That is partly true, on the other hand, otherwise | |
you have to translate to a human translator | |
0:41:01.892 --> 0:41:06.116 | |
and men who are sometimes overestimating human | |
performance. | |
0:41:06.746 --> 0:41:15.111 | |
They are very good translators but under a | |
lot of pressure and not human translations. | |
0:41:15.715 --> 0:41:22.855 | |
The question is: When can you trust it enough | |
anyway? | |
0:41:22.855 --> 0:41:28.540 | |
You should be careful about trusting them. | |
0:41:31.011 --> 0:41:38.023 | |
And I think some of them are too old now because | |
it has been shown that it is helpful to have | |
0:41:38.023 --> 0:41:41.082 | |
some type of machine translation system. | |
0:41:41.082 --> 0:41:47.722 | |
Of course, it is not buying the car, so typically | |
still a system is not working forever. | |
0:41:48.048 --> 0:41:56.147 | |
If you want your dedicated system, which is | |
good for the task you are, they are typically | |
0:41:56.147 --> 0:41:57.947 | |
not as generalized. | |
0:41:58.278 --> 0:42:07.414 | |
That can translate news and chats, and I don't | |
know what. | |
0:42:07.414 --> 0:42:12.770 | |
So typically if you want to show. | |
0:42:12.772 --> 0:42:18.796 | |
It's not made for, it has not seen very well | |
and then you see a bad quality. | |
0:42:19.179 --> 0:42:27.139 | |
But that's also like yeah, therefore you don't | |
build it. | |
0:42:27.139 --> 0:42:42.187 | |
If you have a sports car and you are driving | |
off road you should: Yeah, you can also say | |
0:42:42.187 --> 0:42:49.180 | |
the other way around trans machine translation | |
is already solved, and especially with more | |
0:42:49.180 --> 0:42:50.487 | |
people think so. | |
0:42:50.750 --> 0:43:04.275 | |
However, there is an impressive performance | |
of machine translation, but it's not stated | |
0:43:04.275 --> 0:43:06.119 | |
of the art. | |
0:43:06.586 --> 0:43:11.811 | |
And yeah, they're good for some domains and | |
some languages that are even like already. | |
0:43:12.572 --> 0:43:27.359 | |
Have Microsoft has a very super human performance | |
claiming that their machine translated system. | |
0:43:27.467 --> 0:43:38.319 | |
However, there was one domain use and some | |
language in Spanish where there is a huge amount | |
0:43:38.319 --> 0:43:45.042 | |
of training data and you can build a very strong | |
system. | |
0:43:45.505 --> 0:43:48.605 | |
And you even don't have to go to these extreme | |
cases. | |
0:43:48.688 --> 0:43:54.328 | |
We have worked on Canada, which is a language | |
in India spoken. | |
0:43:54.328 --> 0:44:01.669 | |
I think by also around eighty million people | |
so similar to to German that it has. | |
0:44:01.669 --> 0:44:07.757 | |
The quality is significantly worse, it has | |
significantly less data. | |
0:44:08.108 --> 0:44:15.132 | |
There are still quite a lot of languages where | |
the quality is not, where you want to have. | |
0:44:15.295 --> 0:44:17.971 | |
Scaling this is not as easy at this thing. | |
0:44:17.971 --> 0:44:23.759 | |
That's why we're also interested in multilingual | |
systems with the hope that we don't have to | |
0:44:23.759 --> 0:44:29.548 | |
build a system for each possible combination, | |
but we can build a system which can cover many | |
0:44:29.548 --> 0:44:33.655 | |
tags, many languages and then also need less | |
data for each other. | |
0:44:39.639 --> 0:44:51.067 | |
With invasion maybe some presentation of everything | |
is a bit cat that can say the most important. | |
0:44:51.331 --> 0:45:09.053 | |
So machine translation started coming from | |
information theory in there was this: It's | |
0:45:09.053 --> 0:45:13.286 | |
treating machine translation as encryption | |
or decryption. | |
0:45:13.533 --> 0:45:21.088 | |
Don't understand it, want to have it in English, | |
treat it as if it's like encrypted English, | |
0:45:21.088 --> 0:45:28.724 | |
and then apply my decryption algorithm, which | |
they were working a lot during the Second World | |
0:45:28.724 --> 0:45:29.130 | |
War. | |
0:45:29.209 --> 0:45:34.194 | |
And so if I cannot do this detruction then | |
this sings a song. | |
0:45:34.934 --> 0:45:42.430 | |
And they based on that they had rules and | |
so on. | |
0:45:42.430 --> 0:45:50.843 | |
So they had the judge Georgetown experiments | |
in where. | |
0:45:51.691 --> 0:45:57.419 | |
From English and then they were like wow. | |
0:45:57.419 --> 0:46:01.511 | |
This is solved in some years. | |
0:46:01.511 --> 0:46:04.921 | |
Now we can do sentences. | |
0:46:06.546 --> 0:46:18.657 | |
As you can imagine this didn't really work | |
out that way, so it's not really happening. | |
0:46:18.657 --> 0:46:24.503 | |
The spirit is willing, but flesh is weak. | |
0:46:24.444 --> 0:46:30.779 | |
Translated it to Russian and then to Germany | |
and then vodka is good but the meat is rotten. | |
0:46:31.271 --> 0:46:39.694 | |
Think it never really happened this way, but | |
you can see you can imagine that something | |
0:46:39.694 --> 0:46:49.533 | |
like that could happen, and then in in the | |
there was this report saying: It's more challenging | |
0:46:49.533 --> 0:46:56.877 | |
than expected and the problem is that we have | |
to invest more. | |
0:46:56.877 --> 0:47:02.801 | |
There's no benefit for doing machine translation. | |
0:47:04.044 --> 0:47:09.255 | |
At least in some other countries there was | |
a bit, but then for some time there wasn't | |
0:47:09.255 --> 0:47:10.831 | |
that big out of progress. | |
0:47:12.152 --> 0:47:26.554 | |
We have then in the' 70s there were some rule | |
based systems that would cover out some linguistic | |
0:47:26.554 --> 0:47:28.336 | |
background. | |
0:47:28.728 --> 0:47:34.013 | |
They are now doing very good machine translation, | |
but they had a really huge rule base. | |
0:47:34.314 --> 0:47:43.538 | |
So they really have like handwritten roots | |
how to parse sentences, how to translate parse | |
0:47:43.538 --> 0:47:45.587 | |
sentences to parse. | |
0:47:46.306 --> 0:47:55.868 | |
When which word should be translated, these | |
rule based systems were quite strong for a | |
0:47:55.868 --> 0:47:57.627 | |
very long time. | |
0:47:57.917 --> 0:48:03.947 | |
So even in or so for some language fares and | |
some remains, it was better than a machine | |
0:48:03.947 --> 0:48:04.633 | |
learning. | |
0:48:05.505 --> 0:48:09.576 | |
Well, of course, there was a lot of effort | |
in and a lot of experts were building this. | |
0:48:11.791 --> 0:48:13.170 | |
And then. | |
0:48:13.053 --> 0:48:18.782 | |
The first statistical machine translations | |
were coming in the early nineties. | |
0:48:18.782 --> 0:48:25.761 | |
There's the system by IBM will refer to them | |
as a T by the IBM models, which are quite famous, | |
0:48:25.761 --> 0:48:32.886 | |
and they were used to film your machine translations | |
from the nineties nineties to two thousand. | |
0:48:32.912 --> 0:48:35.891 | |
Fifteen or so people were working on the IBM | |
models. | |
0:48:36.496 --> 0:48:44.608 | |
And that was the first way of doing a machine | |
translation with statisticals or machine learning. | |
0:48:44.924 --> 0:48:52.143 | |
And it was possible through the French English | |
under a corpusol from the Canadian Parliament | |
0:48:52.143 --> 0:48:59.516 | |
they also had proceedings in French and English | |
and people tried to use that to translate and. | |
0:49:01.681 --> 0:49:06.919 | |
And yes, so that was than the start of statistical | |
machine translation. | |
0:49:07.227 --> 0:49:17.797 | |
Is called a phrase page machine translation | |
was introduced where you could add more information | |
0:49:17.797 --> 0:49:26.055 | |
in use longer chunks to translate and phrase | |
page translation was somehow. | |
0:49:26.326 --> 0:49:27.603 | |
She'll Start Fourteen. | |
0:49:27.767 --> 0:49:37.721 | |
With this straight space machine sensation | |
we saw the first commercial systems. | |
0:49:38.178 --> 0:49:45.301 | |
And yeah, that was the first big advantage | |
where really you can see the machine translation. | |
0:49:47.287 --> 0:49:55.511 | |
And neural machine translation was mainly | |
introduced. | |
0:49:55.511 --> 0:50:07.239 | |
That means there was a shift from traditional | |
statistical modeling to using. | |
0:50:07.507 --> 0:50:09.496 | |
And that was quite impressive. | |
0:50:09.496 --> 0:50:11.999 | |
It was really within one or two years. | |
0:50:11.999 --> 0:50:17.453 | |
The whole research community shifted from | |
what they had been working on since twenty | |
0:50:17.453 --> 0:50:17.902 | |
years. | |
0:50:17.902 --> 0:50:23.485 | |
And everybody was using this pattern, you | |
know networks, because just the performances | |
0:50:23.485 --> 0:50:25.089 | |
were really really much. | |
0:50:25.425 --> 0:50:35.048 | |
Especially they are what we also see now with | |
chat boards like the impressive thing. | |
0:50:35.135 --> 0:50:45.261 | |
That was very, very challenging if you see | |
machine translation before that, especially | |
0:50:45.261 --> 0:50:47.123 | |
if the English. | |
0:50:47.547 --> 0:50:53.352 | |
But if you were transmitting to German you | |
would see that the agreement so that it's there | |
0:50:53.352 --> 0:50:58.966 | |
shown abound and dishewn and boima and this | |
didn't always really work perfect maybe for | |
0:50:58.966 --> 0:51:04.835 | |
the short range of work but then it has to | |
be accusative and it's like far away then things | |
0:51:04.835 --> 0:51:06.430 | |
didn't really work well. | |
0:51:06.866 --> 0:51:13.323 | |
Now with new machine translation we have a | |
bit of a different problem: So the sentences | |
0:51:13.323 --> 0:51:16.901 | |
are typically really nice. | |
0:51:16.901 --> 0:51:24.056 | |
They are perfectly written not always but | |
very often. | |
0:51:24.224 --> 0:51:36.587 | |
So that adequacy and their conveillance should | |
have the same meaning is typically the bigger. | |
0:51:42.002 --> 0:51:46.039 | |
So how can we do so last? | |
0:51:46.039 --> 0:51:54.889 | |
What are the things and how can we do machine | |
rendering? | |
0:51:55.235 --> 0:52:01.297 | |
So we had first blue based systems, and as | |
a side systems we did that we manually created | |
0:52:01.297 --> 0:52:01.769 | |
rules. | |
0:52:01.861 --> 0:52:07.421 | |
And there were rules how to dissemvy real | |
ambiguities. | |
0:52:07.421 --> 0:52:16.417 | |
For example, we had the word banks look at | |
the context and do rules like to decide when. | |
0:52:17.197 --> 0:52:28.418 | |
How to translate the structure, but you know | |
how to transfer the structure that you work | |
0:52:28.418 --> 0:52:33.839 | |
has to split it in German and move to the. | |
0:52:35.295 --> 0:52:36.675 | |
Here's a difficult thing. | |
0:52:36.675 --> 0:52:39.118 | |
My thing is you don't need any training data. | |
0:52:39.118 --> 0:52:41.295 | |
It's not like now with machine learning. | |
0:52:41.295 --> 0:52:46.073 | |
If you build a machine translation system, | |
the first question you should ask is do I have | |
0:52:46.073 --> 0:52:46.976 | |
data to do that? | |
0:52:46.976 --> 0:52:48.781 | |
Do I have parallel data to train? | |
0:52:49.169 --> 0:52:50.885 | |
Here there's no data. | |
0:52:50.885 --> 0:52:57.829 | |
It's like all trades, pencils and roads, but | |
the problem is people trading the roads and | |
0:52:57.829 --> 0:52:59.857 | |
this needs to be experts. | |
0:52:59.799 --> 0:53:06.614 | |
Understand at least the grammar in one language, | |
basically the grammar in both languages. | |
0:53:06.614 --> 0:53:09.264 | |
It needs to be a real language to. | |
0:53:10.090 --> 0:53:17.308 | |
Then we have the two corpus based machine | |
translation approaches, and then we use machine | |
0:53:17.308 --> 0:53:22.682 | |
learning to learn how to translate from one | |
language to the other. | |
0:53:22.882 --> 0:53:29.205 | |
We should find out ourselves what is the meaning | |
of individual words, which words translate | |
0:53:29.205 --> 0:53:30.236 | |
to each other. | |
0:53:30.236 --> 0:53:36.215 | |
The only information we give is the German | |
sentence, the English sentence, and then we | |
0:53:36.215 --> 0:53:37.245 | |
look for many. | |
0:53:37.697 --> 0:53:42.373 | |
So maybe you think there's a Bible for each | |
language. | |
0:53:42.373 --> 0:53:44.971 | |
There shouldn't be a problem. | |
0:53:45.605 --> 0:53:52.752 | |
But this is not the scale when we're talking | |
about. | |
0:53:52.752 --> 0:54:05.122 | |
Small systems have maybe one hundred thousand | |
sentences when we're building large models. | |
0:54:05.745 --> 0:54:19.909 | |
The statistical models do statistics about | |
how the word screw occur and how often the | |
0:54:19.909 --> 0:54:21.886 | |
word screw. | |
0:54:22.382 --> 0:54:29.523 | |
While we were focused on it was currently | |
most of the cases referred to as neural communication. | |
0:54:30.050 --> 0:54:44.792 | |
So in this case the idea is that you have | |
a neural model which is a big neural network. | |
0:54:45.345 --> 0:54:55.964 | |
And for these machine drums there quite challenging | |
tasks. | |
0:54:55.964 --> 0:55:03.883 | |
For example, this transformal architecture. | |
0:55:03.903 --> 0:55:07.399 | |
Cast by Google in two thousand eight. | |
0:55:08.028 --> 0:55:19.287 | |
Here want to ask the screw-based machine translation | |
of that part. | |
0:55:22.862 --> 0:55:33.201 | |
Would say it's mainly rule based systems because | |
purely rule based systems maybe exist with | |
0:55:33.201 --> 0:55:36.348 | |
some very exotic languages. | |
0:55:36.776 --> 0:55:43.947 | |
Of course, the idea of investigating if we | |
have this type of rulers that might be still | |
0:55:43.947 --> 0:55:45.006 | |
interesting. | |
0:55:45.105 --> 0:55:52.090 | |
Maybe you can try to let someone force the | |
rules in there. | |
0:55:52.090 --> 0:55:57.655 | |
You might use rules to create artificial data. | |
0:55:57.557 --> 0:56:03.577 | |
That it might be helpful to have some concepts | |
which develop by bilinguistic researches to | |
0:56:03.577 --> 0:56:09.464 | |
somehow interview that that's still an open | |
question is sometimes helpful, and of course | |
0:56:09.464 --> 0:56:13.235 | |
is also interesting from more the analyzed | |
perspectives. | |
0:56:13.235 --> 0:56:13.499 | |
So. | |
0:56:13.793 --> 0:56:20.755 | |
Do the new networks have these types of concepts | |
of gender or anything? | |
0:56:20.755 --> 0:56:23.560 | |
And can we test that though? | |
0:56:30.330 --> 0:56:34.255 | |
Yes, and then the other way of describing | |
how this can be done. | |
0:56:34.574 --> 0:56:52.021 | |
And then originally mainly for a rule based | |
system that can be used for a lot of scenarios. | |
0:56:52.352 --> 0:57:04.135 | |
In real ways, the first world has really direct | |
translation systems that work for related languages. | |
0:57:04.135 --> 0:57:11.367 | |
You mainly look at each word and replace the | |
word by the one. | |
0:57:11.631 --> 0:57:22.642 | |
Another idea is that you first do some type | |
of animus on the source side, so for example | |
0:57:22.642 --> 0:57:28.952 | |
you can create what is referred to as a path | |
tree. | |
0:57:30.150 --> 0:57:36.290 | |
Or you can instead, and that is what is called | |
the lingua face approach. | |
0:57:36.290 --> 0:57:44.027 | |
You take the short sentence and parse it into | |
a semantic representation, which is hopefully | |
0:57:44.027 --> 0:57:44.448 | |
the. | |
0:57:44.384 --> 0:57:50.100 | |
Only of the meaning of what is said and then | |
you can generate it to any other language because | |
0:57:50.100 --> 0:57:55.335 | |
it has a meaning and then you can need a part | |
generation which can generate all other. | |
0:57:57.077 --> 0:58:09.248 | |
The idea is somewhat nice to have this type | |
of interlingua, general representation of all | |
0:58:09.248 --> 0:58:17.092 | |
meanings, and they always translate into the | |
interlingua. | |
0:58:17.177 --> 0:58:19.189 | |
A Little World and It's Been Somewhere. | |
0:58:20.580 --> 0:58:26.684 | |
It shouldn't be a natural language because | |
it shouldn't have ambiguities so that's a big | |
0:58:26.684 --> 0:58:32.995 | |
difference so the story and the tiger language | |
have ambiguities so the idea is they do some | |
0:58:32.995 --> 0:58:39.648 | |
semantic representation or what does it mean | |
and so on and therefore it's very easy to generate. | |
0:58:41.962 --> 0:58:45.176 | |
However, that is a challenge that this really | |
exists. | |
0:58:45.176 --> 0:58:48.628 | |
You cannot define the language for anything | |
in the world. | |
0:58:49.249 --> 0:58:56.867 | |
And that's why the Lingo-based approach typically | |
worked for small domains to do hotel reservation, | |
0:58:56.867 --> 0:59:00.676 | |
but if you want to define the Lingo for anything. | |
0:59:01.061 --> 0:59:07.961 | |
There have been approaches and semantics, | |
but it's yeah, it's not really possible CR. | |
0:59:07.961 --> 0:59:15.905 | |
So approaches to this because I mean a seasonal | |
vector's face and bitch eyes and slaves everything | |
0:59:15.905 --> 0:59:20.961 | |
that I mitonized that they all could end up | |
in the same space. | |
0:59:21.821 --> 0:59:24.936 | |
That is not the question. | |
0:59:24.936 --> 0:59:35.957 | |
If you talk about neural networks, it's direct | |
translation on the one you're putting in the | |
0:59:35.957 --> 0:59:36.796 | |
input. | |
0:59:36.957 --> 0:59:44.061 | |
And you can argue for both that we have been | |
making this representation language agnostic | |
0:59:44.061 --> 0:59:45.324 | |
or independent. | |
0:59:47.227 --> 0:59:52.912 | |
Until now we were able to make it less language | |
dependent but it's very hard to make it completely | |
0:59:52.912 --> 0:59:54.175 | |
language independent. | |
0:59:54.175 --> 0:59:59.286 | |
Maybe it's also not necessary and of course | |
if there's again the problem there's not all | |
0:59:59.286 --> 1:00:04.798 | |
information and the source and the target there | |
is different types of information if you remove | |
1:00:04.798 --> 1:00:05.602 | |
all language. | |
1:00:05.585 --> 1:00:09.408 | |
Information might be that you have removed | |
too many information. | |
1:00:10.290 --> 1:00:15.280 | |
Talk about this and there's a very interesting | |
research direction in which we are working | |
1:00:15.280 --> 1:00:20.325 | |
on on the multilingual part because there is | |
especially the case if we have several source | |
1:00:20.325 --> 1:00:25.205 | |
languages, several type of languages who try | |
to generate a representation in the middle | |
1:00:25.205 --> 1:00:27.422 | |
which have the few language dependence. | |
1:00:32.752 --> 1:00:46.173 | |
Yes, so for a direct base approach, so as | |
said the first one is dictionary based approach. | |
1:00:46.806 --> 1:00:48.805 | |
Replace some words with other words. | |
1:00:48.805 --> 1:00:51.345 | |
Then you have exactly the same same structure. | |
1:00:51.771 --> 1:00:55.334 | |
Other problems are one to one correspondence. | |
1:00:55.334 --> 1:01:01.686 | |
Some phrases are expressed with several words | |
in English, but one word in German. | |
1:01:01.686 --> 1:01:03.777 | |
That's extremely the case. | |
1:01:03.777 --> 1:01:07.805 | |
Just think about all our composites like the | |
Donau. | |
1:01:08.608 --> 1:01:18.787 | |
Which is used very often as been referred | |
to as translation memory. | |
1:01:18.787 --> 1:01:25.074 | |
It might seem very simple, but it's like. | |
1:01:26.406 --> 1:01:33.570 | |
That means you might think of this not helpful | |
at all, but you know think about translating. | |
1:01:33.513 --> 1:01:38.701 | |
The law text is more like the interactive | |
scenario for the human translator. | |
1:01:38.701 --> 1:01:44.091 | |
In law text there is a lot of repetition and | |
a lot of phrases occur very often. | |
1:01:44.424 --> 1:01:55.412 | |
The translator has just a background of translation | |
memory and retrieve all this translation. | |
1:01:55.895 --> 1:02:07.147 | |
There is even another benefit in addition | |
to less work: That is also precise in the way | |
1:02:07.147 --> 1:02:19.842 | |
know this creates a small mistake in the North | |
Carolina. | |
1:02:20.300 --> 1:02:22.584 | |
By especially its like consistence,. | |
1:02:23.243 --> 1:02:32.954 | |
If you once translate the sentence this way | |
you again translate it and especially for some | |
1:02:32.954 --> 1:02:36.903 | |
situations like a company they have. | |
1:02:37.217 --> 1:02:47.695 | |
With this one, of course, you get more consistent | |
translations. | |
1:02:47.695 --> 1:02:56.700 | |
Each one is a style where phrases maybe are | |
retrieved. | |
1:03:01.861 --> 1:03:15.502 | |
Then we have these transfer based approaches | |
where we have three steps: Analysts remain | |
1:03:15.502 --> 1:03:25.975 | |
that you check one synthetic structure, so | |
for example for morphology the basic. | |
1:03:26.286 --> 1:03:37.277 | |
Then you will do a parstry or dependency structure | |
that this is the adjective of the balm. | |
1:03:37.917 --> 1:03:42.117 | |
Then you can do the transfer where you transfer | |
the structure to the other. | |
1:03:42.382 --> 1:03:46.633 | |
There you have to do, for example, it's re-ordering | |
because the satisfaction is different. | |
1:03:46.987 --> 1:03:50.088 | |
In German, the adjective is before the noun. | |
1:03:50.088 --> 1:03:52.777 | |
In Spanish, it's the other way around. | |
1:03:52.777 --> 1:03:59.256 | |
You have first found and then that it's nice | |
and these types of rehonoring can be done there. | |
1:03:59.256 --> 1:04:04.633 | |
You might have to do other things like passive | |
voice to exit voice and so on. | |
1:04:05.145 --> 1:04:14.074 | |
And in some type of lexical transverse it | |
should like to me: And then you are doing the | |
1:04:14.074 --> 1:04:16.014 | |
generation. | |
1:04:16.014 --> 1:04:25.551 | |
Of course, you would do the agreement if it | |
is accusative. | |
1:04:25.551 --> 1:04:29.430 | |
What type of adjective? | |
1:04:30.090 --> 1:04:32.048 | |
Is some kind of saving. | |
1:04:32.048 --> 1:04:39.720 | |
Of course, here, because the analyze has only | |
to be done in the source language, the transfer | |
1:04:39.720 --> 1:04:41.679 | |
has to do on the pairs. | |
1:04:41.679 --> 1:04:48.289 | |
But if you not look German, English and French | |
through all directions, you only. | |
1:04:53.273 --> 1:04:59.340 | |
Then there is an interlingua card which is | |
really about the pure meaning, so you have | |
1:04:59.340 --> 1:05:00.751 | |
a semantic grammar. | |
1:05:01.061 --> 1:05:07.930 | |
To represent everything and one thing, one | |
nice implication is more extreme than before. | |
1:05:07.930 --> 1:05:15.032 | |
You don't have the transfer anymore, so if | |
you add one language to it and you have already. | |
1:05:15.515 --> 1:05:26.188 | |
If you add the one parting and the one generation | |
phase, you can now translate from: So you need | |
1:05:26.188 --> 1:05:40.172 | |
components which do the and components which | |
do the generation, and then you can translate: | |
1:05:41.001 --> 1:05:45.994 | |
You can also do other things like paraphrasing. | |
1:05:45.994 --> 1:05:52.236 | |
You can translate back to the words language | |
and hopefully. | |
1:05:53.533 --> 1:06:05.013 | |
If you're sparkling trying to analyze it, | |
it was also down a lot for ungrammetical speech | |
1:06:05.013 --> 1:06:11.518 | |
because the idea is you're in this representation. | |
1:06:12.552 --> 1:06:18.679 | |
Of course, it's very much work and it's only | |
realistic for limited domains. | |
1:06:20.000 --> 1:06:25.454 | |
Then we're, we're have the campus based approach. | |
1:06:25.745 --> 1:06:32.486 | |
So we'll talk about a lot about peril layer | |
and what is really peril data is what you know | |
1:06:32.486 --> 1:06:34.634 | |
from the Rosetta stone page. | |
1:06:34.634 --> 1:06:41.227 | |
That is, you have a sewer sentence and you | |
have a target sentence and you know they need | |
1:06:41.227 --> 1:06:42.856 | |
to watch translation. | |
1:06:43.343 --> 1:06:46.651 | |
And that's important, so the alignment is | |
typically at a sentence level. | |
1:06:46.987 --> 1:06:50.252 | |
So you know, for each sentence what is a translation? | |
1:06:50.252 --> 1:06:55.756 | |
Not always perfect because maybe there's two | |
German sentences and one English, but at that | |
1:06:55.756 --> 1:06:57.570 | |
level it's normally possible. | |
1:06:57.570 --> 1:07:03.194 | |
At word level you can't do that because it's | |
a very complicated thing and sense level that's | |
1:07:03.194 --> 1:07:04.464 | |
normally a relative. | |
1:07:05.986 --> 1:07:12.693 | |
Some type of machine learning which tries | |
to learn dismapping between sentences on the | |
1:07:12.693 --> 1:07:14.851 | |
English side and sentences. | |
1:07:15.355 --> 1:07:22.088 | |
Of course this doesn't look like good mapping | |
too complex but you try to find something like | |
1:07:22.088 --> 1:07:28.894 | |
that where it's a very nice mapping so there's | |
always the mixing things are met to each other | |
1:07:28.894 --> 1:07:32.224 | |
and then if you have the English you can try. | |
1:07:32.172 --> 1:07:36.900 | |
In another English sentence you can apply | |
the same mannering and hopefully adhere to | |
1:07:36.900 --> 1:07:38.514 | |
the right sentence in terms. | |
1:07:38.918 --> 1:07:41.438 | |
The big problem here. | |
1:07:41.438 --> 1:07:44.646 | |
How can we find this model? | |
1:07:44.646 --> 1:07:50.144 | |
How to map English centers into German centers? | |
1:07:54.374 --> 1:08:08.492 | |
How we do that is that we are trying to maximize | |
the probability, so we have all the letterstone. | |
1:08:09.109 --> 1:08:15.230 | |
Then we're having some type of model here | |
which takes the Suez language and translates | |
1:08:15.230 --> 1:08:16.426 | |
it for a target. | |
1:08:16.896 --> 1:08:34.008 | |
And then we are in our translation, and we | |
are adjusting our model in a way that the probability. | |
1:08:34.554 --> 1:08:48.619 | |
How that is the idea behind it, how we are | |
pushed now, implement that is part of the bottle. | |
1:08:51.131 --> 1:09:01.809 | |
And then if we want to do translation, what | |
we are doing is we are trying to find the translation. | |
1:09:01.962 --> 1:09:06.297 | |
So we are scoring many possible translations. | |
1:09:06.297 --> 1:09:12.046 | |
There is an infinite number of sentences that | |
we are trying. | |
1:09:12.552 --> 1:09:18.191 | |
That may be a bit of a problem when we talk | |
about confidence because we are always trying | |
1:09:18.191 --> 1:09:19.882 | |
to find the most probable. | |
1:09:20.440 --> 1:09:28.241 | |
And then, of course, we are not really having | |
intrinsically the possibility to say, oh, I | |
1:09:28.241 --> 1:09:31.015 | |
have no idea in this situation. | |
1:09:31.015 --> 1:09:35.782 | |
But our general model is always about how | |
can we find? | |
1:09:40.440 --> 1:09:41.816 | |
Think It's. | |
1:09:42.963 --> 1:09:44.242 | |
Get Four More Slides. | |
1:09:46.686 --> 1:09:52.025 | |
So just high level, so for a proper space | |
this one we won't cover again. | |
1:09:52.352 --> 1:10:00.808 | |
Its example based machine translation was | |
at the beginning of SMT. | |
1:10:00.808 --> 1:10:08.254 | |
The idea is that you take subparts and combine | |
them again. | |
1:10:08.568 --> 1:10:11.569 | |
So this will not be really covered here. | |
1:10:11.569 --> 1:10:15.228 | |
Then the statistical machine translation we | |
will. | |
1:10:17.077 --> 1:10:18.773 | |
Yeah, we will cover next week. | |
1:10:19.079 --> 1:10:27.594 | |
The idea is there that we automatically now, | |
if we have the sentence alignment, we automatically. | |
1:10:27.527 --> 1:10:34.207 | |
In the sentences, and then we can learn statistical | |
models of how probable words are translated | |
1:10:34.207 --> 1:10:39.356 | |
to each other, and then the surge is that we | |
create different hypotheses. | |
1:10:39.356 --> 1:10:45.200 | |
This could be a translation of this part, | |
this could be a translation of that part. | |
1:10:45.200 --> 1:10:47.496 | |
We give a score to each of them. | |
1:10:47.727 --> 1:10:51.584 | |
The statistical machine manual is where a | |
lot of work is done. | |
1:10:51.584 --> 1:10:54.155 | |
How can we score how good translation is? | |
1:10:54.494 --> 1:11:04.764 | |
The words can recur this type of structure, | |
how is it reordered, and then based on that | |
1:11:04.764 --> 1:11:08.965 | |
we search for the best translation. | |
1:11:12.252 --> 1:11:19.127 | |
Then yeah, that one what we'll cover most | |
of the time is is a neural, a model where we | |
1:11:19.127 --> 1:11:21.102 | |
can use neural networks. | |
1:11:21.102 --> 1:11:27.187 | |
The nice thing is between everything together | |
before we get some compliment. | |
1:11:27.187 --> 1:11:30.269 | |
Each of them is trained independently. | |
1:11:30.210 --> 1:11:34.349 | |
Which of course has a disadvantage that they | |
might not best work together. | |
1:11:34.694 --> 1:11:36.601 | |
Here everything is trained together. | |
1:11:36.601 --> 1:11:39.230 | |
The continuous representation will look into | |
that. | |
1:11:39.339 --> 1:11:41.846 | |
That's very helpful soft. | |
1:11:41.846 --> 1:11:50.426 | |
We then neonetworks are able to learn somehow | |
the relation between words and that's very | |
1:11:50.426 --> 1:11:57.753 | |
helpful because then we can more easily deal | |
with words which didn't occur. | |
1:12:00.000 --> 1:12:05.240 | |
One thing just to correlate that to interlingua | |
based. | |
1:12:05.345 --> 1:12:07.646 | |
So we have this as an actual language. | |
1:12:07.627 --> 1:12:11.705 | |
And if you do an interlingual based approach | |
but don't take an artificial. | |
1:12:11.731 --> 1:12:17.814 | |
With no ambiguities, but with a natural language | |
that's referred to as pivot based in tea and | |
1:12:17.814 --> 1:12:20.208 | |
can be done with all the approaches. | |
1:12:20.208 --> 1:12:25.902 | |
So the ideas instead of directly translating | |
from German to French, you first translate | |
1:12:25.902 --> 1:12:29.073 | |
from German to English and then from English | |
to. | |
1:12:29.409 --> 1:12:40.954 | |
French where the big advantage is that you | |
might have a lot more data for these two directions | |
1:12:40.954 --> 1:12:43.384 | |
than you have here. | |
1:12:44.864 --> 1:12:54.666 | |
With this thank you and deserve more questions | |
and a bit late I'm sorry and then I'll see | |
1:12:54.666 --> 1:12:55.864 | |
you again. | |