Spaces:
Running
Running
WEBVTT | |
0:00:01.561 --> 0:00:05.186 | |
Okay So Um. | |
0:00:08.268 --> 0:00:17.655 | |
Welcome to today's presentation of the second | |
class and machine translation where we'll today | |
0:00:17.655 --> 0:00:25.044 | |
do a bit of a specific topic and we'll talk | |
about linguistic backgrounds. | |
0:00:26.226 --> 0:00:34.851 | |
Will cover their three different parts of | |
the lecture. | |
0:00:35.615 --> 0:00:42.538 | |
We'll do first a very, very brief introduction | |
about linguistic background in a way that what | |
0:00:42.538 --> 0:00:49.608 | |
is language, what are ways of describing language, | |
what are a bit serious behind it, very, very | |
0:00:49.608 --> 0:00:50.123 | |
short. | |
0:00:50.410 --> 0:00:57.669 | |
Don't know some of you have listened, think | |
to NLP in the last semester or so. | |
0:00:58.598 --> 0:01:02.553 | |
So there we did a lot longer explanation. | |
0:01:02.553 --> 0:01:08.862 | |
Here is just because we are not talking about | |
machine translation. | |
0:01:09.109 --> 0:01:15.461 | |
So it's really focused on the parts which | |
are important when we talk about machine translation. | |
0:01:15.755 --> 0:01:19.377 | |
Though for everybody who has listened to that | |
already, it's a bit of a repetition. | |
0:01:19.377 --> 0:01:19.683 | |
Maybe. | |
0:01:19.980 --> 0:01:23.415 | |
But it's really trying to look. | |
0:01:23.415 --> 0:01:31.358 | |
These are properties of languages and how | |
can they influence translation. | |
0:01:31.671 --> 0:01:38.928 | |
We'll use that in the second part to discuss | |
why is machine translation more from what we | |
0:01:38.928 --> 0:01:40.621 | |
know about language. | |
0:01:40.940 --> 0:01:47.044 | |
We will see that I mean there's two main things | |
is that the language might express ideas and | |
0:01:47.044 --> 0:01:53.279 | |
information differently, and if they are expressed | |
different in different languages we have to | |
0:01:53.279 --> 0:01:54.920 | |
do somehow the transfer. | |
0:01:55.135 --> 0:02:02.771 | |
And it's not purely that we know there's words | |
used for it, but it's not that simple and very | |
0:02:02.771 --> 0:02:03.664 | |
different. | |
0:02:04.084 --> 0:02:10.088 | |
And the other problem we mentioned last time | |
about biases is that there's not always the | |
0:02:10.088 --> 0:02:12.179 | |
same amount of information in. | |
0:02:12.592 --> 0:02:18.206 | |
So it can be that there's some more information | |
in the one or you can't express that few information | |
0:02:18.206 --> 0:02:19.039 | |
on the target. | |
0:02:19.039 --> 0:02:24.264 | |
We had that also, for example, with the example | |
with the rice plant in Germany, we would just | |
0:02:24.264 --> 0:02:24.820 | |
say rice. | |
0:02:24.904 --> 0:02:33.178 | |
Or in English, while in other countries you | |
have to distinguish between rice plant or rice | |
0:02:33.178 --> 0:02:33.724 | |
as a. | |
0:02:34.194 --> 0:02:40.446 | |
And then it's not always possible to directly | |
infer this on the surface. | |
0:02:41.781 --> 0:02:48.501 | |
And if we make it to the last point otherwise | |
we'll do that next Tuesday or we'll partly | |
0:02:48.501 --> 0:02:55.447 | |
do it only here is like we'll describe briefly | |
the three main approaches on a rule based so | |
0:02:55.447 --> 0:02:59.675 | |
linguistic motivated ways of doing machine | |
translation. | |
0:02:59.779 --> 0:03:03.680 | |
We mentioned them last time like the direct | |
translation. | |
0:03:03.680 --> 0:03:10.318 | |
The translation by transfer the lingua interlingua | |
bass will do that a bit more in detail today. | |
0:03:10.590 --> 0:03:27.400 | |
But very briefly because this is not a focus | |
of this class and then next week because. | |
0:03:29.569 --> 0:03:31.757 | |
Why do we think this is important? | |
0:03:31.757 --> 0:03:37.259 | |
On the one hand, of course, we are dealing | |
with natural language, so therefore it might | |
0:03:37.259 --> 0:03:43.074 | |
be good to spend a bit of time in understanding | |
what we are really dealing with because this | |
0:03:43.074 --> 0:03:45.387 | |
is challenging these other problems. | |
0:03:45.785 --> 0:03:50.890 | |
And on the other hand, this was the first | |
way of how we're doing machine translation. | |
0:03:51.271 --> 0:04:01.520 | |
Therefore, it's interesting to understand | |
what was the idea behind that and also to later | |
0:04:01.520 --> 0:04:08.922 | |
see what is done differently and to understand | |
when some models. | |
0:04:13.453 --> 0:04:20.213 | |
When we're talking about linguistics, we can | |
of course do that on different levels and there's | |
0:04:20.213 --> 0:04:21.352 | |
different ways. | |
0:04:21.521 --> 0:04:26.841 | |
On the right side here you are seeing the | |
basic levels of linguistics. | |
0:04:27.007 --> 0:04:31.431 | |
So we have at the bottom the phonetics and | |
phonology. | |
0:04:31.431 --> 0:04:38.477 | |
Phones will not cover this year because we | |
are mainly focusing on text input where we | |
0:04:38.477 --> 0:04:42.163 | |
are directly having directors and then work. | |
0:04:42.642 --> 0:04:52.646 | |
Then what we touch today, at least mention | |
what it is, is a morphology which is the first | |
0:04:52.646 --> 0:04:53.424 | |
level. | |
0:04:53.833 --> 0:04:59.654 | |
Already mentioned it a bit on Tuesday that | |
of course there are some languages where this | |
0:04:59.654 --> 0:05:05.343 | |
is very, very basic and there is not really | |
a lot of rules of how you can build words. | |
0:05:05.343 --> 0:05:11.099 | |
But since I assume you all have some basic | |
knowledge of German there is like a lot more | |
0:05:11.099 --> 0:05:12.537 | |
challenges than that. | |
0:05:13.473 --> 0:05:20.030 | |
You know, maybe if you're a native speaker | |
that's quite easy and everything is clear, | |
0:05:20.030 --> 0:05:26.969 | |
but if you have to learn it like the endings | |
of a word, we are famous for doing compositar | |
0:05:26.969 --> 0:05:29.103 | |
and putting words together. | |
0:05:29.103 --> 0:05:31.467 | |
So this is like the first lab. | |
0:05:32.332 --> 0:05:40.268 | |
Then we have the syntax, which is both on | |
the word and on the sentence level, and that's | |
0:05:40.268 --> 0:05:43.567 | |
about the structure of the sentence. | |
0:05:43.567 --> 0:05:46.955 | |
What are the functions of some words? | |
0:05:47.127 --> 0:05:51.757 | |
You might remember part of speech text from | |
From Your High School Time. | |
0:05:51.757 --> 0:05:57.481 | |
There is like noun and adjective and and things | |
like that and this is something helpful. | |
0:05:57.737 --> 0:06:03.933 | |
Just imagine in the beginning that it was | |
not only used for rule based but for statistical | |
0:06:03.933 --> 0:06:10.538 | |
machine translation, for example, the reordering | |
between languages was quite a challenging task. | |
0:06:10.770 --> 0:06:16.330 | |
Especially if you have long range reorderings | |
and their part of speech information is very | |
0:06:16.330 --> 0:06:16.880 | |
helpful. | |
0:06:16.880 --> 0:06:20.301 | |
You know, in German you have to move the word | |
the verb. | |
0:06:20.260 --> 0:06:26.599 | |
To the second position, if you have Spanish | |
you have to change the noun and the adjective | |
0:06:26.599 --> 0:06:30.120 | |
so information from part of speech could be | |
very. | |
0:06:30.410 --> 0:06:38.621 | |
Then you have a syntax base structure where | |
you have a full syntax tree in the beginning | |
0:06:38.621 --> 0:06:43.695 | |
and then it came into statistical machine translation. | |
0:06:44.224 --> 0:06:50.930 | |
And it got more and more important for statistical | |
machine translation that you are really trying | |
0:06:50.930 --> 0:06:53.461 | |
to model the whole syntax tree of a. | |
0:06:53.413 --> 0:06:57.574 | |
Sentence in order to better match how to do | |
that in UM. | |
0:06:57.574 --> 0:07:04.335 | |
In the target language, a bit yeah, the syntax | |
based statistical machine translation had a | |
0:07:04.335 --> 0:07:05.896 | |
bitter of a problem. | |
0:07:05.896 --> 0:07:08.422 | |
It got better and better and was. | |
0:07:08.368 --> 0:07:13.349 | |
Just on the way of getting better in some | |
languages than traditional statistical models. | |
0:07:13.349 --> 0:07:18.219 | |
But then the neural models came up and they | |
were just so much better in modelling that | |
0:07:18.219 --> 0:07:19.115 | |
all implicitly. | |
0:07:19.339 --> 0:07:23.847 | |
So that they are never were used in practice | |
so much. | |
0:07:24.304 --> 0:07:34.262 | |
And then we'll talk about the semantics, so | |
what is the meaning of the words? | |
0:07:34.262 --> 0:07:40.007 | |
Last time words can have different meanings. | |
0:07:40.260 --> 0:07:46.033 | |
And yeah, how you represent meaning of cause | |
is very challenging. | |
0:07:45.966 --> 0:07:53.043 | |
And normally that like formalizing this is | |
typically done in quite limited domains because | |
0:07:53.043 --> 0:08:00.043 | |
like doing that for like all possible words | |
has not really been achieved yet in this very | |
0:08:00.043 --> 0:08:00.898 | |
challenge. | |
0:08:02.882 --> 0:08:09.436 | |
About pragmatics, so pragmatics is then what | |
is meaning in the context of the current situation. | |
0:08:09.789 --> 0:08:16.202 | |
So one famous example is there, for example, | |
if you say the light is red. | |
0:08:16.716 --> 0:08:21.795 | |
The traffic light is red so that typically | |
not you don't want to tell the other person | |
0:08:21.795 --> 0:08:27.458 | |
if you're sitting in a car that it's surprising | |
oh the light is red but typically you're meaning | |
0:08:27.458 --> 0:08:30.668 | |
okay you should stop and you shouldn't pass | |
the light. | |
0:08:30.850 --> 0:08:40.994 | |
So the meaning of this sentence, the light, | |
is red in the context of sitting in the car. | |
0:08:42.762 --> 0:08:51.080 | |
So let's start with the morphology so that | |
with the things we are starting there and one | |
0:08:51.080 --> 0:08:53.977 | |
easy and first thing is there. | |
0:08:53.977 --> 0:09:02.575 | |
Of course we have to split the sentence into | |
words or joint directors so that we have word. | |
0:09:02.942 --> 0:09:09.017 | |
Because in most of our work we'll deal like | |
machine translation with some type of words. | |
0:09:09.449 --> 0:09:15.970 | |
In neuromachine translation, people are working | |
also on director based and subwords, but a | |
0:09:15.970 --> 0:09:20.772 | |
basic unique words of the sentence is a very | |
important first step. | |
0:09:21.421 --> 0:09:32.379 | |
And for many languages that is quite simple | |
in German, it's not that hard to determine | |
0:09:32.379 --> 0:09:33.639 | |
the word. | |
0:09:34.234 --> 0:09:46.265 | |
In tokenization, the main challenge is if | |
we are doing corpus-based methods that we are | |
0:09:46.265 --> 0:09:50.366 | |
also dealing as normal words. | |
0:09:50.770 --> 0:10:06.115 | |
And there of course it's getting a bit more | |
challenging. | |
0:10:13.173 --> 0:10:17.426 | |
So that is maybe the main thing where, for | |
example, in Germany, if you think of German | |
0:10:17.426 --> 0:10:19.528 | |
tokenization, it's easy to get every word. | |
0:10:19.779 --> 0:10:26.159 | |
You split it at a space, but then you would | |
have the dots at the end join to the last word, | |
0:10:26.159 --> 0:10:30.666 | |
and of course that you don't want because it's | |
a different word. | |
0:10:30.666 --> 0:10:37.046 | |
The last word would not be go, but go dot, | |
but what you can do is split up the dots always. | |
0:10:37.677 --> 0:10:45.390 | |
Can you really do that always or it might | |
be sometimes better to keep the dot as a point? | |
0:10:47.807 --> 0:10:51.001 | |
For example, email addresses or abbreviations | |
here. | |
0:10:51.001 --> 0:10:56.284 | |
For example, doctor, maybe it doesn't make | |
sense to split up the dot because then you | |
0:10:56.284 --> 0:11:01.382 | |
would assume all year starts a new sentence, | |
but it's just the DR dot from doctor. | |
0:11:01.721 --> 0:11:08.797 | |
Or if you have numbers like he's a seventh | |
person like the zipter, then you don't want | |
0:11:08.797 --> 0:11:09.610 | |
to split. | |
0:11:09.669 --> 0:11:15.333 | |
So there are some things where it could be | |
a bit more difficult, but it's not really challenging. | |
0:11:16.796 --> 0:11:23.318 | |
In other languages it's getting a lot more | |
challenging, especially in Asian languages | |
0:11:23.318 --> 0:11:26.882 | |
where often there are no spaces between words. | |
0:11:27.147 --> 0:11:32.775 | |
So you just have the sequence of characters. | |
0:11:32.775 --> 0:11:38.403 | |
The quick brown fox jumps over the lazy dog. | |
0:11:38.999 --> 0:11:44.569 | |
And then it still might be helpful to work | |
on something like words. | |
0:11:44.569 --> 0:11:48.009 | |
Then you need to have a bit more complex. | |
0:11:48.328 --> 0:11:55.782 | |
And here you see we are again having our typical | |
problem. | |
0:11:55.782 --> 0:12:00.408 | |
That means that there is ambiguity. | |
0:12:00.600 --> 0:12:02.104 | |
So you're seeing here. | |
0:12:02.104 --> 0:12:08.056 | |
We have exactly the same sequence of characters | |
or here, but depending on how we split it, | |
0:12:08.056 --> 0:12:12.437 | |
it means he is your servant or he is the one | |
who used your things. | |
0:12:12.437 --> 0:12:15.380 | |
Or here we have round eyes and take the air. | |
0:12:15.895 --> 0:12:22.953 | |
So then of course yeah this type of tokenization | |
gets more important because you could introduce | |
0:12:22.953 --> 0:12:27.756 | |
already arrows and you can imagine if you're | |
doing it here wrong. | |
0:12:27.756 --> 0:12:34.086 | |
If you once do a wrong decision it's quite | |
difficult to recover from a wrong decision. | |
0:12:34.634 --> 0:12:47.088 | |
And so in these cases looking about how we're | |
doing tokenization is an important issue. | |
0:12:47.127 --> 0:12:54.424 | |
And then it might be helpful to do things | |
like director based models where we treat each | |
0:12:54.424 --> 0:12:56.228 | |
director as a symbol. | |
0:12:56.228 --> 0:13:01.803 | |
For example, do this decision in the later | |
or never really do this? | |
0:13:06.306 --> 0:13:12.033 | |
The other thing is that if we have words we | |
might, it might not be the optimal unit to | |
0:13:12.033 --> 0:13:18.155 | |
work with because it can be that we should | |
look into the internal structure of words because | |
0:13:18.155 --> 0:13:20.986 | |
if we have a morphological rich language,. | |
0:13:21.141 --> 0:13:27.100 | |
That means we have a lot of different types | |
of words, and if you have a lot of many different | |
0:13:27.100 --> 0:13:32.552 | |
types of words, it on the other hand means | |
of course each of these words we have seen | |
0:13:32.552 --> 0:13:33.757 | |
very infrequently. | |
0:13:33.793 --> 0:13:39.681 | |
So if you only have ten words and you have | |
a large corpus, each word occurs more often. | |
0:13:39.681 --> 0:13:45.301 | |
If you have three million different words, | |
then each of them will occur less often. | |
0:13:45.301 --> 0:13:51.055 | |
Hopefully you know, from machine learning, | |
it's helpful if you have seen each example | |
0:13:51.055 --> 0:13:51.858 | |
very often. | |
0:13:52.552 --> 0:13:54.524 | |
And so why does it help? | |
0:13:54.524 --> 0:13:56.495 | |
Why does it help happen? | |
0:13:56.495 --> 0:14:02.410 | |
Yeah, in some languages we have quite a complex | |
information inside a word. | |
0:14:02.410 --> 0:14:09.271 | |
So here's a word from a finish talosanikiko | |
or something like that, and it means in my | |
0:14:09.271 --> 0:14:10.769 | |
house to question. | |
0:14:11.491 --> 0:14:15.690 | |
So you have all these information attached | |
to the word. | |
0:14:16.036 --> 0:14:20.326 | |
And that of course in extreme case that's | |
why typically, for example, Finnish is the | |
0:14:20.326 --> 0:14:20.831 | |
language. | |
0:14:20.820 --> 0:14:26.725 | |
Where machine translation quality is less | |
good because generating all these different | |
0:14:26.725 --> 0:14:33.110 | |
morphological variants is is a challenge and | |
the additional challenge is typically in finish | |
0:14:33.110 --> 0:14:39.564 | |
not really low resource but for in low resource | |
languages you quite often have more difficult | |
0:14:39.564 --> 0:14:40.388 | |
morphology. | |
0:14:40.440 --> 0:14:43.949 | |
Mean English is an example of a relatively | |
easy one. | |
0:14:46.066 --> 0:14:54.230 | |
And so in general we can say that words are | |
composed of more themes, and more themes are | |
0:14:54.230 --> 0:15:03.069 | |
the smallest meaning carrying unit, so normally | |
it means: All morphine should have some type | |
0:15:03.069 --> 0:15:04.218 | |
of meaning. | |
0:15:04.218 --> 0:15:09.004 | |
For example, here does not really have a meaning. | |
0:15:09.289 --> 0:15:12.005 | |
Bian has some type of meaning. | |
0:15:12.005 --> 0:15:14.371 | |
It's changing the meaning. | |
0:15:14.371 --> 0:15:21.468 | |
The NES has the meaning that it's making out | |
of an adjective, a noun, and happy. | |
0:15:21.701 --> 0:15:31.215 | |
So each of these parts conveys some meaning, | |
but you cannot split them further up and have | |
0:15:31.215 --> 0:15:32.156 | |
somehow. | |
0:15:32.312 --> 0:15:36.589 | |
You see that of course a little bit more is | |
happening. | |
0:15:36.589 --> 0:15:43.511 | |
Typically the Y is going into an E so there | |
can be some variation, but these are typical | |
0:15:43.511 --> 0:15:46.544 | |
examples of what we have as morphines. | |
0:16:02.963 --> 0:16:08.804 | |
That is, of course, a problem and that's the | |
question why how you do your splitting. | |
0:16:08.804 --> 0:16:15.057 | |
But that problem we have anyway always because | |
even full words can have different meanings | |
0:16:15.057 --> 0:16:17.806 | |
depending on the context they're using. | |
0:16:18.038 --> 0:16:24.328 | |
So we always have to somewhat have a model | |
which can infer or represent the meaning of | |
0:16:24.328 --> 0:16:25.557 | |
the word in the. | |
0:16:25.825 --> 0:16:30.917 | |
But you are right that this problem might | |
get even more severe if you're splitting up. | |
0:16:30.917 --> 0:16:36.126 | |
Therefore, it might not be the best to go | |
for the very extreme and represent each letter | |
0:16:36.126 --> 0:16:41.920 | |
and have a model which is only on letters because, | |
of course, a letter can have a lot of different | |
0:16:41.920 --> 0:16:44.202 | |
meanings depending on where it's used. | |
0:16:44.524 --> 0:16:50.061 | |
And yeah, there is no right solution like | |
what is the right splitting. | |
0:16:50.061 --> 0:16:56.613 | |
It depends on the language and the application | |
on the amount of data you're having. | |
0:16:56.613 --> 0:17:01.058 | |
For example, typically it means the fewer | |
data you have. | |
0:17:01.301 --> 0:17:12.351 | |
The more splitting you should do, if you have | |
more data, then you can be better distinguish. | |
0:17:13.653 --> 0:17:19.065 | |
Then there are different types of morphines: | |
So we have typically one stemmed theme: It's | |
0:17:19.065 --> 0:17:21.746 | |
like house or tish, so the main meaning. | |
0:17:21.941 --> 0:17:29.131 | |
And then you can have functional or bound | |
morphemes which can be f which can be prefix, | |
0:17:29.131 --> 0:17:34.115 | |
suffix, infix or circumfix so it can be before | |
can be after. | |
0:17:34.114 --> 0:17:39.416 | |
It can be inside or it can be around it, something | |
like a coughed there. | |
0:17:39.416 --> 0:17:45.736 | |
Typically you would say that it's not like | |
two more themes, G and T, because they both | |
0:17:45.736 --> 0:17:50.603 | |
describe the function, but together G and T | |
are marking the cough. | |
0:17:53.733 --> 0:18:01.209 | |
For what are people using them you can use | |
them for inflection to describe something like | |
0:18:01.209 --> 0:18:03.286 | |
tense count person case. | |
0:18:04.604 --> 0:18:09.238 | |
That is yeah, if you know German, this is | |
commonly used in German. | |
0:18:10.991 --> 0:18:16.749 | |
But of course there is a lot more complicated | |
things: I think in in some languages it also. | |
0:18:16.749 --> 0:18:21.431 | |
I mean, in Germany it only depends counting | |
person on the subject. | |
0:18:21.431 --> 0:18:27.650 | |
For the word, for example, in other languages | |
it can also determine the first and on the | |
0:18:27.650 --> 0:18:28.698 | |
second object. | |
0:18:28.908 --> 0:18:35.776 | |
So that it like if you buy an apple or an | |
house, that not only the, the, the. | |
0:18:35.776 --> 0:18:43.435 | |
Kauft depends on on me like in German, but | |
it can also depend on whether it's an apple | |
0:18:43.435 --> 0:18:44.492 | |
or a house. | |
0:18:44.724 --> 0:18:48.305 | |
And then of course you have an exploding number | |
of web fronts. | |
0:18:49.409 --> 0:19:04.731 | |
Furthermore, it can be used to do derivations | |
so you can make other types of words from it. | |
0:19:05.165 --> 0:19:06.254 | |
And then yeah. | |
0:19:06.254 --> 0:19:12.645 | |
This is like creating new words by joining | |
them like rainbow waterproof but for example | |
0:19:12.645 --> 0:19:19.254 | |
in German like Einköw's Wagen, Ice Cult and | |
so on where you can join where you can do that | |
0:19:19.254 --> 0:19:22.014 | |
with nouns and German adjectives and. | |
0:19:22.282 --> 0:19:29.077 | |
Then of course you might have additional challenges | |
like the Fugan where you have to add this one. | |
0:19:32.452 --> 0:19:39.021 | |
Yeah, then there is a yeah of course additional | |
special things. | |
0:19:39.639 --> 0:19:48.537 | |
You have to sometimes put extra stuff because | |
of phonology, so it's dig the plural, not plural. | |
0:19:48.537 --> 0:19:56.508 | |
The third person singular, as in English, | |
is normally S, but by Goes, for example, is | |
0:19:56.508 --> 0:19:57.249 | |
an E S. | |
0:19:57.277 --> 0:20:04.321 | |
In German you can also have other things that | |
like Osmutta gets Mutter so you're changing | |
0:20:04.321 --> 0:20:11.758 | |
the Umlaud in order to express the plural and | |
in other languages for example the vowel harmony | |
0:20:11.758 --> 0:20:17.315 | |
where the vowels inside are changing depending | |
on which form you have. | |
0:20:17.657 --> 0:20:23.793 | |
Which makes things more difficult than splitting | |
a word into its part doesn't really work anymore. | |
0:20:23.793 --> 0:20:28.070 | |
So like for Muta and Muta, for example, that | |
is not really possible. | |
0:20:28.348 --> 0:20:36.520 | |
The nice thing is, of course, more like a | |
general thing, but often irregular things are | |
0:20:36.520 --> 0:20:39.492 | |
happening as words which occur. | |
0:20:39.839 --> 0:20:52.177 | |
So that you can have enough examples, while | |
the regular things you can do by some type | |
0:20:52.177 --> 0:20:53.595 | |
of rules. | |
0:20:55.655 --> 0:20:57.326 | |
Yeah, This Can Be Done. | |
0:20:57.557 --> 0:21:02.849 | |
So there are tasks on this: how to do automatic | |
inflection, how to analyze them. | |
0:21:02.849 --> 0:21:04.548 | |
So you give it a word to. | |
0:21:04.548 --> 0:21:10.427 | |
It's telling you what are the possible forms | |
of that, like how they are built, and so on. | |
0:21:10.427 --> 0:21:15.654 | |
And for the at least Ah Iris shoes language, | |
there are a lot of tools for that. | |
0:21:15.654 --> 0:21:18.463 | |
Of course, if you now want to do that for. | |
0:21:18.558 --> 0:21:24.281 | |
Some language which is very low resourced | |
might be very difficult and there might be | |
0:21:24.281 --> 0:21:25.492 | |
no tool for them. | |
0:21:28.368 --> 0:21:37.652 | |
Good before we are going for the next part | |
about part of speech, are there any questions | |
0:21:37.652 --> 0:21:38.382 | |
about? | |
0:22:01.781 --> 0:22:03.187 | |
Yeah, we'll come to that a bit. | |
0:22:03.483 --> 0:22:09.108 | |
So it's a very good question and difficult | |
and especially we'll see that later if you | |
0:22:09.108 --> 0:22:14.994 | |
just put in words it would be very bad because | |
words are put into neural networks just as | |
0:22:14.994 --> 0:22:15.844 | |
some digits. | |
0:22:15.844 --> 0:22:21.534 | |
Each word is mapped into a jitter and you | |
put it in so it doesn't really know any more | |
0:22:21.534 --> 0:22:22.908 | |
about the structure. | |
0:22:23.543 --> 0:22:29.898 | |
What we will see therefore the most successful | |
approach which is mostly done is a subword | |
0:22:29.898 --> 0:22:34.730 | |
unit where we split: But we will do this. | |
0:22:34.730 --> 0:22:40.154 | |
Don't know if you have been in advanced. | |
0:22:40.154 --> 0:22:44.256 | |
We'll cover this on a Tuesday. | |
0:22:44.364 --> 0:22:52.316 | |
So there is an algorithm called bite pairing | |
coding, which is about splitting words into | |
0:22:52.316 --> 0:22:52.942 | |
parts. | |
0:22:53.293 --> 0:23:00.078 | |
So it's doing the splitting of words but not | |
morphologically motivated but more based on | |
0:23:00.078 --> 0:23:00.916 | |
frequency. | |
0:23:00.940 --> 0:23:11.312 | |
However, it performs very good and that's | |
why it's used and there is a bit of correlation. | |
0:23:11.312 --> 0:23:15.529 | |
Sometimes they agree on count based. | |
0:23:15.695 --> 0:23:20.709 | |
So we're splitting words and we're splitting | |
especially words which are infrequent and that's | |
0:23:20.709 --> 0:23:23.962 | |
maybe a good motivation why that's good for | |
neural networks. | |
0:23:23.962 --> 0:23:28.709 | |
That means if you have seen a word very often | |
you don't need to split it and it's easier | |
0:23:28.709 --> 0:23:30.043 | |
to just process it fast. | |
0:23:30.690 --> 0:23:39.218 | |
While if you have seen the words infrequently, | |
it is good to split it into parts so it can | |
0:23:39.218 --> 0:23:39.593 | |
do. | |
0:23:39.779 --> 0:23:47.729 | |
So there is some way of doing it, but linguists | |
would say this is not a morphological analyst. | |
0:23:47.729 --> 0:23:53.837 | |
That is true, but we are spitting words into | |
parts if they are not seen. | |
0:23:59.699 --> 0:24:06.324 | |
Yes, so another important thing about words | |
are the paddle speech text. | |
0:24:06.324 --> 0:24:14.881 | |
These are the common ones: noun, verb, adjective, | |
verb, determine, pronoun, proposition, and | |
0:24:14.881 --> 0:24:16.077 | |
conjunction. | |
0:24:16.077 --> 0:24:26.880 | |
There are some more: They are not the same | |
in all language, but for example there is this | |
0:24:26.880 --> 0:24:38.104 | |
universal grammar which tries to do this type | |
of part of speech text for many languages. | |
0:24:38.258 --> 0:24:42.018 | |
And then, of course, it's helping you for | |
generalization. | |
0:24:42.018 --> 0:24:48.373 | |
There are some language deals with verbs and | |
nouns, especially if you look at sentence structure. | |
0:24:48.688 --> 0:24:55.332 | |
And so if you know the part of speech tag | |
you can easily generalize and do get these | |
0:24:55.332 --> 0:24:58.459 | |
rules or apply these rules as you know. | |
0:24:58.459 --> 0:25:02.680 | |
The verb in English is always at the second | |
position. | |
0:25:03.043 --> 0:25:10.084 | |
So you know how to deal with verbs independently | |
of which words you are now really looking at. | |
0:25:12.272 --> 0:25:18.551 | |
And that again can be done is ambiguous. | |
0:25:18.598 --> 0:25:27.171 | |
So there are some words which can have several | |
pot of speech text. | |
0:25:27.171 --> 0:25:38.686 | |
Example are the word can, for example, which | |
can be the can of beans or can do something. | |
0:25:38.959 --> 0:25:46.021 | |
Often is also in English related work. | |
0:25:46.021 --> 0:25:55.256 | |
Access can be to excess or to access to something. | |
0:25:56.836 --> 0:26:02.877 | |
Most words have only one single part of speech | |
tag, but they are some where it's a bit more | |
0:26:02.877 --> 0:26:03.731 | |
challenging. | |
0:26:03.731 --> 0:26:09.640 | |
The nice thing is the ones which are in big | |
are often more words, which occur more often, | |
0:26:09.640 --> 0:26:12.858 | |
while for really ware words it's not that often. | |
0:26:13.473 --> 0:26:23.159 | |
If you look at these classes you can distinguish | |
open classes where new words can happen so | |
0:26:23.159 --> 0:26:25.790 | |
we can invent new nouns. | |
0:26:26.926 --> 0:26:31.461 | |
But then there are the close classes which | |
I think are determined or pronoun. | |
0:26:31.461 --> 0:26:35.414 | |
For example, it's not that you can easily | |
develop your new pronoun. | |
0:26:35.414 --> 0:26:38.901 | |
So there is a fixed list of pronouns and we | |
are using that. | |
0:26:38.901 --> 0:26:44.075 | |
So it's not like that or tomorrow there is | |
something happening and then people are using | |
0:26:44.075 --> 0:26:44.482 | |
a new. | |
0:26:45.085 --> 0:26:52.426 | |
Pronoun or new conjectures, so it's like end, | |
because it's not that you normally invent a | |
0:26:52.426 --> 0:26:52.834 | |
new. | |
0:27:00.120 --> 0:27:03.391 | |
And additional to part of speech text. | |
0:27:03.391 --> 0:27:09.012 | |
Then some of these part of speech texts have | |
different properties. | |
0:27:09.389 --> 0:27:21.813 | |
So, for example, for nouns and adjectives | |
we can have a singular plural: In other languages, | |
0:27:21.813 --> 0:27:29.351 | |
there is a duel so that a word is not only | |
like a single or in plural, but also like a | |
0:27:29.351 --> 0:27:31.257 | |
duel if it's meaning. | |
0:27:31.631 --> 0:27:36.246 | |
You have the gender and masculine feminine | |
neutre we know. | |
0:27:36.246 --> 0:27:43.912 | |
In other language there is animated and inanimated | |
and you have the cases like in German you have | |
0:27:43.912 --> 0:27:46.884 | |
no maternative guinetive acquisitive. | |
0:27:47.467 --> 0:27:57.201 | |
So here and then in other languages you also | |
have Latin with the upper teeth. | |
0:27:57.497 --> 0:28:03.729 | |
So there's like more, it's just like yeah, | |
and there you have no one to one correspondence, | |
0:28:03.729 --> 0:28:09.961 | |
so it can be that there are some cases which | |
are only in the one language and do not happen | |
0:28:09.961 --> 0:28:11.519 | |
in the other language. | |
0:28:13.473 --> 0:28:20.373 | |
For whorps we have tenses of course like walk | |
is walking walked have walked head walked will | |
0:28:20.373 --> 0:28:21.560 | |
walk and so on. | |
0:28:21.560 --> 0:28:28.015 | |
Interestingly for example in Japanese this | |
can also happen for adjectives though there | |
0:28:28.015 --> 0:28:32.987 | |
is a difference between something is white | |
or something was white. | |
0:28:35.635 --> 0:28:41.496 | |
There is this continuous thing which should | |
not really have that commonly in German and | |
0:28:41.496 --> 0:28:47.423 | |
I guess that's if you're German and learning | |
English that's something like she sings and | |
0:28:47.423 --> 0:28:53.350 | |
she is singing and of course we can express | |
that but it's not commonly used and normally | |
0:28:53.350 --> 0:28:55.281 | |
we're not doing this aspect. | |
0:28:55.455 --> 0:28:57.240 | |
Also about tenses. | |
0:28:57.240 --> 0:29:05.505 | |
If you use pasts in English you will also | |
use past tenses in German, so we have similar | |
0:29:05.505 --> 0:29:09.263 | |
tenses, but the use might be different. | |
0:29:14.214 --> 0:29:20.710 | |
There is uncertainty like the mood in there | |
indicative. | |
0:29:20.710 --> 0:29:26.742 | |
If he were here, there's voices active and | |
passive. | |
0:29:27.607 --> 0:29:34.024 | |
That you know, that is like both in German | |
and English there, but there is something in | |
0:29:34.024 --> 0:29:35.628 | |
the Middle and Greek. | |
0:29:35.628 --> 0:29:42.555 | |
I get myself taught, so there is other phenomens | |
than which might only happen in one language. | |
0:29:42.762 --> 0:29:50.101 | |
This is, like yeah, the different synthetic | |
structures that you can can have in the language, | |
0:29:50.101 --> 0:29:57.361 | |
and where there's the two things, so it might | |
be that some only are in some language, others | |
0:29:57.361 --> 0:29:58.376 | |
don't exist. | |
0:29:58.358 --> 0:30:05.219 | |
And on the other hand there is also matching, | |
so it might be that in some situations you | |
0:30:05.219 --> 0:30:07.224 | |
use different structures. | |
0:30:10.730 --> 0:30:13.759 | |
The next would be then about semantics. | |
0:30:13.759 --> 0:30:16.712 | |
Do you have any questions before that? | |
0:30:19.819 --> 0:30:31.326 | |
I'll just continue, but if something is unclear | |
beside the structure, we typically have more | |
0:30:31.326 --> 0:30:39.863 | |
ambiguities, so it can be that words itself | |
have different meanings. | |
0:30:40.200 --> 0:30:48.115 | |
And we are typically talking about polysemy | |
and homonyme, where polysemy means that a word | |
0:30:48.115 --> 0:30:50.637 | |
can have different meanings. | |
0:30:50.690 --> 0:30:58.464 | |
So if you have the English word interest, | |
it can be that you are interested in something. | |
0:30:58.598 --> 0:31:07.051 | |
Or it can be like the interest rate financial, | |
but it is somehow related because if you are | |
0:31:07.051 --> 0:31:11.002 | |
getting some interest rates there is some. | |
0:31:11.531 --> 0:31:18.158 | |
Are, but there is a homophemer where they | |
really are not related. | |
0:31:18.458 --> 0:31:24.086 | |
So you can and can doesn't really have anything | |
in common, so it's really very different. | |
0:31:24.324 --> 0:31:29.527 | |
And of course that's not completely clear | |
so there is not a clear definition so for example | |
0:31:29.527 --> 0:31:34.730 | |
for the bank it can be that you say it's related | |
but it can also be other can argue that so | |
0:31:34.730 --> 0:31:39.876 | |
there are some clear things which is interest | |
there are some which is vague and then there | |
0:31:39.876 --> 0:31:43.439 | |
are some where it's very clear again that there | |
are different. | |
0:31:45.065 --> 0:31:49.994 | |
And in order to translate them, of course, | |
we might need the context to disambiguate. | |
0:31:49.994 --> 0:31:54.981 | |
That's typically where we can disambiguate, | |
and that's not only for lexical semantics, | |
0:31:54.981 --> 0:32:00.198 | |
that's generally very often that if you want | |
to disambiguate, context can be very helpful. | |
0:32:00.198 --> 0:32:03.981 | |
So in which sentence and which general knowledge | |
who is speaking? | |
0:32:04.944 --> 0:32:09.867 | |
You can do that externally by some disinvigration | |
task. | |
0:32:09.867 --> 0:32:14.702 | |
Machine translation system will also do it | |
internally. | |
0:32:16.156 --> 0:32:21.485 | |
And sometimes you're lucky and you don't need | |
to do it because you just have the same ambiguity | |
0:32:21.485 --> 0:32:23.651 | |
in the source and the target language. | |
0:32:23.651 --> 0:32:26.815 | |
And then it doesn't matter if you think about | |
the mouse. | |
0:32:26.815 --> 0:32:31.812 | |
As I said, you don't really need to know if | |
it's a computer mouse or the living mouse you | |
0:32:31.812 --> 0:32:36.031 | |
translate from German to English because it | |
has exactly the same ambiguity. | |
0:32:40.400 --> 0:32:46.764 | |
There's also relations between words like | |
synonyms, antonyms, hipponomes, like the is | |
0:32:46.764 --> 0:32:50.019 | |
a relation and the part of like Dora House. | |
0:32:50.019 --> 0:32:55.569 | |
Big small is an antonym and synonym is like | |
which needs something similar. | |
0:32:56.396 --> 0:33:03.252 | |
There are resources which try to express all | |
these linguistic information like word net | |
0:33:03.252 --> 0:33:10.107 | |
or German net where you have a graph with words | |
and how they are related to each other. | |
0:33:11.131 --> 0:33:12.602 | |
Which can be helpful. | |
0:33:12.602 --> 0:33:18.690 | |
Typically these things were more used in tasks | |
where there is fewer data, so there's a lot | |
0:33:18.690 --> 0:33:24.510 | |
of tasks in NLP where you have very limited | |
data because you really need to hand align | |
0:33:24.510 --> 0:33:24.911 | |
that. | |
0:33:25.125 --> 0:33:28.024 | |
Machine translation has a big advantage. | |
0:33:28.024 --> 0:33:31.842 | |
There's naturally a lot of text translated | |
out there. | |
0:33:32.212 --> 0:33:39.519 | |
Typically in machine translation we have compared | |
to other tasks significantly amount of data. | |
0:33:39.519 --> 0:33:46.212 | |
People have looked into integrating wordnet | |
or things like that, but it is rarely used | |
0:33:46.212 --> 0:33:49.366 | |
in like commercial systems or something. | |
0:33:52.692 --> 0:33:55.626 | |
So this was based on the words. | |
0:33:55.626 --> 0:34:03.877 | |
We have morphology, syntax, and semantics, | |
and then of course it makes sense to also look | |
0:34:03.877 --> 0:34:06.169 | |
at the bigger structure. | |
0:34:06.169 --> 0:34:08.920 | |
That means information about. | |
0:34:08.948 --> 0:34:17.822 | |
Of course, we don't have a really morphology | |
there because morphology about the structure | |
0:34:17.822 --> 0:34:26.104 | |
of words, but we have syntax on the sentence | |
level and the semantic representation. | |
0:34:28.548 --> 0:34:35.637 | |
When we are thinking about the sentence structure, | |
then the sentence is, of course, first a sequence | |
0:34:35.637 --> 0:34:37.742 | |
of words terminated by a dot. | |
0:34:37.742 --> 0:34:42.515 | |
Jane bought the house and we can say something | |
about the structure. | |
0:34:42.515 --> 0:34:47.077 | |
It's typically its subject work and then one | |
or several objects. | |
0:34:47.367 --> 0:34:51.996 | |
And the number of objects, for example, is | |
then determined by the word. | |
0:34:52.232 --> 0:34:54.317 | |
It's Called the Valency. | |
0:34:54.354 --> 0:35:01.410 | |
So you have intransitive verbs which don't | |
get any object, it's just to sleep. | |
0:35:02.622 --> 0:35:05.912 | |
For example, there is no object sleep beds. | |
0:35:05.912 --> 0:35:14.857 | |
You cannot say that: And there are transitive | |
verbs where you have to put one or more objects, | |
0:35:14.857 --> 0:35:16.221 | |
and you always. | |
0:35:16.636 --> 0:35:19.248 | |
Sentence is not correct if you don't put the | |
object. | |
0:35:19.599 --> 0:35:33.909 | |
So if you have to buy something you have to | |
say bought this or give someone something then. | |
0:35:34.194 --> 0:35:40.683 | |
Here you see a bit that may be interesting | |
the relation between word order and morphology. | |
0:35:40.683 --> 0:35:47.243 | |
Of course it's not that strong, but for example | |
in English you always have to first say who | |
0:35:47.243 --> 0:35:49.453 | |
you gave it and what you gave. | |
0:35:49.453 --> 0:35:53.304 | |
So the structure is very clear and cannot | |
be changed. | |
0:35:54.154 --> 0:36:00.801 | |
German, for example, has a possibility of | |
determining what you gave and whom you gave | |
0:36:00.801 --> 0:36:07.913 | |
it because there is a morphology and you can | |
do what you gave a different form than to whom | |
0:36:07.913 --> 0:36:08.685 | |
you gave. | |
0:36:11.691 --> 0:36:18.477 | |
And that is a general tendency that if you | |
have morphology then typically the word order | |
0:36:18.477 --> 0:36:25.262 | |
is more free and possible, while in English | |
you cannot express these information through | |
0:36:25.262 --> 0:36:26.482 | |
the morphology. | |
0:36:26.706 --> 0:36:30.238 | |
You typically have to express them through | |
the word order. | |
0:36:30.238 --> 0:36:32.872 | |
It's not as free, but it's more restricted. | |
0:36:35.015 --> 0:36:40.060 | |
Yeah, the first part is typically the noun | |
phrase, the subject, and that can not only | |
0:36:40.060 --> 0:36:43.521 | |
be a single noun, but of course it can be a | |
longer phrase. | |
0:36:43.521 --> 0:36:48.860 | |
So if you have Jane the woman, it can be Jane, | |
it can be the woman, it can a woman, it can | |
0:36:48.860 --> 0:36:52.791 | |
be the young woman or the young woman who lives | |
across the street. | |
0:36:53.073 --> 0:36:56.890 | |
All of these are the subjects, so this can | |
be already very, very long. | |
0:36:57.257 --> 0:36:58.921 | |
And they also put this. | |
0:36:58.921 --> 0:37:05.092 | |
The verb is on the second position in a bit | |
more complicated way because if you have now | |
0:37:05.092 --> 0:37:11.262 | |
the young woman who lives across the street | |
runs to somewhere or so then yeah runs is at | |
0:37:11.262 --> 0:37:16.185 | |
the second position in this tree but the first | |
position is quite long. | |
0:37:16.476 --> 0:37:19.277 | |
And so it's not just counting okay. | |
0:37:19.277 --> 0:37:22.700 | |
The second word is always is always a word. | |
0:37:26.306 --> 0:37:32.681 | |
Additional to these simple things, there's | |
more complex stuff. | |
0:37:32.681 --> 0:37:43.104 | |
Jane bought the house from Jim without hesitation, | |
or Jane bought the house in the pushed neighborhood | |
0:37:43.104 --> 0:37:44.925 | |
across the river. | |
0:37:45.145 --> 0:37:51.694 | |
And these often lead to additional ambiguities | |
because it's not always completely clear to | |
0:37:51.694 --> 0:37:53.565 | |
which this prepositional. | |
0:37:54.054 --> 0:37:59.076 | |
So that we'll see and you have, of course, | |
subclasses and so on. | |
0:38:01.061 --> 0:38:09.926 | |
And then there is a theory behind it which | |
was very important for rule based machine translation | |
0:38:09.926 --> 0:38:14.314 | |
because that's exactly what you're doing there. | |
0:38:14.314 --> 0:38:18.609 | |
You would take the sentence, do the syntactic. | |
0:38:18.979 --> 0:38:28.432 | |
So that we can have this constituents which | |
like describe the basic parts of the language. | |
0:38:28.468 --> 0:38:35.268 | |
And we can create the sentence structure as | |
a context free grammar, which you hopefully | |
0:38:35.268 --> 0:38:42.223 | |
remember from basic computer science, which | |
is a pair of non terminals, terminal symbols, | |
0:38:42.223 --> 0:38:44.001 | |
production rules, and. | |
0:38:43.943 --> 0:38:50.218 | |
And the star symbol, and you can then describe | |
a sentence by this phrase structure grammar: | |
0:38:51.751 --> 0:38:59.628 | |
So a simple example would be something like | |
that: you have a lexicon, Jane is a noun, Frays | |
0:38:59.628 --> 0:39:02.367 | |
is a noun, Telescope is a noun. | |
0:39:02.782 --> 0:39:10.318 | |
And then you have these production rules sentences: | |
a noun phrase in the web phrase. | |
0:39:10.318 --> 0:39:18.918 | |
The noun phrase can either be a determinized | |
noun or it can be a noun phrase and a propositional | |
0:39:18.918 --> 0:39:19.628 | |
phrase. | |
0:39:19.919 --> 0:39:25.569 | |
Or a prepositional phrase and a prepositional | |
phrase is a preposition and a non phrase. | |
0:39:26.426 --> 0:39:27.622 | |
We're looking at this. | |
0:39:27.622 --> 0:39:30.482 | |
What is the valency of the word we're describing | |
here? | |
0:39:33.513 --> 0:39:36.330 | |
How many objects would in this case the world | |
have? | |
0:39:46.706 --> 0:39:48.810 | |
We're looking at the web phrase. | |
0:39:48.810 --> 0:39:54.358 | |
The web phrase is a verb and a noun phrase, | |
so one object here, so this would be for a | |
0:39:54.358 --> 0:39:55.378 | |
balance of one. | |
0:39:55.378 --> 0:40:00.925 | |
If you have intransitive verbs, it would be | |
verb phrases, just a word, and if you have | |
0:40:00.925 --> 0:40:03.667 | |
two, it would be noun phrase, noun phrase. | |
0:40:08.088 --> 0:40:15.348 | |
And yeah, then the, the, the challenge or | |
what you have to do is like this: Given a natural | |
0:40:15.348 --> 0:40:23.657 | |
language sentence, you want to parse it to | |
get this type of pastry from programming languages | |
0:40:23.657 --> 0:40:30.198 | |
where you also need to parse the code in order | |
to get the representation. | |
0:40:30.330 --> 0:40:39.356 | |
However, there is one challenge if you parse | |
natural language compared to computer language. | |
0:40:43.823 --> 0:40:56.209 | |
So there are different ways of how you can | |
express things and there are different pastures | |
0:40:56.209 --> 0:41:00.156 | |
belonging to the same input. | |
0:41:00.740 --> 0:41:05.241 | |
So if you have Jane buys a horse, how's that | |
an easy example? | |
0:41:05.241 --> 0:41:07.491 | |
So you do the lexicon look up. | |
0:41:07.491 --> 0:41:13.806 | |
Jane can be a noun phrase, a bias is a verb, | |
a is a determiner, and a house is a noun. | |
0:41:15.215 --> 0:41:18.098 | |
And then you can now use the grammar rules | |
of here. | |
0:41:18.098 --> 0:41:19.594 | |
There is no rule for that. | |
0:41:20.080 --> 0:41:23.564 | |
Here we have no rules, but here we have a | |
rule. | |
0:41:23.564 --> 0:41:27.920 | |
A noun is a non-phrase, so we have mapped | |
that to the noun. | |
0:41:28.268 --> 0:41:34.012 | |
Then we can map this to the web phrase. | |
0:41:34.012 --> 0:41:47.510 | |
We have a verb noun phrase to web phrase and | |
then we can map this to a sentence representing: | |
0:41:49.069 --> 0:41:53.042 | |
We can have that even more complex. | |
0:41:53.042 --> 0:42:01.431 | |
The woman who won the lottery yesterday bought | |
the house across the street. | |
0:42:01.431 --> 0:42:05.515 | |
The structure gets more complicated. | |
0:42:05.685 --> 0:42:12.103 | |
You now see that the word phrase is at the | |
second position, but the noun phrase is quite. | |
0:42:12.052 --> 0:42:18.655 | |
Quite big in here and the p p phrases, it's | |
sometimes difficult where to put them because | |
0:42:18.655 --> 0:42:25.038 | |
they can be put to the noun phrase, but in | |
other sentences they can also be put to the | |
0:42:25.038 --> 0:42:25.919 | |
web phrase. | |
0:42:36.496 --> 0:42:38.250 | |
Yeah. | |
0:42:43.883 --> 0:42:50.321 | |
Yes, so then either it can have two tags, | |
noun or noun phrase, or you can have the extra | |
0:42:50.321 --> 0:42:50.755 | |
rule. | |
0:42:50.755 --> 0:42:57.409 | |
The noun phrase can not only be a determiner | |
in the noun, but it can also be a noun phrase. | |
0:42:57.717 --> 0:43:04.360 | |
Then of course either you introduce additional | |
rules when what is possible or the problem | |
0:43:04.360 --> 0:43:11.446 | |
that if you do pastures which are not correct | |
and then you have to add some type of probability | |
0:43:11.446 --> 0:43:13.587 | |
which type is more probable. | |
0:43:16.876 --> 0:43:23.280 | |
But of course some things also can't really | |
model easily with this type of cheese. | |
0:43:23.923 --> 0:43:32.095 | |
There, for example, the agreement is not straightforward | |
to do so that in subject and work you can check | |
0:43:32.095 --> 0:43:38.866 | |
that the person, the agreement, the number | |
in person, the number agreement is correct, | |
0:43:38.866 --> 0:43:41.279 | |
but if it's a singular object. | |
0:43:41.561 --> 0:43:44.191 | |
A singular verb, it's also a singular. | |
0:43:44.604 --> 0:43:49.242 | |
Non-subject, and if it's a plural subject, | |
it's a plural work. | |
0:43:49.489 --> 0:43:56.519 | |
Things like that are yeah, the agreement in | |
determining action driven now, so they also | |
0:43:56.519 --> 0:43:57.717 | |
have to agree. | |
0:43:57.877 --> 0:44:05.549 | |
Things like that cannot be easily done with | |
this type of grammar or this subcategorization | |
0:44:05.549 --> 0:44:13.221 | |
that you check whether the verb is transitive | |
or intransitive, and that Jane sleeps is OK, | |
0:44:13.221 --> 0:44:16.340 | |
but Jane sleeps the house is not OK. | |
0:44:16.436 --> 0:44:21.073 | |
And Jane Walterhouse is okay, but Jane Walterhouse | |
is not okay. | |
0:44:23.183 --> 0:44:29.285 | |
Furthermore, this long range dependency might | |
be difficult and which word orders are allowed | |
0:44:29.285 --> 0:44:31.056 | |
and which are not allowed. | |
0:44:31.571 --> 0:44:40.011 | |
This is also not directly so you can say Maria | |
give de man das bourg, de man give Maria das | |
0:44:40.011 --> 0:44:47.258 | |
bourg, das bourg give Maria, de man aber Maria, | |
de man give des bourg is some. | |
0:44:47.227 --> 0:44:55.191 | |
One yeah, which one from this one is possible | |
and not is sometimes not possible to model, | |
0:44:55.191 --> 0:44:56.164 | |
is simple. | |
0:44:56.876 --> 0:45:05.842 | |
Therefore, people have done more complex stuff | |
like this unification grammar and tried to | |
0:45:05.842 --> 0:45:09.328 | |
model both the categories of verb. | |
0:45:09.529 --> 0:45:13.367 | |
The agreement has to be that it's person and | |
single. | |
0:45:13.367 --> 0:45:20.028 | |
You're joining that so you're annotating this | |
thing with more information and then you have | |
0:45:20.028 --> 0:45:25.097 | |
more complex synthetic structures in order | |
to model also these types. | |
0:45:28.948 --> 0:45:33.137 | |
Yeah, why is this difficult? | |
0:45:33.873 --> 0:45:39.783 | |
We have different ambiguities and that makes | |
it different, so words have different part | |
0:45:39.783 --> 0:45:43.610 | |
of speech text and if you have time flies like | |
an error. | |
0:45:43.583 --> 0:45:53.554 | |
It can mean that sometimes the animal L look | |
like an arrow and or it can mean that the time | |
0:45:53.554 --> 0:45:59.948 | |
is flying very fast is going away very fast | |
like an error. | |
0:46:00.220 --> 0:46:10.473 | |
And if you want to do a pastry, these two | |
meanings have a different part of speech text, | |
0:46:10.473 --> 0:46:13.008 | |
so flies is the verb. | |
0:46:13.373 --> 0:46:17.999 | |
And of course that is a different semantic, | |
and so that is very different. | |
0:46:19.499 --> 0:46:23.361 | |
And otherwise a structural. | |
0:46:23.243 --> 0:46:32.419 | |
Ambiguity so that like some part of the sentence | |
can have different rules, so the famous thing | |
0:46:32.419 --> 0:46:34.350 | |
is this attachment. | |
0:46:34.514 --> 0:46:39.724 | |
So the cops saw the Bulgara with a binoculars. | |
0:46:39.724 --> 0:46:48.038 | |
Then with a binocular can be attached to saw | |
or it can be attached to the. | |
0:46:48.448 --> 0:46:59.897 | |
And so in the first two it's more probable | |
that he saw the theft, and not that the theft | |
0:46:59.897 --> 0:47:01.570 | |
has the one. | |
0:47:01.982 --> 0:47:13.356 | |
And this, of course, makes things difficult | |
while parsing and doing structure implicitly | |
0:47:13.356 --> 0:47:16.424 | |
defining the semantics. | |
0:47:20.120 --> 0:47:29.736 | |
Therefore, we would then go directly to semantics, | |
but maybe some questions about spintax and | |
0:47:29.736 --> 0:47:31.373 | |
how that works. | |
0:47:33.113 --> 0:47:46.647 | |
Then we'll do a bit more about semantics, | |
so now we only describe the structure of the | |
0:47:46.647 --> 0:47:48.203 | |
sentence. | |
0:47:48.408 --> 0:47:55.584 | |
And for the meaning of the sentence we typically | |
have the compositionality of meaning. | |
0:47:55.584 --> 0:48:03.091 | |
The meaning of the full sentence is determined | |
by the meaning of the individual words, and | |
0:48:03.091 --> 0:48:06.308 | |
they together form the meaning of the. | |
0:48:06.686 --> 0:48:17.936 | |
For words that is partly true but not always | |
mean for things like rainbow, jointly rain | |
0:48:17.936 --> 0:48:19.086 | |
and bow. | |
0:48:19.319 --> 0:48:26.020 | |
But this is not always a case, while for sentences | |
typically that is happening because you can't | |
0:48:26.020 --> 0:48:30.579 | |
directly determine the full meaning, but you | |
split it into parts. | |
0:48:30.590 --> 0:48:36.164 | |
Sometimes only in some parts like kick the | |
bucket the expression. | |
0:48:36.164 --> 0:48:43.596 | |
Of course you cannot get the meaning of kick | |
the bucket by looking at the individual or | |
0:48:43.596 --> 0:48:46.130 | |
in German abyss in its grass. | |
0:48:47.207 --> 0:48:53.763 | |
You cannot get that he died by looking at | |
the individual words of Bis ins grass, but | |
0:48:53.763 --> 0:48:54.611 | |
they have. | |
0:48:55.195 --> 0:49:10.264 | |
And there are different ways of describing | |
that some people have tried that more commonly | |
0:49:10.264 --> 0:49:13.781 | |
used for some tasks. | |
0:49:14.654 --> 0:49:20.073 | |
Will come to so the first thing would be something | |
like first order logic. | |
0:49:20.073 --> 0:49:27.297 | |
If you have Peter loves Jane then you have | |
this meaning and you're having the end of representation | |
0:49:27.297 --> 0:49:33.005 | |
that you have a love property between Peter | |
and Jane and you try to construct. | |
0:49:32.953 --> 0:49:40.606 | |
That you will see this a lot more complex | |
than directly than only doing syntax but also | |
0:49:40.606 --> 0:49:43.650 | |
doing this type of representation. | |
0:49:44.164 --> 0:49:47.761 | |
The other thing is to try to do frame semantics. | |
0:49:47.867 --> 0:49:55.094 | |
That means that you try to represent the knowledge | |
about the world and you have these ah frames. | |
0:49:55.094 --> 0:49:58.372 | |
For example, you might have a frame to buy. | |
0:49:58.418 --> 0:50:05.030 | |
And the meaning is that you have a commercial | |
transaction. | |
0:50:05.030 --> 0:50:08.840 | |
You have a person who is selling. | |
0:50:08.969 --> 0:50:10.725 | |
You Have a Person Who's Buying. | |
0:50:11.411 --> 0:50:16.123 | |
You have something that is priced, you might | |
have a price, and so on. | |
0:50:17.237 --> 0:50:22.698 | |
And then what you are doing in semantic parsing | |
with frame semantics you first try to determine. | |
0:50:22.902 --> 0:50:30.494 | |
Which frames are happening in the sentence, | |
so if it's something with Bowie buying you | |
0:50:30.494 --> 0:50:33.025 | |
would try to first identify. | |
0:50:33.025 --> 0:50:40.704 | |
Oh, here we have to try Brain B, which does | |
not always have to be indicated by the verb | |
0:50:40.704 --> 0:50:42.449 | |
cell or other ways. | |
0:50:42.582 --> 0:50:52.515 | |
And then you try to find out which elements | |
of these frame are in the sentence and try | |
0:50:52.515 --> 0:50:54.228 | |
to align them. | |
0:50:56.856 --> 0:51:01.121 | |
Yeah, you have, for example, to buy and sell. | |
0:51:01.121 --> 0:51:07.239 | |
If you have a model that has frames, they | |
have the same elements. | |
0:51:09.829 --> 0:51:15.018 | |
In addition over like sentence, then you have | |
also a phenomenon beyond sentence level. | |
0:51:15.018 --> 0:51:20.088 | |
We're coming to this later because it's a | |
special challenge for machine translation. | |
0:51:20.088 --> 0:51:22.295 | |
There is, for example, co reference. | |
0:51:22.295 --> 0:51:27.186 | |
That means if you first mention it, it's like | |
the President of the United States. | |
0:51:27.467 --> 0:51:30.107 | |
And later you would refer to him maybe as | |
he. | |
0:51:30.510 --> 0:51:36.966 | |
And that is especially challenging in machine | |
translation because you're not always using | |
0:51:36.966 --> 0:51:38.114 | |
the same thing. | |
0:51:38.114 --> 0:51:44.355 | |
Of course, for the president, it's he and | |
air in German, but for other things it might | |
0:51:44.355 --> 0:51:49.521 | |
be different depending on the gender in languages | |
that you refer to it. | |
0:51:55.435 --> 0:52:03.866 | |
So much for the background and the next, we | |
want to look based on the knowledge we have | |
0:52:03.866 --> 0:52:04.345 | |
now. | |
0:52:04.345 --> 0:52:10.285 | |
Why is machine translation difficult before | |
we have any more? | |
0:52:16.316 --> 0:52:22.471 | |
The first type of problem is what we refer | |
to as translation divers. | |
0:52:22.471 --> 0:52:30.588 | |
That means that we have the same information | |
in source and target, but the problem is that | |
0:52:30.588 --> 0:52:33.442 | |
they are expressed differently. | |
0:52:33.713 --> 0:52:42.222 | |
So it is not the same way, and we have to | |
translate these things more easily by just | |
0:52:42.222 --> 0:52:44.924 | |
having a bit more complex. | |
0:52:45.325 --> 0:52:51.324 | |
So example is if it's only a structure in | |
English, the delicious. | |
0:52:51.324 --> 0:52:59.141 | |
The adjective is before the noun, while in | |
Spanish you have to put it after the noun, | |
0:52:59.141 --> 0:53:02.413 | |
and so you have to change the word. | |
0:53:02.983 --> 0:53:10.281 | |
So there are different ways of divergence, | |
so there can be structural divergence, which | |
0:53:10.281 --> 0:53:10.613 | |
is. | |
0:53:10.550 --> 0:53:16.121 | |
The word orders so that the order is different, | |
so in German we have that especially in the | |
0:53:16.121 --> 0:53:19.451 | |
in the sub clause, while in English in the | |
sub clause. | |
0:53:19.451 --> 0:53:24.718 | |
The verb is also at the second position, in | |
German it's at the end, and so you have to | |
0:53:24.718 --> 0:53:25.506 | |
move it all. | |
0:53:25.465 --> 0:53:27.222 | |
Um All Over. | |
0:53:27.487 --> 0:53:32.978 | |
It can be that that it's a complete different | |
grammatical role. | |
0:53:33.253 --> 0:53:35.080 | |
So,. | |
0:53:35.595 --> 0:53:37.458 | |
You Have You Like Her. | |
0:53:38.238 --> 0:53:41.472 | |
And eh in in. | |
0:53:41.261 --> 0:53:47.708 | |
English: In Spanish it's a la ti gusta which | |
means she so now she is no longer like object | |
0:53:47.708 --> 0:53:54.509 | |
but she is subject here and you are now acquisitive | |
and then pleases or like yeah so you really | |
0:53:54.509 --> 0:53:58.689 | |
use a different sentence structure and you | |
have to change. | |
0:53:59.139 --> 0:54:03.624 | |
Can also be the head switch. | |
0:54:03.624 --> 0:54:09.501 | |
In English you say the baby just ate. | |
0:54:09.501 --> 0:54:16.771 | |
In Spanish literary you say the baby finishes. | |
0:54:16.997 --> 0:54:20.803 | |
So the is no longer the word, but the finishing | |
is the word. | |
0:54:21.241 --> 0:54:30.859 | |
So you have to learn so you cannot always | |
have the same structures in your input and | |
0:54:30.859 --> 0:54:31.764 | |
output. | |
0:54:36.856 --> 0:54:42.318 | |
Lexical things like to swim across or to cross | |
swimming. | |
0:54:43.243 --> 0:54:57.397 | |
You have categorical like an adjective gets | |
into a noun, so you have a little bread to | |
0:54:57.397 --> 0:55:00.162 | |
make a decision. | |
0:55:00.480 --> 0:55:15.427 | |
That is the one challenge and the even bigger | |
challenge is referred to as translation. | |
0:55:17.017 --> 0:55:19.301 | |
That can be their lexical mismatch. | |
0:55:19.301 --> 0:55:21.395 | |
That's the fish we talked about. | |
0:55:21.395 --> 0:55:27.169 | |
If it's like the, the fish you eat or the | |
fish which is living is the two different worlds | |
0:55:27.169 --> 0:55:27.931 | |
in Spanish. | |
0:55:28.108 --> 0:55:34.334 | |
And then that's partly sometimes even not | |
known, so even the human might not be able | |
0:55:34.334 --> 0:55:34.627 | |
to. | |
0:55:34.774 --> 0:55:40.242 | |
Infer that you maybe need to see the context | |
you maybe need to have the sentences around, | |
0:55:40.242 --> 0:55:45.770 | |
so one problem is that at least traditional | |
machine translation works on a sentence level, | |
0:55:45.770 --> 0:55:51.663 | |
so we take each sentence and translate it independent | |
of everything else, but that's, of course, | |
0:55:51.663 --> 0:55:52.453 | |
not correct. | |
0:55:52.532 --> 0:55:59.901 | |
Will look into some ways of looking at and | |
doing document-based machine translation, but. | |
0:56:00.380 --> 0:56:06.793 | |
There's gender information might be a problem, | |
so in English it's player and you don't know | |
0:56:06.793 --> 0:56:10.139 | |
if it's Spieler Spielerin or if it's not known. | |
0:56:10.330 --> 0:56:15.770 | |
But in the English, if you now generate German, | |
you should know is the reader. | |
0:56:15.770 --> 0:56:21.830 | |
Does he know the gender or does he not know | |
the gender and then generate the right one? | |
0:56:22.082 --> 0:56:38.333 | |
So just imagine a commentator if he's talking | |
about the player and you can see if it's male | |
0:56:38.333 --> 0:56:40.276 | |
or female. | |
0:56:40.540 --> 0:56:47.801 | |
So in generally the problem is that if you | |
have less information and you need more information | |
0:56:47.801 --> 0:56:51.928 | |
in your target, this translation doesn't really | |
work. | |
0:56:55.175 --> 0:56:59.180 | |
Another problem is we just talked about the | |
the. | |
0:56:59.119 --> 0:57:01.429 | |
The co reference. | |
0:57:01.641 --> 0:57:08.818 | |
So if you refer to an object and that can | |
be across sentence boundaries then you have | |
0:57:08.818 --> 0:57:14.492 | |
to use the right pronoun and you cannot just | |
translate the pronoun. | |
0:57:14.492 --> 0:57:18.581 | |
If the baby does not thrive on raw milk boil | |
it. | |
0:57:19.079 --> 0:57:28.279 | |
And if you are now using it and just take | |
the typical translation, it will be: And That | |
0:57:28.279 --> 0:57:31.065 | |
Will Be Ah Wrong. | |
0:57:31.291 --> 0:57:35.784 | |
No, that will be even right because it is | |
dust baby. | |
0:57:35.784 --> 0:57:42.650 | |
Yes, but I mean, you have to determine that | |
and it might be wrong at some point. | |
0:57:42.650 --> 0:57:48.753 | |
So getting this this um yeah, it will be wrong | |
yes, that is right yeah. | |
0:57:48.908 --> 0:57:55.469 | |
Because in English both are baby and milk, | |
and baby are both referred to it, so if you | |
0:57:55.469 --> 0:58:02.180 | |
do S it will be to the first one referred to, | |
so it's correct, but in Germany it will be | |
0:58:02.180 --> 0:58:06.101 | |
S, and so if you translate it as S it will | |
be baby. | |
0:58:06.546 --> 0:58:13.808 | |
But you have to do Z because milk is female, | |
although that is really very uncommon because | |
0:58:13.808 --> 0:58:18.037 | |
maybe a model is an object and so it should | |
be more. | |
0:58:18.358 --> 0:58:25.176 | |
Of course, I agree there might be a situation | |
which is a bit created and not a common thing, | |
0:58:25.176 --> 0:58:29.062 | |
but you can see that these things are not that | |
easy. | |
0:58:29.069 --> 0:58:31.779 | |
Another example is this: Dr. | |
0:58:31.779 --> 0:58:37.855 | |
McLean often brings his dog champion to visit | |
with his patients. | |
0:58:37.855 --> 0:58:41.594 | |
He loves to give big wets loppy kisses. | |
0:58:42.122 --> 0:58:58.371 | |
And there, of course, it's also important | |
if he refers to the dog or to the doctor. | |
0:58:59.779 --> 0:59:11.260 | |
Another example of challenging is that we | |
don't have a fixed language and that was referred | |
0:59:11.260 --> 0:59:16.501 | |
to morphology and we can build new words. | |
0:59:16.496 --> 0:59:23.787 | |
So we can in all languages build new words | |
by just concatinating part of it like braxits, | |
0:59:23.787 --> 0:59:30.570 | |
some things like: And then, of course, also | |
words don't exist in languages, don't exist | |
0:59:30.570 --> 0:59:31.578 | |
in isolations. | |
0:59:32.012 --> 0:59:41.591 | |
In Germany you can now use the word download | |
somewhere and you can also use a morphological | |
0:59:41.591 --> 0:59:43.570 | |
operation on that. | |
0:59:43.570 --> 0:59:48.152 | |
I guess there is even not the correct word. | |
0:59:48.508 --> 0:59:55.575 | |
But so you have to deal with these things, | |
and yeah, in social meters. | |
0:59:55.996 --> 1:00:00.215 | |
This word is maybe most of you have forgotten | |
already. | |
1:00:00.215 --> 1:00:02.517 | |
This was ten years ago or so. | |
1:00:02.517 --> 1:00:08.885 | |
I don't know there was a volcano in Iceland | |
which stopped Europeans flying around. | |
1:00:09.929 --> 1:00:14.706 | |
So there is always new words coming up and | |
you have to deal with. | |
1:00:18.278 --> 1:00:24.041 | |
Yeah, one last thing, so some of these examples | |
we have seen are a bit artificial. | |
1:00:24.041 --> 1:00:30.429 | |
So one example what is very common with machine | |
translation doesn't really work is this box | |
1:00:30.429 --> 1:00:31.540 | |
was in the pen. | |
1:00:32.192 --> 1:00:36.887 | |
And maybe you would be surprised, at least | |
when read it. | |
1:00:36.887 --> 1:00:39.441 | |
How can a box be inside a pen? | |
1:00:40.320 --> 1:00:44.175 | |
Does anybody have a solution for that while | |
the sentence is still correct? | |
1:00:47.367 --> 1:00:51.692 | |
Maybe it's directly clear for you, maybe your | |
English was aside, yeah. | |
1:00:54.654 --> 1:01:07.377 | |
Yes, like at a farm or for small children, | |
and that is also called a pen or a pen on a | |
1:01:07.377 --> 1:01:08.254 | |
farm. | |
1:01:08.368 --> 1:01:12.056 | |
And then this is, and so you can mean okay. | |
1:01:12.056 --> 1:01:16.079 | |
To infer these two meanings is quite difficult. | |
1:01:16.436 --> 1:01:23.620 | |
But at least when I saw it, I wasn't completely | |
convinced because it's maybe not the sentence | |
1:01:23.620 --> 1:01:29.505 | |
you're using in your daily life, and some of | |
these constructions seem to be. | |
1:01:29.509 --> 1:01:35.155 | |
They are very good in showing where the problem | |
is, but the question is, does it really imply | |
1:01:35.155 --> 1:01:35.995 | |
in real life? | |
1:01:35.996 --> 1:01:42.349 | |
And therefore here some examples also that | |
we had here with a lecture translator that | |
1:01:42.349 --> 1:01:43.605 | |
really occurred. | |
1:01:43.605 --> 1:01:49.663 | |
They maybe looked simple, but you will see | |
that some of them still are happening. | |
1:01:50.050 --> 1:01:53.948 | |
And they are partly about spitting words, | |
and then they are happening. | |
1:01:54.294 --> 1:01:56.816 | |
So Um. | |
1:01:56.596 --> 1:02:03.087 | |
We had a text about the numeral system in | |
German, the Silen system, which got splitted | |
1:02:03.087 --> 1:02:07.041 | |
into sub parts because otherwise we can't translate. | |
1:02:07.367 --> 1:02:14.927 | |
And then he did only a proximate match and | |
was talking about the binary payment system | |
1:02:14.927 --> 1:02:23.270 | |
because the payment system was a lot more common | |
in the training data than the Thailand system. | |
1:02:23.823 --> 1:02:29.900 | |
And so there you see like rare words, which | |
don't occur that often. | |
1:02:29.900 --> 1:02:38.211 | |
They are very challenging to deal with because | |
we are good and inferring that sometimes, but | |
1:02:38.211 --> 1:02:41.250 | |
for others that's very difficult. | |
1:02:44.344 --> 1:02:49.605 | |
Another challenge is that, of course, the | |
context is very difficult. | |
1:02:50.010 --> 1:02:56.448 | |
This is also an example a bit older from also | |
the lecture translators we were translating | |
1:02:56.448 --> 1:03:01.813 | |
in mass lecture, and he was always talking | |
about the omens of the numbers. | |
1:03:02.322 --> 1:03:11.063 | |
Which doesn't make any sense at all, but the | |
German word fortsizing can of course mean the | |
1:03:11.063 --> 1:03:12.408 | |
sign and the. | |
1:03:12.732 --> 1:03:22.703 | |
And if you not have the right to main knowledge | |
in there and encode it, it might use the main | |
1:03:22.703 --> 1:03:23.869 | |
knowledge. | |
1:03:25.705 --> 1:03:31.205 | |
A more recent version of that is like here | |
from a paper where it's about translating. | |
1:03:31.205 --> 1:03:36.833 | |
We had this pivot based translation where | |
you translate maybe to English and to another | |
1:03:36.833 --> 1:03:39.583 | |
because you have not enough training data. | |
1:03:40.880 --> 1:03:48.051 | |
And we did that from Dutch to German guess | |
if you don't understand Dutch, if you speak | |
1:03:48.051 --> 1:03:48.710 | |
German. | |
1:03:48.908 --> 1:03:56.939 | |
So we have this raven forebuilt, which means | |
to geben in English. | |
1:03:56.939 --> 1:04:05.417 | |
It's correctly in setting an example: However, | |
if we're then translate to German, he didn't | |
1:04:05.417 --> 1:04:11.524 | |
get the full context, and in German you normally | |
don't set an example, but you give an example, | |
1:04:11.524 --> 1:04:16.740 | |
and so yes, going through another language | |
you introduce their additional errors. | |
1:04:19.919 --> 1:04:27.568 | |
Good so much for this are there more questions | |
about why this is difficult. | |
1:04:30.730 --> 1:04:35.606 | |
Then we'll start with this one. | |
1:04:35.606 --> 1:04:44.596 | |
I have to leave a bit early today in a quarter | |
of an hour. | |
1:04:44.904 --> 1:04:58.403 | |
If you look about linguistic approaches to | |
machine translation, they are typically described | |
1:04:58.403 --> 1:05:03.599 | |
by: So we can do a direct translation, so you | |
take the Suez language. | |
1:05:03.599 --> 1:05:09.452 | |
Do not apply a lot of the analysis we were | |
discussing today about syntax representation, | |
1:05:09.452 --> 1:05:11.096 | |
semantic representation. | |
1:05:11.551 --> 1:05:14.678 | |
But you directly translate to your target | |
text. | |
1:05:14.678 --> 1:05:16.241 | |
That's here the direct. | |
1:05:16.516 --> 1:05:19.285 | |
Then there is a transfer based approach. | |
1:05:19.285 --> 1:05:23.811 | |
Then you transfer everything over and you | |
do the text translation. | |
1:05:24.064 --> 1:05:28.354 | |
And you can do that at two levels, more at | |
the syntax level. | |
1:05:28.354 --> 1:05:34.683 | |
That means you only do synthetic analysts | |
like you do a pasture or so, or at the semantic | |
1:05:34.683 --> 1:05:37.848 | |
level where you do a semantic parsing frame. | |
1:05:38.638 --> 1:05:51.489 | |
Then there is an interlingua based approach | |
where you don't do any transfer anymore, but | |
1:05:51.489 --> 1:05:55.099 | |
you only do an analysis. | |
1:05:57.437 --> 1:06:02.790 | |
So how does now the direct transfer, the direct | |
translation? | |
1:06:03.043 --> 1:06:07.031 | |
Look like it's one of the earliest approaches. | |
1:06:07.327 --> 1:06:18.485 | |
So you do maybe some morphological analysts, | |
but not a lot, and then you do this bilingual | |
1:06:18.485 --> 1:06:20.202 | |
word mapping. | |
1:06:20.540 --> 1:06:25.067 | |
You might do some here in generations. | |
1:06:25.067 --> 1:06:32.148 | |
These two things are not really big, but you | |
are working on. | |
1:06:32.672 --> 1:06:39.237 | |
And of course this might be a first easy solution | |
about all the challenges we have seen that | |
1:06:39.237 --> 1:06:41.214 | |
the structure is different. | |
1:06:41.214 --> 1:06:45.449 | |
That you have to reorder, look at the agreement, | |
then work. | |
1:06:45.449 --> 1:06:47.638 | |
That's why the first approach. | |
1:06:47.827 --> 1:06:54.618 | |
So if we have different word order, structural | |
shifts or idiomatic expressions that doesn't | |
1:06:54.618 --> 1:06:55.208 | |
really. | |
1:06:57.797 --> 1:07:05.034 | |
Then there are these rule based approaches | |
which were more commonly used. | |
1:07:05.034 --> 1:07:15.249 | |
They might still be somewhere: Mean most commonly | |
they are now used by neural networks but wouldn't | |
1:07:15.249 --> 1:07:19.254 | |
be sure there is no system out there but. | |
1:07:19.719 --> 1:07:25.936 | |
And in this transfer based approach we have | |
these steps there nicely visualized in the. | |
1:07:26.406 --> 1:07:32.397 | |
Triangle, so we have the analytic of the sur | |
sentence where we then get some type of abstract | |
1:07:32.397 --> 1:07:33.416 | |
representation. | |
1:07:33.693 --> 1:07:40.010 | |
Then we are doing the transfer of the representation | |
of the source sentence into the representation | |
1:07:40.010 --> 1:07:40.263 | |
of. | |
1:07:40.580 --> 1:07:46.754 | |
And then we have the generation where we take | |
this abstract representation and do then the | |
1:07:46.754 --> 1:07:47.772 | |
surface forms. | |
1:07:47.772 --> 1:07:54.217 | |
For example, it might be that there is no | |
morphological variants in the episode representation | |
1:07:54.217 --> 1:07:56.524 | |
and we have to do this agreement. | |
1:07:56.656 --> 1:08:00.077 | |
Which components do you they need? | |
1:08:01.061 --> 1:08:08.854 | |
You need monolingual source and target lexicon | |
and the corresponding grammars in order to | |
1:08:08.854 --> 1:08:12.318 | |
do both the analyst and the generation. | |
1:08:12.412 --> 1:08:18.584 | |
Then you need the bilingual dictionary in | |
order to do the lexical translation and the | |
1:08:18.584 --> 1:08:25.116 | |
bilingual transfer rules in order to transfer | |
the grammar, for example in German, into the | |
1:08:25.116 --> 1:08:28.920 | |
grammar in English, and that enables you to | |
do that. | |
1:08:29.269 --> 1:08:32.579 | |
So an example is is something like this here. | |
1:08:32.579 --> 1:08:38.193 | |
So if you're doing a syntactic transfer it | |
means you're starting with John E. | |
1:08:38.193 --> 1:08:38.408 | |
Z. | |
1:08:38.408 --> 1:08:43.014 | |
Apple you do the analyst then you have this | |
type of graph here. | |
1:08:43.014 --> 1:08:48.340 | |
Therefore you need your monolingual lexicon | |
and your monolingual grammar. | |
1:08:48.748 --> 1:08:59.113 | |
Then you're doing the transfer where you're | |
transferring this representation into this | |
1:08:59.113 --> 1:09:01.020 | |
representation. | |
1:09:01.681 --> 1:09:05.965 | |
So how could this type of translation then | |
look like? | |
1:09:07.607 --> 1:09:08.276 | |
Style. | |
1:09:08.276 --> 1:09:14.389 | |
We have the example of a delicious soup and | |
una soup deliciosa. | |
1:09:14.894 --> 1:09:22.173 | |
This is your source language tree and this | |
is your target language tree and then the rules | |
1:09:22.173 --> 1:09:26.092 | |
that you need are these ones to do the transfer. | |
1:09:26.092 --> 1:09:31.211 | |
So if you have a noun phrase that also goes | |
to the noun phrase. | |
1:09:31.691 --> 1:09:44.609 | |
You see here that the switch is happening, | |
so the second position is here at the first | |
1:09:44.609 --> 1:09:46.094 | |
position. | |
1:09:46.146 --> 1:09:52.669 | |
Then you have the translation of determiner | |
of the words, so the dictionary entries. | |
1:09:53.053 --> 1:10:07.752 | |
And with these types of rules you can then | |
do these mappings and do the transfer between | |
1:10:07.752 --> 1:10:11.056 | |
the representation. | |
1:10:25.705 --> 1:10:32.505 | |
Think it more depends on the amount of expertise | |
you have in representing them. | |
1:10:32.505 --> 1:10:35.480 | |
The rules will get more difficult. | |
1:10:36.136 --> 1:10:42.445 | |
For example, these rule based were, so I think | |
it more depends on how difficult the structure | |
1:10:42.445 --> 1:10:42.713 | |
is. | |
1:10:42.713 --> 1:10:48.619 | |
So for German generating German they were | |
quite long, quite successful because modeling | |
1:10:48.619 --> 1:10:52.579 | |
all the German phenomena which are in there | |
was difficult. | |
1:10:52.953 --> 1:10:56.786 | |
And that can be done there, and it wasn't | |
easy to learn that just from data. | |
1:10:59.019 --> 1:11:07.716 | |
Think even if you think about Chinese and | |
English or so, if you have the trees there | |
1:11:07.716 --> 1:11:10.172 | |
is quite some rule and. | |
1:11:15.775 --> 1:11:23.370 | |
Another thing is you can also try to do something | |
like that on the semantic, which means this | |
1:11:23.370 --> 1:11:24.905 | |
gets more complex. | |
1:11:25.645 --> 1:11:31.047 | |
This gets maybe a bit easier because this | |
representation, the semantic representation | |
1:11:31.047 --> 1:11:36.198 | |
between languages, are more similar and therefore | |
this gets more difficult again. | |
1:11:36.496 --> 1:11:45.869 | |
So typically if you go higher in your triangle | |
this is more work while this is less work. | |
1:11:49.729 --> 1:11:56.023 | |
So it can be then, for example, like in Gusta, | |
we have again that the the the order changes. | |
1:11:56.023 --> 1:12:02.182 | |
So you see the transfer rule for like is that | |
the first argument is here and the second is | |
1:12:02.182 --> 1:12:06.514 | |
there, while on the on the Gusta side here | |
the second argument. | |
1:12:06.466 --> 1:12:11.232 | |
It is in the first position and the first | |
argument is in the second position. | |
1:12:11.511 --> 1:12:14.061 | |
So that you do yeah, and also there you're | |
ordering,. | |
1:12:14.354 --> 1:12:20.767 | |
From the principle it is more like you have | |
a different type of formalism of representing | |
1:12:20.767 --> 1:12:27.038 | |
your sentence and therefore you need to do | |
more on one side and less on the other side. | |
1:12:32.852 --> 1:12:42.365 | |
Then so in general transfer based approaches | |
are you have to first select how to represent | |
1:12:42.365 --> 1:12:44.769 | |
a synthetic structure. | |
1:12:45.165 --> 1:12:55.147 | |
There's like these variable abstraction levels | |
and then you have the three components: The | |
1:12:55.147 --> 1:13:04.652 | |
disadvantage is that on the one hand you need | |
normally a lot of experts monolingual experts | |
1:13:04.652 --> 1:13:08.371 | |
who analyze how to do the transfer. | |
1:13:08.868 --> 1:13:18.860 | |
And if you're doing a new language, you have | |
to do analyst transfer in generation and the | |
1:13:18.860 --> 1:13:19.970 | |
transfer. | |
1:13:20.400 --> 1:13:27.074 | |
So if you need one language, add one language | |
in existing systems, of course you have to | |
1:13:27.074 --> 1:13:29.624 | |
do transfer to all the languages. | |
1:13:32.752 --> 1:13:39.297 | |
Therefore, the other idea which people were | |
interested in is the interlingua based machine | |
1:13:39.297 --> 1:13:40.232 | |
translation. | |
1:13:40.560 --> 1:13:47.321 | |
Where the idea is that we have this intermediate | |
language with this abstract language independent | |
1:13:47.321 --> 1:13:53.530 | |
representation and so the important thing is | |
it's language independent so it's really the | |
1:13:53.530 --> 1:13:59.188 | |
same for all language and it's a pure meaning | |
and there is no ambiguity in there. | |
1:14:00.100 --> 1:14:05.833 | |
That allows this nice translation without | |
transfer, so you just do an analysis into your | |
1:14:05.833 --> 1:14:11.695 | |
representation, and there afterwards you do | |
the generation into the other target language. | |
1:14:13.293 --> 1:14:16.953 | |
And that of course makes especially multilingual. | |
1:14:16.953 --> 1:14:19.150 | |
It's like somehow is a dream. | |
1:14:19.150 --> 1:14:25.519 | |
If you want to add a language you just need | |
to add one analyst tool and one generation | |
1:14:25.519 --> 1:14:25.959 | |
tool. | |
1:14:29.249 --> 1:14:32.279 | |
Which is not the case in the other scenario. | |
1:14:33.193 --> 1:14:40.547 | |
However, the big challenge is in this case | |
the interlingua based representation because | |
1:14:40.547 --> 1:14:47.651 | |
you need to represent all different types of | |
knowledge in there in order to do that. | |
1:14:47.807 --> 1:14:54.371 | |
And also like world knowledge, so something | |
like an apple is a fruit and property is a | |
1:14:54.371 --> 1:14:57.993 | |
fruit, so they are eatable and stuff like that. | |
1:14:58.578 --> 1:15:06.286 | |
So that is why this is typically always only | |
done for small amounts of data. | |
1:15:06.326 --> 1:15:13.106 | |
So what people have done for special applications | |
like hotel reservation people have looked into | |
1:15:13.106 --> 1:15:18.348 | |
that, but they have typically not done it for | |
any possibility of doing it. | |
1:15:18.718 --> 1:15:31.640 | |
So the advantage is you need to represent | |
all the world knowledge in your interlingua. | |
1:15:32.092 --> 1:15:40.198 | |
And that is not possible at the moment or | |
never was possible so far. | |
1:15:40.198 --> 1:15:47.364 | |
Typically they were for small domains for | |
hotel reservation. | |
1:15:51.431 --> 1:15:57.926 | |
But of course this idea of doing that and | |
that's why some people are interested in is | |
1:15:57.926 --> 1:16:04.950 | |
like if you now do a neural system where you | |
learn the representation in your neural network | |
1:16:04.950 --> 1:16:07.442 | |
is that some type of artificial. | |
1:16:08.848 --> 1:16:09.620 | |
Interlingua. | |
1:16:09.620 --> 1:16:15.025 | |
However, what we at least found out until | |
now is that there's often very language specific | |
1:16:15.025 --> 1:16:15.975 | |
information in. | |
1:16:16.196 --> 1:16:19.648 | |
And they might be important and essential. | |
1:16:19.648 --> 1:16:26.552 | |
You don't have all the information in your | |
input, so you typically can't do resolving | |
1:16:26.552 --> 1:16:32.412 | |
all ambiguities inside there because you might | |
not have all information. | |
1:16:32.652 --> 1:16:37.870 | |
So in English you don't know if it's a living | |
fish or the fish which you're eating, and if | |
1:16:37.870 --> 1:16:43.087 | |
you're translating to Germany you also don't | |
have to resolve this problem because you have | |
1:16:43.087 --> 1:16:45.610 | |
the same ambiguity in your target language. | |
1:16:45.610 --> 1:16:50.828 | |
So why would you put in our effort in finding | |
out if it's a dish or the other fish if it's | |
1:16:50.828 --> 1:16:52.089 | |
not necessary at all? | |
1:16:54.774 --> 1:16:59.509 | |
Yeah Yeah. | |
1:17:05.585 --> 1:17:15.019 | |
The semantic transfer is not the same for | |
both languages, so you still represent the | |
1:17:15.019 --> 1:17:17.127 | |
semantic language. | |
1:17:17.377 --> 1:17:23.685 | |
So you have the like semantic representation | |
in the Gusta, but that's not the same as semantic | |
1:17:23.685 --> 1:17:28.134 | |
representation for both languages, and that's | |
the main difference. | |
1:17:35.515 --> 1:17:44.707 | |
Okay, then these are the most important things | |
for today: what is language and how our rule | |
1:17:44.707 --> 1:17:46.205 | |
based systems. | |
1:17:46.926 --> 1:17:59.337 | |
And if there is no more questions thank you | |
for joining, we have today a bit of a shorter | |
1:17:59.337 --> 1:18:00.578 | |
lecture. | |