Start on Feb 12, 2025 | Due on Feb 26, 2025
Out on Feb 12, 2025
If you have already cloned my homework repository nlp-class-hw
for
previous homeworks then go into that directory and update the directory:
git pull origin/master
cd nlp-class-hw/neuralmt
If you don’t have that directory anymore then simply clone the repository again:
git clone https://github.com/angelxuanchang/nlp-class-hw.git
Clone your own repository from Github if you haven’t done it already:
git clone git@github.sfu.ca:USER/nlpclass-1251-g-GROUP.git
Note that the USER
above is the SFU username of the person in
your group that set up the Github repository.
Then copy over the contents of the neuralmt
directory into your
hw3
directory in your repository.
Set up the virtual environment:
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
Note that if you do not change the requirements then after you have
set up the virtual environment venv
you can simply run the following
command to get started with your development for the homework:
source venv/bin/activate
The data files provided are:
data/input
– input files dev.txt
and test.txt
data/reference/dev.out
– the reference output for the dev.txt
input fileThe default solution is provided in default.py
. To use the default
as your solution:
cp default.py answer/neuralmt.py
cp default.ipynb answer/neuralmt.ipynb
python3 zipout.py # Warning: can take >10mins to translate dev and test input files
python3 check.py
The default solution will look for the file seq2seq_E049.pt
pre-trained model file in the data directory. You do not
need to train a model for this homework.
Download the trained pipelines for English and German:
python3 -m spacy download en_core_web_sm
python3 -m spacy download de_core_news_sm
You can either download the seq2seq_E049.pt
model file from:
https://drive.google.com/drive/folders/1d-cyNMrHcrxwb60EKDw8TR0_NPsWhe3l?usp=sharing
Or you can use the same file directly on CSIL from the following directory:
/home/anoop/nlp-class/neuralmt/seq2seq_E049.pt
After you implement the baseline approach if you wish to tackle ensemble decoding then you will need additional model files to create the ensemble. The extra model files are as follows:
seq2seq_E048.pt
seq2seq_E047.pt
seq2seq_E046.pt
seq2seq_E045.pt
These model files are available from:
https://drive.google.com/drive/folders/1d-cyNMrHcrxwb60EKDw8TR0_NPsWhe3l?usp=sharing
and on CSIL in the following directory:
/home/anoop/nlp-class/neuralmt/*.pt
Please do not copy over the file into your CSIL directory as it is
moderately large and you can go over your disk quota. Instead modify
default.py
to use the full path to the above file which is
accessible on the CSIL machines or use the command line option
for default.py
.
python3 default.py -m /home/anoop/nlp-class/neuralmt/seq2seq_E049.pt > output.txt
If you have a copy or soft link to seq2seq_E049.pt
in the data
directory then you can simply run:
python3 default.py > dev.out
Note that this will take 5-10 minutes depending on your machine.
And then you can check the score on the dev output file called by running:
zip output.zip dev.out
python3 check.py
which produces the following evaluation:
dev.out score: 1.8637
For this homework we will be scoring your solution based on the BLEU score which is described in detail in the Accuracy section below.
Make sure that the command line options are kept as they are in
default.py
. You can add to them but you must not delete any
command line options that exist in default.py
.
Submitting the default solution without modification will get you zero marks.
You are given a pre-trained sequence to sequence (seq2seq) model for neural machine translation (NMT). Also provided to you is a basic NMT implementation that loads the encoder and decoder parameters from the pre-trained model and produces a translation for input documents. Your task is to augment the NMT implementation with the correct attention module to improve the translation performance.
Attention for this homework and for the trained model(s) provided to you is defined as follows:
\[\mathrm{score}_i = W_{enc}( h^{enc}_i ) + W_{dec}( h^{dec} )\]Define the $\alpha$ vector as follows:
\[\alpha = \mathrm{softmax}(V_{att} \mathrm{tanh} (\mathrm{score}))\]The we define the context vector using the $\alpha$ weights for each source side index $i$:
\[c = \sum_i \alpha_i \times h^{enc}_i\]The context vector $c$ is combined with the current decoder hidden state $h^{dec}$ and this representation is used to compute the softmax over the target language vocabulary at the current decoder time step. We then move to the next time step and repeat this process until we produce an end of sentence marker.
Implementing the attention model described above will improve your output translations as can be seen by the BLEU score:
$ python3 zipout.py # using baseline implementation
$ python3 check.py
dev.out score: 14.2427
We fixed the interface in a specific way that allows you to implement at least:
Original training data is also provided (tokenised). You may use it whichever way you want to augment the provided Seq2Seq model.
For visualisation, one could easily use the included functions in utils.py
:
from utils import alphaPlot
# Since alpha is batched, alpha[0] refers to the first item in the batch
alpha_plot = alphaPlot(alpha[0], output, source)
This converts the alpha values into a nice attention graph.
Example code in combination with tensorboard
is provided in validator.py
.
This can help you visualise an entire test_iter
.
In addition, default.py
has an additional parameter -n
.
If your inference is taking too long and you’d like to test your implementation
with a subset of dev (say first 100 samples), you can do that.
You must create the following files:
answer/neuralmt.py
– this is your solution to the homework. start by copying default.py
as explained below.answer/neuralmt.ipynb
– this is the iPython notebook that will be your write-up for the homework.To create the output.zip
file for upload to Coursys do:
python3 zipout.py
For more options:
python3 zipout.py -h
To check your performance on the dev set:
python3 check.py
The output score is the $F_{\beta=1}$ score or FB1 score which is the harmonic mean of the precision and recall computed over all the output phrasal chunks.
python3 check.py -h
In particular use the log file to check your output evaluation:
python3 check.py -l log
The performance on data/input/test.txt
will not be shown. We will
evaluate your output on the test input after the submission deadline.
You should prepare a short (1-2 pages) report on what you did in this assignment. Your report should be organized into clear sections, with grammatical English (full sentences). Use figures, graphs, tables to compare results of different experiments.
The report should include the following:
Your report should be submitted as report.pdf
to to Crowdmark.
Using LaTex for preparing your reports is recommended (see Overleaf for online editing of LaTex documents), but not required.
Once you are done with your homework submit all the relevant materials to Coursys for evaluation.
Once you have a working solution in answer/neuralmt.py
create
the output.zip
for upload to Coursys using:
python3 zipout.py
To create the source.zip
file for upload to Coursys do:
python3 zipsrc.py
You must have the following files or zipsrc.py
will complain about it:
answer/neuralmt.py
– this is your solution to the homework. start by copying default.py
as explained below.answer/neuralmt.ipynb
– this is the iPython notebook that will be your write-up for the homework.In addition, each group member should write down a short description of what they
did for this homework in answer/README.username
.
Go to Programming Homework 3
on Coursys and do a group submission:
Upload output.zip
and source.zip
to Coursys
Please upload your report.pdf
to Crowdmark HW3-P Report.
Only one person need to submit for the group, but please add your group members
so that they can see the submission and specify the name of your group in the report.
source.zip
matches your Github repository.answer/neuralmt.ipynb
.answer/README.username
where username
is your CSIL/Github username.The grading is split up into the following components:
answer/README.username
.Your F-score should be equal to or greater than the score listed for the corresponding marks.
BLEU(dev) | BLEU(test) | Marks | Grade |
2.5 | 2.0 | 0 | F |
3.0 | 2.5 | 55 | D |
4.0 | 3.0 | 60 | C- |
5.0 | 4.0 | 65 | C |
6.0 | 5.0 | 70 | C+ |
8.0 | 7.0 | 75 | B- |
9.0 | 8.0 | 80 | B |
10.0 | 9.0 | 85 | B+ |
12.0 | 11.0 | 90 | A- |
14.0 | 13.5 | 95 | A |
16.0 | 15.5 | 100 | A+ |
The score will be normalized to the marks on Coursys for the dev and test scores.