Fine-tuning Borzoi to create a Decima model¶
import glob
import anndata
import scanpy as sc
import pandas as pd
import bioframe as bf
import os
inputdir = "./data"
outdir = "./example"
ad_file_path = os.path.join(inputdir, "data.h5ad")
h5_file_path = os.path.join(outdir, "data.h5")
1. Load input anndata file¶
The input anndata file needs to be in the format (pseudobulks x genes).
ad = sc.read(ad_file_path)
ad
AnnData object with n_obs × n_vars = 50 × 921
obs: 'cell_type', 'tissue', 'disease', 'study'
var: 'chrom', 'start', 'end', 'strand', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'dataset'
uns: 'log1p'
.obs should be a dataframe with a unique index per pseudobulk. You can also include other columns with metadata about the pseudobulks, e.g. cell type, tissue, disease, study, number of cells, total counts.
Note that the original Decima model does NOT separate pseudobulks by sample, i.e. different samples from the same cell type, tissue, disease and study were merged. We also recommend filtering out pseudobulks with few cells or low read count.
ad.obs.head()
| cell_type | tissue | disease | study | |
|---|---|---|---|---|
| pseudobulk_0 | ct_0 | t_0 | d_0 | st_0 |
| pseudobulk_1 | ct_0 | t_0 | d_1 | st_0 |
| pseudobulk_2 | ct_0 | t_0 | d_2 | st_1 |
| pseudobulk_3 | ct_0 | t_0 | d_0 | st_1 |
| pseudobulk_4 | ct_0 | t_0 | d_1 | st_2 |
.var should be a dataframe with a unique index per gene. The index can be the gene name or Ensembl ID, as long as it is unique. Other essential columns are: chrom, start, end and strand (the gene coordinates).
You can also include other columns with metadata about the genes, e.g. Ensembl ID, type of gene.
ad.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26354840 | 26879128 | + | 26518680 | 27042968 | 524288 | 163840 | 524288 | train |
| gene_1 | chr19 | 41111417 | 41635705 | - | 40947577 | 41471865 | 524288 | 163840 | 524288 | train |
| gene_2 | chr1 | 79774026 | 80298314 | - | 79610186 | 80134474 | 524288 | 163840 | 524288 | train |
| gene_4 | chr16 | 3741368 | 4265656 | - | 3577528 | 4101816 | 524288 | 163840 | 524288 | train |
| gene_5 | chr10 | 22659481 | 23183769 | + | 22823321 | 23347609 | 524288 | 163840 | 524288 | train |
.X should contain the total counts per gene and pseudobulk. These should be non-negative integers.
ad.X[:5, :5]
array([[0. , 7.2926292, 7.2926292, 7.2926292, 7.2926292],
[7.3133874, 7.3133874, 0. , 7.3133874, 7.3133874],
[7.299993 , 7.299993 , 7.299993 , 7.299993 , 0. ],
[7.299993 , 0. , 7.299993 , 7.299993 , 0. ],
[7.3376517, 7.3376517, 0. , 7.3376517, 7.3376517]],
dtype=float32)
2. Normalize and log transform data¶
We first transform the counts to log(CPM+1) values. CPM = Counts Per Million.
sc.pp.normalize_total(ad, target_sum=1e6)
sc.pp.log1p(ad)
WARNING: adata.X seems to be already log-transformed.
ad.X[:5, :5]
array([[0. , 7.295568 , 7.295568 , 7.295568 , 7.295568 ],
[7.316388 , 7.316388 , 0. , 7.316388 , 7.316388 ],
[7.3014727, 7.3014727, 7.3014727, 7.3014727, 0. ],
[7.3014727, 0. , 7.3014727, 7.3014727, 0. ],
[7.3407264, 7.3407264, 0. , 7.3407264, 7.3407264]],
dtype=float32)
3. Create intervals surrounding genes¶
Decima is trained on 524,288 bp sequence surrounding the genes. Therefore, we have to take the given gene coordinates and extend them to create intervals of this length.
from decima.data.preprocess import var_to_intervals
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
ad.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26354840 | 26879128 | + | 26518680 | 27042968 | 524288 | 163840 | 524288 | train |
| gene_1 | chr19 | 41111417 | 41635705 | - | 40947577 | 41471865 | 524288 | 163840 | 524288 | train |
| gene_2 | chr1 | 79774026 | 80298314 | - | 79610186 | 80134474 | 524288 | 163840 | 524288 | train |
| gene_4 | chr16 | 3741368 | 4265656 | - | 3577528 | 4101816 | 524288 | 163840 | 524288 | train |
| gene_5 | chr10 | 22659481 | 23183769 | + | 22823321 | 23347609 | 524288 | 163840 | 524288 | train |
First, we copy the start and end columns to gene_start and gene_end. We also create a new column gene_length.
ad.var["gene_start"] = ad.var.start.tolist()
ad.var["gene_end"] = ad.var.end.tolist()
ad.var["gene_length"] = ad.var["gene_end"] - ad.var["gene_start"]
ad.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26354840 | 26879128 | + | 26354840 | 26879128 | 524288 | 163840 | 524288 | train |
| gene_1 | chr19 | 41111417 | 41635705 | - | 41111417 | 41635705 | 524288 | 163840 | 524288 | train |
| gene_2 | chr1 | 79774026 | 80298314 | - | 79774026 | 80298314 | 524288 | 163840 | 524288 | train |
| gene_4 | chr16 | 3741368 | 4265656 | - | 3741368 | 4265656 | 524288 | 163840 | 524288 | train |
| gene_5 | chr10 | 22659481 | 23183769 | + | 22659481 | 23183769 | 524288 | 163840 | 524288 | train |
Now, we extend the gene coordinates to create enclosing intervals:
ad = var_to_intervals(ad, chr_end_pad=10000, genome="hg38")
# Replace genome name if necessary
The interval size is 524288 bases. Of these, 163840 will be upstream of the gene start and 360448 will be downstream of the gene start.
0 intervals extended beyond the chromosome start and have been shifted
1 intervals extended beyond the chromosome end and have been shifted
1 intervals did not extend far enough upstream of the TSS and have been dropped
ad.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26191000 | 26715288 | + | 26354840 | 26879128 | 524288 | 163840 | 524288 | train |
| gene_1 | chr19 | 41275257 | 41799545 | - | 41111417 | 41635705 | 524288 | 163840 | 524288 | train |
| gene_2 | chr1 | 79937866 | 80462154 | - | 79774026 | 80298314 | 524288 | 163840 | 524288 | train |
| gene_4 | chr16 | 3905208 | 4429496 | - | 3741368 | 4265656 | 524288 | 163840 | 524288 | train |
| gene_5 | chr10 | 22495641 | 23019929 | + | 22659481 | 23183769 | 524288 | 163840 | 524288 | train |
You see that the columns start and end now contain the start and end coordinates for the 524,288 bp intervals.
3. Split genes into training, validation and test sets¶
We load the coordinates of the genomic regions used to train Borzoi:
splits_file = "https://raw.githubusercontent.com/calico/borzoi/main/data/sequences_human.bed.gz"
# replace human with mouse for mm10 splits
splits = pd.read_table(splits_file, header=None, names=["chrom", "start", "end", "fold"])
splits.head()
| chrom | start | end | fold | |
|---|---|---|---|---|
| 0 | chr4 | 82524421 | 82721029 | fold0 |
| 1 | chr13 | 18604798 | 18801406 | fold0 |
| 2 | chr2 | 189923408 | 190120016 | fold0 |
| 3 | chr10 | 59875743 | 60072351 | fold0 |
| 4 | chr1 | 117109467 | 117306075 | fold0 |
Now, we overlap our gene intervals with these regions:
overlaps = bf.overlap(ad.var.reset_index(names="gene"), splits, how="left")
overlaps = overlaps[["gene", "fold_"]].drop_duplicates().astype(str)
overlaps.head()
| gene | fold_ | |
|---|---|---|
| 0 | gene_0 | fold5 |
| 15 | gene_1 | fold0 |
| 30 | gene_2 | fold0 |
| 44 | gene_4 | fold2 |
| 59 | gene_5 | fold2 |
Based on the overlap, we divide our gene intervals into training, validation and test sets.
test_genes = overlaps.gene[overlaps.fold_ == "fold3"].tolist()
val_genes = overlaps.gene[overlaps.fold_ == "fold4"].tolist()
train_genes = set(overlaps.gene).difference(set(test_genes).union(val_genes))
And add this information back to ad.var.
ad.var["dataset"] = "test"
ad.var.loc[ad.var.index.isin(val_genes), "dataset"] = "val"
ad.var.loc[ad.var.index.isin(train_genes), "dataset"] = "train"
/tmp/slurmjob.14477843/ipykernel_3516462/3109841685.py:1: ImplicitModificationWarning: Trying to modify attribute `.var` of view, initializing view as actual.
ad.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26191000 | 26715288 | + | 26354840 | 26879128 | 524288 | 163840 | 524288 | train |
| gene_1 | chr19 | 41275257 | 41799545 | - | 41111417 | 41635705 | 524288 | 163840 | 524288 | train |
| gene_2 | chr1 | 79937866 | 80462154 | - | 79774026 | 80298314 | 524288 | 163840 | 524288 | train |
| gene_4 | chr16 | 3905208 | 4429496 | - | 3741368 | 4265656 | 524288 | 163840 | 524288 | train |
| gene_5 | chr10 | 22495641 | 23019929 | + | 22659481 | 23183769 | 524288 | 163840 | 524288 | train |
ad.var.dataset.value_counts()
dataset
train 766
test 83
val 71
Name: count, dtype: int64
We have now divided the 1000 genes in our dataset into separate sets to be used for training, validation and testing.
4. Save processed anndata¶
We will save the processed anndata file containing these intervals and data splits.
ad.write_h5ad(ad_file_path)
5. Create an hdf5 file¶
To train Decima, we need to extract the genomic sequences for all the intervals and convert them to one-hot encoded format. We save these one-hot encoded inputs to an hdf5 file.
from decima.data.write_hdf5 import write_hdf5
! mkdir -p example
write_hdf5(file=h5_file_path, ad=ad, pad=5000, genome="hg38")
# Change genome name if necessary
Writing metadata
Writing task indices
Writing genes array of shape: (920, 2)
Writing labels array of shape: (920, 50, 1)
Making gene masks
Writing mask array of shape: (920, 534288)
Encoding sequences
Writing sequence array of shape: (920, 534288)
Done!
6. Set training parameters¶
# Learning rate default=0.001
lr = 5e-5
# Total weight parameter for the loss function
total_weight = 1e-4
# Gradient accumulation steps
grad = 5
# batch-size. default=4
bs = 4
# max-seq-shift. default=5000
shift = 5000
# Number of epochs. Default 1
epochs = 15
# logger
logger = "wandb" # Change to csv to save logs locally
# Number of workers default=16
workers = 16
7. Generate training commands¶
cmds = []
for model in range(4):
name = f"finetune_test_{model}"
device = model
cmd = (
f"decima finetune --name {name} "
+ f"--model {model} --device {device} "
+ f"--matrix-file {ad_file_path} --h5-file {h5_file_path} "
+ f"--outdir {outdir} --learning-rate {lr} "
+ f"--loss-total-weight {total_weight} --gradient-accumulation {grad} "
+ f"--batch-size {bs} --max-seq-shift {shift} "
+ f"--epochs {epochs} --logger {logger} --num-workers {workers}"
)
cmds.append(cmd)
for cmd in cmds:
print(cmd)
decima finetune --name finetune_test_0 --model 0 --device 0 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16
decima finetune --name finetune_test_1 --model 1 --device 1 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16
decima finetune --name finetune_test_2 --model 2 --device 2 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16
decima finetune --name finetune_test_3 --model 3 --device 3 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16
Here, we train the model for 1 epoch for quick progressing in tutorial. Run the training for more epochs in your training.
! CUDA_VISIBLE_DEVICES=0 decima finetune \
--name finetune_test_0 \
--model 0 \
--device 0 \
--matrix-file {ad_file_path} \
--h5-file {h5_file_path} \
--outdir {outdir} \
--learning-rate {lr} \
--loss-total-weight {total_weight} \
--gradient-accumulation {grad} \
--batch-size 1 \
--max-seq-shift {shift} \
--epochs 1 \
--logger {logger} \
--num-workers {workers}
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
decima - INFO - Data paths: matrix_file=./data/data.h5ad, h5_file=./example/data.h5
decima - INFO - Reading anndata
decima - INFO - Making dataset objects
decima - INFO - train_params: {'batch_size': 1, 'num_workers': 16, 'devices': 0, 'logger': 'wandb', 'save_dir': './example', 'max_epochs': 1, 'lr': 5e-05, 'total_weight': 0.0001, 'accumulate_grad_batches': 5, 'loss': 'poisson_multinomial', 'clip': 0.0, 'save_top_k': 1, 'pin_memory': True}
decima - INFO - model_params: {'n_tasks': 50, 'init_borzoi': True, 'replicate': '0'}
decima - INFO - Initializing model
decima - INFO - Initializing weights from Borzoi model using wandb for replicate: 0
wandb: Currently logged in as: mhcelik (mhcw) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Downloading large artifact 'human_state_dict_fold0:latest', 709.30MB. 1 files...
wandb: 1 of 1 files downloaded.
Done. 00:00:01.7 (406.1MB/s)
decima - INFO - Connecting to wandb.
wandb: Currently logged in as: mhcelik (mhcw) to https://genentech.wandb.io. Use `wandb login --relogin` to force relogin
wandb: ⢿ Waiting for wandb.init()...
m
wandb: ⣻ setting up run g20ya0al (0.2s)
m
wandb: Tracking run with wandb version 0.22.2
wandb: Run data is saved locally in finetune_test_0/wandb/run-20251121_143055-g20ya0al
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run finetune_test_0
wandb: ⭐️ View project at https://genentech.wandb.io/grelu/decima
wandb: 🚀 View run at https://genentech.wandb.io/grelu/decima/runs/g20ya0al
decima - INFO - Training
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/__init__.py:1617: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py:397: UserWarning: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
SLURM auto-requeueing enabled. Setting signal handlers.
Validation: | | 0/? [00:00<?, ?it/s]
Validation: | | 0/? [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/71 [00:00<?, ?it/s]
Multinomial: 17.704072952270508, Poisson: -0.08451984077692032
Validation DataLoader 0: 1%|▎ | 1/71 [00:04<05:11, 0.22it/s]
Multinomial: 17.50640296936035, Poisson: -0.081619992852211
Validation DataLoader 0: 3%|▌ | 2/71 [00:04<02:37, 0.44it/s]
Multinomial: 23.562318801879883, Poisson: -0.11422758549451828
Validation DataLoader 0: 4%|▊ | 3/71 [00:04<01:45, 0.65it/s]
Multinomial: 18.25124168395996, Poisson: -0.08771996945142746
Validation DataLoader 0: 6%|█ | 4/71 [00:04<01:19, 0.84it/s]
Multinomial: 22.056081771850586, Poisson: -0.10517071187496185
Validation DataLoader 0: 7%|█▎ | 5/71 [00:04<01:03, 1.04it/s]
Multinomial: 18.16107940673828, Poisson: -0.08438651263713837
Validation DataLoader 0: 8%|█▌ | 6/71 [00:04<00:53, 1.22it/s]
Multinomial: 17.731626510620117, Poisson: -0.084895059466362
Validation DataLoader 0: 10%|█▊ | 7/71 [00:04<00:45, 1.40it/s]
Multinomial: 22.352930068969727, Poisson: -0.10843678563833237
Validation DataLoader 0: 11%|██▏ | 8/71 [00:05<00:40, 1.57it/s]
Multinomial: 23.15188217163086, Poisson: -0.1109374463558197
Validation DataLoader 0: 13%|██▍ | 9/71 [00:05<00:35, 1.74it/s]
Multinomial: 20.722728729248047, Poisson: -0.09945794194936752
Validation DataLoader 0: 14%|██▌ | 10/71 [00:05<00:32, 1.90it/s]
Multinomial: 17.389652252197266, Poisson: -0.0819193497300148
Validation DataLoader 0: 15%|██▊ | 11/71 [00:05<00:29, 2.06it/s]
Multinomial: 21.801376342773438, Poisson: -0.10459177196025848
Validation DataLoader 0: 17%|███ | 12/71 [00:05<00:26, 2.21it/s]
Multinomial: 20.208261489868164, Poisson: -0.09648436307907104
Validation DataLoader 0: 18%|███▎ | 13/71 [00:05<00:24, 2.36it/s]
Multinomial: 21.97159194946289, Poisson: -0.10451767593622208
Validation DataLoader 0: 20%|███▌ | 14/71 [00:05<00:22, 2.50it/s]
Multinomial: 24.06745719909668, Poisson: -0.11637863516807556
Validation DataLoader 0: 21%|███▊ | 15/71 [00:05<00:21, 2.64it/s]
Multinomial: 20.875829696655273, Poisson: -0.09984245151281357
Validation DataLoader 0: 23%|████ | 16/71 [00:05<00:19, 2.77it/s]
Multinomial: 16.660158157348633, Poisson: -0.07892918586730957
Validation DataLoader 0: 24%|████▎ | 17/71 [00:05<00:18, 2.90it/s]
Multinomial: 23.05860137939453, Poisson: -0.11131791770458221
Validation DataLoader 0: 25%|████▌ | 18/71 [00:05<00:17, 3.02it/s]
Multinomial: 20.243940353393555, Poisson: -0.09636309742927551
Validation DataLoader 0: 27%|████▊ | 19/71 [00:06<00:16, 3.15it/s]
Multinomial: 24.765872955322266, Poisson: -0.11982230842113495
Validation DataLoader 0: 28%|█████ | 20/71 [00:06<00:15, 3.27it/s]
Multinomial: 20.902503967285156, Poisson: -0.09954728931188583
Validation DataLoader 0: 30%|█████▎ | 21/71 [00:06<00:14, 3.38it/s]
Multinomial: 24.31157875061035, Poisson: -0.11637787520885468
Validation DataLoader 0: 31%|█████▌ | 22/71 [00:06<00:14, 3.49it/s]
Multinomial: 19.60178565979004, Poisson: -0.09363999217748642
Validation DataLoader 0: 32%|█████▊ | 23/71 [00:06<00:13, 3.60it/s]
Multinomial: 19.007802963256836, Poisson: -0.09034327417612076
Validation DataLoader 0: 34%|██████ | 24/71 [00:06<00:12, 3.71it/s]
Multinomial: 20.0886173248291, Poisson: -0.09662938863039017
Validation DataLoader 0: 35%|██████▎ | 25/71 [00:06<00:12, 3.81it/s]
Multinomial: 19.589750289916992, Poisson: -0.09355800598859787
Validation DataLoader 0: 37%|██████▌ | 26/71 [00:06<00:11, 3.91it/s]
Multinomial: 21.216665267944336, Poisson: -0.10239789634943008
Validation DataLoader 0: 38%|██████▊ | 27/71 [00:06<00:10, 4.01it/s]
Multinomial: 17.116254806518555, Poisson: -0.08170726150274277
Validation DataLoader 0: 39%|███████ | 28/71 [00:06<00:10, 4.10it/s]
Multinomial: 20.821630477905273, Poisson: -0.09940409660339355
Validation DataLoader 0: 41%|███████▎ | 29/71 [00:06<00:10, 4.20it/s]
Multinomial: 23.517545700073242, Poisson: -0.113502636551857
Validation DataLoader 0: 42%|███████▌ | 30/71 [00:06<00:09, 4.29it/s]
Multinomial: 22.03712272644043, Poisson: -0.10526878386735916
Validation DataLoader 0: 44%|███████▊ | 31/71 [00:07<00:09, 4.38it/s]
Multinomial: 19.719432830810547, Poisson: -0.09343377500772476
Validation DataLoader 0: 45%|████████ | 32/71 [00:07<00:08, 4.46it/s]
Multinomial: 21.70380401611328, Poisson: -0.10528752207756042
Validation DataLoader 0: 46%|████████▎ | 33/71 [00:07<00:08, 4.55it/s]
Multinomial: 19.4345760345459, Poisson: -0.0937180295586586
Validation DataLoader 0: 48%|████████▌ | 34/71 [00:07<00:07, 4.63it/s]
Multinomial: 19.56287384033203, Poisson: -0.0933140367269516
Validation DataLoader 0: 49%|████████▊ | 35/71 [00:07<00:07, 4.71it/s]
Multinomial: 24.247249603271484, Poisson: -0.11643800884485245
Validation DataLoader 0: 51%|█████████▏ | 36/71 [00:07<00:07, 4.79it/s]
Multinomial: 16.163373947143555, Poisson: -0.07618056982755661
Validation DataLoader 0: 52%|█████████▍ | 37/71 [00:07<00:06, 4.87it/s]
Multinomial: 23.00337028503418, Poisson: -0.11091884225606918
Validation DataLoader 0: 54%|█████████▋ | 38/71 [00:07<00:06, 4.94it/s]
Multinomial: 20.206920623779297, Poisson: -0.09684920310974121
Validation DataLoader 0: 55%|█████████▉ | 39/71 [00:07<00:06, 5.01it/s]
Multinomial: 24.414379119873047, Poisson: -0.11682412773370743
Validation DataLoader 0: 56%|██████████▏ | 40/71 [00:07<00:06, 5.09it/s]
Multinomial: 18.447242736816406, Poisson: -0.08776233345270157
Validation DataLoader 0: 58%|██████████▍ | 41/71 [00:07<00:05, 5.16it/s]
Multinomial: 15.436498641967773, Poisson: -0.07314550876617432
Validation DataLoader 0: 59%|██████████▋ | 42/71 [00:08<00:05, 5.23it/s]
Multinomial: 22.46692657470703, Poisson: -0.10728458315134048
Validation DataLoader 0: 61%|██████████▉ | 43/71 [00:08<00:05, 5.29it/s]
Multinomial: 24.299116134643555, Poisson: -0.11672092229127884
Validation DataLoader 0: 62%|███████████▏ | 44/71 [00:08<00:05, 5.36it/s]
Multinomial: 22.967220306396484, Poisson: -0.11124279350042343
Validation DataLoader 0: 63%|███████████▍ | 45/71 [00:08<00:04, 5.43it/s]
Multinomial: 21.898027420043945, Poisson: -0.10534106194972992
Validation DataLoader 0: 65%|███████████▋ | 46/71 [00:08<00:04, 5.49it/s]
Multinomial: 21.183794021606445, Poisson: -0.10242842882871628
Validation DataLoader 0: 66%|███████████▉ | 47/71 [00:08<00:04, 5.55it/s]
Multinomial: 24.188167572021484, Poisson: -0.11698035895824432
Validation DataLoader 0: 68%|████████████▏ | 48/71 [00:08<00:04, 5.61it/s]
Multinomial: 19.33690071105957, Poisson: -0.09073223918676376
Validation DataLoader 0: 69%|████████████▍ | 49/71 [00:08<00:03, 5.67it/s]
Multinomial: 19.042579650878906, Poisson: -0.09028337150812149
Validation DataLoader 0: 70%|████████████▋ | 50/71 [00:08<00:03, 5.73it/s]
Multinomial: 20.37116241455078, Poisson: -0.09634491056203842
Validation DataLoader 0: 72%|████████████▉ | 51/71 [00:08<00:03, 5.79it/s]
Multinomial: 21.943159103393555, Poisson: -0.10504303872585297
Validation DataLoader 0: 73%|█████████████▏ | 52/71 [00:08<00:03, 5.85it/s]
Multinomial: 23.517465591430664, Poisson: -0.11390405148267746
Validation DataLoader 0: 75%|█████████████▍ | 53/71 [00:08<00:03, 5.90it/s]
Multinomial: 21.979843139648438, Poisson: -0.10537681728601456
Validation DataLoader 0: 76%|█████████████▋ | 54/71 [00:09<00:02, 5.96it/s]
Multinomial: 21.226818084716797, Poisson: -0.10194623470306396
Validation DataLoader 0: 77%|█████████████▉ | 55/71 [00:09<00:02, 6.01it/s]
Multinomial: 18.263525009155273, Poisson: -0.08776170760393143
Validation DataLoader 0: 79%|██████████████▏ | 56/71 [00:09<00:02, 6.06it/s]
Multinomial: 20.565263748168945, Poisson: -0.09930194169282913
Validation DataLoader 0: 80%|██████████████▍ | 57/71 [00:09<00:02, 6.11it/s]
Multinomial: 20.983007431030273, Poisson: -0.09955137223005295
Validation DataLoader 0: 82%|██████████████▋ | 58/71 [00:09<00:02, 6.16it/s]
Multinomial: 24.88779640197754, Poisson: -0.1202164888381958
Validation DataLoader 0: 83%|██████████████▉ | 59/71 [00:09<00:01, 6.21it/s]
Multinomial: 23.5961856842041, Poisson: -0.11393663287162781
Validation DataLoader 0: 85%|███████████████▏ | 60/71 [00:09<00:01, 6.26it/s]
Multinomial: 21.301002502441406, Poisson: -0.10222788155078888
Validation DataLoader 0: 86%|███████████████▍ | 61/71 [00:09<00:01, 6.31it/s]
Multinomial: 16.259353637695312, Poisson: -0.07613859325647354
Validation DataLoader 0: 87%|███████████████▋ | 62/71 [00:09<00:01, 6.36it/s]
Multinomial: 20.09466552734375, Poisson: -0.09604513645172119
Validation DataLoader 0: 89%|███████████████▉ | 63/71 [00:09<00:01, 6.40it/s]
Multinomial: 20.736059188842773, Poisson: -0.09930893778800964
Validation DataLoader 0: 90%|████████████████▏ | 64/71 [00:09<00:01, 6.45it/s]
Multinomial: 21.481731414794922, Poisson: -0.10254371166229248
Validation DataLoader 0: 92%|████████████████▍ | 65/71 [00:10<00:00, 6.49it/s]
Multinomial: 22.471792221069336, Poisson: -0.10787025094032288
Validation DataLoader 0: 93%|████████████████▋ | 66/71 [00:10<00:00, 6.54it/s]
Multinomial: 20.083730697631836, Poisson: -0.09609249979257584
Validation DataLoader 0: 94%|████████████████▉ | 67/71 [00:10<00:00, 6.58it/s]
Multinomial: 17.917104721069336, Poisson: -0.08452223241329193
Validation DataLoader 0: 96%|█████████████████▏| 68/71 [00:10<00:00, 6.62it/s]
Multinomial: 21.31960678100586, Poisson: -0.10256922245025635
Validation DataLoader 0: 97%|█████████████████▍| 69/71 [00:10<00:00, 6.66it/s]
Multinomial: 20.782426834106445, Poisson: -0.09927929937839508
Validation DataLoader 0: 99%|█████████████████▋| 70/71 [00:10<00:00, 6.70it/s]
Multinomial: 24.292436599731445, Poisson: -0.11702897399663925
Validation DataLoader 0: 100%|██████████████████| 71/71 [00:10<00:00, 6.75it/s]
Validation DataLoader 0: 100%|██████████████████| 71/71 [00:11<00:00, 6.41it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Validate metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ val_gene_pearson │ 0.0249176025390625 │
│ val_loss │ 20.776832580566406 │
│ val_mse │ 28.61081886291504 │
│ val_task_pearson │ 0.019344473257660866 │
└───────────────────────────┴───────────────────────────┘
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pytorch_lightning/utilities/model_summary/model_summary.py:231: UserWarning: Precision 16-mixed is not supported by the model summary. Estimated model size in MB will not be accurate. Using 32 bits instead.
| Name | Type | Params | Mode
---------------------------------------------------------------------------
0 | model | DecimaModel | 171 M | train
1 | loss | TaskWisePoissonMultinomialLoss | 0 | train
2 | val_metrics | MetricCollection | 0 | train
3 | test_metrics | MetricCollection | 0 | train
4 | warning_counter | WarningCounter | 0 | train
5 | transform | Identity | 0 | train
---------------------------------------------------------------------------
171 M Trainable params
0 Non-trainable params
171 M Total params
685.503 Total estimated model params size (MB)
401 Modules in train mode
0 Modules in eval mode
SLURM auto-requeueing enabled. Setting signal handlers.
Sanity Checking: | | 0/? [00:00<?, ?it/s]/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
Sanity Checking: | | 0/? [00:00<?, ?it/s]
Sanity Checking DataLoader 0: 0%| | 0/2 [00:00<?, ?it/s]
Multinomial: 17.704072952270508, Poisson: -0.08451984077692032
Sanity Checking DataLoader 0: 50%|███████▌ | 1/2 [00:00<00:00, 3.99it/s]
Multinomial: 17.50640296936035, Poisson: -0.081619992852211
Sanity Checking DataLoader 0: 100%|███████████████| 2/2 [00:00<00:00, 5.90it/s]
Training: | | 0/? [00:00<?, ?it/s]
Training: | | 0/? [00:00<?, ?it/s]
Epoch 0: 0%| | 0/766 [00:00<?, ?it/s]
Multinomial: 19.381006240844727, Poisson: -0.09250339865684509
Epoch 0: 0%| | 1/766 [00:02<34:37, 0.37it/s]
Epoch 0: 0%| | 1/766 [00:02<34:44, 0.37it/s, v_num=a0al, train_loss_step=19.3
Multinomial: 20.657838821411133, Poisson: -0.09813400357961655
Epoch 0: 0%| | 2/766 [00:02<18:03, 0.71it/s, v_num=a0al, train_loss_step=19.3
Epoch 0: 0%| | 2/766 [00:02<18:52, 0.67it/s, v_num=a0al, train_loss_step=20.6
Multinomial: 17.885753631591797, Poisson: -0.08482938259840012
Epoch 0: 0%| | 3/766 [00:03<13:01, 0.98it/s, v_num=a0al, train_loss_step=20.6
Epoch 0: 0%| | 3/766 [00:03<13:35, 0.94it/s, v_num=a0al, train_loss_step=17.8
Multinomial: 21.289833068847656, Poisson: -0.10162309557199478
Epoch 0: 1%| | 4/766 [00:03<10:30, 1.21it/s, v_num=a0al, train_loss_step=17.8
Epoch 0: 1%| | 4/766 [00:03<10:56, 1.16it/s, v_num=a0al, train_loss_step=21.2
Multinomial: 19.540626525878906, Poisson: -0.0922895073890686
Epoch 0: 1%| | 5/766 [00:03<09:53, 1.28it/s, v_num=a0al, train_loss_step=21.2
Epoch 0: 1%| | 5/766 [00:03<09:53, 1.28it/s, v_num=a0al, train_loss_step=19.4
Multinomial: 19.395416259765625, Poisson: -0.09235671162605286
Epoch 0: 1%| | 6/766 [00:04<08:27, 1.50it/s, v_num=a0al, train_loss_step=19.4
Epoch 0: 1%| | 6/766 [00:04<08:44, 1.45it/s, v_num=a0al, train_loss_step=19.3
Multinomial: 23.51851463317871, Poisson: -0.11232350766658783
Epoch 0: 1%| | 7/766 [00:04<07:40, 1.65it/s, v_num=a0al, train_loss_step=19.3
Epoch 0: 1%| | 7/766 [00:04<07:54, 1.60it/s, v_num=a0al, train_loss_step=23.4
Multinomial: 22.36002540588379, Poisson: -0.10656161606311798
Epoch 0: 1%| | 8/766 [00:04<07:04, 1.78it/s, v_num=a0al, train_loss_step=23.4
Epoch 0: 1%| | 8/766 [00:04<07:17, 1.73it/s, v_num=a0al, train_loss_step=22.3
Multinomial: 24.08615493774414, Poisson: -0.11497366428375244
Epoch 0: 1%| | 9/766 [00:04<06:37, 1.91it/s, v_num=a0al, train_loss_step=22.3
Epoch 0: 1%| | 9/766 [00:04<06:48, 1.85it/s, v_num=a0al, train_loss_step=24.0
Multinomial: 18.516836166381836, Poisson: -0.08695206046104431
Epoch 0: 1%| | 10/766 [00:05<06:25, 1.96it/s, v_num=a0al, train_loss_step=24.
Epoch 0: 1%| | 10/766 [00:05<06:26, 1.96it/s, v_num=a0al, train_loss_step=18.
Multinomial: 22.552202224731445, Poisson: -0.10729885846376419
Epoch 0: 1%| | 11/766 [00:05<05:57, 2.11it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 1%| | 11/766 [00:05<06:06, 2.06it/s, v_num=a0al, train_loss_step=22.
Multinomial: 18.964632034301758, Poisson: -0.0897383913397789
Epoch 0: 2%| | 12/766 [00:05<05:42, 2.20it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 2%| | 12/766 [00:05<05:50, 2.15it/s, v_num=a0al, train_loss_step=18.
Multinomial: 18.290241241455078, Poisson: -0.08695797622203827
Epoch 0: 2%| | 13/766 [00:05<05:29, 2.28it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 2%| | 13/766 [00:05<05:37, 2.23it/s, v_num=a0al, train_loss_step=18.
Multinomial: 22.30130958557129, Poisson: -0.10726857930421829
Epoch 0: 2%| | 14/766 [00:05<05:18, 2.36it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 2%| | 14/766 [00:06<05:25, 2.31it/s, v_num=a0al, train_loss_step=22.
Multinomial: 19.5349063873291, Poisson: -0.09275592118501663
Epoch 0: 2%| | 15/766 [00:06<05:15, 2.38it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 2%| | 15/766 [00:06<05:16, 2.38it/s, v_num=a0al, train_loss_step=19.
Multinomial: 19.408832550048828, Poisson: -0.09224316477775574
Epoch 0: 2%| | 16/766 [00:06<05:00, 2.49it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 2%| | 16/766 [00:06<05:07, 2.44it/s, v_num=a0al, train_loss_step=19.
Multinomial: 18.880659103393555, Poisson: -0.08932992070913315
Epoch 0: 2%| | 17/766 [00:06<04:53, 2.55it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 2%| | 17/766 [00:06<04:59, 2.50it/s, v_num=a0al, train_loss_step=18.
Multinomial: 19.011474609375, Poisson: -0.08990071713924408
Epoch 0: 2%| | 18/766 [00:06<04:46, 2.61it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 2%| | 18/766 [00:07<04:52, 2.56it/s, v_num=a0al, train_loss_step=18.
Multinomial: 20.70027732849121, Poisson: -0.09867019951343536
Epoch 0: 2%| | 19/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 2%| | 19/766 [00:07<04:45, 2.61it/s, v_num=a0al, train_loss_step=20.
Multinomial: 17.821165084838867, Poisson: -0.08399660140275955
Epoch 0: 3%| | 20/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 3%| | 20/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=17.
Multinomial: 16.62529945373535, Poisson: -0.07828851789236069
Epoch 0: 3%| | 21/766 [00:07<04:30, 2.75it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 3%| | 21/766 [00:07<04:35, 2.71it/s, v_num=a0al, train_loss_step=16.
Multinomial: 21.265472412109375, Poisson: -0.10090325772762299
Epoch 0: 3%| | 22/766 [00:07<04:25, 2.80it/s, v_num=a0al, train_loss_step=16.
Epoch 0: 3%| | 22/766 [00:07<04:30, 2.75it/s, v_num=a0al, train_loss_step=21.
Multinomial: 20.13052749633789, Poisson: -0.09563688188791275
Epoch 0: 3%| | 23/766 [00:08<04:21, 2.84it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 3%| | 23/766 [00:08<04:26, 2.79it/s, v_num=a0al, train_loss_step=20.
Multinomial: 20.649946212768555, Poisson: -0.09828009456396103
Epoch 0: 3%| | 24/766 [00:08<04:17, 2.88it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 3%| | 24/766 [00:08<04:22, 2.83it/s, v_num=a0al, train_loss_step=20.
Multinomial: 25.186647415161133, Poisson: -0.12111172825098038
Epoch 0: 3%| | 25/766 [00:08<04:18, 2.87it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 3%| | 25/766 [00:08<04:18, 2.87it/s, v_num=a0al, train_loss_step=25.
Multinomial: 18.360719680786133, Poisson: -0.08687090128660202
Epoch 0: 3%| | 26/766 [00:08<04:11, 2.94it/s, v_num=a0al, train_loss_step=25.
Epoch 0: 3%| | 26/766 [00:08<04:15, 2.90it/s, v_num=a0al, train_loss_step=18.
Multinomial: 20.07268524169922, Poisson: -0.0955045148730278
Epoch 0: 4%| | 27/766 [00:09<04:08, 2.98it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 4%| | 27/766 [00:09<04:11, 2.93it/s, v_num=a0al, train_loss_step=20.
Multinomial: 23.431581497192383, Poisson: -0.11249936372041702
Epoch 0: 4%| | 28/766 [00:09<04:05, 3.01it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 4%| | 28/766 [00:09<04:08, 2.97it/s, v_num=a0al, train_loss_step=23.
Multinomial: 21.752777099609375, Poisson: -0.10413458943367004
Epoch 0: 4%| | 29/766 [00:09<04:02, 3.04it/s, v_num=a0al, train_loss_step=23.
Epoch 0: 4%| | 29/766 [00:09<04:06, 3.00it/s, v_num=a0al, train_loss_step=21.
Multinomial: 18.950761795043945, Poisson: -0.08966774493455887
Epoch 0: 4%| | 30/766 [00:09<04:03, 3.02it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 4%| | 30/766 [00:09<04:03, 3.02it/s, v_num=a0al, train_loss_step=18.
Multinomial: 24.61734962463379, Poisson: -0.11915773898363113
Epoch 0: 4%| | 31/766 [00:10<03:57, 3.09it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 4%| | 31/766 [00:10<04:01, 3.05it/s, v_num=a0al, train_loss_step=24.
Multinomial: 19.473047256469727, Poisson: -0.09275110810995102
Epoch 0: 4%| | 32/766 [00:10<03:55, 3.11it/s, v_num=a0al, train_loss_step=24.
Epoch 0: 4%| | 32/766 [00:10<03:58, 3.07it/s, v_num=a0al, train_loss_step=19.
Multinomial: 21.206684112548828, Poisson: -0.10135509073734283
Epoch 0: 4%| | 33/766 [00:10<03:53, 3.14it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 4%| | 33/766 [00:10<03:56, 3.10it/s, v_num=a0al, train_loss_step=21.
Multinomial: 19.45479965209961, Poisson: -0.09279609471559525
Epoch 0: 4%| | 34/766 [00:10<03:51, 3.16it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 4%| | 34/766 [00:10<03:54, 3.12it/s, v_num=a0al, train_loss_step=19.
Multinomial: 21.72089385986328, Poisson: -0.10419032722711563
Epoch 0: 5%| | 35/766 [00:11<03:52, 3.14it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 5%| | 35/766 [00:11<03:52, 3.14it/s, v_num=a0al, train_loss_step=21.
Multinomial: 22.369564056396484, Poisson: -0.10727277398109436
Epoch 0: 5%| | 36/766 [00:11<03:47, 3.20it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 5%| | 36/766 [00:11<03:50, 3.17it/s, v_num=a0al, train_loss_step=22.
Multinomial: 21.176250457763672, Poisson: -0.1012745052576065
Epoch 0: 5%| | 37/766 [00:11<03:46, 3.22it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 5%| | 37/766 [00:11<03:48, 3.19it/s, v_num=a0al, train_loss_step=21.
Multinomial: 18.37683868408203, Poisson: -0.08704456686973572
Epoch 0: 5%| | 38/766 [00:11<03:44, 3.24it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 5%| | 38/766 [00:11<03:47, 3.21it/s, v_num=a0al, train_loss_step=18.
Multinomial: 22.340030670166016, Poisson: -0.10708311200141907
Epoch 0: 5%| | 39/766 [00:11<03:42, 3.26it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 5%| | 39/766 [00:12<03:45, 3.22it/s, v_num=a0al, train_loss_step=22.
Multinomial: 21.2115478515625, Poisson: -0.1010795533657074
Epoch 0: 5%| | 40/766 [00:12<03:43, 3.24it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 5%| | 40/766 [00:12<03:44, 3.24it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.17957878112793, Poisson: -0.10130106657743454
Epoch 0: 5%| | 41/766 [00:12<03:40, 3.29it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 5%| | 41/766 [00:12<03:42, 3.26it/s, v_num=a0al, train_loss_step=21.
Multinomial: 17.70407485961914, Poisson: -0.08396982401609421
Epoch 0: 5%| | 42/766 [00:12<03:38, 3.31it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 5%| | 42/766 [00:12<03:41, 3.28it/s, v_num=a0al, train_loss_step=17.
Multinomial: 19.499862670898438, Poisson: -0.09266598522663116
Epoch 0: 6%| | 43/766 [00:12<03:37, 3.33it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 6%| | 43/766 [00:13<03:39, 3.29it/s, v_num=a0al, train_loss_step=19.
Multinomial: 20.606935501098633, Poisson: -0.09828896075487137
Epoch 0: 6%| | 44/766 [00:13<03:36, 3.34it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 6%| | 44/766 [00:13<03:38, 3.31it/s, v_num=a0al, train_loss_step=20.
Multinomial: 22.871383666992188, Poisson: -0.11045264452695847
Epoch 0: 6%| | 45/766 [00:13<03:37, 3.32it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 6%| | 45/766 [00:13<03:37, 3.32it/s, v_num=a0al, train_loss_step=22.
Multinomial: 24.033437728881836, Poisson: -0.11557681858539581
Epoch 0: 6%| | 46/766 [00:13<03:33, 3.37it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 6%| | 46/766 [00:13<03:35, 3.34it/s, v_num=a0al, train_loss_step=23.
Multinomial: 18.879009246826172, Poisson: -0.09021121263504028
Epoch 0: 6%| | 47/766 [00:13<03:32, 3.38it/s, v_num=a0al, train_loss_step=23.
Epoch 0: 6%| | 47/766 [00:14<03:34, 3.35it/s, v_num=a0al, train_loss_step=18.
Multinomial: 18.95680809020996, Poisson: -0.08978604525327682
Epoch 0: 6%| | 48/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 6%| | 48/766 [00:14<03:33, 3.36it/s, v_num=a0al, train_loss_step=18.
Multinomial: 17.858497619628906, Poisson: -0.08382056653499603
Epoch 0: 6%| | 49/766 [00:14<03:30, 3.41it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 6%| | 49/766 [00:14<03:32, 3.38it/s, v_num=a0al, train_loss_step=17.
Multinomial: 21.18074607849121, Poisson: -0.10173416137695312
Epoch 0: 7%| | 50/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 7%| | 50/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=21.
Multinomial: 19.48756980895996, Poisson: -0.09269016981124878
Epoch 0: 7%| | 51/766 [00:14<03:28, 3.43it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 7%| | 51/766 [00:15<03:30, 3.40it/s, v_num=a0al, train_loss_step=19.
Multinomial: 20.046504974365234, Poisson: -0.09527648985385895
Epoch 0: 7%| | 52/766 [00:15<03:27, 3.44it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 7%| | 52/766 [00:15<03:29, 3.41it/s, v_num=a0al, train_loss_step=20.
Multinomial: 21.808837890625, Poisson: -0.10404733568429947
Epoch 0: 7%| | 53/766 [00:15<03:26, 3.45it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 7%| | 53/766 [00:15<03:28, 3.42it/s, v_num=a0al, train_loss_step=21.
Multinomial: 18.97328758239746, Poisson: -0.08934041857719421
Epoch 0: 7%| | 54/766 [00:15<03:25, 3.46it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 7%| | 54/766 [00:15<03:27, 3.43it/s, v_num=a0al, train_loss_step=18.
Multinomial: 21.240169525146484, Poisson: -0.10145271569490433
Epoch 0: 7%| | 55/766 [00:15<03:26, 3.45it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 7%| | 55/766 [00:15<03:26, 3.44it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.870256423950195, Poisson: -0.1042548418045044
Epoch 0: 7%| | 56/766 [00:16<03:23, 3.48it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 7%| | 56/766 [00:16<03:25, 3.45it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.184789657592773, Poisson: -0.10091016441583633
Epoch 0: 7%| | 57/766 [00:16<03:22, 3.49it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 7%| | 57/766 [00:16<03:24, 3.46it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.16849136352539, Poisson: -0.10139136761426926
Epoch 0: 8%| | 58/766 [00:16<03:22, 3.50it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 8%| | 58/766 [00:16<03:23, 3.47it/s, v_num=a0al, train_loss_step=21.
Multinomial: 24.697975158691406, Poisson: -0.11847725510597229
Epoch 0: 8%| | 59/766 [00:16<03:21, 3.51it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 8%| | 59/766 [00:16<03:22, 3.48it/s, v_num=a0al, train_loss_step=24.
Multinomial: 21.23836898803711, Poisson: -0.10157377272844315
Epoch 0: 8%| | 60/766 [00:17<03:22, 3.49it/s, v_num=a0al, train_loss_step=24.
Epoch 0: 8%| | 60/766 [00:17<03:22, 3.49it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.151546478271484, Poisson: -0.10148127377033234
Epoch 0: 8%| | 61/766 [00:17<03:19, 3.53it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 8%| | 61/766 [00:17<03:21, 3.50it/s, v_num=a0al, train_loss_step=21.
Multinomial: 22.36435890197754, Poisson: -0.10736225545406342
Epoch 0: 8%| | 62/766 [00:17<03:19, 3.54it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 8%| | 62/766 [00:17<03:20, 3.51it/s, v_num=a0al, train_loss_step=22.
Multinomial: 18.459867477416992, Poisson: -0.08722960948944092
Epoch 0: 8%| | 63/766 [00:17<03:18, 3.55it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 8%| | 63/766 [00:17<03:19, 3.52it/s, v_num=a0al, train_loss_step=18.
Multinomial: 20.105688095092773, Poisson: -0.09594902396202087
Epoch 0: 8%| | 64/766 [00:18<03:17, 3.55it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 8%| | 64/766 [00:18<03:18, 3.53it/s, v_num=a0al, train_loss_step=20.
Multinomial: 21.239574432373047, Poisson: -0.10175144672393799
Epoch 0: 8%| | 65/766 [00:18<03:18, 3.54it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 8%| | 65/766 [00:18<03:18, 3.53it/s, v_num=a0al, train_loss_step=21.
Multinomial: 18.30000877380371, Poisson: -0.08693327754735947
Epoch 0: 9%| | 66/766 [00:18<03:16, 3.57it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 9%| | 66/766 [00:18<03:17, 3.54it/s, v_num=a0al, train_loss_step=18.
Multinomial: 19.48712921142578, Poisson: -0.09302585572004318
Epoch 0: 9%| | 67/766 [00:18<03:15, 3.58it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 9%| | 67/766 [00:18<03:16, 3.55it/s, v_num=a0al, train_loss_step=19.
Multinomial: 23.451393127441406, Poisson: -0.11258357018232346
Epoch 0: 9%| | 68/766 [00:18<03:14, 3.58it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 9%| | 68/766 [00:19<03:16, 3.56it/s, v_num=a0al, train_loss_step=23.
Multinomial: 20.058046340942383, Poisson: -0.09530574083328247
Epoch 0: 9%| | 69/766 [00:19<03:14, 3.59it/s, v_num=a0al, train_loss_step=23.
Epoch 0: 9%| | 69/766 [00:19<03:15, 3.57it/s, v_num=a0al, train_loss_step=20.
Multinomial: 18.952863693237305, Poisson: -0.08996559679508209
Epoch 0: 9%| | 70/766 [00:19<03:14, 3.57it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 9%| | 70/766 [00:19<03:14, 3.57it/s, v_num=a0al, train_loss_step=18.
Multinomial: 17.13503646850586, Poisson: -0.08106084913015366
Epoch 0: 9%| | 71/766 [00:19<03:12, 3.60it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 9%| | 71/766 [00:19<03:14, 3.58it/s, v_num=a0al, train_loss_step=17.
Multinomial: 20.00311279296875, Poisson: -0.09557122737169266
Epoch 0: 9%| | 72/766 [00:19<03:12, 3.61it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 9%| | 72/766 [00:20<03:13, 3.59it/s, v_num=a0al, train_loss_step=19.
Multinomial: 19.45271110534668, Poisson: -0.09286917746067047
Epoch 0: 10%| | 73/766 [00:20<03:11, 3.62it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 10%| | 73/766 [00:20<03:12, 3.59it/s, v_num=a0al, train_loss_step=19.
Multinomial: 19.410478591918945, Poisson: -0.09247767925262451
Epoch 0: 10%| | 74/766 [00:20<03:10, 3.62it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 10%| | 74/766 [00:20<03:12, 3.60it/s, v_num=a0al, train_loss_step=19.
Multinomial: 16.642332077026367, Poisson: -0.0786345899105072
Epoch 0: 10%| | 75/766 [00:20<03:11, 3.61it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 10%| | 75/766 [00:20<03:11, 3.61it/s, v_num=a0al, train_loss_step=16.
Multinomial: 22.897323608398438, Poisson: -0.11011520773172379
Epoch 0: 10%| | 76/766 [00:20<03:09, 3.64it/s, v_num=a0al, train_loss_step=16.
Epoch 0: 10%| | 76/766 [00:21<03:11, 3.61it/s, v_num=a0al, train_loss_step=22.
Multinomial: 17.767396926879883, Poisson: -0.08402471244335175
Epoch 0: 10%| | 77/766 [00:21<03:09, 3.64it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 10%| | 77/766 [00:21<03:10, 3.62it/s, v_num=a0al, train_loss_step=17.
Multinomial: 20.062463760375977, Poisson: -0.0955692008137703
Epoch 0: 10%| | 78/766 [00:21<03:08, 3.65it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 10%| | 78/766 [00:21<03:09, 3.62it/s, v_num=a0al, train_loss_step=20.
Multinomial: 18.32487678527832, Poisson: -0.0871487483382225
Epoch 0: 10%| | 79/766 [00:21<03:08, 3.65it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 10%| | 79/766 [00:21<03:09, 3.63it/s, v_num=a0al, train_loss_step=18.
Multinomial: 21.206655502319336, Poisson: -0.10164565593004227
Epoch 0: 10%| | 80/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 10%| | 80/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=21.
Multinomial: 22.280014038085938, Poisson: -0.10702759772539139
Epoch 0: 11%| | 81/766 [00:22<03:07, 3.66it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 11%| | 81/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=22.
Multinomial: 21.242645263671875, Poisson: -0.10192742943763733
Epoch 0: 11%| | 82/766 [00:22<03:06, 3.67it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 11%| | 82/766 [00:22<03:07, 3.65it/s, v_num=a0al, train_loss_step=21.
Multinomial: 21.142255783081055, Poisson: -0.10121983289718628
Epoch 0: 11%| | 83/766 [00:22<03:05, 3.67it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 11%| | 83/766 [00:22<03:07, 3.65it/s, v_num=a0al, train_loss_step=21.
Multinomial: 22.358478546142578, Poisson: -0.1070261299610138
Epoch 0: 11%| | 84/766 [00:22<03:05, 3.68it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 11%| | 84/766 [00:22<03:06, 3.66it/s, v_num=a0al, train_loss_step=22.
Multinomial: 21.18360137939453, Poisson: -0.10107354819774628
Epoch 0: 11%| | 85/766 [00:23<03:05, 3.66it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 11%| | 85/766 [00:23<03:05, 3.66it/s, v_num=a0al, train_loss_step=21.
Multinomial: 20.60392951965332, Poisson: -0.09856819361448288
Epoch 0: 11%| | 86/766 [00:23<03:04, 3.69it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 11%| | 86/766 [00:23<03:05, 3.67it/s, v_num=a0al, train_loss_step=20.
Multinomial: 19.474123001098633, Poisson: -0.09277226030826569
Epoch 0: 11%| | 87/766 [00:23<03:03, 3.69it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 11%| | 87/766 [00:23<03:04, 3.67it/s, v_num=a0al, train_loss_step=19.
Multinomial: 21.81633949279785, Poisson: -0.10398300737142563
Epoch 0: 11%| | 88/766 [00:23<03:03, 3.70it/s, v_num=a0al, train_loss_step=19.
Epoch 0: 11%| | 88/766 [00:23<03:04, 3.68it/s, v_num=a0al, train_loss_step=21.
Multinomial: 22.952714920043945, Poisson: -0.1099216490983963
Epoch 0: 12%| | 89/766 [00:24<03:02, 3.70it/s, v_num=a0al, train_loss_step=21.
Epoch 0: 12%| | 89/766 [00:24<03:03, 3.68it/s, v_num=a0al, train_loss_step=22.
Multinomial: 20.675338745117188, Poisson: -0.09857542812824249
Epoch 0: 12%| | 90/766 [00:24<03:03, 3.69it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 12%| | 90/766 [00:24<03:03, 3.68it/s, v_num=a0al, train_loss_step=20.
Multinomial: 20.54332733154297, Poisson: -0.09850569814443588
Epoch 0: 12%| | 91/766 [00:24<03:01, 3.71it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 12%| | 91/766 [00:24<03:02, 3.69it/s, v_num=a0al, train_loss_step=20.
Multinomial: 17.736204147338867, Poisson: -0.083879254758358
Epoch 0: 12%| | 92/766 [00:24<03:01, 3.71it/s, v_num=a0al, train_loss_step=20.
Epoch 0: 12%| | 92/766 [00:24<03:02, 3.69it/s, v_num=a0al, train_loss_step=17.
Multinomial: 18.93655014038086, Poisson: -0.09000393003225327
Epoch 0: 12%| | 93/766 [00:25<03:00, 3.72it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 12%| | 93/766 [00:25<03:01, 3.70it/s, v_num=a0al, train_loss_step=18.
Multinomial: 23.51058006286621, Poisson: -0.11284295469522476
Epoch 0: 12%| | 94/766 [00:25<03:00, 3.72it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 12%| | 94/766 [00:25<03:01, 3.70it/s, v_num=a0al, train_loss_step=23.
Multinomial: 22.92452621459961, Poisson: -0.10993973165750504
Epoch 0: 12%| | 95/766 [00:25<03:01, 3.71it/s, v_num=a0al, train_loss_step=23.
Epoch 0: 12%| | 95/766 [00:25<03:01, 3.71it/s, v_num=a0al, train_loss_step=22.
Multinomial: 18.880817413330078, Poisson: -0.08968962728977203
Epoch 0: 13%|▏| 96/766 [00:25<02:59, 3.73it/s, v_num=a0al, train_loss_step=22.
Epoch 0: 13%|▏| 96/766 [00:25<03:00, 3.71it/s, v_num=a0al, train_loss_step=18.
Multinomial: 18.968830108642578, Poisson: -0.08979591727256775
Epoch 0: 13%|▏| 97/766 [00:25<02:59, 3.73it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 13%|▏| 97/766 [00:26<03:00, 3.71it/s, v_num=a0al, train_loss_step=18.
Multinomial: 17.777538299560547, Poisson: -0.08393401652574539
Epoch 0: 13%|▏| 98/766 [00:26<02:58, 3.74it/s, v_num=a0al, train_loss_step=18.
Epoch 0: 13%|▏| 98/766 [00:26<02:59, 3.72it/s, v_num=a0al, train_loss_step=17.
Multinomial: 22.880767822265625, Poisson: -0.10982605814933777
Epoch 0: 13%|▏| 99/766 [00:26<02:58, 3.74it/s, v_num=a0al, train_loss_step=17.
Epoch 0: 13%|▏| 99/766 [00:26<02:59, 3.72it/s, v_num=a0al, train_loss_step=22.
Multinomial: 19.429824829101562, Poisson: -0.09266551584005356
Epoch 0: 13%|▏| 100/766 [00:26<02:58, 3.73it/s, v_num=a0al, train_loss_step=22
Epoch 0: 13%|▏| 100/766 [00:26<02:58, 3.73it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.114593505859375, Poisson: -0.10141497850418091
Epoch 0: 13%|▏| 101/766 [00:26<02:57, 3.75it/s, v_num=a0al, train_loss_step=19
Epoch 0: 13%|▏| 101/766 [00:27<02:58, 3.73it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.00738525390625, Poisson: -0.11572451889514923
Epoch 0: 13%|▏| 102/766 [00:27<02:56, 3.75it/s, v_num=a0al, train_loss_step=21
Epoch 0: 13%|▏| 102/766 [00:27<02:57, 3.73it/s, v_num=a0al, train_loss_step=23
Multinomial: 17.775775909423828, Poisson: -0.0842253789305687
Epoch 0: 13%|▏| 103/766 [00:27<02:56, 3.76it/s, v_num=a0al, train_loss_step=23
Epoch 0: 13%|▏| 103/766 [00:27<02:57, 3.74it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.294315338134766, Poisson: -0.10698876529932022
Epoch 0: 14%|▏| 104/766 [00:27<02:56, 3.76it/s, v_num=a0al, train_loss_step=17
Epoch 0: 14%|▏| 104/766 [00:27<02:56, 3.74it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.36329460144043, Poisson: -0.10711447149515152
Epoch 0: 14%|▏| 105/766 [00:28<02:56, 3.74it/s, v_num=a0al, train_loss_step=22
Epoch 0: 14%|▏| 105/766 [00:28<02:56, 3.74it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.062660217285156, Poisson: -0.09556890279054642
Epoch 0: 14%|▏| 106/766 [00:28<02:55, 3.77it/s, v_num=a0al, train_loss_step=22
Epoch 0: 14%|▏| 106/766 [00:28<02:56, 3.75it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.25997543334961, Poisson: -0.10673705488443375
Epoch 0: 14%|▏| 107/766 [00:28<02:54, 3.77it/s, v_num=a0al, train_loss_step=20
Epoch 0: 14%|▏| 107/766 [00:28<02:55, 3.75it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.74195098876953, Poisson: -0.10431475937366486
Epoch 0: 14%|▏| 108/766 [00:28<02:54, 3.77it/s, v_num=a0al, train_loss_step=22
Epoch 0: 14%|▏| 108/766 [00:28<02:55, 3.75it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.445833206176758, Poisson: -0.09263034164905548
Epoch 0: 14%|▏| 109/766 [00:28<02:54, 3.78it/s, v_num=a0al, train_loss_step=21
Epoch 0: 14%|▏| 109/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.96492576599121, Poisson: -0.10998839139938354
Epoch 0: 14%|▏| 110/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=19
Epoch 0: 14%|▏| 110/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.060997009277344, Poisson: -0.09550387412309647
Epoch 0: 14%|▏| 111/766 [00:29<02:53, 3.78it/s, v_num=a0al, train_loss_step=22
Epoch 0: 14%|▏| 111/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.398094177246094, Poisson: -0.09251043945550919
Epoch 0: 15%|▏| 112/766 [00:29<02:52, 3.78it/s, v_num=a0al, train_loss_step=20
Epoch 0: 15%|▏| 112/766 [00:29<02:53, 3.77it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.765329360961914, Poisson: -0.08439560234546661
Epoch 0: 15%|▏| 113/766 [00:29<02:52, 3.79it/s, v_num=a0al, train_loss_step=19
Epoch 0: 15%|▏| 113/766 [00:29<02:53, 3.77it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.94915008544922, Poisson: -0.11012542247772217
Epoch 0: 15%|▏| 114/766 [00:30<02:52, 3.79it/s, v_num=a0al, train_loss_step=17
Epoch 0: 15%|▏| 114/766 [00:30<02:52, 3.77it/s, v_num=a0al, train_loss_step=22
Multinomial: 16.100540161132812, Poisson: -0.07545100152492523
Epoch 0: 15%|▏| 115/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=22
Epoch 0: 15%|▏| 115/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=16
Multinomial: 22.88507843017578, Poisson: -0.11016397178173065
Epoch 0: 15%|▏| 116/766 [00:30<02:51, 3.80it/s, v_num=a0al, train_loss_step=16
Epoch 0: 15%|▏| 116/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.70903968811035, Poisson: -0.10451330244541168
Epoch 0: 15%|▏| 117/766 [00:30<02:50, 3.80it/s, v_num=a0al, train_loss_step=22
Epoch 0: 15%|▏| 117/766 [00:30<02:51, 3.78it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.082307815551758, Poisson: -0.0955430418252945
Epoch 0: 15%|▏| 118/766 [00:31<02:50, 3.80it/s, v_num=a0al, train_loss_step=21
Epoch 0: 15%|▏| 118/766 [00:31<02:51, 3.78it/s, v_num=a0al, train_loss_step=20
Multinomial: 25.241302490234375, Poisson: -0.12178383767604828
Epoch 0: 16%|▏| 119/766 [00:31<02:50, 3.80it/s, v_num=a0al, train_loss_step=20
Epoch 0: 16%|▏| 119/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=25
Multinomial: 20.645946502685547, Poisson: -0.09858675301074982
Epoch 0: 16%|▏| 120/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=25
Epoch 0: 16%|▏| 120/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.908796310424805, Poisson: -0.08985879272222519
Epoch 0: 16%|▏| 121/766 [00:31<02:49, 3.81it/s, v_num=a0al, train_loss_step=20
Epoch 0: 16%|▏| 121/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.289188385009766, Poisson: -0.10742945224046707
Epoch 0: 16%|▏| 122/766 [00:32<02:49, 3.81it/s, v_num=a0al, train_loss_step=18
Epoch 0: 16%|▏| 122/766 [00:32<02:49, 3.79it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.056671142578125, Poisson: -0.09564211219549179
Epoch 0: 16%|▏| 123/766 [00:32<02:48, 3.81it/s, v_num=a0al, train_loss_step=22
Epoch 0: 16%|▏| 123/766 [00:32<02:49, 3.80it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.75560760498047, Poisson: -0.10460510104894638
Epoch 0: 16%|▏| 124/766 [00:32<02:48, 3.82it/s, v_num=a0al, train_loss_step=20
Epoch 0: 16%|▏| 124/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.484085083007812, Poisson: -0.09247615933418274
Epoch 0: 16%|▏| 125/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=21
Epoch 0: 16%|▏| 125/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.635929107666016, Poisson: -0.09875985234975815
Epoch 0: 16%|▏| 126/766 [00:32<02:47, 3.82it/s, v_num=a0al, train_loss_step=19
Epoch 0: 16%|▏| 126/766 [00:33<02:48, 3.80it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.100229263305664, Poisson: -0.10157999396324158
Epoch 0: 17%|▏| 127/766 [00:33<02:47, 3.82it/s, v_num=a0al, train_loss_step=20
Epoch 0: 17%|▏| 127/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.291488647460938, Poisson: -0.10710974782705307
Epoch 0: 17%|▏| 128/766 [00:33<02:46, 3.82it/s, v_num=a0al, train_loss_step=21
Epoch 0: 17%|▏| 128/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.73076057434082, Poisson: -0.1042914167046547
Epoch 0: 17%|▏| 129/766 [00:33<02:46, 3.83it/s, v_num=a0al, train_loss_step=22
Epoch 0: 17%|▏| 129/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.011499404907227, Poisson: -0.08988802134990692
Epoch 0: 17%|▏| 130/766 [00:34<02:46, 3.81it/s, v_num=a0al, train_loss_step=21
Epoch 0: 17%|▏| 130/766 [00:34<02:46, 3.81it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.779977798461914, Poisson: -0.10456342250108719
Epoch 0: 17%|▏| 131/766 [00:34<02:45, 3.83it/s, v_num=a0al, train_loss_step=18
Epoch 0: 17%|▏| 131/766 [00:34<02:46, 3.82it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.7364444732666, Poisson: -0.10426057130098343
Epoch 0: 17%|▏| 132/766 [00:34<02:45, 3.83it/s, v_num=a0al, train_loss_step=21
Epoch 0: 17%|▏| 132/766 [00:34<02:46, 3.82it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.27402114868164, Poisson: -0.08728134632110596
Epoch 0: 17%|▏| 133/766 [00:34<02:45, 3.84it/s, v_num=a0al, train_loss_step=21
Epoch 0: 17%|▏| 133/766 [00:34<02:45, 3.82it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.547163009643555, Poisson: -0.09258746355772018
Epoch 0: 17%|▏| 134/766 [00:34<02:44, 3.84it/s, v_num=a0al, train_loss_step=18
Epoch 0: 17%|▏| 134/766 [00:35<02:45, 3.82it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.47853660583496, Poisson: -0.1130625307559967
Epoch 0: 18%|▏| 135/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=19
Epoch 0: 18%|▏| 135/766 [00:35<02:44, 3.82it/s, v_num=a0al, train_loss_step=23
Multinomial: 23.483755111694336, Poisson: -0.11301064491271973
Epoch 0: 18%|▏| 136/766 [00:35<02:44, 3.84it/s, v_num=a0al, train_loss_step=23
Epoch 0: 18%|▏| 136/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=23
Multinomial: 22.89188003540039, Poisson: -0.11027445644140244
Epoch 0: 18%|▏| 137/766 [00:35<02:43, 3.84it/s, v_num=a0al, train_loss_step=23
Epoch 0: 18%|▏| 137/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.216276168823242, Poisson: -0.10149399191141129
Epoch 0: 18%|▏| 138/766 [00:35<02:43, 3.85it/s, v_num=a0al, train_loss_step=22
Epoch 0: 18%|▏| 138/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.98031234741211, Poisson: -0.09576455503702164
Epoch 0: 18%|▏| 139/766 [00:36<02:42, 3.85it/s, v_num=a0al, train_loss_step=21
Epoch 0: 18%|▏| 139/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.223608016967773, Poisson: -0.1015629693865776
Epoch 0: 18%|▏| 140/766 [00:36<02:43, 3.84it/s, v_num=a0al, train_loss_step=19
Epoch 0: 18%|▏| 140/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.562938690185547, Poisson: -0.09860718995332718
Epoch 0: 18%|▏| 141/766 [00:36<02:42, 3.85it/s, v_num=a0al, train_loss_step=21
Epoch 0: 18%|▏| 141/766 [00:36<02:42, 3.84it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.765254974365234, Poisson: -0.10466967523097992
Epoch 0: 19%|▏| 142/766 [00:36<02:41, 3.85it/s, v_num=a0al, train_loss_step=20
Epoch 0: 19%|▏| 142/766 [00:36<02:42, 3.84it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.77707290649414, Poisson: -0.10455359518527985
Epoch 0: 19%|▏| 143/766 [00:37<02:41, 3.86it/s, v_num=a0al, train_loss_step=21
Epoch 0: 19%|▏| 143/766 [00:37<02:42, 3.84it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.388973236083984, Poisson: -0.10733388364315033
Epoch 0: 19%|▏| 144/766 [00:37<02:41, 3.86it/s, v_num=a0al, train_loss_step=21
Epoch 0: 19%|▏| 144/766 [00:37<02:41, 3.84it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.004207611083984, Poisson: -0.09571236371994019
Epoch 0: 19%|▏| 145/766 [00:37<02:41, 3.85it/s, v_num=a0al, train_loss_step=22
Epoch 0: 19%|▏| 145/766 [00:37<02:41, 3.84it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.35780906677246, Poisson: -0.10723188519477844
Epoch 0: 19%|▏| 146/766 [00:37<02:40, 3.86it/s, v_num=a0al, train_loss_step=19
Epoch 0: 19%|▏| 146/766 [00:37<02:41, 3.85it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.21773338317871, Poisson: -0.10146009176969528
Epoch 0: 19%|▏| 147/766 [00:38<02:40, 3.86it/s, v_num=a0al, train_loss_step=22
Epoch 0: 19%|▏| 147/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.026350021362305, Poisson: -0.09577830880880356
Epoch 0: 19%|▏| 148/766 [00:38<02:39, 3.86it/s, v_num=a0al, train_loss_step=21
Epoch 0: 19%|▏| 148/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.944095611572266, Poisson: -0.11019343137741089
Epoch 0: 19%|▏| 149/766 [00:38<02:39, 3.87it/s, v_num=a0al, train_loss_step=19
Epoch 0: 19%|▏| 149/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=22
Multinomial: 24.010438919067383, Poisson: -0.1160307228565216
Epoch 0: 20%|▏| 150/766 [00:38<02:39, 3.85it/s, v_num=a0al, train_loss_step=22
Epoch 0: 20%|▏| 150/766 [00:38<02:39, 3.85it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.618486404418945, Poisson: -0.09847906231880188
Epoch 0: 20%|▏| 151/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=23
Epoch 0: 20%|▏| 151/766 [00:39<02:39, 3.86it/s, v_num=a0al, train_loss_step=20
Multinomial: 23.028602600097656, Poisson: -0.11033559590578079
Epoch 0: 20%|▏| 152/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=20
Epoch 0: 20%|▏| 152/766 [00:39<02:39, 3.86it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.400075912475586, Poisson: -0.09291176497936249
Epoch 0: 20%|▏| 153/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=22
Epoch 0: 20%|▏| 153/766 [00:39<02:38, 3.86it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.339847564697266, Poisson: -0.10720396786928177
Epoch 0: 20%|▏| 154/766 [00:39<02:37, 3.87it/s, v_num=a0al, train_loss_step=19
Epoch 0: 20%|▏| 154/766 [00:39<02:38, 3.86it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.87708854675293, Poisson: -0.0903216078877449
Epoch 0: 20%|▏| 155/766 [00:40<02:38, 3.86it/s, v_num=a0al, train_loss_step=22
Epoch 0: 20%|▏| 155/766 [00:40<02:38, 3.86it/s, v_num=a0al, train_loss_step=18
Multinomial: 17.164257049560547, Poisson: -0.08146476745605469
Epoch 0: 20%|▏| 156/766 [00:40<02:37, 3.88it/s, v_num=a0al, train_loss_step=18
Epoch 0: 20%|▏| 156/766 [00:40<02:37, 3.86it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.632192611694336, Poisson: -0.09867066890001297
Epoch 0: 20%|▏| 157/766 [00:40<02:37, 3.88it/s, v_num=a0al, train_loss_step=17
Epoch 0: 20%|▏| 157/766 [00:40<02:37, 3.87it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.666051864624023, Poisson: -0.0988137349486351
Epoch 0: 21%|▏| 158/766 [00:40<02:36, 3.88it/s, v_num=a0al, train_loss_step=20
Epoch 0: 21%|▏| 158/766 [00:40<02:37, 3.87it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.503517150878906, Poisson: -0.09298935532569885
Epoch 0: 21%|▏| 159/766 [00:40<02:36, 3.88it/s, v_num=a0al, train_loss_step=20
Epoch 0: 21%|▏| 159/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.657108306884766, Poisson: -0.09871099889278412
Epoch 0: 21%|▏| 160/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=19
Epoch 0: 21%|▏| 160/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.22437286376953, Poisson: -0.10139385610818863
Epoch 0: 21%|▏| 161/766 [00:41<02:35, 3.88it/s, v_num=a0al, train_loss_step=20
Epoch 0: 21%|▏| 161/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.209266662597656, Poisson: -0.10157406330108643
Epoch 0: 21%|▏| 162/766 [00:41<02:35, 3.89it/s, v_num=a0al, train_loss_step=21
Epoch 0: 21%|▏| 162/766 [00:41<02:35, 3.87it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.142242431640625, Poisson: -0.08123155683279037
Epoch 0: 21%|▏| 163/766 [00:41<02:35, 3.89it/s, v_num=a0al, train_loss_step=21
Epoch 0: 21%|▏| 163/766 [00:42<02:35, 3.88it/s, v_num=a0al, train_loss_step=17
Multinomial: 18.951335906982422, Poisson: -0.0898410826921463
Epoch 0: 21%|▏| 164/766 [00:42<02:34, 3.89it/s, v_num=a0al, train_loss_step=17
Epoch 0: 21%|▏| 164/766 [00:42<02:35, 3.88it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.9245548248291, Poisson: -0.09010511636734009
Epoch 0: 22%|▏| 165/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=18
Epoch 0: 22%|▏| 165/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.871036529541016, Poisson: -0.11026760935783386
Epoch 0: 22%|▏| 166/766 [00:42<02:34, 3.89it/s, v_num=a0al, train_loss_step=18
Epoch 0: 22%|▏| 166/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.756982803344727, Poisson: -0.10451257973909378
Epoch 0: 22%|▏| 167/766 [00:42<02:33, 3.89it/s, v_num=a0al, train_loss_step=22
Epoch 0: 22%|▏| 167/766 [00:43<02:34, 3.88it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.007671356201172, Poisson: -0.0958346351981163
Epoch 0: 22%|▏| 168/766 [00:43<02:33, 3.90it/s, v_num=a0al, train_loss_step=21
Epoch 0: 22%|▏| 168/766 [00:43<02:33, 3.88it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.315797805786133, Poisson: -0.08711469173431396
Epoch 0: 22%|▏| 169/766 [00:43<02:33, 3.90it/s, v_num=a0al, train_loss_step=19
Epoch 0: 22%|▏| 169/766 [00:43<02:33, 3.88it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.435083389282227, Poisson: -0.09278573840856552
Epoch 0: 22%|▏| 170/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=18
Epoch 0: 22%|▏| 170/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.58577537536621, Poisson: -0.09869471937417984
Epoch 0: 22%|▏| 171/766 [00:43<02:32, 3.90it/s, v_num=a0al, train_loss_step=19
Epoch 0: 22%|▏| 171/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.860515594482422, Poisson: -0.09017433971166611
Epoch 0: 22%|▏| 172/766 [00:44<02:32, 3.90it/s, v_num=a0al, train_loss_step=20
Epoch 0: 22%|▏| 172/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=18
Multinomial: 24.060638427734375, Poisson: -0.11591766029596329
Epoch 0: 23%|▏| 173/766 [00:44<02:31, 3.90it/s, v_num=a0al, train_loss_step=18
Epoch 0: 23%|▏| 173/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.262035369873047, Poisson: -0.08708906173706055
Epoch 0: 23%|▏| 174/766 [00:44<02:31, 3.90it/s, v_num=a0al, train_loss_step=23
Epoch 0: 23%|▏| 174/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.997161865234375, Poisson: -0.09561602771282196
Epoch 0: 23%|▏| 175/766 [00:44<02:31, 3.89it/s, v_num=a0al, train_loss_step=18
Epoch 0: 23%|▏| 175/766 [00:44<02:31, 3.89it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.397085189819336, Poisson: -0.10748428851366043
Epoch 0: 23%|▏| 176/766 [00:45<02:31, 3.91it/s, v_num=a0al, train_loss_step=19
Epoch 0: 23%|▏| 176/766 [00:45<02:31, 3.89it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.636493682861328, Poisson: -0.09865304082632065
Epoch 0: 23%|▏| 177/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=22
Epoch 0: 23%|▏| 177/766 [00:45<02:31, 3.90it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.90738296508789, Poisson: -0.11019153892993927
Epoch 0: 23%|▏| 178/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=20
Epoch 0: 23%|▏| 178/766 [00:45<02:30, 3.90it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.74561309814453, Poisson: -0.10456544160842896
Epoch 0: 23%|▏| 179/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=22
Epoch 0: 23%|▏| 179/766 [00:45<02:30, 3.90it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.05675506591797, Poisson: -0.11581964790821075
Epoch 0: 23%|▏| 180/766 [00:46<02:30, 3.90it/s, v_num=a0al, train_loss_step=21
Epoch 0: 23%|▏| 180/766 [00:46<02:30, 3.90it/s, v_num=a0al, train_loss_step=23
Multinomial: 19.544309616088867, Poisson: -0.0925077348947525
Epoch 0: 24%|▏| 181/766 [00:46<02:29, 3.91it/s, v_num=a0al, train_loss_step=23
Epoch 0: 24%|▏| 181/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.366865158081055, Poisson: -0.1072160005569458
Epoch 0: 24%|▏| 182/766 [00:46<02:29, 3.91it/s, v_num=a0al, train_loss_step=19
Epoch 0: 24%|▏| 182/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.010889053344727, Poisson: -0.09560806304216385
Epoch 0: 24%|▏| 183/766 [00:46<02:28, 3.91it/s, v_num=a0al, train_loss_step=22
Epoch 0: 24%|▏| 183/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.048059463500977, Poisson: -0.09576614946126938
Epoch 0: 24%|▏| 184/766 [00:46<02:28, 3.92it/s, v_num=a0al, train_loss_step=19
Epoch 0: 24%|▏| 184/766 [00:47<02:29, 3.90it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.188583374023438, Poisson: -0.10139279067516327
Epoch 0: 24%|▏| 185/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=20
Epoch 0: 24%|▏| 185/766 [00:47<02:28, 3.90it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.821096420288086, Poisson: -0.10449020564556122
Epoch 0: 24%|▏| 186/766 [00:47<02:28, 3.92it/s, v_num=a0al, train_loss_step=21
Epoch 0: 24%|▏| 186/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.78985023498535, Poisson: -0.10434942692518234
Epoch 0: 24%|▏| 187/766 [00:47<02:27, 3.92it/s, v_num=a0al, train_loss_step=21
Epoch 0: 24%|▏| 187/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.14258575439453, Poisson: -0.10168063640594482
Epoch 0: 25%|▏| 188/766 [00:47<02:27, 3.92it/s, v_num=a0al, train_loss_step=21
Epoch 0: 25%|▏| 188/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.629594802856445, Poisson: -0.09855164587497711
Epoch 0: 25%|▏| 189/766 [00:48<02:27, 3.92it/s, v_num=a0al, train_loss_step=21
Epoch 0: 25%|▏| 189/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.340116500854492, Poisson: -0.10711188614368439
Epoch 0: 25%|▏| 190/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=20
Epoch 0: 25%|▏| 190/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=22
Multinomial: 16.662609100341797, Poisson: -0.07837900519371033
Epoch 0: 25%|▏| 191/766 [00:48<02:26, 3.92it/s, v_num=a0al, train_loss_step=22
Epoch 0: 25%|▏| 191/766 [00:48<02:26, 3.91it/s, v_num=a0al, train_loss_step=16
Multinomial: 20.059072494506836, Poisson: -0.09556883573532104
Epoch 0: 25%|▎| 192/766 [00:48<02:26, 3.92it/s, v_num=a0al, train_loss_step=16
Epoch 0: 25%|▎| 192/766 [00:49<02:26, 3.91it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.55850601196289, Poisson: -0.09870775789022446
Epoch 0: 25%|▎| 193/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=20
Epoch 0: 25%|▎| 193/766 [00:49<02:26, 3.91it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.497467041015625, Poisson: -0.09287054091691971
Epoch 0: 25%|▎| 194/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=20
Epoch 0: 25%|▎| 194/766 [00:49<02:26, 3.92it/s, v_num=a0al, train_loss_step=19
Multinomial: 16.54609489440918, Poisson: -0.07833902537822723
Epoch 0: 25%|▎| 195/766 [00:49<02:25, 3.92it/s, v_num=a0al, train_loss_step=19
Epoch 0: 25%|▎| 195/766 [00:49<02:25, 3.92it/s, v_num=a0al, train_loss_step=16
Multinomial: 21.772235870361328, Poisson: -0.1042647436261177
Epoch 0: 26%|▎| 196/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=16
Epoch 0: 26%|▎| 196/766 [00:50<02:25, 3.92it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.815385818481445, Poisson: -0.10429618507623672
Epoch 0: 26%|▎| 197/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21
Epoch 0: 26%|▎| 197/766 [00:50<02:25, 3.92it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.140241622924805, Poisson: -0.10133679211139679
Epoch 0: 26%|▎| 198/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21
Epoch 0: 26%|▎| 198/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.803293228149414, Poisson: -0.10434843599796295
Epoch 0: 26%|▎| 199/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21
Epoch 0: 26%|▎| 199/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.736957550048828, Poisson: -0.1044909879565239
Epoch 0: 26%|▎| 200/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21
Epoch 0: 26%|▎| 200/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21
Multinomial: 25.19703483581543, Poisson: -0.12156690657138824
Epoch 0: 26%|▎| 201/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=21
Epoch 0: 26%|▎| 201/766 [00:51<02:24, 3.92it/s, v_num=a0al, train_loss_step=25
Multinomial: 22.888446807861328, Poisson: -0.10990883409976959
Epoch 0: 26%|▎| 202/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=25
Epoch 0: 26%|▎| 202/766 [00:51<02:23, 3.92it/s, v_num=a0al, train_loss_step=22
Multinomial: 24.0115909576416, Poisson: -0.11590568721294403
Epoch 0: 27%|▎| 203/766 [00:51<02:23, 3.94it/s, v_num=a0al, train_loss_step=22
Epoch 0: 27%|▎| 203/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.955623626708984, Poisson: -0.08990591764450073
Epoch 0: 27%|▎| 204/766 [00:51<02:22, 3.94it/s, v_num=a0al, train_loss_step=23
Epoch 0: 27%|▎| 204/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=18
Multinomial: 23.459075927734375, Poisson: -0.11290311068296432
Epoch 0: 27%|▎| 205/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=18
Epoch 0: 27%|▎| 205/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.095584869384766, Poisson: -0.09583885967731476
Epoch 0: 27%|▎| 206/766 [00:52<02:22, 3.94it/s, v_num=a0al, train_loss_step=23
Epoch 0: 27%|▎| 206/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=20
Multinomial: 16.653268814086914, Poisson: -0.07830702513456345
Epoch 0: 27%|▎| 207/766 [00:52<02:21, 3.94it/s, v_num=a0al, train_loss_step=20
Epoch 0: 27%|▎| 207/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=16
Multinomial: 20.061912536621094, Poisson: -0.09579043090343475
Epoch 0: 27%|▎| 208/766 [00:52<02:21, 3.94it/s, v_num=a0al, train_loss_step=16
Epoch 0: 27%|▎| 208/766 [00:52<02:21, 3.93it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.23405647277832, Poisson: -0.1015443205833435
Epoch 0: 27%|▎| 209/766 [00:53<02:21, 3.94it/s, v_num=a0al, train_loss_step=20
Epoch 0: 27%|▎| 209/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.337825775146484, Poisson: -0.10725290328264236
Epoch 0: 27%|▎| 210/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21
Epoch 0: 27%|▎| 210/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.747053146362305, Poisson: -0.10438595712184906
Epoch 0: 28%|▎| 211/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=22
Epoch 0: 28%|▎| 211/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.78754997253418, Poisson: -0.1044032946228981
Epoch 0: 28%|▎| 212/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=21
Epoch 0: 28%|▎| 212/766 [00:53<02:20, 3.93it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.73340606689453, Poisson: -0.08447693288326263
Epoch 0: 28%|▎| 213/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=21
Epoch 0: 28%|▎| 213/766 [00:54<02:20, 3.93it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.85317039489746, Poisson: -0.11031051725149155
Epoch 0: 28%|▎| 214/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=17
Epoch 0: 28%|▎| 214/766 [00:54<02:20, 3.94it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.320491790771484, Poisson: -0.1070985198020935
Epoch 0: 28%|▎| 215/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=22
Epoch 0: 28%|▎| 215/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.711591720581055, Poisson: -0.10438373684883118
Epoch 0: 28%|▎| 216/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=22
Epoch 0: 28%|▎| 216/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.738370895385742, Poisson: -0.10432856529951096
Epoch 0: 28%|▎| 217/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=21
Epoch 0: 28%|▎| 217/766 [00:55<02:19, 3.94it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.16659164428711, Poisson: -0.10148350149393082
Epoch 0: 28%|▎| 218/766 [00:55<02:18, 3.95it/s, v_num=a0al, train_loss_step=21
Epoch 0: 28%|▎| 218/766 [00:55<02:19, 3.94it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.867549896240234, Poisson: -0.09012211859226227
Epoch 0: 29%|▎| 219/766 [00:55<02:18, 3.95it/s, v_num=a0al, train_loss_step=21
Epoch 0: 29%|▎| 219/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.603740692138672, Poisson: -0.09860372543334961
Epoch 0: 29%|▎| 220/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=18
Epoch 0: 29%|▎| 220/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.064496994018555, Poisson: -0.09578673541545868
Epoch 0: 29%|▎| 221/766 [00:55<02:17, 3.95it/s, v_num=a0al, train_loss_step=20
Epoch 0: 29%|▎| 221/766 [00:56<02:18, 3.94it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.167654037475586, Poisson: -0.1014397069811821
Epoch 0: 29%|▎| 222/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=20
Epoch 0: 29%|▎| 222/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.180723190307617, Poisson: -0.08138734847307205
Epoch 0: 29%|▎| 223/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=21
Epoch 0: 29%|▎| 223/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=17
Multinomial: 23.466264724731445, Poisson: -0.11305644363164902
Epoch 0: 29%|▎| 224/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=17
Epoch 0: 29%|▎| 224/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.176498413085938, Poisson: -0.10131478309631348
Epoch 0: 29%|▎| 225/766 [00:57<02:17, 3.95it/s, v_num=a0al, train_loss_step=23
Epoch 0: 29%|▎| 225/766 [00:57<02:17, 3.95it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.572397232055664, Poisson: -0.09865520149469376
Epoch 0: 30%|▎| 226/766 [00:57<02:16, 3.96it/s, v_num=a0al, train_loss_step=21
Epoch 0: 30%|▎| 226/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.918230056762695, Poisson: -0.110383540391922
Epoch 0: 30%|▎| 227/766 [00:57<02:16, 3.96it/s, v_num=a0al, train_loss_step=20
Epoch 0: 30%|▎| 227/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.728248596191406, Poisson: -0.10435037314891815
Epoch 0: 30%|▎| 228/766 [00:57<02:15, 3.96it/s, v_num=a0al, train_loss_step=22
Epoch 0: 30%|▎| 228/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.025983810424805, Poisson: -0.1158100888133049
Epoch 0: 30%|▎| 229/766 [00:57<02:15, 3.96it/s, v_num=a0al, train_loss_step=21
Epoch 0: 30%|▎| 229/766 [00:57<02:15, 3.95it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.17369270324707, Poisson: -0.10164433717727661
Epoch 0: 30%|▎| 230/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=23
Epoch 0: 30%|▎| 230/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.44134521484375, Poisson: -0.09278088063001633
Epoch 0: 30%|▎| 231/766 [00:58<02:15, 3.96it/s, v_num=a0al, train_loss_step=21
Epoch 0: 30%|▎| 231/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.792768478393555, Poisson: -0.08994679898023605
Epoch 0: 30%|▎| 232/766 [00:58<02:14, 3.96it/s, v_num=a0al, train_loss_step=19
Epoch 0: 30%|▎| 232/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.13924789428711, Poisson: -0.10129155963659286
Epoch 0: 30%|▎| 233/766 [00:58<02:14, 3.96it/s, v_num=a0al, train_loss_step=18
Epoch 0: 30%|▎| 233/766 [00:58<02:14, 3.95it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.57449722290039, Poisson: -0.09882348030805588
Epoch 0: 31%|▎| 234/766 [00:59<02:14, 3.96it/s, v_num=a0al, train_loss_step=21
Epoch 0: 31%|▎| 234/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.53229522705078, Poisson: -0.09858986735343933
Epoch 0: 31%|▎| 235/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20
Epoch 0: 31%|▎| 235/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.744096755981445, Poisson: -0.10428847372531891
Epoch 0: 31%|▎| 236/766 [00:59<02:13, 3.96it/s, v_num=a0al, train_loss_step=20
Epoch 0: 31%|▎| 236/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.346569061279297, Poisson: -0.1072879359126091
Epoch 0: 31%|▎| 237/766 [00:59<02:13, 3.96it/s, v_num=a0al, train_loss_step=21
Epoch 0: 31%|▎| 237/766 [00:59<02:13, 3.95it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.879066467285156, Poisson: -0.08986541628837585
Epoch 0: 31%|▎| 238/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=22
Epoch 0: 31%|▎| 238/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.27837371826172, Poisson: -0.10729967057704926
Epoch 0: 31%|▎| 239/766 [01:00<02:12, 3.97it/s, v_num=a0al, train_loss_step=18
Epoch 0: 31%|▎| 239/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.026735305786133, Poisson: -0.09572773426771164
Epoch 0: 31%|▎| 240/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=22
Epoch 0: 31%|▎| 240/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.477657318115234, Poisson: -0.11297494918107986
Epoch 0: 31%|▎| 241/766 [01:00<02:12, 3.97it/s, v_num=a0al, train_loss_step=19
Epoch 0: 31%|▎| 241/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.000240325927734, Poisson: -0.09568235278129578
Epoch 0: 32%|▎| 242/766 [01:01<02:12, 3.97it/s, v_num=a0al, train_loss_step=23
Epoch 0: 32%|▎| 242/766 [01:01<02:12, 3.96it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.904945373535156, Poisson: -0.11020729690790176
Epoch 0: 32%|▎| 243/766 [01:01<02:11, 3.97it/s, v_num=a0al, train_loss_step=19
Epoch 0: 32%|▎| 243/766 [01:01<02:12, 3.96it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.478771209716797, Poisson: -0.09279295802116394
Epoch 0: 32%|▎| 244/766 [01:01<02:11, 3.97it/s, v_num=a0al, train_loss_step=22
Epoch 0: 32%|▎| 244/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.575580596923828, Poisson: -0.09879402071237564
Epoch 0: 32%|▎| 245/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=19
Epoch 0: 32%|▎| 245/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.602312088012695, Poisson: -0.0987543910741806
Epoch 0: 32%|▎| 246/766 [01:01<02:10, 3.97it/s, v_num=a0al, train_loss_step=20
Epoch 0: 32%|▎| 246/766 [01:02<02:11, 3.96it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.788652420043945, Poisson: -0.10461057722568512
Epoch 0: 32%|▎| 247/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=20
Epoch 0: 32%|▎| 247/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.194067001342773, Poisson: -0.08147089183330536
Epoch 0: 32%|▎| 248/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=21
Epoch 0: 32%|▎| 248/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.02446174621582, Poisson: -0.09592930972576141
Epoch 0: 33%|▎| 249/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=17
Epoch 0: 33%|▎| 249/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.48725128173828, Poisson: -0.09317978471517563
Epoch 0: 33%|▎| 250/766 [01:03<02:10, 3.96it/s, v_num=a0al, train_loss_step=19
Epoch 0: 33%|▎| 250/766 [01:03<02:10, 3.96it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.3828125, Poisson: -0.10734201222658157
Epoch 0: 33%|▎| 251/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=19
Epoch 0: 33%|▎| 251/766 [01:03<02:09, 3.96it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.242563247680664, Poisson: -0.10164496302604675
Epoch 0: 33%|▎| 252/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=22
Epoch 0: 33%|▎| 252/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.66411781311035, Poisson: -0.09871603548526764
Epoch 0: 33%|▎| 253/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=21
Epoch 0: 33%|▎| 253/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.409164428710938, Poisson: -0.10761073976755142
Epoch 0: 33%|▎| 254/766 [01:03<02:08, 3.98it/s, v_num=a0al, train_loss_step=20
Epoch 0: 33%|▎| 254/766 [01:04<02:09, 3.97it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.343318939208984, Poisson: -0.1072956845164299
Epoch 0: 33%|▎| 255/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=22
Epoch 0: 33%|▎| 255/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.058074951171875, Poisson: -0.09569811075925827
Epoch 0: 33%|▎| 256/766 [01:04<02:08, 3.98it/s, v_num=a0al, train_loss_step=22
Epoch 0: 33%|▎| 256/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.18549346923828, Poisson: -0.10135468095541
Epoch 0: 34%|▎| 257/766 [01:04<02:07, 3.98it/s, v_num=a0al, train_loss_step=20
Epoch 0: 34%|▎| 257/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.718685150146484, Poisson: -0.10441819578409195
Epoch 0: 34%|▎| 258/766 [01:04<02:07, 3.98it/s, v_num=a0al, train_loss_step=21
Epoch 0: 34%|▎| 258/766 [01:04<02:07, 3.97it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.021270751953125, Poisson: -0.09564517438411713
Epoch 0: 34%|▎| 259/766 [01:05<02:07, 3.98it/s, v_num=a0al, train_loss_step=21
Epoch 0: 34%|▎| 259/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.88401222229004, Poisson: -0.09017042070627213
Epoch 0: 34%|▎| 260/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=19
Epoch 0: 34%|▎| 260/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=18
Multinomial: 17.753528594970703, Poisson: -0.08439047634601593
Epoch 0: 34%|▎| 261/766 [01:05<02:06, 3.98it/s, v_num=a0al, train_loss_step=18
Epoch 0: 34%|▎| 261/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=17
Multinomial: 19.46993064880371, Poisson: -0.09300607442855835
Epoch 0: 34%|▎| 262/766 [01:05<02:06, 3.98it/s, v_num=a0al, train_loss_step=17
Epoch 0: 34%|▎| 262/766 [01:05<02:06, 3.97it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.609466552734375, Poisson: -0.09894497692584991
Epoch 0: 34%|▎| 263/766 [01:06<02:06, 3.98it/s, v_num=a0al, train_loss_step=19
Epoch 0: 34%|▎| 263/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.225048065185547, Poisson: -0.1015915721654892
Epoch 0: 34%|▎| 264/766 [01:06<02:06, 3.98it/s, v_num=a0al, train_loss_step=20
Epoch 0: 34%|▎| 264/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.738489151000977, Poisson: -0.10424954444169998
Epoch 0: 35%|▎| 265/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21
Epoch 0: 35%|▎| 265/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.935768127441406, Poisson: -0.08997920900583267
Epoch 0: 35%|▎| 266/766 [01:06<02:05, 3.98it/s, v_num=a0al, train_loss_step=21
Epoch 0: 35%|▎| 266/766 [01:06<02:05, 3.97it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.35933494567871, Poisson: -0.08705601096153259
Epoch 0: 35%|▎| 267/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=18
Epoch 0: 35%|▎| 267/766 [01:07<02:05, 3.97it/s, v_num=a0al, train_loss_step=18
Multinomial: 16.61872673034668, Poisson: -0.07853472977876663
Epoch 0: 35%|▎| 268/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=18
Epoch 0: 35%|▎| 268/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=16
Multinomial: 21.18777084350586, Poisson: -0.10162300616502762
Epoch 0: 35%|▎| 269/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=16
Epoch 0: 35%|▎| 269/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.008358001708984, Poisson: -0.09578895568847656
Epoch 0: 35%|▎| 270/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=21
Epoch 0: 35%|▎| 270/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.775907516479492, Poisson: -0.10441450029611588
Epoch 0: 35%|▎| 271/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=19
Epoch 0: 35%|▎| 271/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.45305824279785, Poisson: -0.09294477105140686
Epoch 0: 36%|▎| 272/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=21
Epoch 0: 36%|▎| 272/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.475934982299805, Poisson: -0.09292714297771454
Epoch 0: 36%|▎| 273/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=19
Epoch 0: 36%|▎| 273/766 [01:08<02:03, 3.98it/s, v_num=a0al, train_loss_step=19
Multinomial: 24.073951721191406, Poisson: -0.11599202454090118
Epoch 0: 36%|▎| 274/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=19
Epoch 0: 36%|▎| 274/766 [01:08<02:03, 3.98it/s, v_num=a0al, train_loss_step=24
Multinomial: 19.46885871887207, Poisson: -0.09305798262357712
Epoch 0: 36%|▎| 275/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=24
Epoch 0: 36%|▎| 275/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.743139266967773, Poisson: -0.1045631468296051
Epoch 0: 36%|▎| 276/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=19
Epoch 0: 36%|▎| 276/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.68687629699707, Poisson: -0.08435137569904327
Epoch 0: 36%|▎| 277/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=21
Epoch 0: 36%|▎| 277/766 [01:09<02:02, 3.98it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.068389892578125, Poisson: -0.09600233286619186
Epoch 0: 36%|▎| 278/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=17
Epoch 0: 36%|▎| 278/766 [01:09<02:02, 3.98it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.4620361328125, Poisson: -0.09318304061889648
Epoch 0: 36%|▎| 279/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=20
Epoch 0: 36%|▎| 279/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.180936813354492, Poisson: -0.08134336769580841
Epoch 0: 37%|▎| 280/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=19
Epoch 0: 37%|▎| 280/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=17
Multinomial: 18.297426223754883, Poisson: -0.08734682202339172
Epoch 0: 37%|▎| 281/766 [01:10<02:01, 3.99it/s, v_num=a0al, train_loss_step=17
Epoch 0: 37%|▎| 281/766 [01:10<02:01, 3.98it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.0560245513916, Poisson: -0.09584958106279373
Epoch 0: 37%|▎| 282/766 [01:10<02:01, 3.99it/s, v_num=a0al, train_loss_step=18
Epoch 0: 37%|▎| 282/766 [01:10<02:01, 3.98it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.90555191040039, Poisson: -0.11051429063081741
Epoch 0: 37%|▎| 283/766 [01:10<02:00, 3.99it/s, v_num=a0al, train_loss_step=20
Epoch 0: 37%|▎| 283/766 [01:11<02:01, 3.98it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.218358993530273, Poisson: -0.10195266455411911
Epoch 0: 37%|▎| 284/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=22
Epoch 0: 37%|▎| 284/766 [01:11<02:00, 3.98it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.60708236694336, Poisson: -0.09894034266471863
Epoch 0: 37%|▎| 285/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=21
Epoch 0: 37%|▎| 285/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=20
Multinomial: 23.4676513671875, Poisson: -0.11325454711914062
Epoch 0: 37%|▎| 286/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=20
Epoch 0: 37%|▎| 286/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.798397064208984, Poisson: -0.1048145592212677
Epoch 0: 37%|▎| 287/766 [01:11<01:59, 3.99it/s, v_num=a0al, train_loss_step=23
Epoch 0: 37%|▎| 287/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=21
Multinomial: 23.460453033447266, Poisson: -0.11320748180150986
Epoch 0: 38%|▍| 288/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=21
Epoch 0: 38%|▍| 288/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=23
Multinomial: 19.974878311157227, Poisson: -0.0959002673625946
Epoch 0: 38%|▍| 289/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=23
Epoch 0: 38%|▍| 289/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=19
Multinomial: 24.074901580810547, Poisson: -0.11613652110099792
Epoch 0: 38%|▍| 290/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=19
Epoch 0: 38%|▍| 290/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=24
Multinomial: 22.31687355041504, Poisson: -0.10731612145900726
Epoch 0: 38%|▍| 291/766 [01:12<01:58, 4.00it/s, v_num=a0al, train_loss_step=24
Epoch 0: 38%|▍| 291/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.34976577758789, Poisson: -0.10729110985994339
Epoch 0: 38%|▍| 292/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=22
Epoch 0: 38%|▍| 292/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.7636661529541, Poisson: -0.1047850176692009
Epoch 0: 38%|▍| 293/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=22
Epoch 0: 38%|▍| 293/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.743812561035156, Poisson: -0.1044720709323883
Epoch 0: 38%|▍| 294/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 38%|▍| 294/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.24456024169922, Poisson: -0.08739378303289413
Epoch 0: 39%|▍| 295/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21
Epoch 0: 39%|▍| 295/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.432994842529297, Poisson: -0.09291543066501617
Epoch 0: 39%|▍| 296/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=18
Epoch 0: 39%|▍| 296/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.49660301208496, Poisson: -0.09277672320604324
Epoch 0: 39%|▍| 297/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=19
Epoch 0: 39%|▍| 297/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.835451126098633, Poisson: -0.10490729659795761
Epoch 0: 39%|▍| 298/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=19
Epoch 0: 39%|▍| 298/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.90227699279785, Poisson: -0.11049213260412216
Epoch 0: 39%|▍| 299/766 [01:14<01:56, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 39%|▍| 299/766 [01:14<01:56, 3.99it/s, v_num=a0al, train_loss_step=22
Multinomial: 17.16424560546875, Poisson: -0.08146752417087555
Epoch 0: 39%|▍| 300/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=22
Epoch 0: 39%|▍| 300/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=17
Multinomial: 21.780229568481445, Poisson: -0.10456757247447968
Epoch 0: 39%|▍| 301/766 [01:15<01:56, 4.00it/s, v_num=a0al, train_loss_step=17
Epoch 0: 39%|▍| 301/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.533884048461914, Poisson: -0.09327836334705353
Epoch 0: 39%|▍| 302/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 39%|▍| 302/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.492950439453125, Poisson: -0.0929587110877037
Epoch 0: 40%|▍| 303/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=19
Epoch 0: 40%|▍| 303/766 [01:15<01:55, 3.99it/s, v_num=a0al, train_loss_step=19
Multinomial: 25.189559936523438, Poisson: -0.12199914455413818
Epoch 0: 40%|▍| 304/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=19
Epoch 0: 40%|▍| 304/766 [01:16<01:55, 3.99it/s, v_num=a0al, train_loss_step=25
Multinomial: 22.94580841064453, Poisson: -0.11033787578344345
Epoch 0: 40%|▍| 305/766 [01:16<01:55, 4.00it/s, v_num=a0al, train_loss_step=25
Epoch 0: 40%|▍| 305/766 [01:16<01:55, 3.99it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.95433235168457, Poisson: -0.09568342566490173
Epoch 0: 40%|▍| 306/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=22
Epoch 0: 40%|▍| 306/766 [01:16<01:55, 4.00it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.165157318115234, Poisson: -0.10151126980781555
Epoch 0: 40%|▍| 307/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=19
Epoch 0: 40%|▍| 307/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.755311965942383, Poisson: -0.10471049696207047
Epoch 0: 40%|▍| 308/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 40%|▍| 308/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 23.50486946105957, Poisson: -0.11317897588014603
Epoch 0: 40%|▍| 309/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 40%|▍| 309/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.089189529418945, Poisson: -0.09579245746135712
Epoch 0: 40%|▍| 310/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=23
Epoch 0: 40%|▍| 310/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.973127365112305, Poisson: -0.09630625694990158
Epoch 0: 41%|▍| 311/766 [01:17<01:53, 4.00it/s, v_num=a0al, train_loss_step=20
Epoch 0: 41%|▍| 311/766 [01:17<01:53, 4.00it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.46533203125, Poisson: -0.09327004104852676
Epoch 0: 41%|▍| 312/766 [01:17<01:53, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 41%|▍| 312/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.34235954284668, Poisson: -0.08725226670503616
Epoch 0: 41%|▍| 313/766 [01:18<01:53, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 41%|▍| 313/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.199413299560547, Poisson: -0.10223302245140076
Epoch 0: 41%|▍| 314/766 [01:18<01:52, 4.01it/s, v_num=a0al, train_loss_step=18
Epoch 0: 41%|▍| 314/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.443342208862305, Poisson: -0.0930081233382225
Epoch 0: 41%|▍| 315/766 [01:18<01:52, 4.00it/s, v_num=a0al, train_loss_step=21
Epoch 0: 41%|▍| 315/766 [01:18<01:52, 4.00it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.190698623657227, Poisson: -0.10157360881567001
Epoch 0: 41%|▍| 316/766 [01:18<01:52, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 41%|▍| 316/766 [01:19<01:52, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.79853057861328, Poisson: -0.1045677438378334
Epoch 0: 41%|▍| 317/766 [01:19<01:52, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 41%|▍| 317/766 [01:19<01:52, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.742656707763672, Poisson: -0.08421915024518967
Epoch 0: 42%|▍| 318/766 [01:19<01:51, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 42%|▍| 318/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=17
Multinomial: 18.330219268798828, Poisson: -0.08722923696041107
Epoch 0: 42%|▍| 319/766 [01:19<01:51, 4.01it/s, v_num=a0al, train_loss_step=17
Epoch 0: 42%|▍| 319/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=18
Multinomial: 23.47544288635254, Poisson: -0.11358413100242615
Epoch 0: 42%|▍| 320/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=18
Epoch 0: 42%|▍| 320/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=23
Multinomial: 19.467269897460938, Poisson: -0.09299182146787643
Epoch 0: 42%|▍| 321/766 [01:20<01:51, 4.01it/s, v_num=a0al, train_loss_step=23
Epoch 0: 42%|▍| 321/766 [01:20<01:51, 4.00it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.6672306060791, Poisson: -0.09876550734043121
Epoch 0: 42%|▍| 322/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 42%|▍| 322/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.132488250732422, Poisson: -0.1016790047287941
Epoch 0: 42%|▍| 323/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=20
Epoch 0: 42%|▍| 323/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.278751373291016, Poisson: -0.08713781833648682
Epoch 0: 42%|▍| 324/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 42%|▍| 324/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.759872436523438, Poisson: -0.1045098751783371
Epoch 0: 42%|▍| 325/766 [01:21<01:50, 4.00it/s, v_num=a0al, train_loss_step=18
Epoch 0: 42%|▍| 325/766 [01:21<01:50, 4.00it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.067596435546875, Poisson: -0.11585790663957596
Epoch 0: 43%|▍| 326/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 43%|▍| 326/766 [01:21<01:49, 4.00it/s, v_num=a0al, train_loss_step=24
Multinomial: 18.932355880737305, Poisson: -0.08997628837823868
Epoch 0: 43%|▍| 327/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=24
Epoch 0: 43%|▍| 327/766 [01:21<01:49, 4.00it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.26723861694336, Poisson: -0.08742334693670273
Epoch 0: 43%|▍| 328/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=18
Epoch 0: 43%|▍| 328/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.192481994628906, Poisson: -0.10159247368574142
Epoch 0: 43%|▍| 329/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=18
Epoch 0: 43%|▍| 329/766 [01:22<01:49, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.774616241455078, Poisson: -0.1045759841799736
Epoch 0: 43%|▍| 330/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 43%|▍| 330/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.918109893798828, Poisson: -0.11037542670965195
Epoch 0: 43%|▍| 331/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 43%|▍| 331/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 17.097972869873047, Poisson: -0.08164189010858536
Epoch 0: 43%|▍| 332/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=22
Epoch 0: 43%|▍| 332/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=17
Multinomial: 19.439931869506836, Poisson: -0.09298811107873917
Epoch 0: 43%|▍| 333/766 [01:22<01:47, 4.01it/s, v_num=a0al, train_loss_step=17
Epoch 0: 43%|▍| 333/766 [01:23<01:48, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.610124588012695, Poisson: -0.09887861460447311
Epoch 0: 44%|▍| 334/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 44%|▍| 334/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.422632217407227, Poisson: -0.09310529381036758
Epoch 0: 44%|▍| 335/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20
Epoch 0: 44%|▍| 335/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.64097785949707, Poisson: -0.09903261810541153
Epoch 0: 44%|▍| 336/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 44%|▍| 336/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.74676513671875, Poisson: -0.10450173169374466
Epoch 0: 44%|▍| 337/766 [01:23<01:46, 4.01it/s, v_num=a0al, train_loss_step=20
Epoch 0: 44%|▍| 337/766 [01:24<01:47, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.961679458618164, Poisson: -0.11031122505664825
Epoch 0: 44%|▍| 338/766 [01:24<01:46, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 44%|▍| 338/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.743120193481445, Poisson: -0.10468468070030212
Epoch 0: 44%|▍| 339/766 [01:24<01:46, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 44%|▍| 339/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.76246452331543, Poisson: -0.10444938391447067
Epoch 0: 44%|▍| 340/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 44%|▍| 340/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.034584045410156, Poisson: -0.0958065316081047
Epoch 0: 45%|▍| 341/766 [01:24<01:45, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 45%|▍| 341/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.187660217285156, Poisson: -0.1017942950129509
Epoch 0: 45%|▍| 342/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 45%|▍| 342/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.44845962524414, Poisson: -0.0929780825972557
Epoch 0: 45%|▍| 343/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 45%|▍| 343/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.508325576782227, Poisson: -0.09298436343669891
Epoch 0: 45%|▍| 344/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 45%|▍| 344/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.59770393371582, Poisson: -0.09871768206357956
Epoch 0: 45%|▍| 345/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=19
Epoch 0: 45%|▍| 345/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.810609817504883, Poisson: -0.09023444354534149
Epoch 0: 45%|▍| 346/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 45%|▍| 346/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.55392837524414, Poisson: -0.09318080544471741
Epoch 0: 45%|▍| 347/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=18
Epoch 0: 45%|▍| 347/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.338863372802734, Poisson: -0.10776587575674057
Epoch 0: 45%|▍| 348/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 45%|▍| 348/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.75655746459961, Poisson: -0.10445886850357056
Epoch 0: 46%|▍| 349/766 [01:26<01:43, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 46%|▍| 349/766 [01:26<01:43, 4.01it/s, v_num=a0al, train_loss_step=21
Multinomial: 16.552867889404297, Poisson: -0.07851012051105499
Epoch 0: 46%|▍| 350/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=21
Epoch 0: 46%|▍| 350/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=16
Multinomial: 22.313129425048828, Poisson: -0.10756148397922516
Epoch 0: 46%|▍| 351/766 [01:27<01:43, 4.02it/s, v_num=a0al, train_loss_step=16
Epoch 0: 46%|▍| 351/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.432830810546875, Poisson: -0.09281440079212189
Epoch 0: 46%|▍| 352/766 [01:27<01:42, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 46%|▍| 352/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.061098098754883, Poisson: -0.0959169864654541
Epoch 0: 46%|▍| 353/766 [01:27<01:42, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 46%|▍| 353/766 [01:27<01:42, 4.01it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.313295364379883, Poisson: -0.10738380998373032
Epoch 0: 46%|▍| 354/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 46%|▍| 354/766 [01:28<01:42, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.32628631591797, Poisson: -0.10728715360164642
Epoch 0: 46%|▍| 355/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 46%|▍| 355/766 [01:28<01:42, 4.01it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.63421630859375, Poisson: -0.09855731576681137
Epoch 0: 46%|▍| 356/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 46%|▍| 356/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.052242279052734, Poisson: -0.09578649699687958
Epoch 0: 47%|▍| 357/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 47%|▍| 357/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.189252853393555, Poisson: -0.10169314593076706
Epoch 0: 47%|▍| 358/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 47%|▍| 358/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.797149658203125, Poisson: -0.10486872494220734
Epoch 0: 47%|▍| 359/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 47%|▍| 359/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.569082260131836, Poisson: -0.09853038191795349
Epoch 0: 47%|▍| 360/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 47%|▍| 360/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.498085021972656, Poisson: -0.0929093137383461
Epoch 0: 47%|▍| 361/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 47%|▍| 361/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.900672912597656, Poisson: -0.0900539830327034
Epoch 0: 47%|▍| 362/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 47%|▍| 362/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.629474639892578, Poisson: -0.09880076348781586
Epoch 0: 47%|▍| 363/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=18
Epoch 0: 47%|▍| 363/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.05840492248535, Poisson: -0.09586732089519501
Epoch 0: 48%|▍| 364/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 48%|▍| 364/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.877946853637695, Poisson: -0.11037742346525192
Epoch 0: 48%|▍| 365/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=20
Epoch 0: 48%|▍| 365/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.996482849121094, Poisson: -0.0957464724779129
Epoch 0: 48%|▍| 366/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 48%|▍| 366/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 24.613861083984375, Poisson: -0.11876354366540909
Epoch 0: 48%|▍| 367/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 48%|▍| 367/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=24
Multinomial: 22.33321189880371, Poisson: -0.10739743709564209
Epoch 0: 48%|▍| 368/766 [01:31<01:38, 4.03it/s, v_num=a0al, train_loss_step=24
Epoch 0: 48%|▍| 368/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.935331344604492, Poisson: -0.11003967374563217
Epoch 0: 48%|▍| 369/766 [01:31<01:38, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 48%|▍| 369/766 [01:31<01:38, 4.02it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.00196075439453, Poisson: -0.09576454013586044
Epoch 0: 48%|▍| 370/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=22
Epoch 0: 48%|▍| 370/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.221132278442383, Poisson: -0.10173946619033813
Epoch 0: 48%|▍| 371/766 [01:32<01:38, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 48%|▍| 371/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.060192108154297, Poisson: -0.09579496085643768
Epoch 0: 49%|▍| 372/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 49%|▍| 372/766 [01:32<01:37, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.589139938354492, Poisson: -0.09876739978790283
Epoch 0: 49%|▍| 373/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=20
Epoch 0: 49%|▍| 373/766 [01:32<01:37, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.740642547607422, Poisson: -0.10431542992591858
Epoch 0: 49%|▍| 374/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=20
Epoch 0: 49%|▍| 374/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.341379165649414, Poisson: -0.08717776834964752
Epoch 0: 49%|▍| 375/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=21
Epoch 0: 49%|▍| 375/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.19436264038086, Poisson: -0.10196632891893387
Epoch 0: 49%|▍| 376/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=18
Epoch 0: 49%|▍| 376/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 23.506832122802734, Poisson: -0.11307775229215622
Epoch 0: 49%|▍| 377/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 49%|▍| 377/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.549787521362305, Poisson: -0.09868539124727249
Epoch 0: 49%|▍| 378/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=23
Epoch 0: 49%|▍| 378/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.48164176940918, Poisson: -0.09314829111099243
Epoch 0: 49%|▍| 379/766 [01:34<01:36, 4.03it/s, v_num=a0al, train_loss_step=20
Epoch 0: 49%|▍| 379/766 [01:34<01:36, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.741195678710938, Poisson: -0.1045418530702591
Epoch 0: 50%|▍| 380/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 50%|▍| 380/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.425317764282227, Poisson: -0.09279096126556396
Epoch 0: 50%|▍| 381/766 [01:34<01:35, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 50%|▍| 381/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.347368240356445, Poisson: -0.10702072829008102
Epoch 0: 50%|▍| 382/766 [01:34<01:35, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 50%|▍| 382/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=22
Multinomial: 23.530473709106445, Poisson: -0.11321685463190079
Epoch 0: 50%|▌| 383/766 [01:35<01:35, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 50%|▌| 383/766 [01:35<01:35, 4.02it/s, v_num=a0al, train_loss_step=23
Multinomial: 19.423980712890625, Poisson: -0.0931314155459404
Epoch 0: 50%|▌| 384/766 [01:35<01:34, 4.03it/s, v_num=a0al, train_loss_step=23
Epoch 0: 50%|▌| 384/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.317577362060547, Poisson: -0.10730933398008347
Epoch 0: 50%|▌| 385/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=19
Epoch 0: 50%|▌| 385/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=22
Multinomial: 16.54821014404297, Poisson: -0.07840119302272797
Epoch 0: 50%|▌| 386/766 [01:35<01:34, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 50%|▌| 386/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=16
Multinomial: 19.455156326293945, Poisson: -0.09276847541332245
Epoch 0: 51%|▌| 387/766 [01:36<01:34, 4.03it/s, v_num=a0al, train_loss_step=16
Epoch 0: 51%|▌| 387/766 [01:36<01:34, 4.02it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.45780372619629, Poisson: -0.1133873239159584
Epoch 0: 51%|▌| 388/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 51%|▌| 388/766 [01:36<01:33, 4.02it/s, v_num=a0al, train_loss_step=23
Multinomial: 17.704776763916016, Poisson: -0.08420784771442413
Epoch 0: 51%|▌| 389/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=23
Epoch 0: 51%|▌| 389/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=17
Multinomial: 21.756298065185547, Poisson: -0.10439086705446243
Epoch 0: 51%|▌| 390/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=17
Epoch 0: 51%|▌| 390/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 14.827723503112793, Poisson: -0.06981196999549866
Epoch 0: 51%|▌| 391/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 51%|▌| 391/766 [01:37<01:33, 4.03it/s, v_num=a0al, train_loss_step=14
Multinomial: 21.138330459594727, Poisson: -0.10137390345335007
Epoch 0: 51%|▌| 392/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=14
Epoch 0: 51%|▌| 392/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.355375289916992, Poisson: -0.1073036715388298
Epoch 0: 51%|▌| 393/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 51%|▌| 393/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.599882125854492, Poisson: -0.09854143857955933
Epoch 0: 51%|▌| 394/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 51%|▌| 394/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.186933517456055, Poisson: -0.10148259997367859
Epoch 0: 52%|▌| 395/766 [01:38<01:32, 4.03it/s, v_num=a0al, train_loss_step=20
Epoch 0: 52%|▌| 395/766 [01:38<01:32, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.297380447387695, Poisson: -0.08710375428199768
Epoch 0: 52%|▌| 396/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 52%|▌| 396/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.292890548706055, Poisson: -0.10710746794939041
Epoch 0: 52%|▌| 397/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18
Epoch 0: 52%|▌| 397/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.311077117919922, Poisson: -0.08705893158912659
Epoch 0: 52%|▌| 398/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 52%|▌| 398/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.50176429748535, Poisson: -0.09276818484067917
Epoch 0: 52%|▌| 399/766 [01:38<01:30, 4.03it/s, v_num=a0al, train_loss_step=18
Epoch 0: 52%|▌| 399/766 [01:39<01:31, 4.03it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.905942916870117, Poisson: -0.08994127064943314
Epoch 0: 52%|▌| 400/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 52%|▌| 400/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 25.189510345458984, Poisson: -0.12159111350774765
Epoch 0: 52%|▌| 401/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=18
Epoch 0: 52%|▌| 401/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=25
Multinomial: 22.315311431884766, Poisson: -0.10756457597017288
Epoch 0: 52%|▌| 402/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=25
Epoch 0: 52%|▌| 402/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.505334854125977, Poisson: -0.09276691824197769
Epoch 0: 53%|▌| 403/766 [01:39<01:29, 4.03it/s, v_num=a0al, train_loss_step=22
Epoch 0: 53%|▌| 403/766 [01:40<01:30, 4.03it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.212017059326172, Poisson: -0.10146214812994003
Epoch 0: 53%|▌| 404/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 53%|▌| 404/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.59143829345703, Poisson: -0.09869526326656342
Epoch 0: 53%|▌| 405/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21
Epoch 0: 53%|▌| 405/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.556276321411133, Poisson: -0.09864883124828339
Epoch 0: 53%|▌| 406/766 [01:40<01:29, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 53%|▌| 406/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.144487380981445, Poisson: -0.10163812339305878
Epoch 0: 53%|▌| 407/766 [01:40<01:28, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 53%|▌| 407/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.869873046875, Poisson: -0.11037392169237137
Epoch 0: 53%|▌| 408/766 [01:41<01:28, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 53%|▌| 408/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.9237060546875, Poisson: -0.08994999527931213
Epoch 0: 53%|▌| 409/766 [01:41<01:28, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 53%|▌| 409/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.603927612304688, Poisson: -0.09861285984516144
Epoch 0: 54%|▌| 410/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=18
Epoch 0: 54%|▌| 410/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.39595603942871, Poisson: -0.0930929183959961
Epoch 0: 54%|▌| 411/766 [01:41<01:27, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 54%|▌| 411/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.3190860748291, Poisson: -0.08724711090326309
Epoch 0: 54%|▌| 412/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=19
Epoch 0: 54%|▌| 412/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.606151580810547, Poisson: -0.09849604219198227
Epoch 0: 54%|▌| 413/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 54%|▌| 413/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.42496109008789, Poisson: -0.09279781579971313
Epoch 0: 54%|▌| 414/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 54%|▌| 414/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.367816925048828, Poisson: -0.1075163185596466
Epoch 0: 54%|▌| 415/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=19
Epoch 0: 54%|▌| 415/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.783443450927734, Poisson: -0.10439547151327133
Epoch 0: 54%|▌| 416/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 54%|▌| 416/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.364540100097656, Poisson: -0.1071932464838028
Epoch 0: 54%|▌| 417/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 54%|▌| 417/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.86842155456543, Poisson: -0.09013441950082779
Epoch 0: 55%|▌| 418/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 55%|▌| 418/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 24.042606353759766, Poisson: -0.11599228531122208
Epoch 0: 55%|▌| 419/766 [01:43<01:25, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 55%|▌| 419/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=23
Multinomial: 17.2088680267334, Poisson: -0.08152053505182266
Epoch 0: 55%|▌| 420/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=23
Epoch 0: 55%|▌| 420/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.612220764160156, Poisson: -0.09851660579442978
Epoch 0: 55%|▌| 421/766 [01:44<01:25, 4.04it/s, v_num=a0al, train_loss_step=17
Epoch 0: 55%|▌| 421/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.886459350585938, Poisson: -0.11061673611402512
Epoch 0: 55%|▌| 422/766 [01:44<01:25, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 55%|▌| 422/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.761281967163086, Poisson: -0.10456900298595428
Epoch 0: 55%|▌| 423/766 [01:44<01:24, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 55%|▌| 423/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.932226181030273, Poisson: -0.11030212044715881
Epoch 0: 55%|▌| 424/766 [01:44<01:24, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 55%|▌| 424/766 [01:45<01:24, 4.03it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.940080642700195, Poisson: -0.09015578031539917
Epoch 0: 55%|▌| 425/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 55%|▌| 425/766 [01:45<01:24, 4.03it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.045801162719727, Poisson: -0.09577429294586182
Epoch 0: 56%|▌| 426/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 56%|▌| 426/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.947856903076172, Poisson: -0.09007912874221802
Epoch 0: 56%|▌| 427/766 [01:45<01:23, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 56%|▌| 427/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.19378662109375, Poisson: -0.10146069526672363
Epoch 0: 56%|▌| 428/766 [01:45<01:23, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 56%|▌| 428/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.0323429107666, Poisson: -0.09605273604393005
Epoch 0: 56%|▌| 429/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 56%|▌| 429/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.192474365234375, Poisson: -0.10159146785736084
Epoch 0: 56%|▌| 430/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=19
Epoch 0: 56%|▌| 430/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.593473434448242, Poisson: -0.09863156080245972
Epoch 0: 56%|▌| 431/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 56%|▌| 431/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 23.461124420166016, Poisson: -0.11302480101585388
Epoch 0: 56%|▌| 432/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 56%|▌| 432/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.86369514465332, Poisson: -0.08997032791376114
Epoch 0: 57%|▌| 433/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23
Epoch 0: 57%|▌| 433/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=18
Multinomial: 23.45587158203125, Poisson: -0.11291443556547165
Epoch 0: 57%|▌| 434/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 57%|▌| 434/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.766098022460938, Poisson: -0.10475729405879974
Epoch 0: 57%|▌| 435/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=23
Epoch 0: 57%|▌| 435/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.309078216552734, Poisson: -0.10760536789894104
Epoch 0: 57%|▌| 436/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 57%|▌| 436/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.89189338684082, Poisson: -0.11007551848888397
Epoch 0: 57%|▌| 437/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 57%|▌| 437/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 23.434974670410156, Poisson: -0.11298343539237976
Epoch 0: 57%|▌| 438/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 57%|▌| 438/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=23
Multinomial: 22.913671493530273, Poisson: -0.1100585088133812
Epoch 0: 57%|▌| 439/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=23
Epoch 0: 57%|▌| 439/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.782474517822266, Poisson: -0.10438213497400284
Epoch 0: 57%|▌| 440/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 57%|▌| 440/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.02216339111328, Poisson: -0.09585492312908173
Epoch 0: 58%|▌| 441/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 58%|▌| 441/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.8275203704834, Poisson: -0.08999547362327576
Epoch 0: 58%|▌| 442/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=19
Epoch 0: 58%|▌| 442/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.660303115844727, Poisson: -0.09890352189540863
Epoch 0: 58%|▌| 443/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=18
Epoch 0: 58%|▌| 443/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.554729461669922, Poisson: -0.09874079376459122
Epoch 0: 58%|▌| 444/766 [01:49<01:19, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 58%|▌| 444/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.20897674560547, Poisson: -0.10159632563591003
Epoch 0: 58%|▌| 445/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=20
Epoch 0: 58%|▌| 445/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.04391860961914, Poisson: -0.09576905518770218
Epoch 0: 58%|▌| 446/766 [01:50<01:19, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 58%|▌| 446/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.425464630126953, Poisson: -0.09294290095567703
Epoch 0: 58%|▌| 447/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=19
Epoch 0: 58%|▌| 447/766 [01:50<01:18, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.152692794799805, Poisson: -0.10169364511966705
Epoch 0: 58%|▌| 448/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=19
Epoch 0: 58%|▌| 448/766 [01:50<01:18, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.199073791503906, Poisson: -0.10210412740707397
Epoch 0: 59%|▌| 449/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 59%|▌| 449/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.188983917236328, Poisson: -0.10162121802568436
Epoch 0: 59%|▌| 450/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21
Epoch 0: 59%|▌| 450/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.88677215576172, Poisson: -0.10998804867267609
Epoch 0: 59%|▌| 451/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 59%|▌| 451/766 [01:51<01:17, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.793415069580078, Poisson: -0.10432875901460648
Epoch 0: 59%|▌| 452/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 59%|▌| 452/766 [01:51<01:17, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.90279769897461, Poisson: -0.11046546697616577
Epoch 0: 59%|▌| 453/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 59%|▌| 453/766 [01:52<01:17, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.31927490234375, Poisson: -0.10741424560546875
Epoch 0: 59%|▌| 454/766 [01:52<01:17, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 59%|▌| 454/766 [01:52<01:17, 4.04it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.186891555786133, Poisson: -0.10152309387922287
Epoch 0: 59%|▌| 455/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=22
Epoch 0: 59%|▌| 455/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.055156707763672, Poisson: -0.0958203449845314
Epoch 0: 60%|▌| 456/766 [01:52<01:16, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 60%|▌| 456/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.19700813293457, Poisson: -0.1016567125916481
Epoch 0: 60%|▌| 457/766 [01:52<01:16, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 60%|▌| 457/766 [01:53<01:16, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.08208656311035, Poisson: -0.09579449892044067
Epoch 0: 60%|▌| 458/766 [01:53<01:16, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 60%|▌| 458/766 [01:53<01:16, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.43222427368164, Poisson: -0.09304789453744888
Epoch 0: 60%|▌| 459/766 [01:53<01:15, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 60%|▌| 459/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.26169204711914, Poisson: -0.10192114859819412
Epoch 0: 60%|▌| 460/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=19
Epoch 0: 60%|▌| 460/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.59422492980957, Poisson: -0.09882223606109619
Epoch 0: 60%|▌| 461/766 [01:53<01:15, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 60%|▌| 461/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.98931312561035, Poisson: -0.09591254591941833
Epoch 0: 60%|▌| 462/766 [01:54<01:15, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 60%|▌| 462/766 [01:54<01:15, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.62569236755371, Poisson: -0.09887341409921646
Epoch 0: 60%|▌| 463/766 [01:54<01:14, 4.05it/s, v_num=a0al, train_loss_step=19
Epoch 0: 60%|▌| 463/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.513959884643555, Poisson: -0.0929412916302681
Epoch 0: 61%|▌| 464/766 [01:54<01:14, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 61%|▌| 464/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.668134689331055, Poisson: -0.09903602302074432
Epoch 0: 61%|▌| 465/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=19
Epoch 0: 61%|▌| 465/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=20
Multinomial: 24.615325927734375, Poisson: -0.11899266391992569
Epoch 0: 61%|▌| 466/766 [01:55<01:14, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 61%|▌| 466/766 [01:55<01:14, 4.04it/s, v_num=a0al, train_loss_step=24
Multinomial: 18.833663940429688, Poisson: -0.09003962576389313
Epoch 0: 61%|▌| 467/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=24
Epoch 0: 61%|▌| 467/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.29035758972168, Poisson: -0.10729990154504776
Epoch 0: 61%|▌| 468/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 61%|▌| 468/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.32347869873047, Poisson: -0.08711374551057816
Epoch 0: 61%|▌| 469/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 61%|▌| 469/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.881690979003906, Poisson: -0.0901159793138504
Epoch 0: 61%|▌| 470/766 [01:56<01:13, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 61%|▌| 470/766 [01:56<01:13, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.574710845947266, Poisson: -0.09879805892705917
Epoch 0: 61%|▌| 471/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 61%|▌| 471/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.176883697509766, Poisson: -0.1018044650554657
Epoch 0: 62%|▌| 472/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 62%|▌| 472/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.632938385009766, Poisson: -0.09889015555381775
Epoch 0: 62%|▌| 473/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 62%|▌| 473/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.012971878051758, Poisson: -0.09589537978172302
Epoch 0: 62%|▌| 474/766 [01:57<01:12, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 62%|▌| 474/766 [01:57<01:12, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.88588523864746, Poisson: -0.09001684188842773
Epoch 0: 62%|▌| 475/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=19
Epoch 0: 62%|▌| 475/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.06610679626465, Poisson: -0.0959547832608223
Epoch 0: 62%|▌| 476/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 62%|▌| 476/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.90119171142578, Poisson: -0.0900401920080185
Epoch 0: 62%|▌| 477/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 62%|▌| 477/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.705995559692383, Poisson: -0.10461120307445526
Epoch 0: 62%|▌| 478/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 62%|▌| 478/766 [01:58<01:11, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.32662010192871, Poisson: -0.10744621604681015
Epoch 0: 63%|▋| 479/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 63%|▋| 479/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.06502914428711, Poisson: -0.09612105786800385
Epoch 0: 63%|▋| 480/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 63%|▋| 480/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.90214729309082, Poisson: -0.0900801345705986
Epoch 0: 63%|▋| 481/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 63%|▋| 481/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.344295501708984, Poisson: -0.10750877112150192
Epoch 0: 63%|▋| 482/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 63%|▋| 482/766 [01:59<01:10, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.892011642456055, Poisson: -0.1104309931397438
Epoch 0: 63%|▋| 483/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 63%|▋| 483/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.33388328552246, Poisson: -0.0872010588645935
Epoch 0: 63%|▋| 484/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 63%|▋| 484/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.70537757873535, Poisson: -0.10474186390638351
Epoch 0: 63%|▋| 485/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 63%|▋| 485/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.626005172729492, Poisson: -0.09885644912719727
Epoch 0: 63%|▋| 486/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 63%|▋| 486/766 [02:00<01:09, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.320764541625977, Poisson: -0.08730830252170563
Epoch 0: 64%|▋| 487/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 64%|▋| 487/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.204618453979492, Poisson: -0.10184206813573837
Epoch 0: 64%|▋| 488/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 64%|▋| 488/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.476268768310547, Poisson: -0.09295077621936798
Epoch 0: 64%|▋| 489/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 64%|▋| 489/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.75274658203125, Poisson: -0.10457751154899597
Epoch 0: 64%|▋| 490/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=19
Epoch 0: 64%|▋| 490/766 [02:01<01:08, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.167760848999023, Poisson: -0.10183519124984741
Epoch 0: 64%|▋| 491/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 64%|▋| 491/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.753326416015625, Poisson: -0.10484641790390015
Epoch 0: 64%|▋| 492/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 64%|▋| 492/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.363622665405273, Poisson: -0.10764535516500473
Epoch 0: 64%|▋| 493/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 64%|▋| 493/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.632719039916992, Poisson: -0.0988030731678009
Epoch 0: 64%|▋| 494/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 64%|▋| 494/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.926715850830078, Poisson: -0.11051718145608902
Epoch 0: 65%|▋| 495/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=20
Epoch 0: 65%|▋| 495/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.477825164794922, Poisson: -0.09308557212352753
Epoch 0: 65%|▋| 496/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 65%|▋| 496/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.22157096862793, Poisson: -0.10167962312698364
Epoch 0: 65%|▋| 497/766 [02:02<01:06, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 65%|▋| 497/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.785541534423828, Poisson: -0.10460682958364487
Epoch 0: 65%|▋| 498/766 [02:02<01:06, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 65%|▋| 498/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.182655334472656, Poisson: -0.10170533508062363
Epoch 0: 65%|▋| 499/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 65%|▋| 499/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.012969970703125, Poisson: -0.09600904583930969
Epoch 0: 65%|▋| 500/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=21
Epoch 0: 65%|▋| 500/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.6324462890625, Poisson: -0.09878172725439072
Epoch 0: 65%|▋| 501/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 65%|▋| 501/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.375688552856445, Poisson: -0.10759124159812927
Epoch 0: 66%|▋| 502/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 66%|▋| 502/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.72992706298828, Poisson: -0.10490220785140991
Epoch 0: 66%|▋| 503/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 66%|▋| 503/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.729093551635742, Poisson: -0.08433445543050766
Epoch 0: 66%|▋| 504/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 66%|▋| 504/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.33883285522461, Poisson: -0.10752850025892258
Epoch 0: 66%|▋| 505/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=17
Epoch 0: 66%|▋| 505/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.393917083740234, Poisson: -0.09316592663526535
Epoch 0: 66%|▋| 506/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 66%|▋| 506/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.15519905090332, Poisson: -0.1016923263669014
Epoch 0: 66%|▋| 507/766 [02:04<01:03, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 66%|▋| 507/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.47632598876953, Poisson: -0.09319717437028885
Epoch 0: 66%|▋| 508/766 [02:05<01:03, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 66%|▋| 508/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.30472183227539, Poisson: -0.08737389743328094
Epoch 0: 66%|▋| 509/766 [02:05<01:03, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 66%|▋| 509/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.87850570678711, Poisson: -0.1103426143527031
Epoch 0: 67%|▋| 510/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=18
Epoch 0: 67%|▋| 510/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.92397689819336, Poisson: -0.09042582660913467
Epoch 0: 67%|▋| 511/766 [02:05<01:02, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 67%|▋| 511/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.052078247070312, Poisson: -0.09605080634355545
Epoch 0: 67%|▋| 512/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 67%|▋| 512/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.039587020874023, Poisson: -0.09586768597364426
Epoch 0: 67%|▋| 513/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 67%|▋| 513/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.329599380493164, Poisson: -0.10767664760351181
Epoch 0: 67%|▋| 514/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 67%|▋| 514/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.762454986572266, Poisson: -0.10471668094396591
Epoch 0: 67%|▋| 515/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=22
Epoch 0: 67%|▋| 515/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.3719425201416, Poisson: -0.10770434141159058
Epoch 0: 67%|▋| 516/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 67%|▋| 516/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.777860641479492, Poisson: -0.10472965985536575
Epoch 0: 67%|▋| 517/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 67%|▋| 517/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.621591567993164, Poisson: -0.09896160662174225
Epoch 0: 68%|▋| 518/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 68%|▋| 518/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=20
Multinomial: 16.540300369262695, Poisson: -0.07868669927120209
Epoch 0: 68%|▋| 519/766 [02:07<01:00, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 68%|▋| 519/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=16
Multinomial: 22.385284423828125, Poisson: -0.10761824995279312
Epoch 0: 68%|▋| 520/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=16
Epoch 0: 68%|▋| 520/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=22
Multinomial: 23.501117706298828, Poisson: -0.11348327249288559
Epoch 0: 68%|▋| 521/766 [02:08<01:00, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 68%|▋| 521/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.739992141723633, Poisson: -0.10472604632377625
Epoch 0: 68%|▋| 522/766 [02:08<01:00, 4.06it/s, v_num=a0al, train_loss_step=23
Epoch 0: 68%|▋| 522/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.185258865356445, Poisson: -0.10173660516738892
Epoch 0: 68%|▋| 523/766 [02:08<00:59, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 68%|▋| 523/766 [02:08<00:59, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.91326332092285, Poisson: -0.0903579592704773
Epoch 0: 68%|▋| 524/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 68%|▋| 524/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.332609176635742, Poisson: -0.10764188319444656
Epoch 0: 69%|▋| 525/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 69%|▋| 525/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.16383171081543, Poisson: -0.10190869122743607
Epoch 0: 69%|▋| 526/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 69%|▋| 526/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.098201751708984, Poisson: -0.10189001262187958
Epoch 0: 69%|▋| 527/766 [02:09<00:58, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 69%|▋| 527/766 [02:09<00:58, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.59515380859375, Poisson: -0.09890273213386536
Epoch 0: 69%|▋| 528/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 69%|▋| 528/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.494415283203125, Poisson: -0.09319400042295456
Epoch 0: 69%|▋| 529/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 69%|▋| 529/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.47793197631836, Poisson: -0.09325416386127472
Epoch 0: 69%|▋| 530/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 69%|▋| 530/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.84056282043457, Poisson: -0.09030847996473312
Epoch 0: 69%|▋| 531/766 [02:10<00:57, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 69%|▋| 531/766 [02:10<00:57, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.482717514038086, Poisson: -0.09316873550415039
Epoch 0: 69%|▋| 532/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 69%|▋| 532/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.619293212890625, Poisson: -0.09898217767477036
Epoch 0: 70%|▋| 533/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 70%|▋| 533/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 23.45407485961914, Poisson: -0.11366017907857895
Epoch 0: 70%|▋| 534/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 70%|▋| 534/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.06269645690918, Poisson: -0.09609104692935944
Epoch 0: 70%|▋| 535/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=23
Epoch 0: 70%|▋| 535/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.31659698486328, Poisson: -0.08762583136558533
Epoch 0: 70%|▋| 536/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 70%|▋| 536/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.4761962890625, Poisson: -0.09314439445734024
Epoch 0: 70%|▋| 537/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 70%|▋| 537/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.999032974243164, Poisson: -0.11631442606449127
Epoch 0: 70%|▋| 538/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 70%|▋| 538/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.194652557373047, Poisson: -0.10190042108297348
Epoch 0: 70%|▋| 539/766 [02:12<00:55, 4.06it/s, v_num=a0al, train_loss_step=23
Epoch 0: 70%|▋| 539/766 [02:12<00:55, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.30335235595703, Poisson: -0.08749409019947052
Epoch 0: 70%|▋| 540/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 70%|▋| 540/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.20626449584961, Poisson: -0.10183284431695938
Epoch 0: 71%|▋| 541/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 71%|▋| 541/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.777469635009766, Poisson: -0.10488604754209518
Epoch 0: 71%|▋| 542/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 71%|▋| 542/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.33306121826172, Poisson: -0.10774660110473633
Epoch 0: 71%|▋| 543/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 71%|▋| 543/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.98891830444336, Poisson: -0.09613889455795288
Epoch 0: 71%|▋| 544/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 71%|▋| 544/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.997255325317383, Poisson: -0.09609785676002502
Epoch 0: 71%|▋| 545/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 71%|▋| 545/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.197553634643555, Poisson: -0.10187359899282455
Epoch 0: 71%|▋| 546/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 71%|▋| 546/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.859487533569336, Poisson: -0.11060630530118942
Epoch 0: 71%|▋| 547/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 71%|▋| 547/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.890262603759766, Poisson: -0.11062599718570709
Epoch 0: 72%|▋| 548/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 72%|▋| 548/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.20583724975586, Poisson: -0.10201893746852875
Epoch 0: 72%|▋| 549/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 72%|▋| 549/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.77797508239746, Poisson: -0.10483971983194351
Epoch 0: 72%|▋| 550/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 72%|▋| 550/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.097412109375, Poisson: -0.09617631137371063
Epoch 0: 72%|▋| 551/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 72%|▋| 551/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.389467239379883, Poisson: -0.10784398019313812
Epoch 0: 72%|▋| 552/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 72%|▋| 552/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.91577911376953, Poisson: -0.11067967116832733
Epoch 0: 72%|▋| 553/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 72%|▋| 553/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.879379272460938, Poisson: -0.11067167669534683
Epoch 0: 72%|▋| 554/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 72%|▋| 554/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.61037254333496, Poisson: -0.09901925921440125
Epoch 0: 72%|▋| 555/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 72%|▋| 555/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.037792205810547, Poisson: -0.09608427435159683
Epoch 0: 73%|▋| 556/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 73%|▋| 556/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.957345962524414, Poisson: -0.09041019529104233
Epoch 0: 73%|▋| 557/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 73%|▋| 557/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 17.77267837524414, Poisson: -0.0846007689833641
Epoch 0: 73%|▋| 558/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 73%|▋| 558/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.03352165222168, Poisson: -0.09616824984550476
Epoch 0: 73%|▋| 559/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=17
Epoch 0: 73%|▋| 559/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.01946449279785, Poisson: -0.09619680792093277
Epoch 0: 73%|▋| 560/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 73%|▋| 560/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.306650161743164, Poisson: -0.08740312606096268
Epoch 0: 73%|▋| 561/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 73%|▋| 561/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.354135513305664, Poisson: -0.10777858644723892
Epoch 0: 73%|▋| 562/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=18
Epoch 0: 73%|▋| 562/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.208993911743164, Poisson: -0.10200832784175873
Epoch 0: 73%|▋| 563/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 73%|▋| 563/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.653942108154297, Poisson: -0.09901890903711319
Epoch 0: 74%|▋| 564/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 74%|▋| 564/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.73474884033203, Poisson: -0.10498061031103134
Epoch 0: 74%|▋| 565/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=20
Epoch 0: 74%|▋| 565/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.364513397216797, Poisson: -0.10782323777675629
Epoch 0: 74%|▋| 566/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 74%|▋| 566/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.02947425842285, Poisson: -0.09621088206768036
Epoch 0: 74%|▋| 567/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 74%|▋| 567/766 [02:19<00:48, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.660968780517578, Poisson: -0.0989837497472763
Epoch 0: 74%|▋| 568/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 74%|▋| 568/766 [02:19<00:48, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.195329666137695, Poisson: -0.10204056650400162
Epoch 0: 74%|▋| 569/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 74%|▋| 569/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.745546340942383, Poisson: -0.1049041748046875
Epoch 0: 74%|▋| 570/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 74%|▋| 570/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.060508728027344, Poisson: -0.09597116708755493
Epoch 0: 75%|▋| 571/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 75%|▋| 571/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.203964233398438, Poisson: -0.10202384740114212
Epoch 0: 75%|▋| 572/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 75%|▋| 572/766 [02:20<00:47, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 25.170516967773438, Poisson: -0.12212321162223816
Epoch 0: 75%|▋| 573/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 75%|▋| 573/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=25
Multinomial: 21.194564819335938, Poisson: -0.10183016955852509
Epoch 0: 75%|▋| 574/766 [02:21<00:47, 4.07it/s, v_num=a0al, train_loss_step=25
Epoch 0: 75%|▋| 574/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.293495178222656, Poisson: -0.10767456889152527
Epoch 0: 75%|▊| 575/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=21
Epoch 0: 75%|▊| 575/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.32612419128418, Poisson: -0.10786134749650955
Epoch 0: 75%|▊| 576/766 [02:21<00:46, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 75%|▊| 576/766 [02:21<00:46, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.417272567749023, Poisson: -0.09323275834321976
Epoch 0: 75%|▊| 577/766 [02:21<00:46, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 75%|▊| 577/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.920013427734375, Poisson: -0.11068196594715118
Epoch 0: 75%|▊| 578/766 [02:22<00:46, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 75%|▊| 578/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.92017936706543, Poisson: -0.11057084798812866
Epoch 0: 76%|▊| 579/766 [02:22<00:45, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 76%|▊| 579/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.585636138916016, Poisson: -0.09909996390342712
Epoch 0: 76%|▊| 580/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=22
Epoch 0: 76%|▊| 580/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.35625648498535, Poisson: -0.08748295158147812
Epoch 0: 76%|▊| 581/766 [02:22<00:45, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 76%|▊| 581/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.28192710876465, Poisson: -0.08736550807952881
Epoch 0: 76%|▊| 582/766 [02:23<00:45, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 76%|▊| 582/766 [02:23<00:45, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.358203887939453, Poisson: -0.08747058361768723
Epoch 0: 76%|▊| 583/766 [02:23<00:44, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 76%|▊| 583/766 [02:23<00:45, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.040454864501953, Poisson: -0.09612678736448288
Epoch 0: 76%|▊| 584/766 [02:23<00:44, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 76%|▊| 584/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.636125564575195, Poisson: -0.09899233281612396
Epoch 0: 76%|▊| 585/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 76%|▊| 585/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.294565200805664, Poisson: -0.10777721554040909
Epoch 0: 77%|▊| 586/766 [02:24<00:44, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 77%|▊| 586/766 [02:24<00:44, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.7929630279541, Poisson: -0.10491427034139633
Epoch 0: 77%|▊| 587/766 [02:24<00:44, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 77%|▊| 587/766 [02:24<00:44, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.575651168823242, Poisson: -0.09907319396734238
Epoch 0: 77%|▊| 588/766 [02:24<00:43, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 77%|▊| 588/766 [02:24<00:43, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.472824096679688, Poisson: -0.09318910539150238
Epoch 0: 77%|▊| 589/766 [02:24<00:43, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 77%|▊| 589/766 [02:24<00:43, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.570619583129883, Poisson: -0.09897415339946747
Epoch 0: 77%|▊| 590/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=19
Epoch 0: 77%|▊| 590/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.23089027404785, Poisson: -0.10216553509235382
Epoch 0: 77%|▊| 591/766 [02:25<00:43, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 77%|▊| 591/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.470623016357422, Poisson: -0.09319072216749191
Epoch 0: 77%|▊| 592/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 77%|▊| 592/766 [02:25<00:42, 4.06it/s, v_num=a0al, train_loss_step=19
Multinomial: 22.3203067779541, Poisson: -0.10772398114204407
Epoch 0: 77%|▊| 593/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 77%|▊| 593/766 [02:25<00:42, 4.06it/s, v_num=a0al, train_loss_step=22
Multinomial: 23.481698989868164, Poisson: -0.11348134279251099
Epoch 0: 78%|▊| 594/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 78%|▊| 594/766 [02:26<00:42, 4.06it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.397077560424805, Poisson: -0.0875048041343689
Epoch 0: 78%|▊| 595/766 [02:26<00:42, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 78%|▊| 595/766 [02:26<00:42, 4.06it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.281776428222656, Poisson: -0.08735102415084839
Epoch 0: 78%|▊| 596/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 78%|▊| 596/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.900833129882812, Poisson: -0.09037599712610245
Epoch 0: 78%|▊| 597/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 78%|▊| 597/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.041349411010742, Poisson: -0.09603632241487503
Epoch 0: 78%|▊| 598/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 78%|▊| 598/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.15169906616211, Poisson: -0.10192982852458954
Epoch 0: 78%|▊| 599/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 78%|▊| 599/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.998565673828125, Poisson: -0.09607094526290894
Epoch 0: 78%|▊| 600/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 78%|▊| 600/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.630062103271484, Poisson: -0.09916112571954727
Epoch 0: 78%|▊| 601/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 78%|▊| 601/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 24.03969955444336, Poisson: -0.11638204008340836
Epoch 0: 79%|▊| 602/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 79%|▊| 602/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.574071884155273, Poisson: -0.09899208694696426
Epoch 0: 79%|▊| 603/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 79%|▊| 603/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.062055587768555, Poisson: -0.09617772698402405
Epoch 0: 79%|▊| 604/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 79%|▊| 604/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.900306701660156, Poisson: -0.11064215004444122
Epoch 0: 79%|▊| 605/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 79%|▊| 605/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.45872688293457, Poisson: -0.0931672751903534
Epoch 0: 79%|▊| 606/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 79%|▊| 606/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.61547088623047, Poisson: -0.09892795234918594
Epoch 0: 79%|▊| 607/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 79%|▊| 607/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.748783111572266, Poisson: -0.1048136055469513
Epoch 0: 79%|▊| 608/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 79%|▊| 608/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.059423446655273, Poisson: -0.0961209386587143
Epoch 0: 80%|▊| 609/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 80%|▊| 609/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.010229110717773, Poisson: -0.09607797861099243
Epoch 0: 80%|▊| 610/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 80%|▊| 610/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.505292892456055, Poisson: -0.11350621283054352
Epoch 0: 80%|▊| 611/766 [02:30<00:38, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 80%|▊| 611/766 [02:30<00:38, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.647974014282227, Poisson: -0.09891486167907715
Epoch 0: 80%|▊| 612/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 80%|▊| 612/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.9503173828125, Poisson: -0.11074764281511307
Epoch 0: 80%|▊| 613/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 80%|▊| 613/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.505077362060547, Poisson: -0.09333376586437225
Epoch 0: 80%|▊| 614/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 80%|▊| 614/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.72488021850586, Poisson: -0.10470889508724213
Epoch 0: 80%|▊| 615/766 [02:31<00:37, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 80%|▊| 615/766 [02:31<00:37, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 25.20842170715332, Poisson: -0.12234365195035934
Epoch 0: 80%|▊| 616/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 80%|▊| 616/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=25
Multinomial: 19.443464279174805, Poisson: -0.09326735883951187
Epoch 0: 81%|▊| 617/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=25
Epoch 0: 81%|▊| 617/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.06236457824707, Poisson: -0.09638020396232605
Epoch 0: 81%|▊| 618/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 81%|▊| 618/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.628524780273438, Poisson: -0.0990341380238533
Epoch 0: 81%|▊| 619/766 [02:32<00:36, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 81%|▊| 619/766 [02:32<00:36, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.54349136352539, Poisson: -0.09893681108951569
Epoch 0: 81%|▊| 620/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 81%|▊| 620/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.07408332824707, Poisson: -0.09616923332214355
Epoch 0: 81%|▊| 621/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 81%|▊| 621/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 23.465482711791992, Poisson: -0.11332377791404724
Epoch 0: 81%|▊| 622/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 81%|▊| 622/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 22.919734954833984, Poisson: -0.11065004765987396
Epoch 0: 81%|▊| 623/766 [02:33<00:35, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 81%|▊| 623/766 [02:33<00:35, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 20.591352462768555, Poisson: -0.09888836741447449
Epoch 0: 81%|▊| 624/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 81%|▊| 624/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.751014709472656, Poisson: -0.10491835325956345
Epoch 0: 82%|▊| 625/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 82%|▊| 625/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.052703857421875, Poisson: -0.09618022292852402
Epoch 0: 82%|▊| 626/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 82%|▊| 626/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 24.039308547973633, Poisson: -0.11643020063638687
Epoch 0: 82%|▊| 627/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 82%|▊| 627/766 [02:34<00:34, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 19.47958755493164, Poisson: -0.09322299808263779
Epoch 0: 82%|▊| 628/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 82%|▊| 628/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.07942771911621, Poisson: -0.09612985700368881
Epoch 0: 82%|▊| 629/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 82%|▊| 629/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.62293815612793, Poisson: -0.09899833798408508
Epoch 0: 82%|▊| 630/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 82%|▊| 630/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.180599212646484, Poisson: -0.10211139917373657
Epoch 0: 82%|▊| 631/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 82%|▊| 631/766 [02:35<00:33, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 19.43415069580078, Poisson: -0.09314945340156555
Epoch 0: 83%|▊| 632/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 83%|▊| 632/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.04065704345703, Poisson: -0.0961098000407219
Epoch 0: 83%|▊| 633/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 83%|▊| 633/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.605737686157227, Poisson: -0.09911376237869263
Epoch 0: 83%|▊| 634/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 83%|▊| 634/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.07790756225586, Poisson: -0.09614162147045135
Epoch 0: 83%|▊| 635/766 [02:36<00:32, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 83%|▊| 635/766 [02:36<00:32, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.729204177856445, Poisson: -0.10484756529331207
Epoch 0: 83%|▊| 636/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 83%|▊| 636/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.351215362548828, Poisson: -0.08738947659730911
Epoch 0: 83%|▊| 637/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 83%|▊| 637/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.743696212768555, Poisson: -0.1048312559723854
Epoch 0: 83%|▊| 638/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 83%|▊| 638/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.89417266845703, Poisson: -0.11054882407188416
Epoch 0: 83%|▊| 639/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 83%|▊| 639/766 [02:37<00:31, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.442575454711914, Poisson: -0.09320535510778427
Epoch 0: 84%|▊| 640/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 84%|▊| 640/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.3289852142334, Poisson: -0.08738420903682709
Epoch 0: 84%|▊| 641/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 84%|▊| 641/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.37784194946289, Poisson: -0.09329848736524582
Epoch 0: 84%|▊| 642/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 84%|▊| 642/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.74251365661621, Poisson: -0.10472816228866577
Epoch 0: 84%|▊| 643/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19
Epoch 0: 84%|▊| 643/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.053268432617188, Poisson: -0.09602104127407074
Epoch 0: 84%|▊| 644/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 84%|▊| 644/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.585439682006836, Poisson: -0.0989774540066719
Epoch 0: 84%|▊| 645/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 84%|▊| 645/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.32609748840332, Poisson: -0.10777314007282257
Epoch 0: 84%|▊| 646/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 84%|▊| 646/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.32042694091797, Poisson: -0.10790190100669861
Epoch 0: 84%|▊| 647/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 84%|▊| 647/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.8642578125, Poisson: -0.09032086282968521
Epoch 0: 85%|▊| 648/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 85%|▊| 648/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.9396915435791, Poisson: -0.09038145840167999
Epoch 0: 85%|▊| 649/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 85%|▊| 649/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 24.03581428527832, Poisson: -0.11642525345087051
Epoch 0: 85%|▊| 650/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18
Epoch 0: 85%|▊| 650/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 24.04706573486328, Poisson: -0.11630402505397797
Epoch 0: 85%|▊| 651/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 85%|▊| 651/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 17.720582962036133, Poisson: -0.08459357917308807
Epoch 0: 85%|▊| 652/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 85%|▊| 652/766 [02:40<00:28, 4.07it/s, v_num=a0al, train_loss_step=17
Multinomial: 17.181501388549805, Poisson: -0.0816231518983841
Epoch 0: 85%|▊| 653/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=17
Epoch 0: 85%|▊| 653/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=17
Multinomial: 23.45564842224121, Poisson: -0.11357060074806213
Epoch 0: 85%|▊| 654/766 [02:40<00:27, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 85%|▊| 654/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.209218978881836, Poisson: -0.10190658271312714
Epoch 0: 86%|▊| 655/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23
Epoch 0: 86%|▊| 655/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.60980224609375, Poisson: -0.09892372041940689
Epoch 0: 86%|▊| 656/766 [02:40<00:26, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 86%|▊| 656/766 [02:41<00:27, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.1964054107666, Poisson: -0.10195201635360718
Epoch 0: 86%|▊| 657/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 86%|▊| 657/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.74530792236328, Poisson: -0.10479382425546646
Epoch 0: 86%|▊| 658/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 86%|▊| 658/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.213241577148438, Poisson: -0.10197531431913376
Epoch 0: 86%|▊| 659/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 86%|▊| 659/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.750818252563477, Poisson: -0.10478349030017853
Epoch 0: 86%|▊| 660/766 [02:42<00:26, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 86%|▊| 660/766 [02:42<00:26, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.8967227935791, Poisson: -0.11057905852794647
Epoch 0: 86%|▊| 661/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 86%|▊| 661/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.184062957763672, Poisson: -0.10184145718812943
Epoch 0: 86%|▊| 662/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 86%|▊| 662/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.182111740112305, Poisson: -0.08155201375484467
Epoch 0: 87%|▊| 663/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 87%|▊| 663/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.304018020629883, Poisson: -0.10763479024171829
Epoch 0: 87%|▊| 664/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 87%|▊| 664/766 [02:43<00:25, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.771034240722656, Poisson: -0.1047348603606224
Epoch 0: 87%|▊| 665/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 87%|▊| 665/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.2939395904541, Poisson: -0.0873831957578659
Epoch 0: 87%|▊| 666/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 87%|▊| 666/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.618026733398438, Poisson: -0.09891299903392792
Epoch 0: 87%|▊| 667/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 87%|▊| 667/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.608810424804688, Poisson: -0.09906212240457535
Epoch 0: 87%|▊| 668/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 87%|▊| 668/766 [02:44<00:24, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.322099685668945, Poisson: -0.10757517069578171
Epoch 0: 87%|▊| 669/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 87%|▊| 669/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.787355422973633, Poisson: -0.1049061119556427
Epoch 0: 87%|▊| 670/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=22
Epoch 0: 87%|▊| 670/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.766332626342773, Poisson: -0.10484164953231812
Epoch 0: 88%|▉| 671/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 88%|▉| 671/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.71084976196289, Poisson: -0.10478072613477707
Epoch 0: 88%|▉| 672/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 88%|▉| 672/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.059356689453125, Poisson: -0.11647970974445343
Epoch 0: 88%|▉| 673/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 88%|▉| 673/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 21.204574584960938, Poisson: -0.10201311111450195
Epoch 0: 88%|▉| 674/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 88%|▉| 674/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 24.584774017333984, Poisson: -0.1193760558962822
Epoch 0: 88%|▉| 675/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 88%|▉| 675/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=24
Multinomial: 20.663442611694336, Poisson: -0.09896506369113922
Epoch 0: 88%|▉| 676/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=24
Epoch 0: 88%|▉| 676/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.19459342956543, Poisson: -0.1019076555967331
Epoch 0: 88%|▉| 677/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 88%|▉| 677/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.178815841674805, Poisson: -0.08161477744579315
Epoch 0: 89%|▉| 678/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 89%|▉| 678/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=17
Multinomial: 21.783733367919922, Poisson: -0.10473403334617615
Epoch 0: 89%|▉| 679/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 89%|▉| 679/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 23.467281341552734, Poisson: -0.11346116662025452
Epoch 0: 89%|▉| 680/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21
Epoch 0: 89%|▉| 680/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.04427146911621, Poisson: -0.09604737907648087
Epoch 0: 89%|▉| 681/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 89%|▉| 681/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.142990112304688, Poisson: -0.101886086165905
Epoch 0: 89%|▉| 682/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 89%|▉| 682/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.709651947021484, Poisson: -0.1048181876540184
Epoch 0: 89%|▉| 683/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 89%|▉| 683/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.052165985107422, Poisson: -0.09610036015510559
Epoch 0: 89%|▉| 684/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 89%|▉| 684/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 15.995346069335938, Poisson: -0.07582859694957733
Epoch 0: 89%|▉| 685/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=20
Epoch 0: 89%|▉| 685/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=15
Multinomial: 18.30655288696289, Poisson: -0.08745232969522476
Epoch 0: 90%|▉| 686/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=15
Epoch 0: 90%|▉| 686/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.636791229248047, Poisson: -0.09912166744470596
Epoch 0: 90%|▉| 687/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 90%|▉| 687/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.205366134643555, Poisson: -0.10186842828989029
Epoch 0: 90%|▉| 688/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 90%|▉| 688/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.31546401977539, Poisson: -0.10762669146060944
Epoch 0: 90%|▉| 689/766 [02:48<00:18, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 90%|▉| 689/766 [02:49<00:18, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.30607795715332, Poisson: -0.10776594281196594
Epoch 0: 90%|▉| 690/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 90%|▉| 690/766 [02:49<00:18, 4.07it/s, v_num=a0al, train_loss_step=22
Multinomial: 17.707111358642578, Poisson: -0.08455497026443481
Epoch 0: 90%|▉| 691/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 90%|▉| 691/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.056251525878906, Poisson: -0.09613558650016785
Epoch 0: 90%|▉| 692/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 90%|▉| 692/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 17.71844482421875, Poisson: -0.0844632163643837
Epoch 0: 90%|▉| 693/766 [02:49<00:17, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 90%|▉| 693/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.011394500732422, Poisson: -0.09618280827999115
Epoch 0: 91%|▉| 694/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 91%|▉| 694/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.220170974731445, Poisson: -0.1019996628165245
Epoch 0: 91%|▉| 695/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 91%|▉| 695/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.046058654785156, Poisson: -0.09611000120639801
Epoch 0: 91%|▉| 696/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 91%|▉| 696/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.776384353637695, Poisson: -0.08445380628108978
Epoch 0: 91%|▉| 697/766 [02:50<00:16, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 91%|▉| 697/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.02248191833496, Poisson: -0.0961388349533081
Epoch 0: 91%|▉| 698/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 91%|▉| 698/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.788034439086914, Poisson: -0.10468914359807968
Epoch 0: 91%|▉| 699/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 91%|▉| 699/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.033443450927734, Poisson: -0.09614822268486023
Epoch 0: 91%|▉| 700/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 91%|▉| 700/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.43999671936035, Poisson: -0.0931275263428688
Epoch 0: 92%|▉| 701/766 [02:51<00:15, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 92%|▉| 701/766 [02:51<00:15, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.02468490600586, Poisson: -0.09612933546304703
Epoch 0: 92%|▉| 702/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 92%|▉| 702/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.720537185668945, Poisson: -0.08454470336437225
Epoch 0: 92%|▉| 703/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 92%|▉| 703/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.856117248535156, Poisson: -0.11041641235351562
Epoch 0: 92%|▉| 704/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 92%|▉| 704/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.749130249023438, Poisson: -0.10481631755828857
Epoch 0: 92%|▉| 705/766 [02:52<00:14, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 92%|▉| 705/766 [02:52<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.150577545166016, Poisson: -0.10191968083381653
Epoch 0: 92%|▉| 706/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 92%|▉| 706/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.73512077331543, Poisson: -0.10484417527914047
Epoch 0: 92%|▉| 707/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 92%|▉| 707/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 17.735193252563477, Poisson: -0.08444802463054657
Epoch 0: 92%|▉| 708/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 92%|▉| 708/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 21.171815872192383, Poisson: -0.10175672918558121
Epoch 0: 93%|▉| 709/766 [02:53<00:13, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 93%|▉| 709/766 [02:53<00:13, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.140235900878906, Poisson: -0.10202343016862869
Epoch 0: 93%|▉| 710/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 93%|▉| 710/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.266754150390625, Poisson: -0.08740904927253723
Epoch 0: 93%|▉| 711/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 93%|▉| 711/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.7435245513916, Poisson: -0.1047716736793518
Epoch 0: 93%|▉| 712/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 93%|▉| 712/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.33485221862793, Poisson: -0.10773944854736328
Epoch 0: 93%|▉| 713/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 93%|▉| 713/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 23.449464797973633, Poisson: -0.11353325098752975
Epoch 0: 93%|▉| 714/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 93%|▉| 714/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.8591365814209, Poisson: -0.09025004506111145
Epoch 0: 93%|▉| 715/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 93%|▉| 715/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 18.2924861907959, Poisson: -0.08760888129472733
Epoch 0: 93%|▉| 716/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 93%|▉| 716/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.450326919555664, Poisson: -0.0932532325387001
Epoch 0: 94%|▉| 717/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 94%|▉| 717/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.754545211791992, Poisson: -0.10489057004451752
Epoch 0: 94%|▉| 718/766 [02:55<00:11, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 94%|▉| 718/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.621562957763672, Poisson: -0.09915497153997421
Epoch 0: 94%|▉| 719/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 94%|▉| 719/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.644670486450195, Poisson: -0.09900769591331482
Epoch 0: 94%|▉| 720/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 94%|▉| 720/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 19.482215881347656, Poisson: -0.09339278936386108
Epoch 0: 94%|▉| 721/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 94%|▉| 721/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.218652725219727, Poisson: -0.10194966197013855
Epoch 0: 94%|▉| 722/766 [02:56<00:10, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 94%|▉| 722/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.85138511657715, Poisson: -0.09043137729167938
Epoch 0: 94%|▉| 723/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 94%|▉| 723/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 21.170902252197266, Poisson: -0.10202258080244064
Epoch 0: 95%|▉| 724/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 95%|▉| 724/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 21.220476150512695, Poisson: -0.10178679972887039
Epoch 0: 95%|▉| 725/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 95%|▉| 725/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.098186492919922, Poisson: -0.09609003365039825
Epoch 0: 95%|▉| 726/766 [02:57<00:09, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 95%|▉| 726/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.026958465576172, Poisson: -0.09608585387468338
Epoch 0: 95%|▉| 727/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 95%|▉| 727/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 21.762968063354492, Poisson: -0.10483455657958984
Epoch 0: 95%|▉| 728/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 95%|▉| 728/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.0172061920166, Poisson: -0.0962311178445816
Epoch 0: 95%|▉| 729/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 95%|▉| 729/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.594770431518555, Poisson: -0.09916572272777557
Epoch 0: 95%|▉| 730/766 [02:58<00:08, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 95%|▉| 730/766 [02:58<00:08, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 18.910221099853516, Poisson: -0.09029804170131683
Epoch 0: 95%|▉| 731/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 95%|▉| 731/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.605316162109375, Poisson: -0.09899063408374786
Epoch 0: 96%|▉| 732/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 96%|▉| 732/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 17.684709548950195, Poisson: -0.08461694419384003
Epoch 0: 96%|▉| 733/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 96%|▉| 733/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 19.47001838684082, Poisson: -0.09328693896532059
Epoch 0: 96%|▉| 734/766 [02:59<00:07, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 96%|▉| 734/766 [02:59<00:07, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.442007064819336, Poisson: -0.11350321769714355
Epoch 0: 96%|▉| 735/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 96%|▉| 735/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=23
Multinomial: 20.5697021484375, Poisson: -0.0991271361708641
Epoch 0: 96%|▉| 736/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 96%|▉| 736/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.006288528442383, Poisson: -0.09610603004693985
Epoch 0: 96%|▉| 737/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 96%|▉| 737/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 23.479318618774414, Poisson: -0.11350422352552414
Epoch 0: 96%|▉| 738/766 [03:00<00:06, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 96%|▉| 738/766 [03:00<00:06, 4.08it/s, v_num=a0al, train_loss_step=23
Multinomial: 18.866106033325195, Poisson: -0.09043769538402557
Epoch 0: 96%|▉| 739/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 96%|▉| 739/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 22.89381980895996, Poisson: -0.11064435541629791
Epoch 0: 97%|▉| 740/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 97%|▉| 740/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.364917755126953, Poisson: -0.0875319167971611
Epoch 0: 97%|▉| 741/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 97%|▉| 741/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 19.492168426513672, Poisson: -0.09326649457216263
Epoch 0: 97%|▉| 742/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 97%|▉| 742/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 18.37800407409668, Poisson: -0.08742114901542664
Epoch 0: 97%|▉| 743/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 97%|▉| 743/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.02222442626953, Poisson: -0.09628087282180786
Epoch 0: 97%|▉| 744/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 97%|▉| 744/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.798154830932617, Poisson: -0.08459708094596863
Epoch 0: 97%|▉| 745/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 97%|▉| 745/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 22.28070068359375, Poisson: -0.10781168937683105
Epoch 0: 97%|▉| 746/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 97%|▉| 746/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 22.335309982299805, Poisson: -0.10783626139163971
Epoch 0: 98%|▉| 747/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 98%|▉| 747/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 19.987077713012695, Poisson: -0.09613754600286484
Epoch 0: 98%|▉| 748/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 98%|▉| 748/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 17.21087074279785, Poisson: -0.08178546279668808
Epoch 0: 98%|▉| 749/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 98%|▉| 749/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=17
Multinomial: 20.066789627075195, Poisson: -0.09616988152265549
Epoch 0: 98%|▉| 750/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=17
Epoch 0: 98%|▉| 750/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 22.929866790771484, Poisson: -0.11071749031543732
Epoch 0: 98%|▉| 751/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 98%|▉| 751/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 18.374528884887695, Poisson: -0.08747097849845886
Epoch 0: 98%|▉| 752/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 98%|▉| 752/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 23.452871322631836, Poisson: -0.11337430030107498
Epoch 0: 98%|▉| 753/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 98%|▉| 753/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=23
Multinomial: 22.317773818969727, Poisson: -0.10782989114522934
Epoch 0: 98%|▉| 754/766 [03:04<00:02, 4.08it/s, v_num=a0al, train_loss_step=23
Epoch 0: 98%|▉| 754/766 [03:04<00:02, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 21.17522621154785, Poisson: -0.10185651481151581
Epoch 0: 99%|▉| 755/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 99%|▉| 755/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 18.87299156188965, Poisson: -0.09019719064235687
Epoch 0: 99%|▉| 756/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 99%|▉| 756/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=18
Multinomial: 20.004302978515625, Poisson: -0.09615175426006317
Epoch 0: 99%|▉| 757/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=18
Epoch 0: 99%|▉| 757/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 19.4959716796875, Poisson: -0.09322664886713028
Epoch 0: 99%|▉| 758/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 99%|▉| 758/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19
Multinomial: 20.062843322753906, Poisson: -0.09629540145397186
Epoch 0: 99%|▉| 759/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19
Epoch 0: 99%|▉| 759/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 20.584623336791992, Poisson: -0.09911467134952545
Epoch 0: 99%|▉| 760/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 99%|▉| 760/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.18416404724121, Poisson: -0.10187393426895142
Epoch 0: 99%|▉| 761/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 99%|▉| 761/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 20.091028213500977, Poisson: -0.0962648093700409
Epoch 0: 99%|▉| 762/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 99%|▉| 762/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=20
Multinomial: 21.143661499023438, Poisson: -0.10190434008836746
Epoch 0: 100%|▉| 763/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=20
Epoch 0: 100%|▉| 763/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=21
Multinomial: 22.87443733215332, Poisson: -0.11070720106363297
Epoch 0: 100%|▉| 764/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=21
Epoch 0: 100%|▉| 764/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=22
Multinomial: 25.80946159362793, Poisson: -0.12509529292583466
Epoch 0: 100%|▉| 765/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=22
Epoch 0: 100%|▉| 765/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=25
Multinomial: 19.509836196899414, Poisson: -0.09323473274707794
Epoch 0: 100%|█| 766/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=25
Epoch 0: 100%|█| 766/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=19
Validation: | | 0/? [00:00<?, ?it/s]
Validation: | | 0/? [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/71 [00:00<?, ?it/s]
Multinomial: 17.715787887573242, Poisson: -0.08436138182878494
Validation DataLoader 0: 1%|▎ | 1/71 [00:00<00:06, 11.09it/s]
Multinomial: 17.200572967529297, Poisson: -0.08141561597585678
Validation DataLoader 0: 3%|▌ | 2/71 [00:00<00:06, 11.25it/s]
Multinomial: 23.471235275268555, Poisson: -0.11336066573858261
Validation DataLoader 0: 4%|▊ | 3/71 [00:00<00:05, 11.34it/s]
Multinomial: 18.440092086791992, Poisson: -0.08726721256971359
Validation DataLoader 0: 6%|█ | 4/71 [00:00<00:05, 11.35it/s]
Multinomial: 21.81659698486328, Poisson: -0.10460898280143738
Validation DataLoader 0: 7%|█▎ | 5/71 [00:00<00:05, 11.38it/s]
Multinomial: 17.75787353515625, Poisson: -0.08423992991447449
Validation DataLoader 0: 8%|█▌ | 6/71 [00:00<00:05, 11.41it/s]
Multinomial: 17.787721633911133, Poisson: -0.08449624478816986
Validation DataLoader 0: 10%|█▊ | 7/71 [00:00<00:05, 11.40it/s]
Multinomial: 22.402618408203125, Poisson: -0.10755792260169983
Validation DataLoader 0: 11%|██▏ | 8/71 [00:00<00:05, 11.41it/s]
Multinomial: 22.863603591918945, Poisson: -0.11044501513242722
Validation DataLoader 0: 13%|██▍ | 9/71 [00:00<00:05, 11.41it/s]
Multinomial: 20.60628890991211, Poisson: -0.09900801628828049
Validation DataLoader 0: 14%|██▌ | 10/71 [00:00<00:05, 11.40it/s]
Multinomial: 17.1425838470459, Poisson: -0.08161227405071259
Validation DataLoader 0: 15%|██▊ | 11/71 [00:00<00:05, 11.41it/s]
Multinomial: 21.7701358795166, Poisson: -0.10494677722454071
Validation DataLoader 0: 17%|███ | 12/71 [00:01<00:05, 11.41it/s]
Multinomial: 20.06895637512207, Poisson: -0.0960720032453537
Validation DataLoader 0: 18%|███▎ | 13/71 [00:01<00:05, 11.41it/s]
Multinomial: 21.843286514282227, Poisson: -0.10461314767599106
Validation DataLoader 0: 20%|███▌ | 14/71 [00:01<00:04, 11.41it/s]
Multinomial: 24.058631896972656, Poisson: -0.11612120270729065
Validation DataLoader 0: 21%|███▊ | 15/71 [00:01<00:04, 11.41it/s]
Multinomial: 20.56452178955078, Poisson: -0.09911813586950302
Validation DataLoader 0: 23%|████ | 16/71 [00:01<00:04, 11.40it/s]
Multinomial: 16.65570831298828, Poisson: -0.07856716960668564
Validation DataLoader 0: 24%|████▎ | 17/71 [00:01<00:04, 11.41it/s]
Multinomial: 22.95334243774414, Poisson: -0.11065673828125
Validation DataLoader 0: 25%|████▌ | 18/71 [00:01<00:04, 11.41it/s]
Multinomial: 20.04860496520996, Poisson: -0.09602082520723343
Validation DataLoader 0: 27%|████▊ | 19/71 [00:01<00:04, 11.40it/s]
Multinomial: 24.706954956054688, Poisson: -0.1192212849855423
Validation DataLoader 0: 28%|█████ | 20/71 [00:01<00:04, 11.40it/s]
Multinomial: 20.63813591003418, Poisson: -0.09894131869077682
Validation DataLoader 0: 30%|█████▎ | 21/71 [00:01<00:04, 11.41it/s]
Multinomial: 24.087890625, Poisson: -0.11592213809490204
Validation DataLoader 0: 31%|█████▌ | 22/71 [00:01<00:04, 11.41it/s]
Multinomial: 19.508792877197266, Poisson: -0.0931679755449295
Validation DataLoader 0: 32%|█████▊ | 23/71 [00:02<00:04, 11.41it/s]
Multinomial: 18.87944221496582, Poisson: -0.09000281989574432
Validation DataLoader 0: 34%|██████ | 24/71 [00:02<00:04, 11.41it/s]
Multinomial: 20.072032928466797, Poisson: -0.09605841338634491
Validation DataLoader 0: 35%|██████▎ | 25/71 [00:02<00:04, 11.41it/s]
Multinomial: 19.435930252075195, Poisson: -0.09318968653678894
Validation DataLoader 0: 37%|██████▌ | 26/71 [00:02<00:03, 11.41it/s]
Multinomial: 21.161197662353516, Poisson: -0.10190277546644211
Validation DataLoader 0: 38%|██████▊ | 27/71 [00:02<00:03, 11.41it/s]
Multinomial: 17.20120620727539, Poisson: -0.08157812058925629
Validation DataLoader 0: 39%|███████ | 28/71 [00:02<00:03, 11.41it/s]
Multinomial: 20.60438346862793, Poisson: -0.0989345908164978
Validation DataLoader 0: 41%|███████▎ | 29/71 [00:02<00:03, 11.42it/s]
Multinomial: 23.479650497436523, Poisson: -0.11340800672769547
Validation DataLoader 0: 42%|███████▌ | 30/71 [00:02<00:03, 11.42it/s]
Multinomial: 21.724428176879883, Poisson: -0.10450900346040726
Validation DataLoader 0: 44%|███████▊ | 31/71 [00:02<00:03, 11.42it/s]
Multinomial: 19.462779998779297, Poisson: -0.0931844487786293
Validation DataLoader 0: 45%|████████ | 32/71 [00:02<00:03, 11.42it/s]
Multinomial: 21.786657333374023, Poisson: -0.10474784672260284
Validation DataLoader 0: 46%|████████▎ | 33/71 [00:02<00:03, 11.42it/s]
Multinomial: 19.443769454956055, Poisson: -0.09325771033763885
Validation DataLoader 0: 48%|████████▌ | 34/71 [00:02<00:03, 11.42it/s]
Multinomial: 19.488693237304688, Poisson: -0.09301599860191345
Validation DataLoader 0: 49%|████████▊ | 35/71 [00:03<00:03, 11.42it/s]
Multinomial: 23.99990463256836, Poisson: -0.11622511595487595
Validation DataLoader 0: 51%|█████████▏ | 36/71 [00:03<00:03, 11.42it/s]
Multinomial: 16.03451156616211, Poisson: -0.07577572762966156
Validation DataLoader 0: 52%|█████████▍ | 37/71 [00:03<00:02, 11.42it/s]
Multinomial: 22.938905715942383, Poisson: -0.11054838448762894
Validation DataLoader 0: 54%|█████████▋ | 38/71 [00:03<00:02, 11.42it/s]
Multinomial: 20.075075149536133, Poisson: -0.09620349854230881
Validation DataLoader 0: 55%|█████████▉ | 39/71 [00:03<00:02, 11.42it/s]
Multinomial: 24.006479263305664, Poisson: -0.11629815399646759
Validation DataLoader 0: 56%|██████████▏ | 40/71 [00:03<00:02, 11.42it/s]
Multinomial: 18.275978088378906, Poisson: -0.08737660944461823
Validation DataLoader 0: 58%|██████████▍ | 41/71 [00:03<00:02, 11.42it/s]
Multinomial: 15.497014045715332, Poisson: -0.07280221581459045
Validation DataLoader 0: 59%|██████████▋ | 42/71 [00:03<00:02, 11.43it/s]
Multinomial: 22.40738296508789, Poisson: -0.10748593509197235
Validation DataLoader 0: 61%|██████████▉ | 43/71 [00:03<00:02, 11.43it/s]
Multinomial: 24.083242416381836, Poisson: -0.11616585403680801
Validation DataLoader 0: 62%|███████████▏ | 44/71 [00:03<00:02, 11.43it/s]
Multinomial: 22.93613624572754, Poisson: -0.11035705357789993
Validation DataLoader 0: 63%|███████████▍ | 45/71 [00:03<00:02, 11.43it/s]
Multinomial: 21.766260147094727, Poisson: -0.10466591268777847
Validation DataLoader 0: 65%|███████████▋ | 46/71 [00:04<00:02, 11.43it/s]
Multinomial: 21.18513298034668, Poisson: -0.10168717801570892
Validation DataLoader 0: 66%|███████████▉ | 47/71 [00:04<00:02, 11.43it/s]
Multinomial: 24.081628799438477, Poisson: -0.11622646450996399
Validation DataLoader 0: 68%|████████████▏ | 48/71 [00:04<00:02, 11.43it/s]
Multinomial: 18.872467041015625, Poisson: -0.09034885466098785
Validation DataLoader 0: 69%|████████████▍ | 49/71 [00:04<00:01, 11.43it/s]
Multinomial: 18.83187484741211, Poisson: -0.08994244039058685
Validation DataLoader 0: 70%|████████████▋ | 50/71 [00:04<00:01, 11.43it/s]
Multinomial: 20.023244857788086, Poisson: -0.09600641578435898
Validation DataLoader 0: 72%|████████████▉ | 51/71 [00:04<00:01, 11.44it/s]
Multinomial: 21.718595504760742, Poisson: -0.1045961007475853
Validation DataLoader 0: 73%|█████████████▏ | 52/71 [00:04<00:01, 11.44it/s]
Multinomial: 23.44754409790039, Poisson: -0.11339467763900757
Validation DataLoader 0: 75%|█████████████▍ | 53/71 [00:04<00:01, 11.44it/s]
Multinomial: 21.757465362548828, Poisson: -0.10470139235258102
Validation DataLoader 0: 76%|█████████████▋ | 54/71 [00:04<00:01, 11.44it/s]
Multinomial: 21.18140983581543, Poisson: -0.10165048390626907
Validation DataLoader 0: 77%|█████████████▉ | 55/71 [00:04<00:01, 11.44it/s]
Multinomial: 18.28765296936035, Poisson: -0.0873810350894928
Validation DataLoader 0: 79%|██████████████▏ | 56/71 [00:04<00:01, 11.44it/s]
Multinomial: 20.64971923828125, Poisson: -0.09891148656606674
Validation DataLoader 0: 80%|██████████████▍ | 57/71 [00:04<00:01, 11.44it/s]
Multinomial: 20.557477951049805, Poisson: -0.0990000069141388
Validation DataLoader 0: 82%|██████████████▋ | 58/71 [00:05<00:01, 11.44it/s]
Multinomial: 24.666433334350586, Poisson: -0.11910754442214966
Validation DataLoader 0: 83%|██████████████▉ | 59/71 [00:05<00:01, 11.44it/s]
Multinomial: 23.435176849365234, Poisson: -0.11322159320116043
Validation DataLoader 0: 85%|███████████████▏ | 60/71 [00:05<00:00, 11.44it/s]
Multinomial: 21.201955795288086, Poisson: -0.10189700126647949
Validation DataLoader 0: 86%|███████████████▍ | 61/71 [00:05<00:00, 11.45it/s]
Multinomial: 16.05048942565918, Poisson: -0.0757397785782814
Validation DataLoader 0: 87%|███████████████▋ | 62/71 [00:05<00:00, 11.45it/s]
Multinomial: 20.027244567871094, Poisson: -0.09596437960863113
Validation DataLoader 0: 89%|███████████████▉ | 63/71 [00:05<00:00, 11.45it/s]
Multinomial: 20.64202308654785, Poisson: -0.09875722229480743
Validation DataLoader 0: 90%|████████████████▏ | 64/71 [00:05<00:00, 11.45it/s]
Multinomial: 21.188655853271484, Poisson: -0.1018335297703743
Validation DataLoader 0: 92%|████████████████▍ | 65/71 [00:05<00:00, 11.45it/s]
Multinomial: 22.29461669921875, Poisson: -0.10748331248760223
Validation DataLoader 0: 93%|████████████████▋ | 66/71 [00:05<00:00, 11.45it/s]
Multinomial: 19.94753646850586, Poisson: -0.09584959596395493
Validation DataLoader 0: 94%|████████████████▉ | 67/71 [00:05<00:00, 11.45it/s]
Multinomial: 17.73777961730957, Poisson: -0.08445792645215988
Validation DataLoader 0: 96%|█████████████████▏| 68/71 [00:05<00:00, 11.45it/s]
Multinomial: 21.220571517944336, Poisson: -0.10187307000160217
Validation DataLoader 0: 97%|█████████████████▍| 69/71 [00:06<00:00, 11.45it/s]
Multinomial: 20.63282585144043, Poisson: -0.09883726388216019
Validation DataLoader 0: 99%|█████████████████▋| 70/71 [00:06<00:00, 11.45it/s]
Multinomial: 24.04494857788086, Poisson: -0.11639108508825302
Validation DataLoader 0: 100%|██████████████████| 71/71 [00:06<00:00, 11.45it/s]
Epoch 0: 100%|█| 766/766 [03:14<00:00, 3.93it/s, v_num=a0al, train_loss_step=19
Epoch 0: 100%|█| 766/766 [03:14<00:00, 3.93it/s, v_num=a0al, train_loss_step=19
`Trainer.fit` stopped: `max_epochs=1` reached.
Epoch 0: 100%|█| 766/766 [03:20<00:00, 3.83it/s, v_num=a0al, train_loss_step=19
wandb:
wandb: 🚀 View run finetune_test_0 at:
# Uncomment if necessary
# import wandb
# wandb.login(host="https://genentech.wandb.io", anonymous="never", relogin=True)
8. Make and evaluate predictions using trained models¶
Using the training commands above, we trained two model replicates. Now, we can use these models to predict gene expression:
checkpoint = glob.glob(os.path.join(outdir, "lightning_logs/*/checkpoints/*.ckpt"))[0]
print(checkpoint)
./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt
# comma-separated list of model checkpoints
checkpoint_list = ",".join([checkpoint, checkpoint])
checkpoint_list
'./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt,./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt'
! CUDA_VISIBLE_DEVICES=0 decima predict-genes \
--output example/test_preds.h5ad \
--model {checkpoint_list} \
--metadata {ad_file_path} \
--device 0 \
--batch-size 8 \
--num-workers 32 \
--max_seq_shift 0 \
--genome hg38 \
--save-replicates
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
decima - INFO - Using device: 0 and genome: hg38 for prediction.
decima - INFO - Loading model ['./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt', './example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt']...
decima - INFO - Making predictions
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/__init__.py:1617: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 32 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
SLURM auto-requeueing enabled. Setting signal handlers.
Predicting: | | 0/? [00:00<?, ?it/s]
Predicting: | | 0/? [00:00<?, ?it/s]
Predicting DataLoader 0: 0%| | 0/115 [00:00<?, ?it/s]
Predicting DataLoader 0: 1%|▏ | 1/115 [00:03<06:53, 0.28it/s]
Predicting DataLoader 0: 2%|▎ | 2/115 [00:05<04:55, 0.38it/s]
Predicting DataLoader 0: 3%|▍ | 3/115 [00:06<04:18, 0.43it/s]
Predicting DataLoader 0: 3%|▋ | 4/115 [00:08<03:59, 0.46it/s]
Predicting DataLoader 0: 4%|▊ | 5/115 [00:10<03:47, 0.48it/s]
Predicting DataLoader 0: 5%|▉ | 6/115 [00:12<03:38, 0.50it/s]
Predicting DataLoader 0: 6%|█ | 7/115 [00:13<03:31, 0.51it/s]
Predicting DataLoader 0: 7%|█▎ | 8/115 [00:15<03:26, 0.52it/s]
Predicting DataLoader 0: 8%|█▍ | 9/115 [00:17<03:21, 0.53it/s]
Predicting DataLoader 0: 9%|█▍ | 10/115 [00:18<03:17, 0.53it/s]
Predicting DataLoader 0: 10%|█▋ | 11/115 [00:20<03:14, 0.54it/s]
Predicting DataLoader 0: 10%|█▊ | 12/115 [00:22<03:10, 0.54it/s]
Predicting DataLoader 0: 11%|█▉ | 13/115 [00:23<03:07, 0.54it/s]
Predicting DataLoader 0: 12%|██ | 14/115 [00:25<03:05, 0.55it/s]
Predicting DataLoader 0: 13%|██▏ | 15/115 [00:27<03:02, 0.55it/s]
Predicting DataLoader 0: 14%|██▎ | 16/115 [00:29<02:59, 0.55it/s]
Predicting DataLoader 0: 15%|██▌ | 17/115 [00:30<02:57, 0.55it/s]
Predicting DataLoader 0: 16%|██▋ | 18/115 [00:32<02:54, 0.55it/s]
Predicting DataLoader 0: 17%|██▊ | 19/115 [00:34<02:52, 0.56it/s]
Predicting DataLoader 0: 17%|██▉ | 20/115 [00:35<02:50, 0.56it/s]
Predicting DataLoader 0: 18%|███ | 21/115 [00:37<02:48, 0.56it/s]
Predicting DataLoader 0: 19%|███▎ | 22/115 [00:39<02:46, 0.56it/s]
Predicting DataLoader 0: 20%|███▍ | 23/115 [00:40<02:43, 0.56it/s]
Predicting DataLoader 0: 21%|███▌ | 24/115 [00:42<02:41, 0.56it/s]
Predicting DataLoader 0: 22%|███▋ | 25/115 [00:44<02:39, 0.56it/s]
Predicting DataLoader 0: 23%|███▊ | 26/115 [00:46<02:37, 0.56it/s]
Predicting DataLoader 0: 23%|███▉ | 27/115 [00:47<02:35, 0.56it/s]
Predicting DataLoader 0: 24%|████▏ | 28/115 [00:49<02:33, 0.57it/s]
Predicting DataLoader 0: 25%|████▎ | 29/115 [00:51<02:31, 0.57it/s]
Predicting DataLoader 0: 26%|████▍ | 30/115 [00:52<02:29, 0.57it/s]
Predicting DataLoader 0: 27%|████▌ | 31/115 [00:54<02:28, 0.57it/s]
Predicting DataLoader 0: 28%|████▋ | 32/115 [00:56<02:26, 0.57it/s]
Predicting DataLoader 0: 29%|████▉ | 33/115 [00:58<02:24, 0.57it/s]
Predicting DataLoader 0: 30%|█████ | 34/115 [00:59<02:22, 0.57it/s]
Predicting DataLoader 0: 30%|█████▏ | 35/115 [01:01<02:20, 0.57it/s]
Predicting DataLoader 0: 31%|█████▎ | 36/115 [01:03<02:18, 0.57it/s]
Predicting DataLoader 0: 32%|█████▍ | 37/115 [01:04<02:16, 0.57it/s]
Predicting DataLoader 0: 33%|█████▌ | 38/115 [01:06<02:14, 0.57it/s]
Predicting DataLoader 0: 34%|█████▊ | 39/115 [01:08<02:13, 0.57it/s]
Predicting DataLoader 0: 35%|█████▉ | 40/115 [01:10<02:11, 0.57it/s]
Predicting DataLoader 0: 36%|██████ | 41/115 [01:11<02:09, 0.57it/s]
Predicting DataLoader 0: 37%|██████▏ | 42/115 [01:13<02:07, 0.57it/s]
Predicting DataLoader 0: 37%|██████▎ | 43/115 [01:15<02:05, 0.57it/s]
Predicting DataLoader 0: 38%|██████▌ | 44/115 [01:16<02:03, 0.57it/s]
Predicting DataLoader 0: 39%|██████▋ | 45/115 [01:18<02:02, 0.57it/s]
Predicting DataLoader 0: 40%|██████▊ | 46/115 [01:20<02:00, 0.57it/s]
Predicting DataLoader 0: 41%|██████▉ | 47/115 [01:21<01:58, 0.57it/s]
Predicting DataLoader 0: 42%|███████ | 48/115 [01:23<01:56, 0.57it/s]
Predicting DataLoader 0: 43%|███████▏ | 49/115 [01:25<01:54, 0.57it/s]
Predicting DataLoader 0: 43%|███████▍ | 50/115 [01:27<01:53, 0.57it/s]
Predicting DataLoader 0: 44%|███████▌ | 51/115 [01:28<01:51, 0.57it/s]
Predicting DataLoader 0: 45%|███████▋ | 52/115 [01:30<01:49, 0.57it/s]
Predicting DataLoader 0: 46%|███████▊ | 53/115 [01:32<01:47, 0.57it/s]
Predicting DataLoader 0: 47%|███████▉ | 54/115 [01:33<01:46, 0.58it/s]
Predicting DataLoader 0: 48%|████████▏ | 55/115 [01:35<01:44, 0.58it/s]
Predicting DataLoader 0: 49%|████████▎ | 56/115 [01:37<01:42, 0.58it/s]
Predicting DataLoader 0: 50%|████████▍ | 57/115 [01:39<01:40, 0.58it/s]
Predicting DataLoader 0: 50%|████████▌ | 58/115 [01:40<01:38, 0.58it/s]
Predicting DataLoader 0: 51%|████████▋ | 59/115 [01:42<01:37, 0.58it/s]
Predicting DataLoader 0: 52%|████████▊ | 60/115 [01:44<01:35, 0.58it/s]
Predicting DataLoader 0: 53%|█████████ | 61/115 [01:45<01:33, 0.58it/s]
Predicting DataLoader 0: 54%|█████████▏ | 62/115 [01:47<01:31, 0.58it/s]
Predicting DataLoader 0: 55%|█████████▎ | 63/115 [01:49<01:30, 0.58it/s]
Predicting DataLoader 0: 56%|█████████▍ | 64/115 [01:50<01:28, 0.58it/s]
Predicting DataLoader 0: 57%|█████████▌ | 65/115 [01:52<01:26, 0.58it/s]
Predicting DataLoader 0: 57%|█████████▊ | 66/115 [01:54<01:24, 0.58it/s]
Predicting DataLoader 0: 58%|█████████▉ | 67/115 [01:56<01:23, 0.58it/s]
Predicting DataLoader 0: 59%|██████████ | 68/115 [01:57<01:21, 0.58it/s]
Predicting DataLoader 0: 60%|██████████▏ | 69/115 [01:59<01:19, 0.58it/s]
Predicting DataLoader 0: 61%|██████████▎ | 70/115 [02:01<01:17, 0.58it/s]
Predicting DataLoader 0: 62%|██████████▍ | 71/115 [02:02<01:16, 0.58it/s]
Predicting DataLoader 0: 63%|██████████▋ | 72/115 [02:04<01:14, 0.58it/s]
Predicting DataLoader 0: 63%|██████████▊ | 73/115 [02:06<01:12, 0.58it/s]
Predicting DataLoader 0: 64%|██████████▉ | 74/115 [02:08<01:10, 0.58it/s]
Predicting DataLoader 0: 65%|███████████ | 75/115 [02:09<01:09, 0.58it/s]
Predicting DataLoader 0: 66%|███████████▏ | 76/115 [02:11<01:07, 0.58it/s]
Predicting DataLoader 0: 67%|███████████▍ | 77/115 [02:13<01:05, 0.58it/s]
Predicting DataLoader 0: 68%|███████████▌ | 78/115 [02:14<01:03, 0.58it/s]
Predicting DataLoader 0: 69%|███████████▋ | 79/115 [02:16<01:02, 0.58it/s]
Predicting DataLoader 0: 70%|███████████▊ | 80/115 [02:18<01:00, 0.58it/s]
Predicting DataLoader 0: 70%|███████████▉ | 81/115 [02:19<00:58, 0.58it/s]
Predicting DataLoader 0: 71%|████████████ | 82/115 [02:21<00:57, 0.58it/s]
Predicting DataLoader 0: 72%|████████████▎ | 83/115 [02:23<00:55, 0.58it/s]
Predicting DataLoader 0: 73%|████████████▍ | 84/115 [02:25<00:53, 0.58it/s]
Predicting DataLoader 0: 74%|████████████▌ | 85/115 [02:26<00:51, 0.58it/s]
Predicting DataLoader 0: 75%|████████████▋ | 86/115 [02:28<00:50, 0.58it/s]
Predicting DataLoader 0: 76%|████████████▊ | 87/115 [02:30<00:48, 0.58it/s]
Predicting DataLoader 0: 77%|█████████████ | 88/115 [02:31<00:46, 0.58it/s]
Predicting DataLoader 0: 77%|█████████████▏ | 89/115 [02:33<00:44, 0.58it/s]
Predicting DataLoader 0: 78%|█████████████▎ | 90/115 [02:35<00:43, 0.58it/s]
Predicting DataLoader 0: 79%|█████████████▍ | 91/115 [02:37<00:41, 0.58it/s]
Predicting DataLoader 0: 80%|█████████████▌ | 92/115 [02:38<00:39, 0.58it/s]
Predicting DataLoader 0: 81%|█████████████▋ | 93/115 [02:40<00:37, 0.58it/s]
Predicting DataLoader 0: 82%|█████████████▉ | 94/115 [02:42<00:36, 0.58it/s]
Predicting DataLoader 0: 83%|██████████████ | 95/115 [02:43<00:34, 0.58it/s]
Predicting DataLoader 0: 83%|██████████████▏ | 96/115 [02:45<00:32, 0.58it/s]
Predicting DataLoader 0: 84%|██████████████▎ | 97/115 [02:47<00:31, 0.58it/s]
Predicting DataLoader 0: 85%|██████████████▍ | 98/115 [02:48<00:29, 0.58it/s]
Predicting DataLoader 0: 86%|██████████████▋ | 99/115 [02:50<00:27, 0.58it/s]
Predicting DataLoader 0: 87%|█████████████▉ | 100/115 [02:52<00:25, 0.58it/s]
Predicting DataLoader 0: 88%|██████████████ | 101/115 [02:54<00:24, 0.58it/s]
Predicting DataLoader 0: 89%|██████████████▏ | 102/115 [02:55<00:22, 0.58it/s]
Predicting DataLoader 0: 90%|██████████████▎ | 103/115 [02:57<00:20, 0.58it/s]
Predicting DataLoader 0: 90%|██████████████▍ | 104/115 [02:59<00:18, 0.58it/s]
Predicting DataLoader 0: 91%|██████████████▌ | 105/115 [03:00<00:17, 0.58it/s]
Predicting DataLoader 0: 92%|██████████████▋ | 106/115 [03:02<00:15, 0.58it/s]
Predicting DataLoader 0: 93%|██████████████▉ | 107/115 [03:04<00:13, 0.58it/s]
Predicting DataLoader 0: 94%|███████████████ | 108/115 [03:06<00:12, 0.58it/s]
Predicting DataLoader 0: 95%|███████████████▏| 109/115 [03:07<00:10, 0.58it/s]
Predicting DataLoader 0: 96%|███████████████▎| 110/115 [03:09<00:08, 0.58it/s]
Predicting DataLoader 0: 97%|███████████████▍| 111/115 [03:11<00:06, 0.58it/s]
Predicting DataLoader 0: 97%|███████████████▌| 112/115 [03:12<00:05, 0.58it/s]
Predicting DataLoader 0: 98%|███████████████▋| 113/115 [03:14<00:03, 0.58it/s]
Predicting DataLoader 0: 99%|███████████████▊| 114/115 [03:16<00:01, 0.58it/s]
Predicting DataLoader 0: 100%|████████████████| 115/115 [03:18<00:00, 0.58it/s]
Predicting DataLoader 0: 100%|████████████████| 115/115 [03:18<00:00, 0.58it/s]
/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: The ``compute`` method of metric WarningCounter was called before the ``update`` method which may lead to errors, as metric states have not yet been updated.
decima - INFO - Creating anndata
decima - INFO - Evaluating performance
Performance on genes in the train dataset.
Mean Pearson Correlation per gene: Mean: 0.01.
Mean Pearson Correlation per gene using size factor (baseline): 0.03.
Mean Pearson Correlation per pseudobulk: 0.00
Performance on genes in the val dataset.
Mean Pearson Correlation per gene: Mean: -0.01.
Mean Pearson Correlation per gene using size factor (baseline): 0.06.
Mean Pearson Correlation per pseudobulk: -0.01
Performance on genes in the test dataset.
Mean Pearson Correlation per gene: Mean: -0.02.
Mean Pearson Correlation per gene using size factor (baseline): -0.00.
Mean Pearson Correlation per pseudobulk: -0.02
We can open the output h5ad file to see the individual predictions and metrics.
ad_out = anndata.read_h5ad("example/test_preds.h5ad")
ad_out
AnnData object with n_obs × n_vars = 50 × 920
obs: 'cell_type', 'tissue', 'disease', 'study', 'size_factor', 'train_pearson', 'val_pearson', 'test_pearson'
var: 'chrom', 'start', 'end', 'strand', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'dataset', 'pearson', 'size_factor_pearson'
layers: 'preds', 'preds_finetune_test_0'
.layers['preds_0'] and .layers['preds_1'] contain the predictions made by the individual models whereas .layers['preds_0'] contains the average predictions. You will see that performance metrics have been added to both .obs and .var.
ad_out.obs.head()
| cell_type | tissue | disease | study | size_factor | train_pearson | val_pearson | test_pearson | |
|---|---|---|---|---|---|---|---|---|
| pseudobulk_0 | ct_0 | t_0 | d_0 | st_0 | 4946.397461 | 0.010020 | 0.171944 | 0.122095 |
| pseudobulk_1 | ct_0 | t_0 | d_1 | st_0 | 4858.091797 | -0.024151 | 0.061900 | -0.169406 |
| pseudobulk_2 | ct_0 | t_0 | d_2 | st_1 | 4921.185547 | 0.007005 | -0.079252 | -0.094602 |
| pseudobulk_3 | ct_0 | t_0 | d_0 | st_1 | 4928.486816 | 0.016869 | -0.023038 | 0.007967 |
| pseudobulk_4 | ct_0 | t_0 | d_1 | st_2 | 4756.819336 | 0.050297 | 0.160398 | -0.101163 |
ad_out.var.head()
| chrom | start | end | strand | gene_start | gene_end | gene_length | gene_mask_start | gene_mask_end | dataset | pearson | size_factor_pearson | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| gene_0 | chr1 | 26191000 | 26715288 | + | 26354840 | 26879128 | 524288 | 163840 | 524288 | train | 0.177304 | -0.062494 |
| gene_1 | chr19 | 41275257 | 41799545 | - | 41111417 | 41635705 | 524288 | 163840 | 524288 | train | 0.049450 | -0.037428 |
| gene_2 | chr1 | 79937866 | 80462154 | - | 79774026 | 80298314 | 524288 | 163840 | 524288 | train | -0.095439 | 0.240203 |
| gene_4 | chr16 | 3905208 | 4429496 | - | 3741368 | 4265656 | 524288 | 163840 | 524288 | train | -0.092946 | -0.042283 |
| gene_5 | chr10 | 22495641 | 23019929 | + | 22659481 | 23183769 | 524288 | 163840 | 524288 | train | -0.310151 | -0.069181 |