{
"cells": [
{
"cell_type": "markdown",
"id": "10fdb752-2248-4e3a-9678-2e0bf2288790",
"metadata": {},
"source": [
"# Fine-tuning Borzoi to create a Decima model"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "c6dbf5fc-85ca-42a8-b076-ba0313604e91",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:18.270014Z",
"iopub.status.busy": "2025-11-21T22:29:18.269887Z",
"iopub.status.idle": "2025-11-21T22:29:26.533418Z",
"shell.execute_reply": "2025-11-21T22:29:26.532666Z"
}
},
"outputs": [],
"source": [
"import glob\n",
"import anndata\n",
"import scanpy as sc\n",
"import pandas as pd\n",
"import bioframe as bf\n",
"import os"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a8563bf1-0305-437b-81fa-0584753c5793",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.535200Z",
"iopub.status.busy": "2025-11-21T22:29:26.534846Z",
"iopub.status.idle": "2025-11-21T22:29:26.537436Z",
"shell.execute_reply": "2025-11-21T22:29:26.537029Z"
}
},
"outputs": [],
"source": [
"inputdir = \"./data\"\n",
"outdir = \"./example\"\n",
"ad_file_path = os.path.join(inputdir, \"data.h5ad\")\n",
"h5_file_path = os.path.join(outdir, \"data.h5\")"
]
},
{
"cell_type": "markdown",
"id": "4215ffc0-6a14-44b4-b522-7d4322a7cafe",
"metadata": {},
"source": [
"## 1. Load input anndata file"
]
},
{
"cell_type": "markdown",
"id": "f36c3e31-e447-42b8-b785-5d75b1a1007f",
"metadata": {},
"source": [
"The input anndata file needs to be in the format (pseudobulks x genes)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "83273dca-0622-42e2-a606-b645e0a31f19",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.538717Z",
"iopub.status.busy": "2025-11-21T22:29:26.538588Z",
"iopub.status.idle": "2025-11-21T22:29:26.580706Z",
"shell.execute_reply": "2025-11-21T22:29:26.580293Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AnnData object with n_obs × n_vars = 50 × 921\n",
" obs: 'cell_type', 'tissue', 'disease', 'study'\n",
" var: 'chrom', 'start', 'end', 'strand', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'dataset'\n",
" uns: 'log1p'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad = sc.read(ad_file_path)\n",
"ad"
]
},
{
"cell_type": "markdown",
"id": "dcb6c9a7-5e97-46fc-a2d3-fe029821c375",
"metadata": {},
"source": [
"`.obs` should be a dataframe with a unique index per pseudobulk. You can also include other columns with metadata about the pseudobulks, e.g. cell type, tissue, disease, study, number of cells, total counts. \n",
"\n",
"Note that the original Decima model does NOT separate pseudobulks by sample, i.e. different samples from the same cell type, tissue, disease and study were merged. We also recommend filtering out pseudobulks with few cells or low read count. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e29ca0c0-5f61-4146-b187-d11cc57373d0",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.581807Z",
"iopub.status.busy": "2025-11-21T22:29:26.581680Z",
"iopub.status.idle": "2025-11-21T22:29:26.602279Z",
"shell.execute_reply": "2025-11-21T22:29:26.601917Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" cell_type | \n",
" tissue | \n",
" disease | \n",
" study | \n",
"
\n",
" \n",
" \n",
" \n",
" | pseudobulk_0 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_0 | \n",
" st_0 | \n",
"
\n",
" \n",
" | pseudobulk_1 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_1 | \n",
" st_0 | \n",
"
\n",
" \n",
" | pseudobulk_2 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_2 | \n",
" st_1 | \n",
"
\n",
" \n",
" | pseudobulk_3 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_0 | \n",
" st_1 | \n",
"
\n",
" \n",
" | pseudobulk_4 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_1 | \n",
" st_2 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" cell_type tissue disease study\n",
"pseudobulk_0 ct_0 t_0 d_0 st_0\n",
"pseudobulk_1 ct_0 t_0 d_1 st_0\n",
"pseudobulk_2 ct_0 t_0 d_2 st_1\n",
"pseudobulk_3 ct_0 t_0 d_0 st_1\n",
"pseudobulk_4 ct_0 t_0 d_1 st_2"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.obs.head()"
]
},
{
"cell_type": "markdown",
"id": "ab69e185-0d58-41e1-8d04-c60a4ed24ef5",
"metadata": {},
"source": [
"`.var` should be a dataframe with a unique index per gene. The index can be the gene name or Ensembl ID, as long as it is unique. Other essential columns are: chrom, start, end and strand (the gene coordinates).\n",
"\n",
"You can also include other columns with metadata about the genes, e.g. Ensembl ID, type of gene."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a79a70c0-5a33-46dc-b363-4e9df6ab2b8a",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.603573Z",
"iopub.status.busy": "2025-11-21T22:29:26.603444Z",
"iopub.status.idle": "2025-11-21T22:29:26.609192Z",
"shell.execute_reply": "2025-11-21T22:29:26.608798Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26354840 | \n",
" 26879128 | \n",
" + | \n",
" 26518680 | \n",
" 27042968 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41111417 | \n",
" 41635705 | \n",
" - | \n",
" 40947577 | \n",
" 41471865 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79774026 | \n",
" 80298314 | \n",
" - | \n",
" 79610186 | \n",
" 80134474 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3741368 | \n",
" 4265656 | \n",
" - | \n",
" 3577528 | \n",
" 4101816 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22659481 | \n",
" 23183769 | \n",
" + | \n",
" 22823321 | \n",
" 23347609 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26354840 26879128 + 26518680 27042968 524288 \n",
"gene_1 chr19 41111417 41635705 - 40947577 41471865 524288 \n",
"gene_2 chr1 79774026 80298314 - 79610186 80134474 524288 \n",
"gene_4 chr16 3741368 4265656 - 3577528 4101816 524288 \n",
"gene_5 chr10 22659481 23183769 + 22823321 23347609 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset \n",
"gene_0 163840 524288 train \n",
"gene_1 163840 524288 train \n",
"gene_2 163840 524288 train \n",
"gene_4 163840 524288 train \n",
"gene_5 163840 524288 train "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.head()"
]
},
{
"cell_type": "markdown",
"id": "55cd3f29-8bf7-47f8-8942-bd906a856ab7",
"metadata": {},
"source": [
"`.X` should contain the total counts per gene and pseudobulk. These should be non-negative integers."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3fd9bd2a-e728-4dc5-9c90-a9558fab0e27",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.610372Z",
"iopub.status.busy": "2025-11-21T22:29:26.610248Z",
"iopub.status.idle": "2025-11-21T22:29:26.613043Z",
"shell.execute_reply": "2025-11-21T22:29:26.612649Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0. , 7.2926292, 7.2926292, 7.2926292, 7.2926292],\n",
" [7.3133874, 7.3133874, 0. , 7.3133874, 7.3133874],\n",
" [7.299993 , 7.299993 , 7.299993 , 7.299993 , 0. ],\n",
" [7.299993 , 0. , 7.299993 , 7.299993 , 0. ],\n",
" [7.3376517, 7.3376517, 0. , 7.3376517, 7.3376517]],\n",
" dtype=float32)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.X[:5, :5]"
]
},
{
"cell_type": "markdown",
"id": "9514b0b9-9cc3-48f0-9e70-897c1cb55962",
"metadata": {},
"source": [
"## 2. Normalize and log transform data"
]
},
{
"cell_type": "markdown",
"id": "cbca4273-1752-47dd-9b3a-9b29266787e3",
"metadata": {},
"source": [
"We first transform the counts to log(CPM+1) values. CPM = Counts Per Million."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "34115f7a-aaf8-4ca3-abbb-a4fc552bf5a7",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.614156Z",
"iopub.status.busy": "2025-11-21T22:29:26.614039Z",
"iopub.status.idle": "2025-11-21T22:29:26.616992Z",
"shell.execute_reply": "2025-11-21T22:29:26.616589Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING: adata.X seems to be already log-transformed.\n"
]
}
],
"source": [
"sc.pp.normalize_total(ad, target_sum=1e6)\n",
"sc.pp.log1p(ad)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e42a91c7-ac01-45b3-8d3b-6c99baf7adff",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.618117Z",
"iopub.status.busy": "2025-11-21T22:29:26.617994Z",
"iopub.status.idle": "2025-11-21T22:29:26.620685Z",
"shell.execute_reply": "2025-11-21T22:29:26.620286Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0. , 7.295568 , 7.295568 , 7.295568 , 7.295568 ],\n",
" [7.316388 , 7.316388 , 0. , 7.316388 , 7.316388 ],\n",
" [7.3014727, 7.3014727, 7.3014727, 7.3014727, 0. ],\n",
" [7.3014727, 0. , 7.3014727, 7.3014727, 0. ],\n",
" [7.3407264, 7.3407264, 0. , 7.3407264, 7.3407264]],\n",
" dtype=float32)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.X[:5, :5]"
]
},
{
"cell_type": "markdown",
"id": "1556f595-d4c7-4944-905e-060e4ae1c4f6",
"metadata": {},
"source": [
"## 3. Create intervals surrounding genes"
]
},
{
"cell_type": "markdown",
"id": "ff51db62-9c1d-4af7-b188-fed7b038e3fa",
"metadata": {},
"source": [
"Decima is trained on 524,288 bp sequence surrounding the genes. Therefore, we have to take the given gene coordinates and extend them to create intervals of this length."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "86905140-4b30-424b-91ce-090a0a56ebab",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:26.621914Z",
"iopub.status.busy": "2025-11-21T22:29:26.621802Z",
"iopub.status.idle": "2025-11-21T22:29:40.486339Z",
"shell.execute_reply": "2025-11-21T22:29:40.485817Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\n",
" warnings.warn(\n",
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\n",
" warnings.warn(\n"
]
}
],
"source": [
"from decima.data.preprocess import var_to_intervals"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d027eb23-48d6-40d9-9e30-b5c7691a7c53",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.487901Z",
"iopub.status.busy": "2025-11-21T22:29:40.487652Z",
"iopub.status.idle": "2025-11-21T22:29:40.494114Z",
"shell.execute_reply": "2025-11-21T22:29:40.493712Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26354840 | \n",
" 26879128 | \n",
" + | \n",
" 26518680 | \n",
" 27042968 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41111417 | \n",
" 41635705 | \n",
" - | \n",
" 40947577 | \n",
" 41471865 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79774026 | \n",
" 80298314 | \n",
" - | \n",
" 79610186 | \n",
" 80134474 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3741368 | \n",
" 4265656 | \n",
" - | \n",
" 3577528 | \n",
" 4101816 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22659481 | \n",
" 23183769 | \n",
" + | \n",
" 22823321 | \n",
" 23347609 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26354840 26879128 + 26518680 27042968 524288 \n",
"gene_1 chr19 41111417 41635705 - 40947577 41471865 524288 \n",
"gene_2 chr1 79774026 80298314 - 79610186 80134474 524288 \n",
"gene_4 chr16 3741368 4265656 - 3577528 4101816 524288 \n",
"gene_5 chr10 22659481 23183769 + 22823321 23347609 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset \n",
"gene_0 163840 524288 train \n",
"gene_1 163840 524288 train \n",
"gene_2 163840 524288 train \n",
"gene_4 163840 524288 train \n",
"gene_5 163840 524288 train "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.head()"
]
},
{
"cell_type": "markdown",
"id": "0f067d6e-f0d7-48f9-b973-032fac069159",
"metadata": {},
"source": [
"First, we copy the start and end columns to `gene_start` and `gene_end`. We also create a new column `gene_length`. "
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "566977ab-041f-4a3d-b10e-9b6fa717c98e",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.495318Z",
"iopub.status.busy": "2025-11-21T22:29:40.495198Z",
"iopub.status.idle": "2025-11-21T22:29:40.498415Z",
"shell.execute_reply": "2025-11-21T22:29:40.498008Z"
}
},
"outputs": [],
"source": [
"ad.var[\"gene_start\"] = ad.var.start.tolist()\n",
"ad.var[\"gene_end\"] = ad.var.end.tolist()\n",
"ad.var[\"gene_length\"] = ad.var[\"gene_end\"] - ad.var[\"gene_start\"]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "e23e95dd-6616-4f79-8d1b-3a0fe22816d0",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.499458Z",
"iopub.status.busy": "2025-11-21T22:29:40.499339Z",
"iopub.status.idle": "2025-11-21T22:29:40.504838Z",
"shell.execute_reply": "2025-11-21T22:29:40.504440Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26354840 | \n",
" 26879128 | \n",
" + | \n",
" 26354840 | \n",
" 26879128 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41111417 | \n",
" 41635705 | \n",
" - | \n",
" 41111417 | \n",
" 41635705 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79774026 | \n",
" 80298314 | \n",
" - | \n",
" 79774026 | \n",
" 80298314 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3741368 | \n",
" 4265656 | \n",
" - | \n",
" 3741368 | \n",
" 4265656 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22659481 | \n",
" 23183769 | \n",
" + | \n",
" 22659481 | \n",
" 23183769 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26354840 26879128 + 26354840 26879128 524288 \n",
"gene_1 chr19 41111417 41635705 - 41111417 41635705 524288 \n",
"gene_2 chr1 79774026 80298314 - 79774026 80298314 524288 \n",
"gene_4 chr16 3741368 4265656 - 3741368 4265656 524288 \n",
"gene_5 chr10 22659481 23183769 + 22659481 23183769 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset \n",
"gene_0 163840 524288 train \n",
"gene_1 163840 524288 train \n",
"gene_2 163840 524288 train \n",
"gene_4 163840 524288 train \n",
"gene_5 163840 524288 train "
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.head()"
]
},
{
"cell_type": "markdown",
"id": "33edbd2f-66f2-48db-ae30-f9aef55c78c3",
"metadata": {},
"source": [
"Now, we extend the gene coordinates to create enclosing intervals:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "c40ebdb5-2d8c-4cce-9685-61d5db0123f3",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.505916Z",
"iopub.status.busy": "2025-11-21T22:29:40.505798Z",
"iopub.status.idle": "2025-11-21T22:29:40.664023Z",
"shell.execute_reply": "2025-11-21T22:29:40.663629Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The interval size is 524288 bases. Of these, 163840 will be upstream of the gene start and 360448 will be downstream of the gene start.\n",
"0 intervals extended beyond the chromosome start and have been shifted\n",
"1 intervals extended beyond the chromosome end and have been shifted\n",
"1 intervals did not extend far enough upstream of the TSS and have been dropped\n"
]
}
],
"source": [
"ad = var_to_intervals(ad, chr_end_pad=10000, genome=\"hg38\")\n",
"# Replace genome name if necessary"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "191ed1e0-d34f-4aa4-a8e7-bc5919642528",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.665110Z",
"iopub.status.busy": "2025-11-21T22:29:40.664988Z",
"iopub.status.idle": "2025-11-21T22:29:40.670440Z",
"shell.execute_reply": "2025-11-21T22:29:40.670046Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26191000 | \n",
" 26715288 | \n",
" + | \n",
" 26354840 | \n",
" 26879128 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41275257 | \n",
" 41799545 | \n",
" - | \n",
" 41111417 | \n",
" 41635705 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79937866 | \n",
" 80462154 | \n",
" - | \n",
" 79774026 | \n",
" 80298314 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3905208 | \n",
" 4429496 | \n",
" - | \n",
" 3741368 | \n",
" 4265656 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22495641 | \n",
" 23019929 | \n",
" + | \n",
" 22659481 | \n",
" 23183769 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26191000 26715288 + 26354840 26879128 524288 \n",
"gene_1 chr19 41275257 41799545 - 41111417 41635705 524288 \n",
"gene_2 chr1 79937866 80462154 - 79774026 80298314 524288 \n",
"gene_4 chr16 3905208 4429496 - 3741368 4265656 524288 \n",
"gene_5 chr10 22495641 23019929 + 22659481 23183769 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset \n",
"gene_0 163840 524288 train \n",
"gene_1 163840 524288 train \n",
"gene_2 163840 524288 train \n",
"gene_4 163840 524288 train \n",
"gene_5 163840 524288 train "
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.head()"
]
},
{
"cell_type": "markdown",
"id": "5733a394-28f5-487e-b967-18aad52cf423",
"metadata": {},
"source": [
"You see that the columns `start` and `end` now contain the start and end coordinates for the 524,288 bp intervals."
]
},
{
"cell_type": "markdown",
"id": "1a8df107-f38e-428b-8bda-47101708ebc7",
"metadata": {},
"source": [
"## 3. Split genes into training, validation and test sets"
]
},
{
"cell_type": "markdown",
"id": "747d3b4b-784f-4735-af81-ab5db4002a9d",
"metadata": {},
"source": [
"We load the coordinates of the genomic regions used to train Borzoi:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "8f4db77e-71f1-4457-b5cf-8c5f8f98ad06",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.671608Z",
"iopub.status.busy": "2025-11-21T22:29:40.671488Z",
"iopub.status.idle": "2025-11-21T22:29:40.876755Z",
"shell.execute_reply": "2025-11-21T22:29:40.876312Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" fold | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" chr4 | \n",
" 82524421 | \n",
" 82721029 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 1 | \n",
" chr13 | \n",
" 18604798 | \n",
" 18801406 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 2 | \n",
" chr2 | \n",
" 189923408 | \n",
" 190120016 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 3 | \n",
" chr10 | \n",
" 59875743 | \n",
" 60072351 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 4 | \n",
" chr1 | \n",
" 117109467 | \n",
" 117306075 | \n",
" fold0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end fold\n",
"0 chr4 82524421 82721029 fold0\n",
"1 chr13 18604798 18801406 fold0\n",
"2 chr2 189923408 190120016 fold0\n",
"3 chr10 59875743 60072351 fold0\n",
"4 chr1 117109467 117306075 fold0"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"splits_file = \"https://raw.githubusercontent.com/calico/borzoi/main/data/sequences_human.bed.gz\"\n",
"# replace human with mouse for mm10 splits\n",
"splits = pd.read_table(splits_file, header=None, names=[\"chrom\", \"start\", \"end\", \"fold\"])\n",
"splits.head()"
]
},
{
"cell_type": "markdown",
"id": "e7229a5f-d27e-48b3-8590-a23474635542",
"metadata": {},
"source": [
"Now, we overlap our gene intervals with these regions:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "99d1a382-1384-41ea-b59a-bb4aaa26caba",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.877970Z",
"iopub.status.busy": "2025-11-21T22:29:40.877840Z",
"iopub.status.idle": "2025-11-21T22:29:40.907310Z",
"shell.execute_reply": "2025-11-21T22:29:40.906870Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" gene | \n",
" fold_ | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" gene_0 | \n",
" fold5 | \n",
"
\n",
" \n",
" | 15 | \n",
" gene_1 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 30 | \n",
" gene_2 | \n",
" fold0 | \n",
"
\n",
" \n",
" | 44 | \n",
" gene_4 | \n",
" fold2 | \n",
"
\n",
" \n",
" | 59 | \n",
" gene_5 | \n",
" fold2 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" gene fold_\n",
"0 gene_0 fold5\n",
"15 gene_1 fold0\n",
"30 gene_2 fold0\n",
"44 gene_4 fold2\n",
"59 gene_5 fold2"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"overlaps = bf.overlap(ad.var.reset_index(names=\"gene\"), splits, how=\"left\")\n",
"overlaps = overlaps[[\"gene\", \"fold_\"]].drop_duplicates().astype(str)\n",
"overlaps.head()"
]
},
{
"cell_type": "markdown",
"id": "26a1a415-5299-48d6-94e0-7ee008d2bcc3",
"metadata": {},
"source": [
"Based on the overlap, we divide our gene intervals into training, validation and test sets."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "14180e40-c296-4f6f-b353-a07f383a7aae",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.908528Z",
"iopub.status.busy": "2025-11-21T22:29:40.908396Z",
"iopub.status.idle": "2025-11-21T22:29:40.911444Z",
"shell.execute_reply": "2025-11-21T22:29:40.910959Z"
}
},
"outputs": [],
"source": [
"test_genes = overlaps.gene[overlaps.fold_ == \"fold3\"].tolist()\n",
"val_genes = overlaps.gene[overlaps.fold_ == \"fold4\"].tolist()\n",
"train_genes = set(overlaps.gene).difference(set(test_genes).union(val_genes))"
]
},
{
"cell_type": "markdown",
"id": "d6dc73a9-42d2-46c9-8176-34edbd70125d",
"metadata": {},
"source": [
"And add this information back to `ad.var`."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "ed8411d8-aecf-4196-bcc3-7635c1b4f34a",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.912530Z",
"iopub.status.busy": "2025-11-21T22:29:40.912405Z",
"iopub.status.idle": "2025-11-21T22:29:40.918001Z",
"shell.execute_reply": "2025-11-21T22:29:40.917575Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/tmp/slurmjob.14477843/ipykernel_3516462/3109841685.py:1: ImplicitModificationWarning: Trying to modify attribute `.var` of view, initializing view as actual.\n"
]
}
],
"source": [
"ad.var[\"dataset\"] = \"test\"\n",
"ad.var.loc[ad.var.index.isin(val_genes), \"dataset\"] = \"val\"\n",
"ad.var.loc[ad.var.index.isin(train_genes), \"dataset\"] = \"train\""
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "6e71f3f9-61d5-496d-8b04-661d5e97c2b8",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.919025Z",
"iopub.status.busy": "2025-11-21T22:29:40.918905Z",
"iopub.status.idle": "2025-11-21T22:29:40.924425Z",
"shell.execute_reply": "2025-11-21T22:29:40.924026Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26191000 | \n",
" 26715288 | \n",
" + | \n",
" 26354840 | \n",
" 26879128 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41275257 | \n",
" 41799545 | \n",
" - | \n",
" 41111417 | \n",
" 41635705 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79937866 | \n",
" 80462154 | \n",
" - | \n",
" 79774026 | \n",
" 80298314 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3905208 | \n",
" 4429496 | \n",
" - | \n",
" 3741368 | \n",
" 4265656 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22495641 | \n",
" 23019929 | \n",
" + | \n",
" 22659481 | \n",
" 23183769 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26191000 26715288 + 26354840 26879128 524288 \n",
"gene_1 chr19 41275257 41799545 - 41111417 41635705 524288 \n",
"gene_2 chr1 79937866 80462154 - 79774026 80298314 524288 \n",
"gene_4 chr16 3905208 4429496 - 3741368 4265656 524288 \n",
"gene_5 chr10 22495641 23019929 + 22659481 23183769 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset \n",
"gene_0 163840 524288 train \n",
"gene_1 163840 524288 train \n",
"gene_2 163840 524288 train \n",
"gene_4 163840 524288 train \n",
"gene_5 163840 524288 train "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.head()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "894e5f6f-1d0b-4c71-8f11-fcea38bea97d",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.925651Z",
"iopub.status.busy": "2025-11-21T22:29:40.925524Z",
"iopub.status.idle": "2025-11-21T22:29:40.928822Z",
"shell.execute_reply": "2025-11-21T22:29:40.928340Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dataset\n",
"train 766\n",
"test 83\n",
"val 71\n",
"Name: count, dtype: int64"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad.var.dataset.value_counts()"
]
},
{
"cell_type": "markdown",
"id": "0a99d7e9-6b69-42ed-ae2b-fda6c0229035",
"metadata": {},
"source": [
"We have now divided the 1000 genes in our dataset into separate sets to be used for training, validation and testing."
]
},
{
"cell_type": "markdown",
"id": "c3ce727d-ae1a-4e9b-bf57-c6856e4e21e7",
"metadata": {},
"source": [
"## 4. Save processed anndata"
]
},
{
"cell_type": "markdown",
"id": "f950a054-a749-4a6b-a0c5-11b162e1babe",
"metadata": {},
"source": [
"We will save the processed anndata file containing these intervals and data splits."
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "e55d7251-5372-4744-a1f3-411223d4eb35",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.929937Z",
"iopub.status.busy": "2025-11-21T22:29:40.929820Z",
"iopub.status.idle": "2025-11-21T22:29:40.998651Z",
"shell.execute_reply": "2025-11-21T22:29:40.998240Z"
}
},
"outputs": [],
"source": [
"ad.write_h5ad(ad_file_path)"
]
},
{
"cell_type": "markdown",
"id": "a348fd0c-746f-4f3f-9c22-1cef3220460c",
"metadata": {},
"source": [
"## 5. Create an hdf5 file"
]
},
{
"cell_type": "markdown",
"id": "f5b813a4-ad04-421c-9e13-a5cbb4c4885b",
"metadata": {},
"source": [
"To train Decima, we need to extract the genomic sequences for all the intervals and convert them to one-hot encoded format. We save these one-hot encoded inputs to an hdf5 file."
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "3b1473fd-fd72-41e7-a673-92fb474440ec",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:40.999863Z",
"iopub.status.busy": "2025-11-21T22:29:40.999743Z",
"iopub.status.idle": "2025-11-21T22:29:41.024192Z",
"shell.execute_reply": "2025-11-21T22:29:41.023764Z"
}
},
"outputs": [],
"source": [
"from decima.data.write_hdf5 import write_hdf5"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "925184a3",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:41.025478Z",
"iopub.status.busy": "2025-11-21T22:29:41.025360Z",
"iopub.status.idle": "2025-11-21T22:29:41.191312Z",
"shell.execute_reply": "2025-11-21T22:29:41.190672Z"
}
},
"outputs": [],
"source": [
"! mkdir -p example"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "81e3d301-db22-4a00-9cdc-f6ec7c641030",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:29:41.192756Z",
"iopub.status.busy": "2025-11-21T22:29:41.192597Z",
"iopub.status.idle": "2025-11-21T22:30:41.442605Z",
"shell.execute_reply": "2025-11-21T22:30:41.442010Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing metadata\n",
"Writing task indices\n",
"Writing genes array of shape: (920, 2)\n",
"Writing labels array of shape: (920, 50, 1)\n",
"Making gene masks\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing mask array of shape: (920, 534288)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Encoding sequences\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing sequence array of shape: (920, 534288)\n",
"Done!\n"
]
}
],
"source": [
"write_hdf5(file=h5_file_path, ad=ad, pad=5000, genome=\"hg38\")\n",
"# Change genome name if necessary"
]
},
{
"cell_type": "markdown",
"id": "2e7ebe9c-a523-4aff-8bcd-3faf46917caf",
"metadata": {},
"source": [
"## 6. Set training parameters"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "d9ca9ee3-c90a-4848-b9ed-899f21fbf39e",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:30:41.444116Z",
"iopub.status.busy": "2025-11-21T22:30:41.443952Z",
"iopub.status.idle": "2025-11-21T22:30:41.446732Z",
"shell.execute_reply": "2025-11-21T22:30:41.446334Z"
}
},
"outputs": [],
"source": [
"# Learning rate default=0.001\n",
"lr = 5e-5\n",
"# Total weight parameter for the loss function\n",
"total_weight = 1e-4\n",
"# Gradient accumulation steps\n",
"grad = 5\n",
"# batch-size. default=4\n",
"bs = 4\n",
"# max-seq-shift. default=5000\n",
"shift = 5000\n",
"# Number of epochs. Default 1\n",
"epochs = 15\n",
"\n",
"# logger\n",
"logger = \"wandb\" # Change to csv to save logs locally\n",
"\n",
"# Number of workers default=16\n",
"workers = 16"
]
},
{
"cell_type": "markdown",
"id": "f74beb10-0045-4c3c-9bc3-b5037053241b",
"metadata": {},
"source": [
"## 7. Generate training commands"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "676bba10-feaf-4bcf-ae76-91cf163ac26a",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:30:41.447823Z",
"iopub.status.busy": "2025-11-21T22:30:41.447700Z",
"iopub.status.idle": "2025-11-21T22:30:41.450388Z",
"shell.execute_reply": "2025-11-21T22:30:41.450000Z"
}
},
"outputs": [],
"source": [
"cmds = []\n",
"\n",
"for model in range(4):\n",
" name = f\"finetune_test_{model}\"\n",
" device = model\n",
"\n",
" cmd = (\n",
" f\"decima finetune --name {name} \"\n",
" + f\"--model {model} --device {device} \"\n",
" + f\"--matrix-file {ad_file_path} --h5-file {h5_file_path} \"\n",
" + f\"--outdir {outdir} --learning-rate {lr} \"\n",
" + f\"--loss-total-weight {total_weight} --gradient-accumulation {grad} \"\n",
" + f\"--batch-size {bs} --max-seq-shift {shift} \"\n",
" + f\"--epochs {epochs} --logger {logger} --num-workers {workers}\"\n",
" )\n",
" cmds.append(cmd)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "6aee97bd-7b06-4af5-834e-c0f08127e75c",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:30:41.451403Z",
"iopub.status.busy": "2025-11-21T22:30:41.451285Z",
"iopub.status.idle": "2025-11-21T22:30:41.453392Z",
"shell.execute_reply": "2025-11-21T22:30:41.452977Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima finetune --name finetune_test_0 --model 0 --device 0 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16\n",
"decima finetune --name finetune_test_1 --model 1 --device 1 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16\n",
"decima finetune --name finetune_test_2 --model 2 --device 2 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16\n",
"decima finetune --name finetune_test_3 --model 3 --device 3 --matrix-file ./data/data.h5ad --h5-file ./example/data.h5 --outdir ./example --learning-rate 5e-05 --loss-total-weight 0.0001 --gradient-accumulation 5 --batch-size 4 --max-seq-shift 5000 --epochs 15 --logger wandb --num-workers 16\n"
]
}
],
"source": [
"for cmd in cmds:\n",
" print(cmd)"
]
},
{
"cell_type": "markdown",
"id": "4133e741",
"metadata": {},
"source": [
"Here, we train the model for 1 epoch for quick progressing in tutorial. Run the training for more epochs in your training."
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "d0fdaa9d",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:30:41.454479Z",
"iopub.status.busy": "2025-11-21T22:30:41.454355Z",
"iopub.status.idle": "2025-11-21T22:35:16.456721Z",
"shell.execute_reply": "2025-11-21T22:35:16.455981Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n",
" warnings.warn(\r\n",
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n",
" warnings.warn(\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Data paths: matrix_file=./data/data.h5ad, h5_file=./example/data.h5\r\n",
"decima - INFO - Reading anndata\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Making dataset objects\r\n",
"decima - INFO - train_params: {'batch_size': 1, 'num_workers': 16, 'devices': 0, 'logger': 'wandb', 'save_dir': './example', 'max_epochs': 1, 'lr': 5e-05, 'total_weight': 0.0001, 'accumulate_grad_batches': 5, 'loss': 'poisson_multinomial', 'clip': 0.0, 'save_top_k': 1, 'pin_memory': True}\r\n",
"decima - INFO - model_params: {'n_tasks': 50, 'init_borzoi': True, 'replicate': '0'}\r\n",
"decima - INFO - Initializing model\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Initializing weights from Borzoi model using wandb for replicate: 0\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34m\u001b[1mwandb\u001b[0m: Downloading large artifact 'human_state_dict_fold0:latest', 709.30MB. 1 files...\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34m\u001b[1mwandb\u001b[0m: 1 of 1 files downloaded. \r\n",
"Done. 00:00:01.7 (406.1MB/s)\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Connecting to wandb.\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mmhcelik\u001b[0m (\u001b[33mmhcw\u001b[0m) to \u001b[32mhttps://genentech.wandb.io\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[38;5;178m⢿\u001b[0m Waiting for wandb.init()...\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[Am\u001b[2K\r",
"\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[38;5;178m⣻\u001b[0m setting up run g20ya0al (0.2s)\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[Am\u001b[2K\r",
"\u001b[34m\u001b[1mwandb\u001b[0m: Tracking run with wandb version 0.22.2\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: Run data is saved locally in \u001b[35m\u001b[1mfinetune_test_0/wandb/run-20251121_143055-g20ya0al\u001b[0m\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: Run \u001b[1m`wandb offline`\u001b[0m to turn off syncing.\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: Syncing run \u001b[33mfinetune_test_0\u001b[0m\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: ⭐️ View project at \u001b[34m\u001b[4mhttps://genentech.wandb.io/grelu/decima\u001b[0m\r\n",
"\u001b[34m\u001b[1mwandb\u001b[0m: 🚀 View run at \u001b[34m\u001b[4mhttps://genentech.wandb.io/grelu/decima/runs/g20ya0al\u001b[0m\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Training\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/__init__.py:1617: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using 16bit Automatic Mixed Precision (AMP)\r\n",
"GPU available: True (cuda), used: True\r\n",
"TPU available: False, using: 0 TPU cores\r\n",
"HPU available: False, using: 0 HPUs\r\n",
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py:397: UserWarning: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.\r\n",
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"SLURM auto-requeueing enabled. Setting signal handlers.\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation: | | 0/? [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation: | | 0/? [00:00, ?it/s]\r",
"Validation DataLoader 0: 0%| | 0/71 [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.704072952270508, Poisson: -0.08451984077692032\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation DataLoader 0: 1%|▎ | 1/71 [00:04<05:11, 0.22it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.50640296936035, Poisson: -0.081619992852211\r\n",
"\r",
"Validation DataLoader 0: 3%|▌ | 2/71 [00:04<02:37, 0.44it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.562318801879883, Poisson: -0.11422758549451828\r\n",
"\r",
"Validation DataLoader 0: 4%|▊ | 3/71 [00:04<01:45, 0.65it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.25124168395996, Poisson: -0.08771996945142746\r\n",
"\r",
"Validation DataLoader 0: 6%|█ | 4/71 [00:04<01:19, 0.84it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.056081771850586, Poisson: -0.10517071187496185\r\n",
"\r",
"Validation DataLoader 0: 7%|█▎ | 5/71 [00:04<01:03, 1.04it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.16107940673828, Poisson: -0.08438651263713837\r\n",
"\r",
"Validation DataLoader 0: 8%|█▌ | 6/71 [00:04<00:53, 1.22it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.731626510620117, Poisson: -0.084895059466362\r\n",
"\r",
"Validation DataLoader 0: 10%|█▊ | 7/71 [00:04<00:45, 1.40it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.352930068969727, Poisson: -0.10843678563833237\r\n",
"\r",
"Validation DataLoader 0: 11%|██▏ | 8/71 [00:05<00:40, 1.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.15188217163086, Poisson: -0.1109374463558197\r\n",
"\r",
"Validation DataLoader 0: 13%|██▍ | 9/71 [00:05<00:35, 1.74it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.722728729248047, Poisson: -0.09945794194936752\r\n",
"\r",
"Validation DataLoader 0: 14%|██▌ | 10/71 [00:05<00:32, 1.90it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.389652252197266, Poisson: -0.0819193497300148\r\n",
"\r",
"Validation DataLoader 0: 15%|██▊ | 11/71 [00:05<00:29, 2.06it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.801376342773438, Poisson: -0.10459177196025848\r\n",
"\r",
"Validation DataLoader 0: 17%|███ | 12/71 [00:05<00:26, 2.21it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.208261489868164, Poisson: -0.09648436307907104\r\n",
"\r",
"Validation DataLoader 0: 18%|███▎ | 13/71 [00:05<00:24, 2.36it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.97159194946289, Poisson: -0.10451767593622208\r\n",
"\r",
"Validation DataLoader 0: 20%|███▌ | 14/71 [00:05<00:22, 2.50it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.06745719909668, Poisson: -0.11637863516807556\r\n",
"\r",
"Validation DataLoader 0: 21%|███▊ | 15/71 [00:05<00:21, 2.64it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.875829696655273, Poisson: -0.09984245151281357\r\n",
"\r",
"Validation DataLoader 0: 23%|████ | 16/71 [00:05<00:19, 2.77it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.660158157348633, Poisson: -0.07892918586730957\r\n",
"\r",
"Validation DataLoader 0: 24%|████▎ | 17/71 [00:05<00:18, 2.90it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.05860137939453, Poisson: -0.11131791770458221\r\n",
"\r",
"Validation DataLoader 0: 25%|████▌ | 18/71 [00:05<00:17, 3.02it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.243940353393555, Poisson: -0.09636309742927551\r\n",
"\r",
"Validation DataLoader 0: 27%|████▊ | 19/71 [00:06<00:16, 3.15it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.765872955322266, Poisson: -0.11982230842113495\r\n",
"\r",
"Validation DataLoader 0: 28%|█████ | 20/71 [00:06<00:15, 3.27it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.902503967285156, Poisson: -0.09954728931188583\r\n",
"\r",
"Validation DataLoader 0: 30%|█████▎ | 21/71 [00:06<00:14, 3.38it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.31157875061035, Poisson: -0.11637787520885468\r\n",
"\r",
"Validation DataLoader 0: 31%|█████▌ | 22/71 [00:06<00:14, 3.49it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.60178565979004, Poisson: -0.09363999217748642\r\n",
"\r",
"Validation DataLoader 0: 32%|█████▊ | 23/71 [00:06<00:13, 3.60it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.007802963256836, Poisson: -0.09034327417612076\r\n",
"\r",
"Validation DataLoader 0: 34%|██████ | 24/71 [00:06<00:12, 3.71it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.0886173248291, Poisson: -0.09662938863039017\r\n",
"\r",
"Validation DataLoader 0: 35%|██████▎ | 25/71 [00:06<00:12, 3.81it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.589750289916992, Poisson: -0.09355800598859787\r\n",
"\r",
"Validation DataLoader 0: 37%|██████▌ | 26/71 [00:06<00:11, 3.91it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.216665267944336, Poisson: -0.10239789634943008\r\n",
"\r",
"Validation DataLoader 0: 38%|██████▊ | 27/71 [00:06<00:10, 4.01it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.116254806518555, Poisson: -0.08170726150274277\r\n",
"\r",
"Validation DataLoader 0: 39%|███████ | 28/71 [00:06<00:10, 4.10it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.821630477905273, Poisson: -0.09940409660339355\r\n",
"\r",
"Validation DataLoader 0: 41%|███████▎ | 29/71 [00:06<00:10, 4.20it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.517545700073242, Poisson: -0.113502636551857\r\n",
"\r",
"Validation DataLoader 0: 42%|███████▌ | 30/71 [00:06<00:09, 4.29it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.03712272644043, Poisson: -0.10526878386735916\r\n",
"\r",
"Validation DataLoader 0: 44%|███████▊ | 31/71 [00:07<00:09, 4.38it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.719432830810547, Poisson: -0.09343377500772476\r\n",
"\r",
"Validation DataLoader 0: 45%|████████ | 32/71 [00:07<00:08, 4.46it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.70380401611328, Poisson: -0.10528752207756042\r\n",
"\r",
"Validation DataLoader 0: 46%|████████▎ | 33/71 [00:07<00:08, 4.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.4345760345459, Poisson: -0.0937180295586586\r\n",
"\r",
"Validation DataLoader 0: 48%|████████▌ | 34/71 [00:07<00:07, 4.63it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.56287384033203, Poisson: -0.0933140367269516\r\n",
"\r",
"Validation DataLoader 0: 49%|████████▊ | 35/71 [00:07<00:07, 4.71it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.247249603271484, Poisson: -0.11643800884485245\r\n",
"\r",
"Validation DataLoader 0: 51%|█████████▏ | 36/71 [00:07<00:07, 4.79it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.163373947143555, Poisson: -0.07618056982755661\r\n",
"\r",
"Validation DataLoader 0: 52%|█████████▍ | 37/71 [00:07<00:06, 4.87it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.00337028503418, Poisson: -0.11091884225606918\r\n",
"\r",
"Validation DataLoader 0: 54%|█████████▋ | 38/71 [00:07<00:06, 4.94it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.206920623779297, Poisson: -0.09684920310974121\r\n",
"\r",
"Validation DataLoader 0: 55%|█████████▉ | 39/71 [00:07<00:06, 5.01it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.414379119873047, Poisson: -0.11682412773370743\r\n",
"\r",
"Validation DataLoader 0: 56%|██████████▏ | 40/71 [00:07<00:06, 5.09it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.447242736816406, Poisson: -0.08776233345270157\r\n",
"\r",
"Validation DataLoader 0: 58%|██████████▍ | 41/71 [00:07<00:05, 5.16it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 15.436498641967773, Poisson: -0.07314550876617432\r\n",
"\r",
"Validation DataLoader 0: 59%|██████████▋ | 42/71 [00:08<00:05, 5.23it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.46692657470703, Poisson: -0.10728458315134048\r\n",
"\r",
"Validation DataLoader 0: 61%|██████████▉ | 43/71 [00:08<00:05, 5.29it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.299116134643555, Poisson: -0.11672092229127884\r\n",
"\r",
"Validation DataLoader 0: 62%|███████████▏ | 44/71 [00:08<00:05, 5.36it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.967220306396484, Poisson: -0.11124279350042343\r\n",
"\r",
"Validation DataLoader 0: 63%|███████████▍ | 45/71 [00:08<00:04, 5.43it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.898027420043945, Poisson: -0.10534106194972992\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation DataLoader 0: 65%|███████████▋ | 46/71 [00:08<00:04, 5.49it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.183794021606445, Poisson: -0.10242842882871628\r\n",
"\r",
"Validation DataLoader 0: 66%|███████████▉ | 47/71 [00:08<00:04, 5.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.188167572021484, Poisson: -0.11698035895824432\r\n",
"\r",
"Validation DataLoader 0: 68%|████████████▏ | 48/71 [00:08<00:04, 5.61it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.33690071105957, Poisson: -0.09073223918676376\r\n",
"\r",
"Validation DataLoader 0: 69%|████████████▍ | 49/71 [00:08<00:03, 5.67it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.042579650878906, Poisson: -0.09028337150812149\r\n",
"\r",
"Validation DataLoader 0: 70%|████████████▋ | 50/71 [00:08<00:03, 5.73it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.37116241455078, Poisson: -0.09634491056203842\r\n",
"\r",
"Validation DataLoader 0: 72%|████████████▉ | 51/71 [00:08<00:03, 5.79it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.943159103393555, Poisson: -0.10504303872585297\r\n",
"\r",
"Validation DataLoader 0: 73%|█████████████▏ | 52/71 [00:08<00:03, 5.85it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.517465591430664, Poisson: -0.11390405148267746\r\n",
"\r",
"Validation DataLoader 0: 75%|█████████████▍ | 53/71 [00:08<00:03, 5.90it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.979843139648438, Poisson: -0.10537681728601456\r\n",
"\r",
"Validation DataLoader 0: 76%|█████████████▋ | 54/71 [00:09<00:02, 5.96it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.226818084716797, Poisson: -0.10194623470306396\r\n",
"\r",
"Validation DataLoader 0: 77%|█████████████▉ | 55/71 [00:09<00:02, 6.01it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.263525009155273, Poisson: -0.08776170760393143\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation DataLoader 0: 79%|██████████████▏ | 56/71 [00:09<00:02, 6.06it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.565263748168945, Poisson: -0.09930194169282913\r\n",
"\r",
"Validation DataLoader 0: 80%|██████████████▍ | 57/71 [00:09<00:02, 6.11it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.983007431030273, Poisson: -0.09955137223005295\r\n",
"\r",
"Validation DataLoader 0: 82%|██████████████▋ | 58/71 [00:09<00:02, 6.16it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.88779640197754, Poisson: -0.1202164888381958\r\n",
"\r",
"Validation DataLoader 0: 83%|██████████████▉ | 59/71 [00:09<00:01, 6.21it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.5961856842041, Poisson: -0.11393663287162781\r\n",
"\r",
"Validation DataLoader 0: 85%|███████████████▏ | 60/71 [00:09<00:01, 6.26it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.301002502441406, Poisson: -0.10222788155078888\r\n",
"\r",
"Validation DataLoader 0: 86%|███████████████▍ | 61/71 [00:09<00:01, 6.31it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.259353637695312, Poisson: -0.07613859325647354\r\n",
"\r",
"Validation DataLoader 0: 87%|███████████████▋ | 62/71 [00:09<00:01, 6.36it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.09466552734375, Poisson: -0.09604513645172119\r\n",
"\r",
"Validation DataLoader 0: 89%|███████████████▉ | 63/71 [00:09<00:01, 6.40it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.736059188842773, Poisson: -0.09930893778800964\r\n",
"\r",
"Validation DataLoader 0: 90%|████████████████▏ | 64/71 [00:09<00:01, 6.45it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.481731414794922, Poisson: -0.10254371166229248\r\n",
"\r",
"Validation DataLoader 0: 92%|████████████████▍ | 65/71 [00:10<00:00, 6.49it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.471792221069336, Poisson: -0.10787025094032288\r\n",
"\r",
"Validation DataLoader 0: 93%|████████████████▋ | 66/71 [00:10<00:00, 6.54it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.083730697631836, Poisson: -0.09609249979257584\r\n",
"\r",
"Validation DataLoader 0: 94%|████████████████▉ | 67/71 [00:10<00:00, 6.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.917104721069336, Poisson: -0.08452223241329193\r\n",
"\r",
"Validation DataLoader 0: 96%|█████████████████▏| 68/71 [00:10<00:00, 6.62it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.31960678100586, Poisson: -0.10256922245025635\r\n",
"\r",
"Validation DataLoader 0: 97%|█████████████████▍| 69/71 [00:10<00:00, 6.66it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.782426834106445, Poisson: -0.09927929937839508\r\n",
"\r",
"Validation DataLoader 0: 99%|█████████████████▋| 70/71 [00:10<00:00, 6.70it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.292436599731445, Poisson: -0.11702897399663925\r\n",
"\r",
"Validation DataLoader 0: 100%|██████████████████| 71/71 [00:10<00:00, 6.75it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Validation DataLoader 0: 100%|██████████████████| 71/71 [00:11<00:00, 6.41it/s]\r\n",
"┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\r\n",
"┃\u001b[1m \u001b[0m\u001b[1m Validate metric \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m DataLoader 0 \u001b[0m\u001b[1m \u001b[0m┃\r\n",
"┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\r\n",
"│\u001b[36m \u001b[0m\u001b[36m val_gene_pearson \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.0249176025390625 \u001b[0m\u001b[35m \u001b[0m│\r\n",
"│\u001b[36m \u001b[0m\u001b[36m val_loss \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 20.776832580566406 \u001b[0m\u001b[35m \u001b[0m│\r\n",
"│\u001b[36m \u001b[0m\u001b[36m val_mse \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 28.61081886291504 \u001b[0m\u001b[35m \u001b[0m│\r\n",
"│\u001b[36m \u001b[0m\u001b[36m val_task_pearson \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m 0.019344473257660866 \u001b[0m\u001b[35m \u001b[0m│\r\n",
"└───────────────────────────┴───────────────────────────┘\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pytorch_lightning/utilities/model_summary/model_summary.py:231: UserWarning: Precision 16-mixed is not supported by the model summary. Estimated model size in MB will not be accurate. Using 32 bits instead.\r\n",
"\r\n",
" | Name | Type | Params | Mode \r\n",
"---------------------------------------------------------------------------\r\n",
"0 | model | DecimaModel | 171 M | train\r\n",
"1 | loss | TaskWisePoissonMultinomialLoss | 0 | train\r\n",
"2 | val_metrics | MetricCollection | 0 | train\r\n",
"3 | test_metrics | MetricCollection | 0 | train\r\n",
"4 | warning_counter | WarningCounter | 0 | train\r\n",
"5 | transform | Identity | 0 | train\r\n",
"---------------------------------------------------------------------------\r\n",
"171 M Trainable params\r\n",
"0 Non-trainable params\r\n",
"171 M Total params\r\n",
"685.503 Total estimated model params size (MB)\r\n",
"401 Modules in train mode\r\n",
"0 Modules in eval mode\r\n",
"SLURM auto-requeueing enabled. Setting signal handlers.\r\n",
"\r",
"Sanity Checking: | | 0/? [00:00, ?it/s]/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Sanity Checking: | | 0/? [00:00, ?it/s]\r",
"Sanity Checking DataLoader 0: 0%| | 0/2 [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.704072952270508, Poisson: -0.08451984077692032\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Sanity Checking DataLoader 0: 50%|███████▌ | 1/2 [00:00<00:00, 3.99it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.50640296936035, Poisson: -0.081619992852211\r\n",
"\r",
"Sanity Checking DataLoader 0: 100%|███████████████| 2/2 [00:00<00:00, 5.90it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
" \r"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Training: | | 0/? [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Training: | | 0/? [00:00, ?it/s]\r",
"Epoch 0: 0%| | 0/766 [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.381006240844727, Poisson: -0.09250339865684509\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 1/766 [00:02<34:37, 0.37it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 1/766 [00:02<34:44, 0.37it/s, v_num=a0al, train_loss_step=19.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.657838821411133, Poisson: -0.09813400357961655\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 2/766 [00:02<18:03, 0.71it/s, v_num=a0al, train_loss_step=19.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 2/766 [00:02<18:52, 0.67it/s, v_num=a0al, train_loss_step=20.6"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.885753631591797, Poisson: -0.08482938259840012\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 3/766 [00:03<13:01, 0.98it/s, v_num=a0al, train_loss_step=20.6"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 0%| | 3/766 [00:03<13:35, 0.94it/s, v_num=a0al, train_loss_step=17.8"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.289833068847656, Poisson: -0.10162309557199478\r\n",
"\r",
"Epoch 0: 1%| | 4/766 [00:03<10:30, 1.21it/s, v_num=a0al, train_loss_step=17.8"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 4/766 [00:03<10:56, 1.16it/s, v_num=a0al, train_loss_step=21.2"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.540626525878906, Poisson: -0.0922895073890686\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 5/766 [00:03<09:53, 1.28it/s, v_num=a0al, train_loss_step=21.2\r",
"Epoch 0: 1%| | 5/766 [00:03<09:53, 1.28it/s, v_num=a0al, train_loss_step=19.4"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.395416259765625, Poisson: -0.09235671162605286\r\n",
"\r",
"Epoch 0: 1%| | 6/766 [00:04<08:27, 1.50it/s, v_num=a0al, train_loss_step=19.4"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 6/766 [00:04<08:44, 1.45it/s, v_num=a0al, train_loss_step=19.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.51851463317871, Poisson: -0.11232350766658783\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 7/766 [00:04<07:40, 1.65it/s, v_num=a0al, train_loss_step=19.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 7/766 [00:04<07:54, 1.60it/s, v_num=a0al, train_loss_step=23.4"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.36002540588379, Poisson: -0.10656161606311798\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 8/766 [00:04<07:04, 1.78it/s, v_num=a0al, train_loss_step=23.4"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 8/766 [00:04<07:17, 1.73it/s, v_num=a0al, train_loss_step=22.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.08615493774414, Poisson: -0.11497366428375244\r\n",
"\r",
"Epoch 0: 1%| | 9/766 [00:04<06:37, 1.91it/s, v_num=a0al, train_loss_step=22.3"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 9/766 [00:04<06:48, 1.85it/s, v_num=a0al, train_loss_step=24.0"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.516836166381836, Poisson: -0.08695206046104431\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 10/766 [00:05<06:25, 1.96it/s, v_num=a0al, train_loss_step=24.\r",
"Epoch 0: 1%| | 10/766 [00:05<06:26, 1.96it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.552202224731445, Poisson: -0.10729885846376419\r\n",
"\r",
"Epoch 0: 1%| | 11/766 [00:05<05:57, 2.11it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 1%| | 11/766 [00:05<06:06, 2.06it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.964632034301758, Poisson: -0.0897383913397789\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 12/766 [00:05<05:42, 2.20it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 12/766 [00:05<05:50, 2.15it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.290241241455078, Poisson: -0.08695797622203827\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 13/766 [00:05<05:29, 2.28it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 13/766 [00:05<05:37, 2.23it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.30130958557129, Poisson: -0.10726857930421829\r\n",
"\r",
"Epoch 0: 2%| | 14/766 [00:05<05:18, 2.36it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 14/766 [00:06<05:25, 2.31it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.5349063873291, Poisson: -0.09275592118501663\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 15/766 [00:06<05:15, 2.38it/s, v_num=a0al, train_loss_step=22.\r",
"Epoch 0: 2%| | 15/766 [00:06<05:16, 2.38it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.408832550048828, Poisson: -0.09224316477775574\r\n",
"\r",
"Epoch 0: 2%| | 16/766 [00:06<05:00, 2.49it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 16/766 [00:06<05:07, 2.44it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.880659103393555, Poisson: -0.08932992070913315\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 17/766 [00:06<04:53, 2.55it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 17/766 [00:06<04:59, 2.50it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.011474609375, Poisson: -0.08990071713924408\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 18/766 [00:06<04:46, 2.61it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 18/766 [00:07<04:52, 2.56it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.70027732849121, Poisson: -0.09867019951343536\r\n",
"\r",
"Epoch 0: 2%| | 19/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 2%| | 19/766 [00:07<04:45, 2.61it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.821165084838867, Poisson: -0.08399660140275955\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 20/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=20.\r",
"Epoch 0: 3%| | 20/766 [00:07<04:40, 2.66it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.62529945373535, Poisson: -0.07828851789236069\r\n",
"\r",
"Epoch 0: 3%| | 21/766 [00:07<04:30, 2.75it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 21/766 [00:07<04:35, 2.71it/s, v_num=a0al, train_loss_step=16."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.265472412109375, Poisson: -0.10090325772762299\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 22/766 [00:07<04:25, 2.80it/s, v_num=a0al, train_loss_step=16."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 22/766 [00:07<04:30, 2.75it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.13052749633789, Poisson: -0.09563688188791275\r\n",
"\r",
"Epoch 0: 3%| | 23/766 [00:08<04:21, 2.84it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 23/766 [00:08<04:26, 2.79it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.649946212768555, Poisson: -0.09828009456396103\r\n",
"\r",
"Epoch 0: 3%| | 24/766 [00:08<04:17, 2.88it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 24/766 [00:08<04:22, 2.83it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.186647415161133, Poisson: -0.12111172825098038\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 25/766 [00:08<04:18, 2.87it/s, v_num=a0al, train_loss_step=20.\r",
"Epoch 0: 3%| | 25/766 [00:08<04:18, 2.87it/s, v_num=a0al, train_loss_step=25."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.360719680786133, Poisson: -0.08687090128660202\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 26/766 [00:08<04:11, 2.94it/s, v_num=a0al, train_loss_step=25."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 3%| | 26/766 [00:08<04:15, 2.90it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.07268524169922, Poisson: -0.0955045148730278\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 27/766 [00:09<04:08, 2.98it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 27/766 [00:09<04:11, 2.93it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.431581497192383, Poisson: -0.11249936372041702\r\n",
"\r",
"Epoch 0: 4%| | 28/766 [00:09<04:05, 3.01it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 28/766 [00:09<04:08, 2.97it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.752777099609375, Poisson: -0.10413458943367004\r\n",
"\r",
"Epoch 0: 4%| | 29/766 [00:09<04:02, 3.04it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 29/766 [00:09<04:06, 3.00it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.950761795043945, Poisson: -0.08966774493455887\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 30/766 [00:09<04:03, 3.02it/s, v_num=a0al, train_loss_step=21.\r",
"Epoch 0: 4%| | 30/766 [00:09<04:03, 3.02it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.61734962463379, Poisson: -0.11915773898363113\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 31/766 [00:10<03:57, 3.09it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 31/766 [00:10<04:01, 3.05it/s, v_num=a0al, train_loss_step=24."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.473047256469727, Poisson: -0.09275110810995102\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 32/766 [00:10<03:55, 3.11it/s, v_num=a0al, train_loss_step=24."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 32/766 [00:10<03:58, 3.07it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.206684112548828, Poisson: -0.10135509073734283\r\n",
"\r",
"Epoch 0: 4%| | 33/766 [00:10<03:53, 3.14it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 33/766 [00:10<03:56, 3.10it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.45479965209961, Poisson: -0.09279609471559525\r\n",
"\r",
"Epoch 0: 4%| | 34/766 [00:10<03:51, 3.16it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 4%| | 34/766 [00:10<03:54, 3.12it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.72089385986328, Poisson: -0.10419032722711563\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 35/766 [00:11<03:52, 3.14it/s, v_num=a0al, train_loss_step=19.\r",
"Epoch 0: 5%| | 35/766 [00:11<03:52, 3.14it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.369564056396484, Poisson: -0.10727277398109436\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 36/766 [00:11<03:47, 3.20it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 36/766 [00:11<03:50, 3.17it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.176250457763672, Poisson: -0.1012745052576065\r\n",
"\r",
"Epoch 0: 5%| | 37/766 [00:11<03:46, 3.22it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 37/766 [00:11<03:48, 3.19it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.37683868408203, Poisson: -0.08704456686973572\r\n",
"\r",
"Epoch 0: 5%| | 38/766 [00:11<03:44, 3.24it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 38/766 [00:11<03:47, 3.21it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.340030670166016, Poisson: -0.10708311200141907\r\n",
"\r",
"Epoch 0: 5%| | 39/766 [00:11<03:42, 3.26it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 39/766 [00:12<03:45, 3.22it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.2115478515625, Poisson: -0.1010795533657074\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 40/766 [00:12<03:43, 3.24it/s, v_num=a0al, train_loss_step=22.\r",
"Epoch 0: 5%| | 40/766 [00:12<03:44, 3.24it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.17957878112793, Poisson: -0.10130106657743454\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 41/766 [00:12<03:40, 3.29it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 41/766 [00:12<03:42, 3.26it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.70407485961914, Poisson: -0.08396982401609421\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 42/766 [00:12<03:38, 3.31it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 5%| | 42/766 [00:12<03:41, 3.28it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.499862670898438, Poisson: -0.09266598522663116\r\n",
"\r",
"Epoch 0: 6%| | 43/766 [00:12<03:37, 3.33it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 43/766 [00:13<03:39, 3.29it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.606935501098633, Poisson: -0.09828896075487137\r\n",
"\r",
"Epoch 0: 6%| | 44/766 [00:13<03:36, 3.34it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 44/766 [00:13<03:38, 3.31it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.871383666992188, Poisson: -0.11045264452695847\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 45/766 [00:13<03:37, 3.32it/s, v_num=a0al, train_loss_step=20.\r",
"Epoch 0: 6%| | 45/766 [00:13<03:37, 3.32it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.033437728881836, Poisson: -0.11557681858539581\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 46/766 [00:13<03:33, 3.37it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 46/766 [00:13<03:35, 3.34it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.879009246826172, Poisson: -0.09021121263504028\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 47/766 [00:13<03:32, 3.38it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 47/766 [00:14<03:34, 3.35it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.95680809020996, Poisson: -0.08978604525327682\r\n",
"\r",
"Epoch 0: 6%| | 48/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 48/766 [00:14<03:33, 3.36it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.858497619628906, Poisson: -0.08382056653499603\r\n",
"\r",
"Epoch 0: 6%| | 49/766 [00:14<03:30, 3.41it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 6%| | 49/766 [00:14<03:32, 3.38it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18074607849121, Poisson: -0.10173416137695312\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 50/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=17.\r",
"Epoch 0: 7%| | 50/766 [00:14<03:31, 3.39it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.48756980895996, Poisson: -0.09269016981124878\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 51/766 [00:14<03:28, 3.43it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 51/766 [00:15<03:30, 3.40it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.046504974365234, Poisson: -0.09527648985385895\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 52/766 [00:15<03:27, 3.44it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 52/766 [00:15<03:29, 3.41it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.808837890625, Poisson: -0.10404733568429947\r\n",
"\r",
"Epoch 0: 7%| | 53/766 [00:15<03:26, 3.45it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 53/766 [00:15<03:28, 3.42it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.97328758239746, Poisson: -0.08934041857719421\r\n",
"\r",
"Epoch 0: 7%| | 54/766 [00:15<03:25, 3.46it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 54/766 [00:15<03:27, 3.43it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.240169525146484, Poisson: -0.10145271569490433\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 55/766 [00:15<03:26, 3.45it/s, v_num=a0al, train_loss_step=18.\r",
"Epoch 0: 7%| | 55/766 [00:15<03:26, 3.44it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.870256423950195, Poisson: -0.1042548418045044\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 56/766 [00:16<03:23, 3.48it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 56/766 [00:16<03:25, 3.45it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.184789657592773, Poisson: -0.10091016441583633\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 57/766 [00:16<03:22, 3.49it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 7%| | 57/766 [00:16<03:24, 3.46it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.16849136352539, Poisson: -0.10139136761426926\r\n",
"\r",
"Epoch 0: 8%| | 58/766 [00:16<03:22, 3.50it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 58/766 [00:16<03:23, 3.47it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.697975158691406, Poisson: -0.11847725510597229\r\n",
"\r",
"Epoch 0: 8%| | 59/766 [00:16<03:21, 3.51it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 59/766 [00:16<03:22, 3.48it/s, v_num=a0al, train_loss_step=24."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.23836898803711, Poisson: -0.10157377272844315\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 60/766 [00:17<03:22, 3.49it/s, v_num=a0al, train_loss_step=24.\r",
"Epoch 0: 8%| | 60/766 [00:17<03:22, 3.49it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.151546478271484, Poisson: -0.10148127377033234\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 61/766 [00:17<03:19, 3.53it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 61/766 [00:17<03:21, 3.50it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.36435890197754, Poisson: -0.10736225545406342\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 62/766 [00:17<03:19, 3.54it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 62/766 [00:17<03:20, 3.51it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.459867477416992, Poisson: -0.08722960948944092\r\n",
"\r",
"Epoch 0: 8%| | 63/766 [00:17<03:18, 3.55it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 63/766 [00:17<03:19, 3.52it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.105688095092773, Poisson: -0.09594902396202087\r\n",
"\r",
"Epoch 0: 8%| | 64/766 [00:18<03:17, 3.55it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 64/766 [00:18<03:18, 3.53it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.239574432373047, Poisson: -0.10175144672393799\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 8%| | 65/766 [00:18<03:18, 3.54it/s, v_num=a0al, train_loss_step=20.\r",
"Epoch 0: 8%| | 65/766 [00:18<03:18, 3.53it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.30000877380371, Poisson: -0.08693327754735947\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 66/766 [00:18<03:16, 3.57it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 66/766 [00:18<03:17, 3.54it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.48712921142578, Poisson: -0.09302585572004318\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 67/766 [00:18<03:15, 3.58it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 67/766 [00:18<03:16, 3.55it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.451393127441406, Poisson: -0.11258357018232346\r\n",
"\r",
"Epoch 0: 9%| | 68/766 [00:18<03:14, 3.58it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 68/766 [00:19<03:16, 3.56it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.058046340942383, Poisson: -0.09530574083328247\r\n",
"\r",
"Epoch 0: 9%| | 69/766 [00:19<03:14, 3.59it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 69/766 [00:19<03:15, 3.57it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.952863693237305, Poisson: -0.08996559679508209\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 70/766 [00:19<03:14, 3.57it/s, v_num=a0al, train_loss_step=20.\r",
"Epoch 0: 9%| | 70/766 [00:19<03:14, 3.57it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.13503646850586, Poisson: -0.08106084913015366\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 71/766 [00:19<03:12, 3.60it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 71/766 [00:19<03:14, 3.58it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.00311279296875, Poisson: -0.09557122737169266\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 72/766 [00:19<03:12, 3.61it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 9%| | 72/766 [00:20<03:13, 3.59it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.45271110534668, Poisson: -0.09286917746067047\r\n",
"\r",
"Epoch 0: 10%| | 73/766 [00:20<03:11, 3.62it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 73/766 [00:20<03:12, 3.59it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.410478591918945, Poisson: -0.09247767925262451\r\n",
"\r",
"Epoch 0: 10%| | 74/766 [00:20<03:10, 3.62it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 74/766 [00:20<03:12, 3.60it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.642332077026367, Poisson: -0.0786345899105072\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 75/766 [00:20<03:11, 3.61it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 75/766 [00:20<03:11, 3.61it/s, v_num=a0al, train_loss_step=16."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.897323608398438, Poisson: -0.11011520773172379\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 76/766 [00:20<03:09, 3.64it/s, v_num=a0al, train_loss_step=16."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 76/766 [00:21<03:11, 3.61it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.767396926879883, Poisson: -0.08402471244335175\r\n",
"\r",
"Epoch 0: 10%| | 77/766 [00:21<03:09, 3.64it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 77/766 [00:21<03:10, 3.62it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.062463760375977, Poisson: -0.0955692008137703\r\n",
"\r",
"Epoch 0: 10%| | 78/766 [00:21<03:08, 3.65it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 78/766 [00:21<03:09, 3.62it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.32487678527832, Poisson: -0.0871487483382225\r\n",
"\r",
"Epoch 0: 10%| | 79/766 [00:21<03:08, 3.65it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 79/766 [00:21<03:09, 3.63it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.206655502319336, Poisson: -0.10164565593004227\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 80/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 10%| | 80/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.280014038085938, Poisson: -0.10702759772539139\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 81/766 [00:22<03:07, 3.66it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 81/766 [00:22<03:08, 3.64it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.242645263671875, Poisson: -0.10192742943763733\r\n",
"\r",
"Epoch 0: 11%| | 82/766 [00:22<03:06, 3.67it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 82/766 [00:22<03:07, 3.65it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.142255783081055, Poisson: -0.10121983289718628\r\n",
"\r",
"Epoch 0: 11%| | 83/766 [00:22<03:05, 3.67it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 83/766 [00:22<03:07, 3.65it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.358478546142578, Poisson: -0.1070261299610138\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 84/766 [00:22<03:05, 3.68it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 84/766 [00:22<03:06, 3.66it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18360137939453, Poisson: -0.10107354819774628\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 85/766 [00:23<03:05, 3.66it/s, v_num=a0al, train_loss_step=22.\r",
"Epoch 0: 11%| | 85/766 [00:23<03:05, 3.66it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.60392951965332, Poisson: -0.09856819361448288\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 86/766 [00:23<03:04, 3.69it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 86/766 [00:23<03:05, 3.67it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.474123001098633, Poisson: -0.09277226030826569\r\n",
"\r",
"Epoch 0: 11%| | 87/766 [00:23<03:03, 3.69it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 87/766 [00:23<03:04, 3.67it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.81633949279785, Poisson: -0.10398300737142563\r\n",
"\r",
"Epoch 0: 11%| | 88/766 [00:23<03:03, 3.70it/s, v_num=a0al, train_loss_step=19."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 11%| | 88/766 [00:23<03:04, 3.68it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.952714920043945, Poisson: -0.1099216490983963\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 89/766 [00:24<03:02, 3.70it/s, v_num=a0al, train_loss_step=21."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 89/766 [00:24<03:03, 3.68it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.675338745117188, Poisson: -0.09857542812824249\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 90/766 [00:24<03:03, 3.69it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 90/766 [00:24<03:03, 3.68it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.54332733154297, Poisson: -0.09850569814443588\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 91/766 [00:24<03:01, 3.71it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 91/766 [00:24<03:02, 3.69it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.736204147338867, Poisson: -0.083879254758358\r\n",
"\r",
"Epoch 0: 12%| | 92/766 [00:24<03:01, 3.71it/s, v_num=a0al, train_loss_step=20."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 92/766 [00:24<03:02, 3.69it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.93655014038086, Poisson: -0.09000393003225327\r\n",
"\r",
"Epoch 0: 12%| | 93/766 [00:25<03:00, 3.72it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 93/766 [00:25<03:01, 3.70it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.51058006286621, Poisson: -0.11284295469522476\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 94/766 [00:25<03:00, 3.72it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 94/766 [00:25<03:01, 3.70it/s, v_num=a0al, train_loss_step=23."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.92452621459961, Poisson: -0.10993973165750504\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 12%| | 95/766 [00:25<03:01, 3.71it/s, v_num=a0al, train_loss_step=23.\r",
"Epoch 0: 12%| | 95/766 [00:25<03:01, 3.71it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.880817413330078, Poisson: -0.08968962728977203\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 96/766 [00:25<02:59, 3.73it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 96/766 [00:25<03:00, 3.71it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.968830108642578, Poisson: -0.08979591727256775\r\n",
"\r",
"Epoch 0: 13%|▏| 97/766 [00:25<02:59, 3.73it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 97/766 [00:26<03:00, 3.71it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.777538299560547, Poisson: -0.08393401652574539\r\n",
"\r",
"Epoch 0: 13%|▏| 98/766 [00:26<02:58, 3.74it/s, v_num=a0al, train_loss_step=18."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 98/766 [00:26<02:59, 3.72it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.880767822265625, Poisson: -0.10982605814933777\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 99/766 [00:26<02:58, 3.74it/s, v_num=a0al, train_loss_step=17."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 99/766 [00:26<02:59, 3.72it/s, v_num=a0al, train_loss_step=22."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.429824829101562, Poisson: -0.09266551584005356\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 100/766 [00:26<02:58, 3.73it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 13%|▏| 100/766 [00:26<02:58, 3.73it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.114593505859375, Poisson: -0.10141497850418091\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 101/766 [00:26<02:57, 3.75it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 101/766 [00:27<02:58, 3.73it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.00738525390625, Poisson: -0.11572451889514923\r\n",
"\r",
"Epoch 0: 13%|▏| 102/766 [00:27<02:56, 3.75it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 102/766 [00:27<02:57, 3.73it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.775775909423828, Poisson: -0.0842253789305687\r\n",
"\r",
"Epoch 0: 13%|▏| 103/766 [00:27<02:56, 3.76it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 13%|▏| 103/766 [00:27<02:57, 3.74it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.294315338134766, Poisson: -0.10698876529932022\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 104/766 [00:27<02:56, 3.76it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 104/766 [00:27<02:56, 3.74it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.36329460144043, Poisson: -0.10711447149515152\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 105/766 [00:28<02:56, 3.74it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 14%|▏| 105/766 [00:28<02:56, 3.74it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.062660217285156, Poisson: -0.09556890279054642\r\n",
"\r",
"Epoch 0: 14%|▏| 106/766 [00:28<02:55, 3.77it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 106/766 [00:28<02:56, 3.75it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.25997543334961, Poisson: -0.10673705488443375\r\n",
"\r",
"Epoch 0: 14%|▏| 107/766 [00:28<02:54, 3.77it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 107/766 [00:28<02:55, 3.75it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.74195098876953, Poisson: -0.10431475937366486\r\n",
"\r",
"Epoch 0: 14%|▏| 108/766 [00:28<02:54, 3.77it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 108/766 [00:28<02:55, 3.75it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.445833206176758, Poisson: -0.09263034164905548\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 109/766 [00:28<02:54, 3.78it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 109/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.96492576599121, Poisson: -0.10998839139938354\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 110/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 14%|▏| 110/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.060997009277344, Poisson: -0.09550387412309647\r\n",
"\r",
"Epoch 0: 14%|▏| 111/766 [00:29<02:53, 3.78it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 14%|▏| 111/766 [00:29<02:54, 3.76it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.398094177246094, Poisson: -0.09251043945550919\r\n",
"\r",
"Epoch 0: 15%|▏| 112/766 [00:29<02:52, 3.78it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 112/766 [00:29<02:53, 3.77it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.765329360961914, Poisson: -0.08439560234546661\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 113/766 [00:29<02:52, 3.79it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 113/766 [00:29<02:53, 3.77it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.94915008544922, Poisson: -0.11012542247772217\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 114/766 [00:30<02:52, 3.79it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 114/766 [00:30<02:52, 3.77it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.100540161132812, Poisson: -0.07545100152492523\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 115/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 15%|▏| 115/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.88507843017578, Poisson: -0.11016397178173065\r\n",
"\r",
"Epoch 0: 15%|▏| 116/766 [00:30<02:51, 3.80it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 116/766 [00:30<02:52, 3.78it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.70903968811035, Poisson: -0.10451330244541168\r\n",
"\r",
"Epoch 0: 15%|▏| 117/766 [00:30<02:50, 3.80it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 117/766 [00:30<02:51, 3.78it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.082307815551758, Poisson: -0.0955430418252945\r\n",
"\r",
"Epoch 0: 15%|▏| 118/766 [00:31<02:50, 3.80it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 15%|▏| 118/766 [00:31<02:51, 3.78it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.241302490234375, Poisson: -0.12178383767604828\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 119/766 [00:31<02:50, 3.80it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 119/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.645946502685547, Poisson: -0.09858675301074982\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 120/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=25\r",
"Epoch 0: 16%|▏| 120/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.908796310424805, Poisson: -0.08985879272222519\r\n",
"\r",
"Epoch 0: 16%|▏| 121/766 [00:31<02:49, 3.81it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 121/766 [00:31<02:50, 3.79it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.289188385009766, Poisson: -0.10742945224046707\r\n",
"\r",
"Epoch 0: 16%|▏| 122/766 [00:32<02:49, 3.81it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 122/766 [00:32<02:49, 3.79it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.056671142578125, Poisson: -0.09564211219549179\r\n",
"\r",
"Epoch 0: 16%|▏| 123/766 [00:32<02:48, 3.81it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 123/766 [00:32<02:49, 3.80it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.75560760498047, Poisson: -0.10460510104894638\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 124/766 [00:32<02:48, 3.82it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 124/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.484085083007812, Poisson: -0.09247615933418274\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 125/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 16%|▏| 125/766 [00:32<02:48, 3.80it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.635929107666016, Poisson: -0.09875985234975815\r\n",
"\r",
"Epoch 0: 16%|▏| 126/766 [00:32<02:47, 3.82it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 16%|▏| 126/766 [00:33<02:48, 3.80it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.100229263305664, Poisson: -0.10157999396324158\r\n",
"\r",
"Epoch 0: 17%|▏| 127/766 [00:33<02:47, 3.82it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 127/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.291488647460938, Poisson: -0.10710974782705307\r\n",
"\r",
"Epoch 0: 17%|▏| 128/766 [00:33<02:46, 3.82it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 128/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.73076057434082, Poisson: -0.1042914167046547\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 129/766 [00:33<02:46, 3.83it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 129/766 [00:33<02:47, 3.81it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.011499404907227, Poisson: -0.08988802134990692\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 130/766 [00:34<02:46, 3.81it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 17%|▏| 130/766 [00:34<02:46, 3.81it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.779977798461914, Poisson: -0.10456342250108719\r\n",
"\r",
"Epoch 0: 17%|▏| 131/766 [00:34<02:45, 3.83it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 131/766 [00:34<02:46, 3.82it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.7364444732666, Poisson: -0.10426057130098343\r\n",
"\r",
"Epoch 0: 17%|▏| 132/766 [00:34<02:45, 3.83it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 132/766 [00:34<02:46, 3.82it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.27402114868164, Poisson: -0.08728134632110596\r\n",
"\r",
"Epoch 0: 17%|▏| 133/766 [00:34<02:45, 3.84it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 133/766 [00:34<02:45, 3.82it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.547163009643555, Poisson: -0.09258746355772018\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 134/766 [00:34<02:44, 3.84it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 17%|▏| 134/766 [00:35<02:45, 3.82it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.47853660583496, Poisson: -0.1130625307559967\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 135/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 18%|▏| 135/766 [00:35<02:44, 3.82it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.483755111694336, Poisson: -0.11301064491271973\r\n",
"\r",
"Epoch 0: 18%|▏| 136/766 [00:35<02:44, 3.84it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 136/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.89188003540039, Poisson: -0.11027445644140244\r\n",
"\r",
"Epoch 0: 18%|▏| 137/766 [00:35<02:43, 3.84it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 137/766 [00:35<02:44, 3.83it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.216276168823242, Poisson: -0.10149399191141129\r\n",
"\r",
"Epoch 0: 18%|▏| 138/766 [00:35<02:43, 3.85it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 138/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.98031234741211, Poisson: -0.09576455503702164\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 139/766 [00:36<02:42, 3.85it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 139/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.223608016967773, Poisson: -0.1015629693865776\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 140/766 [00:36<02:43, 3.84it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 18%|▏| 140/766 [00:36<02:43, 3.83it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.562938690185547, Poisson: -0.09860718995332718\r\n",
"\r",
"Epoch 0: 18%|▏| 141/766 [00:36<02:42, 3.85it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 18%|▏| 141/766 [00:36<02:42, 3.84it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.765254974365234, Poisson: -0.10466967523097992\r\n",
"\r",
"Epoch 0: 19%|▏| 142/766 [00:36<02:41, 3.85it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 142/766 [00:36<02:42, 3.84it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.77707290649414, Poisson: -0.10455359518527985\r\n",
"\r",
"Epoch 0: 19%|▏| 143/766 [00:37<02:41, 3.86it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 143/766 [00:37<02:42, 3.84it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.388973236083984, Poisson: -0.10733388364315033\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 144/766 [00:37<02:41, 3.86it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 144/766 [00:37<02:41, 3.84it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.004207611083984, Poisson: -0.09571236371994019\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 145/766 [00:37<02:41, 3.85it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 19%|▏| 145/766 [00:37<02:41, 3.84it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.35780906677246, Poisson: -0.10723188519477844\r\n",
"\r",
"Epoch 0: 19%|▏| 146/766 [00:37<02:40, 3.86it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 146/766 [00:37<02:41, 3.85it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.21773338317871, Poisson: -0.10146009176969528\r\n",
"\r",
"Epoch 0: 19%|▏| 147/766 [00:38<02:40, 3.86it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 147/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.026350021362305, Poisson: -0.09577830880880356\r\n",
"\r",
"Epoch 0: 19%|▏| 148/766 [00:38<02:39, 3.86it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 148/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.944095611572266, Poisson: -0.11019343137741089\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 149/766 [00:38<02:39, 3.87it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 19%|▏| 149/766 [00:38<02:40, 3.85it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.010438919067383, Poisson: -0.1160307228565216\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 150/766 [00:38<02:39, 3.85it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 20%|▏| 150/766 [00:38<02:39, 3.85it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.618486404418945, Poisson: -0.09847906231880188\r\n",
"\r",
"Epoch 0: 20%|▏| 151/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 151/766 [00:39<02:39, 3.86it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.028602600097656, Poisson: -0.11033559590578079\r\n",
"\r",
"Epoch 0: 20%|▏| 152/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 152/766 [00:39<02:39, 3.86it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.400075912475586, Poisson: -0.09291176497936249\r\n",
"\r",
"Epoch 0: 20%|▏| 153/766 [00:39<02:38, 3.87it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 153/766 [00:39<02:38, 3.86it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.339847564697266, Poisson: -0.10720396786928177\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 154/766 [00:39<02:37, 3.87it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 154/766 [00:39<02:38, 3.86it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.87708854675293, Poisson: -0.0903216078877449\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 155/766 [00:40<02:38, 3.86it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 20%|▏| 155/766 [00:40<02:38, 3.86it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.164257049560547, Poisson: -0.08146476745605469\r\n",
"\r",
"Epoch 0: 20%|▏| 156/766 [00:40<02:37, 3.88it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 156/766 [00:40<02:37, 3.86it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.632192611694336, Poisson: -0.09867066890001297\r\n",
"\r",
"Epoch 0: 20%|▏| 157/766 [00:40<02:37, 3.88it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 20%|▏| 157/766 [00:40<02:37, 3.87it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.666051864624023, Poisson: -0.0988137349486351\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 158/766 [00:40<02:36, 3.88it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 158/766 [00:40<02:37, 3.87it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.503517150878906, Poisson: -0.09298935532569885\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 159/766 [00:40<02:36, 3.88it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 159/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.657108306884766, Poisson: -0.09871099889278412\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 160/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 21%|▏| 160/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.22437286376953, Poisson: -0.10139385610818863\r\n",
"\r",
"Epoch 0: 21%|▏| 161/766 [00:41<02:35, 3.88it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 161/766 [00:41<02:36, 3.87it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.209266662597656, Poisson: -0.10157406330108643\r\n",
"\r",
"Epoch 0: 21%|▏| 162/766 [00:41<02:35, 3.89it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 162/766 [00:41<02:35, 3.87it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.142242431640625, Poisson: -0.08123155683279037\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 163/766 [00:41<02:35, 3.89it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 163/766 [00:42<02:35, 3.88it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.951335906982422, Poisson: -0.0898410826921463\r\n",
"\r",
"Epoch 0: 21%|▏| 164/766 [00:42<02:34, 3.89it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 21%|▏| 164/766 [00:42<02:35, 3.88it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.9245548248291, Poisson: -0.09010511636734009\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 165/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 22%|▏| 165/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.871036529541016, Poisson: -0.11026760935783386\r\n",
"\r",
"Epoch 0: 22%|▏| 166/766 [00:42<02:34, 3.89it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 166/766 [00:42<02:34, 3.88it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.756982803344727, Poisson: -0.10451257973909378\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 167/766 [00:42<02:33, 3.89it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 167/766 [00:43<02:34, 3.88it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.007671356201172, Poisson: -0.0958346351981163\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 168/766 [00:43<02:33, 3.90it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 168/766 [00:43<02:33, 3.88it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.315797805786133, Poisson: -0.08711469173431396\r\n",
"\r",
"Epoch 0: 22%|▏| 169/766 [00:43<02:33, 3.90it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 169/766 [00:43<02:33, 3.88it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.435083389282227, Poisson: -0.09278573840856552\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 170/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 22%|▏| 170/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.58577537536621, Poisson: -0.09869471937417984\r\n",
"\r",
"Epoch 0: 22%|▏| 171/766 [00:43<02:32, 3.90it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 171/766 [00:43<02:33, 3.89it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.860515594482422, Poisson: -0.09017433971166611\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 172/766 [00:44<02:32, 3.90it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 22%|▏| 172/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.060638427734375, Poisson: -0.11591766029596329\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 173/766 [00:44<02:31, 3.90it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 173/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.262035369873047, Poisson: -0.08708906173706055\r\n",
"\r",
"Epoch 0: 23%|▏| 174/766 [00:44<02:31, 3.90it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 174/766 [00:44<02:32, 3.89it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.997161865234375, Poisson: -0.09561602771282196\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 175/766 [00:44<02:31, 3.89it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 23%|▏| 175/766 [00:44<02:31, 3.89it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.397085189819336, Poisson: -0.10748428851366043\r\n",
"\r",
"Epoch 0: 23%|▏| 176/766 [00:45<02:31, 3.91it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 176/766 [00:45<02:31, 3.89it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.636493682861328, Poisson: -0.09865304082632065\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 177/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 177/766 [00:45<02:31, 3.90it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.90738296508789, Poisson: -0.11019153892993927\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 178/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 178/766 [00:45<02:30, 3.90it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.74561309814453, Poisson: -0.10456544160842896\r\n",
"\r",
"Epoch 0: 23%|▏| 179/766 [00:45<02:30, 3.91it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 179/766 [00:45<02:30, 3.90it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.05675506591797, Poisson: -0.11581964790821075\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 23%|▏| 180/766 [00:46<02:30, 3.90it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 23%|▏| 180/766 [00:46<02:30, 3.90it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.544309616088867, Poisson: -0.0925077348947525\r\n",
"\r",
"Epoch 0: 24%|▏| 181/766 [00:46<02:29, 3.91it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 181/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.366865158081055, Poisson: -0.1072160005569458\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 182/766 [00:46<02:29, 3.91it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 182/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.010889053344727, Poisson: -0.09560806304216385\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 183/766 [00:46<02:28, 3.91it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 183/766 [00:46<02:29, 3.90it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.048059463500977, Poisson: -0.09576614946126938\r\n",
"\r",
"Epoch 0: 24%|▏| 184/766 [00:46<02:28, 3.92it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 184/766 [00:47<02:29, 3.90it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.188583374023438, Poisson: -0.10139279067516327\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 185/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 24%|▏| 185/766 [00:47<02:28, 3.90it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.821096420288086, Poisson: -0.10449020564556122\r\n",
"\r",
"Epoch 0: 24%|▏| 186/766 [00:47<02:28, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 186/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.78985023498535, Poisson: -0.10434942692518234\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 187/766 [00:47<02:27, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 24%|▏| 187/766 [00:47<02:28, 3.91it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.14258575439453, Poisson: -0.10168063640594482\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▏| 188/766 [00:47<02:27, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▏| 188/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.629594802856445, Poisson: -0.09855164587497711\r\n",
"\r",
"Epoch 0: 25%|▏| 189/766 [00:48<02:27, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▏| 189/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.340116500854492, Poisson: -0.10711188614368439\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▏| 190/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 25%|▏| 190/766 [00:48<02:27, 3.91it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.662609100341797, Poisson: -0.07837900519371033\r\n",
"\r",
"Epoch 0: 25%|▏| 191/766 [00:48<02:26, 3.92it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▏| 191/766 [00:48<02:26, 3.91it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.059072494506836, Poisson: -0.09556883573532104\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▎| 192/766 [00:48<02:26, 3.92it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▎| 192/766 [00:49<02:26, 3.91it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.55850601196289, Poisson: -0.09870775789022446\r\n",
"\r",
"Epoch 0: 25%|▎| 193/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▎| 193/766 [00:49<02:26, 3.91it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.497467041015625, Poisson: -0.09287054091691971\r\n",
"\r",
"Epoch 0: 25%|▎| 194/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▎| 194/766 [00:49<02:26, 3.92it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.54609489440918, Poisson: -0.07833902537822723\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 25%|▎| 195/766 [00:49<02:25, 3.92it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 25%|▎| 195/766 [00:49<02:25, 3.92it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.772235870361328, Poisson: -0.1042647436261177\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 196/766 [00:49<02:25, 3.93it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 196/766 [00:50<02:25, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.815385818481445, Poisson: -0.10429618507623672\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 197/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 197/766 [00:50<02:25, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.140241622924805, Poisson: -0.10133679211139679\r\n",
"\r",
"Epoch 0: 26%|▎| 198/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 198/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.803293228149414, Poisson: -0.10434843599796295\r\n",
"\r",
"Epoch 0: 26%|▎| 199/766 [00:50<02:24, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 199/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.736957550048828, Poisson: -0.1044909879565239\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 200/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 26%|▎| 200/766 [00:50<02:24, 3.92it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.19703483581543, Poisson: -0.12156690657138824\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 201/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 201/766 [00:51<02:24, 3.92it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.888446807861328, Poisson: -0.10990883409976959\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 202/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 26%|▎| 202/766 [00:51<02:23, 3.92it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.0115909576416, Poisson: -0.11590568721294403\r\n",
"\r",
"Epoch 0: 27%|▎| 203/766 [00:51<02:23, 3.94it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 203/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.955623626708984, Poisson: -0.08990591764450073\r\n",
"\r",
"Epoch 0: 27%|▎| 204/766 [00:51<02:22, 3.94it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 204/766 [00:51<02:23, 3.93it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.459075927734375, Poisson: -0.11290311068296432\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 205/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 27%|▎| 205/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.095584869384766, Poisson: -0.09583885967731476\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 206/766 [00:52<02:22, 3.94it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 206/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.653268814086914, Poisson: -0.07830702513456345\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 207/766 [00:52<02:21, 3.94it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 207/766 [00:52<02:22, 3.93it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.061912536621094, Poisson: -0.09579043090343475\r\n",
"\r",
"Epoch 0: 27%|▎| 208/766 [00:52<02:21, 3.94it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 208/766 [00:52<02:21, 3.93it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.23405647277832, Poisson: -0.1015443205833435\r\n",
"\r",
"Epoch 0: 27%|▎| 209/766 [00:53<02:21, 3.94it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 209/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.337825775146484, Poisson: -0.10725290328264236\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 27%|▎| 210/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 27%|▎| 210/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.747053146362305, Poisson: -0.10438595712184906\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 211/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 211/766 [00:53<02:21, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.78754997253418, Poisson: -0.1044032946228981\r\n",
"\r",
"Epoch 0: 28%|▎| 212/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 212/766 [00:53<02:20, 3.93it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.73340606689453, Poisson: -0.08447693288326263\r\n",
"\r",
"Epoch 0: 28%|▎| 213/766 [00:53<02:20, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 213/766 [00:54<02:20, 3.93it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.85317039489746, Poisson: -0.11031051725149155\r\n",
"\r",
"Epoch 0: 28%|▎| 214/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 214/766 [00:54<02:20, 3.94it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.320491790771484, Poisson: -0.1070985198020935\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 215/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 215/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.711591720581055, Poisson: -0.10438373684883118\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 216/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 216/766 [00:54<02:19, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.738370895385742, Poisson: -0.10432856529951096\r\n",
"\r",
"Epoch 0: 28%|▎| 217/766 [00:54<02:19, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 217/766 [00:55<02:19, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.16659164428711, Poisson: -0.10148350149393082\r\n",
"\r",
"Epoch 0: 28%|▎| 218/766 [00:55<02:18, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 28%|▎| 218/766 [00:55<02:19, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.867549896240234, Poisson: -0.09012211859226227\r\n",
"\r",
"Epoch 0: 29%|▎| 219/766 [00:55<02:18, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 219/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.603740692138672, Poisson: -0.09860372543334961\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 220/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 29%|▎| 220/766 [00:55<02:18, 3.94it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.064496994018555, Poisson: -0.09578673541545868\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 221/766 [00:55<02:17, 3.95it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 221/766 [00:56<02:18, 3.94it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.167654037475586, Poisson: -0.1014397069811821\r\n",
"\r",
"Epoch 0: 29%|▎| 222/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 222/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.180723190307617, Poisson: -0.08138734847307205\r\n",
"\r",
"Epoch 0: 29%|▎| 223/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 223/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.466264724731445, Poisson: -0.11305644363164902\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 224/766 [00:56<02:17, 3.95it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 224/766 [00:56<02:17, 3.94it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.176498413085938, Poisson: -0.10131478309631348\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 29%|▎| 225/766 [00:57<02:17, 3.95it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 29%|▎| 225/766 [00:57<02:17, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.572397232055664, Poisson: -0.09865520149469376\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 226/766 [00:57<02:16, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 226/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.918230056762695, Poisson: -0.110383540391922\r\n",
"\r",
"Epoch 0: 30%|▎| 227/766 [00:57<02:16, 3.96it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 227/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.728248596191406, Poisson: -0.10435037314891815\r\n",
"\r",
"Epoch 0: 30%|▎| 228/766 [00:57<02:15, 3.96it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 228/766 [00:57<02:16, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.025983810424805, Poisson: -0.1158100888133049\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 229/766 [00:57<02:15, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 229/766 [00:57<02:15, 3.95it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.17369270324707, Poisson: -0.10164433717727661\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 230/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 30%|▎| 230/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.44134521484375, Poisson: -0.09278088063001633\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 231/766 [00:58<02:15, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 231/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.792768478393555, Poisson: -0.08994679898023605\r\n",
"\r",
"Epoch 0: 30%|▎| 232/766 [00:58<02:14, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 232/766 [00:58<02:15, 3.95it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.13924789428711, Poisson: -0.10129155963659286\r\n",
"\r",
"Epoch 0: 30%|▎| 233/766 [00:58<02:14, 3.96it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 30%|▎| 233/766 [00:58<02:14, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.57449722290039, Poisson: -0.09882348030805588\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 234/766 [00:59<02:14, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 234/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.53229522705078, Poisson: -0.09858986735343933\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 235/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 31%|▎| 235/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.744096755981445, Poisson: -0.10428847372531891\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 236/766 [00:59<02:13, 3.96it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 236/766 [00:59<02:14, 3.95it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.346569061279297, Poisson: -0.1072879359126091\r\n",
"\r",
"Epoch 0: 31%|▎| 237/766 [00:59<02:13, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 237/766 [00:59<02:13, 3.95it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.879066467285156, Poisson: -0.08986541628837585\r\n",
"\r",
"Epoch 0: 31%|▎| 238/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 238/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.27837371826172, Poisson: -0.10729967057704926\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 239/766 [01:00<02:12, 3.97it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 239/766 [01:00<02:13, 3.96it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.026735305786133, Poisson: -0.09572773426771164\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 240/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 31%|▎| 240/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.477657318115234, Poisson: -0.11297494918107986\r\n",
"\r",
"Epoch 0: 31%|▎| 241/766 [01:00<02:12, 3.97it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 31%|▎| 241/766 [01:00<02:12, 3.96it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.000240325927734, Poisson: -0.09568235278129578\r\n",
"\r",
"Epoch 0: 32%|▎| 242/766 [01:01<02:12, 3.97it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 242/766 [01:01<02:12, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.904945373535156, Poisson: -0.11020729690790176\r\n",
"\r",
"Epoch 0: 32%|▎| 243/766 [01:01<02:11, 3.97it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 243/766 [01:01<02:12, 3.96it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.478771209716797, Poisson: -0.09279295802116394\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 244/766 [01:01<02:11, 3.97it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 244/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.575580596923828, Poisson: -0.09879402071237564\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 245/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 32%|▎| 245/766 [01:01<02:11, 3.96it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.602312088012695, Poisson: -0.0987543910741806\r\n",
"\r",
"Epoch 0: 32%|▎| 246/766 [01:01<02:10, 3.97it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 246/766 [01:02<02:11, 3.96it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.788652420043945, Poisson: -0.10461057722568512\r\n",
"\r",
"Epoch 0: 32%|▎| 247/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 247/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.194067001342773, Poisson: -0.08147089183330536\r\n",
"\r",
"Epoch 0: 32%|▎| 248/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 32%|▎| 248/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02446174621582, Poisson: -0.09592930972576141\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 249/766 [01:02<02:10, 3.97it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 249/766 [01:02<02:10, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.48725128173828, Poisson: -0.09317978471517563\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 250/766 [01:03<02:10, 3.96it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 33%|▎| 250/766 [01:03<02:10, 3.96it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.3828125, Poisson: -0.10734201222658157\r\n",
"\r",
"Epoch 0: 33%|▎| 251/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 251/766 [01:03<02:09, 3.96it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.242563247680664, Poisson: -0.10164496302604675\r\n",
"\r",
"Epoch 0: 33%|▎| 252/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 252/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.66411781311035, Poisson: -0.09871603548526764\r\n",
"\r",
"Epoch 0: 33%|▎| 253/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 253/766 [01:03<02:09, 3.97it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.409164428710938, Poisson: -0.10761073976755142\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 254/766 [01:03<02:08, 3.98it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 254/766 [01:04<02:09, 3.97it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.343318939208984, Poisson: -0.1072956845164299\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 255/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 33%|▎| 255/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.058074951171875, Poisson: -0.09569811075925827\r\n",
"\r",
"Epoch 0: 33%|▎| 256/766 [01:04<02:08, 3.98it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 33%|▎| 256/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18549346923828, Poisson: -0.10135468095541\r\n",
"\r",
"Epoch 0: 34%|▎| 257/766 [01:04<02:07, 3.98it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 257/766 [01:04<02:08, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.718685150146484, Poisson: -0.10441819578409195\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 258/766 [01:04<02:07, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 258/766 [01:04<02:07, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.021270751953125, Poisson: -0.09564517438411713\r\n",
"\r",
"Epoch 0: 34%|▎| 259/766 [01:05<02:07, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 259/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.88401222229004, Poisson: -0.09017042070627213\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 260/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 34%|▎| 260/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.753528594970703, Poisson: -0.08439047634601593\r\n",
"\r",
"Epoch 0: 34%|▎| 261/766 [01:05<02:06, 3.98it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 261/766 [01:05<02:07, 3.97it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.46993064880371, Poisson: -0.09300607442855835\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 262/766 [01:05<02:06, 3.98it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 262/766 [01:05<02:06, 3.97it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.609466552734375, Poisson: -0.09894497692584991\r\n",
"\r",
"Epoch 0: 34%|▎| 263/766 [01:06<02:06, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 263/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.225048065185547, Poisson: -0.1015915721654892\r\n",
"\r",
"Epoch 0: 34%|▎| 264/766 [01:06<02:06, 3.98it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 34%|▎| 264/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.738489151000977, Poisson: -0.10424954444169998\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 265/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 35%|▎| 265/766 [01:06<02:06, 3.97it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.935768127441406, Poisson: -0.08997920900583267\r\n",
"\r",
"Epoch 0: 35%|▎| 266/766 [01:06<02:05, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 266/766 [01:06<02:05, 3.97it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.35933494567871, Poisson: -0.08705601096153259\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 267/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 267/766 [01:07<02:05, 3.97it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.61872673034668, Poisson: -0.07853472977876663\r\n",
"\r",
"Epoch 0: 35%|▎| 268/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 268/766 [01:07<02:05, 3.98it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18777084350586, Poisson: -0.10162300616502762\r\n",
"\r",
"Epoch 0: 35%|▎| 269/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 269/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.008358001708984, Poisson: -0.09578895568847656\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 270/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 35%|▎| 270/766 [01:07<02:04, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.775907516479492, Poisson: -0.10441450029611588\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 271/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 35%|▎| 271/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.45305824279785, Poisson: -0.09294477105140686\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 272/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 272/766 [01:08<02:04, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.475934982299805, Poisson: -0.09292714297771454\r\n",
"\r",
"Epoch 0: 36%|▎| 273/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 273/766 [01:08<02:03, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.073951721191406, Poisson: -0.11599202454090118\r\n",
"\r",
"Epoch 0: 36%|▎| 274/766 [01:08<02:03, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 274/766 [01:08<02:03, 3.98it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.46885871887207, Poisson: -0.09305798262357712\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 275/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=24\r",
"Epoch 0: 36%|▎| 275/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.743139266967773, Poisson: -0.1045631468296051\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 276/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 276/766 [01:09<02:03, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.68687629699707, Poisson: -0.08435137569904327\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 277/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 277/766 [01:09<02:02, 3.98it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.068389892578125, Poisson: -0.09600233286619186\r\n",
"\r",
"Epoch 0: 36%|▎| 278/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 278/766 [01:09<02:02, 3.98it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.4620361328125, Poisson: -0.09318304061889648\r\n",
"\r",
"Epoch 0: 36%|▎| 279/766 [01:09<02:02, 3.99it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 36%|▎| 279/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.180936813354492, Poisson: -0.08134336769580841\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 280/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 280/766 [01:10<02:02, 3.98it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.297426223754883, Poisson: -0.08734682202339172\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 281/766 [01:10<02:01, 3.99it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 281/766 [01:10<02:01, 3.98it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.0560245513916, Poisson: -0.09584958106279373\r\n",
"\r",
"Epoch 0: 37%|▎| 282/766 [01:10<02:01, 3.99it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 282/766 [01:10<02:01, 3.98it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.90555191040039, Poisson: -0.11051429063081741\r\n",
"\r",
"Epoch 0: 37%|▎| 283/766 [01:10<02:00, 3.99it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 283/766 [01:11<02:01, 3.98it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.218358993530273, Poisson: -0.10195266455411911\r\n",
"\r",
"Epoch 0: 37%|▎| 284/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 284/766 [01:11<02:00, 3.98it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.60708236694336, Poisson: -0.09894034266471863\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 285/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 37%|▎| 285/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.4676513671875, Poisson: -0.11325454711914062\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 286/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 286/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.798397064208984, Poisson: -0.1048145592212677\r\n",
"\r",
"Epoch 0: 37%|▎| 287/766 [01:11<01:59, 3.99it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 37%|▎| 287/766 [01:11<02:00, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.460453033447266, Poisson: -0.11320748180150986\r\n",
"\r",
"Epoch 0: 38%|▍| 288/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 288/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.974878311157227, Poisson: -0.0959002673625946\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 289/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 289/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.074901580810547, Poisson: -0.11613652110099792\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 290/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 38%|▍| 290/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.31687355041504, Poisson: -0.10731612145900726\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 291/766 [01:12<01:58, 4.00it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 291/766 [01:12<01:59, 3.99it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.34976577758789, Poisson: -0.10729110985994339\r\n",
"\r",
"Epoch 0: 38%|▍| 292/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 292/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.7636661529541, Poisson: -0.1047850176692009\r\n",
"\r",
"Epoch 0: 38%|▍| 293/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 293/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.743812561035156, Poisson: -0.1044720709323883\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 294/766 [01:13<01:58, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 38%|▍| 294/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.24456024169922, Poisson: -0.08739378303289413\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 295/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 39%|▍| 295/766 [01:13<01:58, 3.99it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.432994842529297, Poisson: -0.09291543066501617\r\n",
"\r",
"Epoch 0: 39%|▍| 296/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 296/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.49660301208496, Poisson: -0.09277672320604324\r\n",
"\r",
"Epoch 0: 39%|▍| 297/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 297/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.835451126098633, Poisson: -0.10490729659795761\r\n",
"\r",
"Epoch 0: 39%|▍| 298/766 [01:14<01:57, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 298/766 [01:14<01:57, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.90227699279785, Poisson: -0.11049213260412216\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 299/766 [01:14<01:56, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 299/766 [01:14<01:56, 3.99it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.16424560546875, Poisson: -0.08146752417087555\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 300/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 39%|▍| 300/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.780229568481445, Poisson: -0.10456757247447968\r\n",
"\r",
"Epoch 0: 39%|▍| 301/766 [01:15<01:56, 4.00it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 301/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.533884048461914, Poisson: -0.09327836334705353\r\n",
"\r",
"Epoch 0: 39%|▍| 302/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 39%|▍| 302/766 [01:15<01:56, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.492950439453125, Poisson: -0.0929587110877037\r\n",
"\r",
"Epoch 0: 40%|▍| 303/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 303/766 [01:15<01:55, 3.99it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.189559936523438, Poisson: -0.12199914455413818\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 304/766 [01:15<01:55, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 304/766 [01:16<01:55, 3.99it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.94580841064453, Poisson: -0.11033787578344345\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 305/766 [01:16<01:55, 4.00it/s, v_num=a0al, train_loss_step=25\r",
"Epoch 0: 40%|▍| 305/766 [01:16<01:55, 3.99it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.95433235168457, Poisson: -0.09568342566490173\r\n",
"\r",
"Epoch 0: 40%|▍| 306/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 306/766 [01:16<01:55, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.165157318115234, Poisson: -0.10151126980781555\r\n",
"\r",
"Epoch 0: 40%|▍| 307/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 307/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.755311965942383, Poisson: -0.10471049696207047\r\n",
"\r",
"Epoch 0: 40%|▍| 308/766 [01:16<01:54, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 308/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.50486946105957, Poisson: -0.11317897588014603\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 309/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 309/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.089189529418945, Poisson: -0.09579245746135712\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 40%|▍| 310/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 40%|▍| 310/766 [01:17<01:54, 4.00it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.973127365112305, Poisson: -0.09630625694990158\r\n",
"\r",
"Epoch 0: 41%|▍| 311/766 [01:17<01:53, 4.00it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 311/766 [01:17<01:53, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.46533203125, Poisson: -0.09327004104852676\r\n",
"\r",
"Epoch 0: 41%|▍| 312/766 [01:17<01:53, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 312/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.34235954284668, Poisson: -0.08725226670503616\r\n",
"\r",
"Epoch 0: 41%|▍| 313/766 [01:18<01:53, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 313/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.199413299560547, Poisson: -0.10223302245140076\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 314/766 [01:18<01:52, 4.01it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 314/766 [01:18<01:53, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.443342208862305, Poisson: -0.0930081233382225\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 315/766 [01:18<01:52, 4.00it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 41%|▍| 315/766 [01:18<01:52, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.190698623657227, Poisson: -0.10157360881567001\r\n",
"\r",
"Epoch 0: 41%|▍| 316/766 [01:18<01:52, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 316/766 [01:19<01:52, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.79853057861328, Poisson: -0.1045677438378334\r\n",
"\r",
"Epoch 0: 41%|▍| 317/766 [01:19<01:52, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 41%|▍| 317/766 [01:19<01:52, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.742656707763672, Poisson: -0.08421915024518967\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 318/766 [01:19<01:51, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 318/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.330219268798828, Poisson: -0.08722923696041107\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 319/766 [01:19<01:51, 4.01it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 319/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.47544288635254, Poisson: -0.11358413100242615\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 320/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 42%|▍| 320/766 [01:19<01:51, 4.00it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.467269897460938, Poisson: -0.09299182146787643\r\n",
"\r",
"Epoch 0: 42%|▍| 321/766 [01:20<01:51, 4.01it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 321/766 [01:20<01:51, 4.00it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.6672306060791, Poisson: -0.09876550734043121\r\n",
"\r",
"Epoch 0: 42%|▍| 322/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 322/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.132488250732422, Poisson: -0.1016790047287941\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 323/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 323/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.278751373291016, Poisson: -0.08713781833648682\r\n",
"\r",
"Epoch 0: 42%|▍| 324/766 [01:20<01:50, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 324/766 [01:20<01:50, 4.00it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.759872436523438, Poisson: -0.1045098751783371\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 42%|▍| 325/766 [01:21<01:50, 4.00it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 42%|▍| 325/766 [01:21<01:50, 4.00it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.067596435546875, Poisson: -0.11585790663957596\r\n",
"\r",
"Epoch 0: 43%|▍| 326/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 326/766 [01:21<01:49, 4.00it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.932355880737305, Poisson: -0.08997628837823868\r\n",
"\r",
"Epoch 0: 43%|▍| 327/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 327/766 [01:21<01:49, 4.00it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.26723861694336, Poisson: -0.08742334693670273\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 328/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 328/766 [01:21<01:49, 4.01it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.192481994628906, Poisson: -0.10159247368574142\r\n",
"\r",
"Epoch 0: 43%|▍| 329/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 329/766 [01:22<01:49, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.774616241455078, Poisson: -0.1045759841799736\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 330/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 43%|▍| 330/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.918109893798828, Poisson: -0.11037542670965195\r\n",
"\r",
"Epoch 0: 43%|▍| 331/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 331/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.097972869873047, Poisson: -0.08164189010858536\r\n",
"\r",
"Epoch 0: 43%|▍| 332/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 332/766 [01:22<01:48, 4.01it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.439931869506836, Poisson: -0.09298811107873917\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 333/766 [01:22<01:47, 4.01it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 43%|▍| 333/766 [01:23<01:48, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.610124588012695, Poisson: -0.09887861460447311\r\n",
"\r",
"Epoch 0: 44%|▍| 334/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 334/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.422632217407227, Poisson: -0.09310529381036758\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 335/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 44%|▍| 335/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.64097785949707, Poisson: -0.09903261810541153\r\n",
"\r",
"Epoch 0: 44%|▍| 336/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 336/766 [01:23<01:47, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.74676513671875, Poisson: -0.10450173169374466\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 337/766 [01:23<01:46, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 337/766 [01:24<01:47, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.961679458618164, Poisson: -0.11031122505664825\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 338/766 [01:24<01:46, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 338/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.743120193481445, Poisson: -0.10468468070030212\r\n",
"\r",
"Epoch 0: 44%|▍| 339/766 [01:24<01:46, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 339/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.76246452331543, Poisson: -0.10444938391447067\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 44%|▍| 340/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 44%|▍| 340/766 [01:24<01:46, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.034584045410156, Poisson: -0.0958065316081047\r\n",
"\r",
"Epoch 0: 45%|▍| 341/766 [01:24<01:45, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 341/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.187660217285156, Poisson: -0.1017942950129509\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 342/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 342/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.44845962524414, Poisson: -0.0929780825972557\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 343/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 343/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.508325576782227, Poisson: -0.09298436343669891\r\n",
"\r",
"Epoch 0: 45%|▍| 344/766 [01:25<01:45, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 344/766 [01:25<01:45, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.59770393371582, Poisson: -0.09871768206357956\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 345/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 45%|▍| 345/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.810609817504883, Poisson: -0.09023444354534149\r\n",
"\r",
"Epoch 0: 45%|▍| 346/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 346/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.55392837524414, Poisson: -0.09318080544471741\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 347/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 347/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.338863372802734, Poisson: -0.10776587575674057\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 348/766 [01:26<01:44, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 45%|▍| 348/766 [01:26<01:44, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.75655746459961, Poisson: -0.10445886850357056\r\n",
"\r",
"Epoch 0: 46%|▍| 349/766 [01:26<01:43, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 349/766 [01:26<01:43, 4.01it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.552867889404297, Poisson: -0.07851012051105499\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 350/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 46%|▍| 350/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.313129425048828, Poisson: -0.10756148397922516\r\n",
"\r",
"Epoch 0: 46%|▍| 351/766 [01:27<01:43, 4.02it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 351/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.432830810546875, Poisson: -0.09281440079212189\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 352/766 [01:27<01:42, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 352/766 [01:27<01:43, 4.01it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.061098098754883, Poisson: -0.0959169864654541\r\n",
"\r",
"Epoch 0: 46%|▍| 353/766 [01:27<01:42, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 353/766 [01:27<01:42, 4.01it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.313295364379883, Poisson: -0.10738380998373032\r\n",
"\r",
"Epoch 0: 46%|▍| 354/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 354/766 [01:28<01:42, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.32628631591797, Poisson: -0.10728715360164642\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 355/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 46%|▍| 355/766 [01:28<01:42, 4.01it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.63421630859375, Poisson: -0.09855731576681137\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 356/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 46%|▍| 356/766 [01:28<01:42, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.052242279052734, Poisson: -0.09578649699687958\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 357/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 357/766 [01:28<01:41, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.189252853393555, Poisson: -0.10169314593076706\r\n",
"\r",
"Epoch 0: 47%|▍| 358/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 358/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.797149658203125, Poisson: -0.10486872494220734\r\n",
"\r",
"Epoch 0: 47%|▍| 359/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 359/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.569082260131836, Poisson: -0.09853038191795349\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 360/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 47%|▍| 360/766 [01:29<01:41, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.498085021972656, Poisson: -0.0929093137383461\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 361/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 361/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.900672912597656, Poisson: -0.0900539830327034\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 362/766 [01:29<01:40, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 362/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.629474639892578, Poisson: -0.09880076348781586\r\n",
"\r",
"Epoch 0: 47%|▍| 363/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 47%|▍| 363/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.05840492248535, Poisson: -0.09586732089519501\r\n",
"\r",
"Epoch 0: 48%|▍| 364/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 364/766 [01:30<01:40, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.877946853637695, Poisson: -0.11037742346525192\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 365/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 48%|▍| 365/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.996482849121094, Poisson: -0.0957464724779129\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 366/766 [01:30<01:39, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 366/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.613861083984375, Poisson: -0.11876354366540909\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 367/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 367/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.33321189880371, Poisson: -0.10739743709564209\r\n",
"\r",
"Epoch 0: 48%|▍| 368/766 [01:31<01:38, 4.03it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 368/766 [01:31<01:39, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.935331344604492, Poisson: -0.11003967374563217\r\n",
"\r",
"Epoch 0: 48%|▍| 369/766 [01:31<01:38, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 369/766 [01:31<01:38, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.00196075439453, Poisson: -0.09576454013586044\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 370/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 48%|▍| 370/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.221132278442383, Poisson: -0.10173946619033813\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 371/766 [01:32<01:38, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 48%|▍| 371/766 [01:32<01:38, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.060192108154297, Poisson: -0.09579496085643768\r\n",
"\r",
"Epoch 0: 49%|▍| 372/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 372/766 [01:32<01:37, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.589139938354492, Poisson: -0.09876739978790283\r\n",
"\r",
"Epoch 0: 49%|▍| 373/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 373/766 [01:32<01:37, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.740642547607422, Poisson: -0.10431542992591858\r\n",
"\r",
"Epoch 0: 49%|▍| 374/766 [01:32<01:37, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 374/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.341379165649414, Poisson: -0.08717776834964752\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 375/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 49%|▍| 375/766 [01:33<01:37, 4.02it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.19436264038086, Poisson: -0.10196632891893387\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 376/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 376/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.506832122802734, Poisson: -0.11307775229215622\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 377/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 377/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.549787521362305, Poisson: -0.09868539124727249\r\n",
"\r",
"Epoch 0: 49%|▍| 378/766 [01:33<01:36, 4.03it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 378/766 [01:33<01:36, 4.02it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.48164176940918, Poisson: -0.09314829111099243\r\n",
"\r",
"Epoch 0: 49%|▍| 379/766 [01:34<01:36, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 49%|▍| 379/766 [01:34<01:36, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.741195678710938, Poisson: -0.1045418530702591\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▍| 380/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 50%|▍| 380/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.425317764282227, Poisson: -0.09279096126556396\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▍| 381/766 [01:34<01:35, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▍| 381/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.347368240356445, Poisson: -0.10702072829008102\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▍| 382/766 [01:34<01:35, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▍| 382/766 [01:34<01:35, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.530473709106445, Poisson: -0.11321685463190079\r\n",
"\r",
"Epoch 0: 50%|▌| 383/766 [01:35<01:35, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▌| 383/766 [01:35<01:35, 4.02it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.423980712890625, Poisson: -0.0931314155459404\r\n",
"\r",
"Epoch 0: 50%|▌| 384/766 [01:35<01:34, 4.03it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▌| 384/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.317577362060547, Poisson: -0.10730933398008347\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▌| 385/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 50%|▌| 385/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.54821014404297, Poisson: -0.07840119302272797\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▌| 386/766 [01:35<01:34, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 50%|▌| 386/766 [01:35<01:34, 4.02it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.455156326293945, Poisson: -0.09276847541332245\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 387/766 [01:36<01:34, 4.03it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 387/766 [01:36<01:34, 4.02it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.45780372619629, Poisson: -0.1133873239159584\r\n",
"\r",
"Epoch 0: 51%|▌| 388/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 388/766 [01:36<01:33, 4.02it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.704776763916016, Poisson: -0.08420784771442413\r\n",
"\r",
"Epoch 0: 51%|▌| 389/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 389/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.756298065185547, Poisson: -0.10439086705446243\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 390/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=17\r",
"Epoch 0: 51%|▌| 390/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 14.827723503112793, Poisson: -0.06981196999549866\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 391/766 [01:36<01:33, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 391/766 [01:37<01:33, 4.03it/s, v_num=a0al, train_loss_step=14"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.138330459594727, Poisson: -0.10137390345335007\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 392/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=14"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 392/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.355375289916992, Poisson: -0.1073036715388298\r\n",
"\r",
"Epoch 0: 51%|▌| 393/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 393/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.599882125854492, Poisson: -0.09854143857955933\r\n",
"\r",
"Epoch 0: 51%|▌| 394/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 51%|▌| 394/766 [01:37<01:32, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.186933517456055, Poisson: -0.10148259997367859\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 395/766 [01:38<01:32, 4.03it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 52%|▌| 395/766 [01:38<01:32, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.297380447387695, Poisson: -0.08710375428199768\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 396/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 396/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.292890548706055, Poisson: -0.10710746794939041\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 397/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 397/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.311077117919922, Poisson: -0.08705893158912659\r\n",
"\r",
"Epoch 0: 52%|▌| 398/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 398/766 [01:38<01:31, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.50176429748535, Poisson: -0.09276818484067917\r\n",
"\r",
"Epoch 0: 52%|▌| 399/766 [01:38<01:30, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 399/766 [01:39<01:31, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.905942916870117, Poisson: -0.08994127064943314\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 400/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 400/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.189510345458984, Poisson: -0.12159111350774765\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 401/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 401/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.315311431884766, Poisson: -0.10756457597017288\r\n",
"\r",
"Epoch 0: 52%|▌| 402/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 52%|▌| 402/766 [01:39<01:30, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.505334854125977, Poisson: -0.09276691824197769\r\n",
"\r",
"Epoch 0: 53%|▌| 403/766 [01:39<01:29, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 403/766 [01:40<01:30, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.212017059326172, Poisson: -0.10146214812994003\r\n",
"\r",
"Epoch 0: 53%|▌| 404/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 404/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.59143829345703, Poisson: -0.09869526326656342\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 405/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 53%|▌| 405/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.556276321411133, Poisson: -0.09864883124828339\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 406/766 [01:40<01:29, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 406/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.144487380981445, Poisson: -0.10163812339305878\r\n",
"\r",
"Epoch 0: 53%|▌| 407/766 [01:40<01:28, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 407/766 [01:40<01:29, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.869873046875, Poisson: -0.11037392169237137\r\n",
"\r",
"Epoch 0: 53%|▌| 408/766 [01:41<01:28, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 408/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.9237060546875, Poisson: -0.08994999527931213\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 409/766 [01:41<01:28, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 53%|▌| 409/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.603927612304688, Poisson: -0.09861285984516144\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 410/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 54%|▌| 410/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.39595603942871, Poisson: -0.0930929183959961\r\n",
"\r",
"Epoch 0: 54%|▌| 411/766 [01:41<01:27, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 411/766 [01:41<01:28, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.3190860748291, Poisson: -0.08724711090326309\r\n",
"\r",
"Epoch 0: 54%|▌| 412/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 412/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.606151580810547, Poisson: -0.09849604219198227\r\n",
"\r",
"Epoch 0: 54%|▌| 413/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 413/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.42496109008789, Poisson: -0.09279781579971313\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 414/766 [01:42<01:27, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 414/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.367816925048828, Poisson: -0.1075163185596466\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 415/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 54%|▌| 415/766 [01:42<01:27, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.783443450927734, Poisson: -0.10439547151327133\r\n",
"\r",
"Epoch 0: 54%|▌| 416/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 416/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.364540100097656, Poisson: -0.1071932464838028\r\n",
"\r",
"Epoch 0: 54%|▌| 417/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 54%|▌| 417/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.86842155456543, Poisson: -0.09013441950082779\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 418/766 [01:43<01:26, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 418/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.042606353759766, Poisson: -0.11599228531122208\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 419/766 [01:43<01:25, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 419/766 [01:43<01:26, 4.03it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.2088680267334, Poisson: -0.08152053505182266\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 420/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 55%|▌| 420/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.612220764160156, Poisson: -0.09851660579442978\r\n",
"\r",
"Epoch 0: 55%|▌| 421/766 [01:44<01:25, 4.04it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 421/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.886459350585938, Poisson: -0.11061673611402512\r\n",
"\r",
"Epoch 0: 55%|▌| 422/766 [01:44<01:25, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 422/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.761281967163086, Poisson: -0.10456900298595428\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 423/766 [01:44<01:24, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 423/766 [01:44<01:25, 4.03it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.932226181030273, Poisson: -0.11030212044715881\r\n",
"\r",
"Epoch 0: 55%|▌| 424/766 [01:44<01:24, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 424/766 [01:45<01:24, 4.03it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.940080642700195, Poisson: -0.09015578031539917\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 55%|▌| 425/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 55%|▌| 425/766 [01:45<01:24, 4.03it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.045801162719727, Poisson: -0.09577429294586182\r\n",
"\r",
"Epoch 0: 56%|▌| 426/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 426/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.947856903076172, Poisson: -0.09007912874221802\r\n",
"\r",
"Epoch 0: 56%|▌| 427/766 [01:45<01:23, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 427/766 [01:45<01:24, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.19378662109375, Poisson: -0.10146069526672363\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 428/766 [01:45<01:23, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 428/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.0323429107666, Poisson: -0.09605273604393005\r\n",
"\r",
"Epoch 0: 56%|▌| 429/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 429/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.192474365234375, Poisson: -0.10159146785736084\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 430/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 56%|▌| 430/766 [01:46<01:23, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.593473434448242, Poisson: -0.09863156080245972\r\n",
"\r",
"Epoch 0: 56%|▌| 431/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 431/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.461124420166016, Poisson: -0.11302480101585388\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 432/766 [01:46<01:22, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 56%|▌| 432/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.86369514465332, Poisson: -0.08997032791376114\r\n",
"\r",
"Epoch 0: 57%|▌| 433/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 433/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.45587158203125, Poisson: -0.11291443556547165\r\n",
"\r",
"Epoch 0: 57%|▌| 434/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 434/766 [01:47<01:22, 4.04it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.766098022460938, Poisson: -0.10475729405879974\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 435/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 57%|▌| 435/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.309078216552734, Poisson: -0.10760536789894104\r\n",
"\r",
"Epoch 0: 57%|▌| 436/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 436/766 [01:47<01:21, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.89189338684082, Poisson: -0.11007551848888397\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 437/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 437/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.434974670410156, Poisson: -0.11298343539237976\r\n",
"\r",
"Epoch 0: 57%|▌| 438/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 438/766 [01:48<01:21, 4.04it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.913671493530273, Poisson: -0.1100585088133812\r\n",
"\r",
"Epoch 0: 57%|▌| 439/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 439/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.782474517822266, Poisson: -0.10438213497400284\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 57%|▌| 440/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 57%|▌| 440/766 [01:48<01:20, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02216339111328, Poisson: -0.09585492312908173\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 441/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 441/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.8275203704834, Poisson: -0.08999547362327576\r\n",
"\r",
"Epoch 0: 58%|▌| 442/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 442/766 [01:49<01:20, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.660303115844727, Poisson: -0.09890352189540863\r\n",
"\r",
"Epoch 0: 58%|▌| 443/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 443/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.554729461669922, Poisson: -0.09874079376459122\r\n",
"\r",
"Epoch 0: 58%|▌| 444/766 [01:49<01:19, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 444/766 [01:49<01:19, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.20897674560547, Poisson: -0.10159632563591003\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 445/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 445/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.04391860961914, Poisson: -0.09576905518770218\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 446/766 [01:50<01:19, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 446/766 [01:50<01:19, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.425464630126953, Poisson: -0.09294290095567703\r\n",
"\r",
"Epoch 0: 58%|▌| 447/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 447/766 [01:50<01:18, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.152692794799805, Poisson: -0.10169364511966705\r\n",
"\r",
"Epoch 0: 58%|▌| 448/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 58%|▌| 448/766 [01:50<01:18, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.199073791503906, Poisson: -0.10210412740707397\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 449/766 [01:50<01:18, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 449/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.188983917236328, Poisson: -0.10162121802568436\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 450/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 59%|▌| 450/766 [01:51<01:18, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.88677215576172, Poisson: -0.10998804867267609\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 451/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 451/766 [01:51<01:17, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.793415069580078, Poisson: -0.10432875901460648\r\n",
"\r",
"Epoch 0: 59%|▌| 452/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 452/766 [01:51<01:17, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.90279769897461, Poisson: -0.11046546697616577\r\n",
"\r",
"Epoch 0: 59%|▌| 453/766 [01:51<01:17, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 453/766 [01:52<01:17, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.31927490234375, Poisson: -0.10741424560546875\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 454/766 [01:52<01:17, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 454/766 [01:52<01:17, 4.04it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.186891555786133, Poisson: -0.10152309387922287\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 59%|▌| 455/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 59%|▌| 455/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.055156707763672, Poisson: -0.0958203449845314\r\n",
"\r",
"Epoch 0: 60%|▌| 456/766 [01:52<01:16, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 456/766 [01:52<01:16, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.19700813293457, Poisson: -0.1016567125916481\r\n",
"\r",
"Epoch 0: 60%|▌| 457/766 [01:52<01:16, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 457/766 [01:53<01:16, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.08208656311035, Poisson: -0.09579449892044067\r\n",
"\r",
"Epoch 0: 60%|▌| 458/766 [01:53<01:16, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 458/766 [01:53<01:16, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.43222427368164, Poisson: -0.09304789453744888\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 459/766 [01:53<01:15, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 459/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.26169204711914, Poisson: -0.10192114859819412\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 460/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 60%|▌| 460/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.59422492980957, Poisson: -0.09882223606109619\r\n",
"\r",
"Epoch 0: 60%|▌| 461/766 [01:53<01:15, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 461/766 [01:53<01:15, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.98931312561035, Poisson: -0.09591254591941833\r\n",
"\r",
"Epoch 0: 60%|▌| 462/766 [01:54<01:15, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 462/766 [01:54<01:15, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.62569236755371, Poisson: -0.09887341409921646\r\n",
"\r",
"Epoch 0: 60%|▌| 463/766 [01:54<01:14, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 60%|▌| 463/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.513959884643555, Poisson: -0.0929412916302681\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 464/766 [01:54<01:14, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 464/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.668134689331055, Poisson: -0.09903602302074432\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 465/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 61%|▌| 465/766 [01:54<01:14, 4.04it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.615325927734375, Poisson: -0.11899266391992569\r\n",
"\r",
"Epoch 0: 61%|▌| 466/766 [01:55<01:14, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 466/766 [01:55<01:14, 4.04it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.833663940429688, Poisson: -0.09003962576389313\r\n",
"\r",
"Epoch 0: 61%|▌| 467/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 467/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.29035758972168, Poisson: -0.10729990154504776\r\n",
"\r",
"Epoch 0: 61%|▌| 468/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 468/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.32347869873047, Poisson: -0.08711374551057816\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 469/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 469/766 [01:55<01:13, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.881690979003906, Poisson: -0.0901159793138504\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 470/766 [01:56<01:13, 4.05it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 61%|▌| 470/766 [01:56<01:13, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.574710845947266, Poisson: -0.09879805892705917\r\n",
"\r",
"Epoch 0: 61%|▌| 471/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 61%|▌| 471/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.176883697509766, Poisson: -0.1018044650554657\r\n",
"\r",
"Epoch 0: 62%|▌| 472/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 472/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.632938385009766, Poisson: -0.09889015555381775\r\n",
"\r",
"Epoch 0: 62%|▌| 473/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 473/766 [01:56<01:12, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.012971878051758, Poisson: -0.09589537978172302\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 474/766 [01:57<01:12, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 474/766 [01:57<01:12, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.88588523864746, Poisson: -0.09001684188842773\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 475/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 62%|▌| 475/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.06610679626465, Poisson: -0.0959547832608223\r\n",
"\r",
"Epoch 0: 62%|▌| 476/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 476/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.90119171142578, Poisson: -0.0900401920080185\r\n",
"\r",
"Epoch 0: 62%|▌| 477/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 477/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.705995559692383, Poisson: -0.10461120307445526\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 478/766 [01:57<01:11, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 62%|▌| 478/766 [01:58<01:11, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.32662010192871, Poisson: -0.10744621604681015\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 479/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 479/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.06502914428711, Poisson: -0.09612105786800385\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 480/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 63%|▋| 480/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.90214729309082, Poisson: -0.0900801345705986\r\n",
"\r",
"Epoch 0: 63%|▋| 481/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 481/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.344295501708984, Poisson: -0.10750877112150192\r\n",
"\r",
"Epoch 0: 63%|▋| 482/766 [01:58<01:10, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 482/766 [01:59<01:10, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.892011642456055, Poisson: -0.1104309931397438\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 483/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 483/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.33388328552246, Poisson: -0.0872010588645935\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 484/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 484/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.70537757873535, Poisson: -0.10474186390638351\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 485/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 63%|▋| 485/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.626005172729492, Poisson: -0.09885644912719727\r\n",
"\r",
"Epoch 0: 63%|▋| 486/766 [01:59<01:09, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 63%|▋| 486/766 [02:00<01:09, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.320764541625977, Poisson: -0.08730830252170563\r\n",
"\r",
"Epoch 0: 64%|▋| 487/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 487/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.204618453979492, Poisson: -0.10184206813573837\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 488/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 488/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.476268768310547, Poisson: -0.09295077621936798\r\n",
"\r",
"Epoch 0: 64%|▋| 489/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 489/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.75274658203125, Poisson: -0.10457751154899597\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 490/766 [02:00<01:08, 4.05it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 64%|▋| 490/766 [02:01<01:08, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.167760848999023, Poisson: -0.10183519124984741\r\n",
"\r",
"Epoch 0: 64%|▋| 491/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 491/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.753326416015625, Poisson: -0.10484641790390015\r\n",
"\r",
"Epoch 0: 64%|▋| 492/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 492/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.363622665405273, Poisson: -0.10764535516500473\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 493/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 493/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.632719039916992, Poisson: -0.0988030731678009\r\n",
"\r",
"Epoch 0: 64%|▋| 494/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 64%|▋| 494/766 [02:01<01:07, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.926715850830078, Poisson: -0.11051718145608902\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 495/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 65%|▋| 495/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.477825164794922, Poisson: -0.09308557212352753\r\n",
"\r",
"Epoch 0: 65%|▋| 496/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 496/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.22157096862793, Poisson: -0.10167962312698364\r\n",
"\r",
"Epoch 0: 65%|▋| 497/766 [02:02<01:06, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 497/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.785541534423828, Poisson: -0.10460682958364487\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 498/766 [02:02<01:06, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 498/766 [02:02<01:06, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.182655334472656, Poisson: -0.10170533508062363\r\n",
"\r",
"Epoch 0: 65%|▋| 499/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 499/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.012969970703125, Poisson: -0.09600904583930969\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 500/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 65%|▋| 500/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.6324462890625, Poisson: -0.09878172725439072\r\n",
"\r",
"Epoch 0: 65%|▋| 501/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 65%|▋| 501/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.375688552856445, Poisson: -0.10759124159812927\r\n",
"\r",
"Epoch 0: 66%|▋| 502/766 [02:03<01:05, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 502/766 [02:03<01:05, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.72992706298828, Poisson: -0.10490220785140991\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 503/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 503/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.729093551635742, Poisson: -0.08433445543050766\r\n",
"\r",
"Epoch 0: 66%|▋| 504/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 504/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.33883285522461, Poisson: -0.10752850025892258\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 505/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=17\r",
"Epoch 0: 66%|▋| 505/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.393917083740234, Poisson: -0.09316592663526535\r\n",
"\r",
"Epoch 0: 66%|▋| 506/766 [02:04<01:04, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 506/766 [02:04<01:04, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.15519905090332, Poisson: -0.1016923263669014\r\n",
"\r",
"Epoch 0: 66%|▋| 507/766 [02:04<01:03, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 507/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.47632598876953, Poisson: -0.09319717437028885\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 508/766 [02:05<01:03, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 508/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.30472183227539, Poisson: -0.08737389743328094\r\n",
"\r",
"Epoch 0: 66%|▋| 509/766 [02:05<01:03, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 66%|▋| 509/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.87850570678711, Poisson: -0.1103426143527031\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 510/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 67%|▋| 510/766 [02:05<01:03, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.92397689819336, Poisson: -0.09042582660913467\r\n",
"\r",
"Epoch 0: 67%|▋| 511/766 [02:05<01:02, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 511/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.052078247070312, Poisson: -0.09605080634355545\r\n",
"\r",
"Epoch 0: 67%|▋| 512/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 512/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.039587020874023, Poisson: -0.09586768597364426\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 513/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 513/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.329599380493164, Poisson: -0.10767664760351181\r\n",
"\r",
"Epoch 0: 67%|▋| 514/766 [02:06<01:02, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 514/766 [02:06<01:02, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.762454986572266, Poisson: -0.10471668094396591\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 515/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 67%|▋| 515/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.3719425201416, Poisson: -0.10770434141159058\r\n",
"\r",
"Epoch 0: 67%|▋| 516/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 516/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.777860641479492, Poisson: -0.10472965985536575\r\n",
"\r",
"Epoch 0: 67%|▋| 517/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 67%|▋| 517/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.621591567993164, Poisson: -0.09896160662174225\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 518/766 [02:07<01:01, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 518/766 [02:07<01:01, 4.05it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.540300369262695, Poisson: -0.07868669927120209\r\n",
"\r",
"Epoch 0: 68%|▋| 519/766 [02:07<01:00, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 519/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=16"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.385284423828125, Poisson: -0.10761824995279312\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 520/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=16\r",
"Epoch 0: 68%|▋| 520/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.501117706298828, Poisson: -0.11348327249288559\r\n",
"\r",
"Epoch 0: 68%|▋| 521/766 [02:08<01:00, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 521/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.739992141723633, Poisson: -0.10472604632377625\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 522/766 [02:08<01:00, 4.06it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 522/766 [02:08<01:00, 4.05it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.185258865356445, Poisson: -0.10173660516738892\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 523/766 [02:08<00:59, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 523/766 [02:08<00:59, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.91326332092285, Poisson: -0.0903579592704773\r\n",
"\r",
"Epoch 0: 68%|▋| 524/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 68%|▋| 524/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.332609176635742, Poisson: -0.10764188319444656\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 525/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 69%|▋| 525/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.16383171081543, Poisson: -0.10190869122743607\r\n",
"\r",
"Epoch 0: 69%|▋| 526/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 526/766 [02:09<00:59, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.098201751708984, Poisson: -0.10189001262187958\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 527/766 [02:09<00:58, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 527/766 [02:09<00:58, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.59515380859375, Poisson: -0.09890273213386536\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 528/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 528/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.494415283203125, Poisson: -0.09319400042295456\r\n",
"\r",
"Epoch 0: 69%|▋| 529/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 529/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.47793197631836, Poisson: -0.09325416386127472\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 530/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 69%|▋| 530/766 [02:10<00:58, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.84056282043457, Poisson: -0.09030847996473312\r\n",
"\r",
"Epoch 0: 69%|▋| 531/766 [02:10<00:57, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 531/766 [02:10<00:57, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.482717514038086, Poisson: -0.09316873550415039\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 532/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 69%|▋| 532/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.619293212890625, Poisson: -0.09898217767477036\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 533/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 533/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.45407485961914, Poisson: -0.11366017907857895\r\n",
"\r",
"Epoch 0: 70%|▋| 534/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 534/766 [02:11<00:57, 4.06it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.06269645690918, Poisson: -0.09609104692935944\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 535/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 70%|▋| 535/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.31659698486328, Poisson: -0.08762583136558533\r\n",
"\r",
"Epoch 0: 70%|▋| 536/766 [02:11<00:56, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 536/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.4761962890625, Poisson: -0.09314439445734024\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 537/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 537/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.999032974243164, Poisson: -0.11631442606449127\r\n",
"\r",
"Epoch 0: 70%|▋| 538/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 538/766 [02:12<00:56, 4.06it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.194652557373047, Poisson: -0.10190042108297348\r\n",
"\r",
"Epoch 0: 70%|▋| 539/766 [02:12<00:55, 4.06it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 539/766 [02:12<00:55, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.30335235595703, Poisson: -0.08749409019947052\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 70%|▋| 540/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 70%|▋| 540/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.20626449584961, Poisson: -0.10183284431695938\r\n",
"\r",
"Epoch 0: 71%|▋| 541/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 541/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.777469635009766, Poisson: -0.10488604754209518\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 542/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 542/766 [02:13<00:55, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.33306121826172, Poisson: -0.10774660110473633\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 543/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 543/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.98891830444336, Poisson: -0.09613889455795288\r\n",
"\r",
"Epoch 0: 71%|▋| 544/766 [02:13<00:54, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 544/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.997255325317383, Poisson: -0.09609785676002502\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 545/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 71%|▋| 545/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.197553634643555, Poisson: -0.10187359899282455\r\n",
"\r",
"Epoch 0: 71%|▋| 546/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 546/766 [02:14<00:54, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.859487533569336, Poisson: -0.11060630530118942\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 547/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 71%|▋| 547/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.890262603759766, Poisson: -0.11062599718570709\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 548/766 [02:14<00:53, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 548/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.20583724975586, Poisson: -0.10201893746852875\r\n",
"\r",
"Epoch 0: 72%|▋| 549/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 549/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.77797508239746, Poisson: -0.10483971983194351\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 550/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 72%|▋| 550/766 [02:15<00:53, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.097412109375, Poisson: -0.09617631137371063\r\n",
"\r",
"Epoch 0: 72%|▋| 551/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 551/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.389467239379883, Poisson: -0.10784398019313812\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 552/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 552/766 [02:15<00:52, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.91577911376953, Poisson: -0.11067967116832733\r\n",
"\r",
"Epoch 0: 72%|▋| 553/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 553/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.879379272460938, Poisson: -0.11067167669534683\r\n",
"\r",
"Epoch 0: 72%|▋| 554/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 554/766 [02:16<00:52, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.61037254333496, Poisson: -0.09901925921440125\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 72%|▋| 555/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 72%|▋| 555/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.037792205810547, Poisson: -0.09608427435159683\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 556/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 556/766 [02:16<00:51, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.957345962524414, Poisson: -0.09041019529104233\r\n",
"\r",
"Epoch 0: 73%|▋| 557/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 557/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.77267837524414, Poisson: -0.0846007689833641\r\n",
"\r",
"Epoch 0: 73%|▋| 558/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 558/766 [02:17<00:51, 4.06it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.03352165222168, Poisson: -0.09616824984550476\r\n",
"\r",
"Epoch 0: 73%|▋| 559/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 559/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.01946449279785, Poisson: -0.09619680792093277\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 560/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 560/766 [02:17<00:50, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.306650161743164, Poisson: -0.08740312606096268\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 561/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 561/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.354135513305664, Poisson: -0.10777858644723892\r\n",
"\r",
"Epoch 0: 73%|▋| 562/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 562/766 [02:18<00:50, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.208993911743164, Poisson: -0.10200832784175873\r\n",
"\r",
"Epoch 0: 73%|▋| 563/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 73%|▋| 563/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.653942108154297, Poisson: -0.09901890903711319\r\n",
"\r",
"Epoch 0: 74%|▋| 564/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 564/766 [02:18<00:49, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.73474884033203, Poisson: -0.10498061031103134\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 565/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 565/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.364513397216797, Poisson: -0.10782323777675629\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 566/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 566/766 [02:19<00:49, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02947425842285, Poisson: -0.09621088206768036\r\n",
"\r",
"Epoch 0: 74%|▋| 567/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 567/766 [02:19<00:48, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.660968780517578, Poisson: -0.0989837497472763\r\n",
"\r",
"Epoch 0: 74%|▋| 568/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 568/766 [02:19<00:48, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.195329666137695, Poisson: -0.10204056650400162\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 569/766 [02:19<00:48, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 569/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.745546340942383, Poisson: -0.1049041748046875\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 74%|▋| 570/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 74%|▋| 570/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.060508728027344, Poisson: -0.09597116708755493\r\n",
"\r",
"Epoch 0: 75%|▋| 571/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 571/766 [02:20<00:48, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.203964233398438, Poisson: -0.10202384740114212\r\n",
"\r",
"Epoch 0: 75%|▋| 572/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 572/766 [02:20<00:47, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.170516967773438, Poisson: -0.12212321162223816\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 573/766 [02:20<00:47, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 573/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.194564819335938, Poisson: -0.10183016955852509\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 574/766 [02:21<00:47, 4.07it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▋| 574/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.293495178222656, Poisson: -0.10767456889152527\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▊| 575/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 75%|▊| 575/766 [02:21<00:47, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.32612419128418, Poisson: -0.10786134749650955\r\n",
"\r",
"Epoch 0: 75%|▊| 576/766 [02:21<00:46, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▊| 576/766 [02:21<00:46, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.417272567749023, Poisson: -0.09323275834321976\r\n",
"\r",
"Epoch 0: 75%|▊| 577/766 [02:21<00:46, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▊| 577/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.920013427734375, Poisson: -0.11068196594715118\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▊| 578/766 [02:22<00:46, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 75%|▊| 578/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.92017936706543, Poisson: -0.11057084798812866\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 579/766 [02:22<00:45, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 579/766 [02:22<00:46, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.585636138916016, Poisson: -0.09909996390342712\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 580/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 76%|▊| 580/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.35625648498535, Poisson: -0.08748295158147812\r\n",
"\r",
"Epoch 0: 76%|▊| 581/766 [02:22<00:45, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 581/766 [02:22<00:45, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.28192710876465, Poisson: -0.08736550807952881\r\n",
"\r",
"Epoch 0: 76%|▊| 582/766 [02:23<00:45, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 582/766 [02:23<00:45, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.358203887939453, Poisson: -0.08747058361768723\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 583/766 [02:23<00:44, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 583/766 [02:23<00:45, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.040454864501953, Poisson: -0.09612678736448288\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 584/766 [02:23<00:44, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 584/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.636125564575195, Poisson: -0.09899233281612396\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 76%|▊| 585/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 76%|▊| 585/766 [02:23<00:44, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.294565200805664, Poisson: -0.10777721554040909\r\n",
"\r",
"Epoch 0: 77%|▊| 586/766 [02:24<00:44, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 586/766 [02:24<00:44, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.7929630279541, Poisson: -0.10491427034139633\r\n",
"\r",
"Epoch 0: 77%|▊| 587/766 [02:24<00:44, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 587/766 [02:24<00:44, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.575651168823242, Poisson: -0.09907319396734238\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 588/766 [02:24<00:43, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 588/766 [02:24<00:43, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.472824096679688, Poisson: -0.09318910539150238\r\n",
"\r",
"Epoch 0: 77%|▊| 589/766 [02:24<00:43, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 589/766 [02:24<00:43, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.570619583129883, Poisson: -0.09897415339946747\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 590/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 77%|▊| 590/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.23089027404785, Poisson: -0.10216553509235382\r\n",
"\r",
"Epoch 0: 77%|▊| 591/766 [02:25<00:43, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 591/766 [02:25<00:43, 4.06it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.470623016357422, Poisson: -0.09319072216749191\r\n",
"\r",
"Epoch 0: 77%|▊| 592/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 592/766 [02:25<00:42, 4.06it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.3203067779541, Poisson: -0.10772398114204407\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 593/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 77%|▊| 593/766 [02:25<00:42, 4.06it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.481698989868164, Poisson: -0.11348134279251099\r\n",
"\r",
"Epoch 0: 78%|▊| 594/766 [02:25<00:42, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 594/766 [02:26<00:42, 4.06it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.397077560424805, Poisson: -0.0875048041343689\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 595/766 [02:26<00:42, 4.07it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 78%|▊| 595/766 [02:26<00:42, 4.06it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.281776428222656, Poisson: -0.08735102415084839\r\n",
"\r",
"Epoch 0: 78%|▊| 596/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 596/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.900833129882812, Poisson: -0.09037599712610245\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 597/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 597/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.041349411010742, Poisson: -0.09603632241487503\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 598/766 [02:26<00:41, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 598/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.15169906616211, Poisson: -0.10192982852458954\r\n",
"\r",
"Epoch 0: 78%|▊| 599/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 599/766 [02:27<00:41, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.998565673828125, Poisson: -0.09607094526290894\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 600/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 78%|▊| 600/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.630062103271484, Poisson: -0.09916112571954727\r\n",
"\r",
"Epoch 0: 78%|▊| 601/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 78%|▊| 601/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.03969955444336, Poisson: -0.11638204008340836\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 602/766 [02:27<00:40, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 602/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.574071884155273, Poisson: -0.09899208694696426\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 603/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 603/766 [02:28<00:40, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.062055587768555, Poisson: -0.09617772698402405\r\n",
"\r",
"Epoch 0: 79%|▊| 604/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 604/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.900306701660156, Poisson: -0.11064215004444122\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 605/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 79%|▊| 605/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.45872688293457, Poisson: -0.0931672751903534\r\n",
"\r",
"Epoch 0: 79%|▊| 606/766 [02:28<00:39, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 606/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.61547088623047, Poisson: -0.09892795234918594\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 607/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 607/766 [02:29<00:39, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.748783111572266, Poisson: -0.1048136055469513\r\n",
"\r",
"Epoch 0: 79%|▊| 608/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 79%|▊| 608/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.059423446655273, Poisson: -0.0961209386587143\r\n",
"\r",
"Epoch 0: 80%|▊| 609/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 609/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.010229110717773, Poisson: -0.09607797861099243\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 610/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 80%|▊| 610/766 [02:29<00:38, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.505292892456055, Poisson: -0.11350621283054352\r\n",
"\r",
"Epoch 0: 80%|▊| 611/766 [02:30<00:38, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 611/766 [02:30<00:38, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.647974014282227, Poisson: -0.09891486167907715\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 612/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 612/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.9503173828125, Poisson: -0.11074764281511307\r\n",
"\r",
"Epoch 0: 80%|▊| 613/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 613/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.505077362060547, Poisson: -0.09333376586437225\r\n",
"\r",
"Epoch 0: 80%|▊| 614/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 614/766 [02:30<00:37, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.72488021850586, Poisson: -0.10470889508724213\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 615/766 [02:31<00:37, 4.07it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 80%|▊| 615/766 [02:31<00:37, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.20842170715332, Poisson: -0.12234365195035934\r\n",
"\r",
"Epoch 0: 80%|▊| 616/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 80%|▊| 616/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.443464279174805, Poisson: -0.09326735883951187\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 617/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 617/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.06236457824707, Poisson: -0.09638020396232605\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 618/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 618/766 [02:31<00:36, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.628524780273438, Poisson: -0.0990341380238533\r\n",
"\r",
"Epoch 0: 81%|▊| 619/766 [02:32<00:36, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 619/766 [02:32<00:36, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.54349136352539, Poisson: -0.09893681108951569\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 620/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 81%|▊| 620/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.07408332824707, Poisson: -0.09616923332214355\r\n",
"\r",
"Epoch 0: 81%|▊| 621/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 621/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.465482711791992, Poisson: -0.11332377791404724\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 622/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 622/766 [02:32<00:35, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.919734954833984, Poisson: -0.11065004765987396\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 623/766 [02:33<00:35, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 623/766 [02:33<00:35, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.591352462768555, Poisson: -0.09888836741447449\r\n",
"\r",
"Epoch 0: 81%|▊| 624/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 81%|▊| 624/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.751014709472656, Poisson: -0.10491835325956345\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 625/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 82%|▊| 625/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.052703857421875, Poisson: -0.09618022292852402\r\n",
"\r",
"Epoch 0: 82%|▊| 626/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 626/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.039308547973633, Poisson: -0.11643020063638687\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 627/766 [02:33<00:34, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 627/766 [02:34<00:34, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.47958755493164, Poisson: -0.09322299808263779\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 628/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 628/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.07942771911621, Poisson: -0.09612985700368881\r\n",
"\r",
"Epoch 0: 82%|▊| 629/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 629/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.62293815612793, Poisson: -0.09899833798408508\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 630/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 82%|▊| 630/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.180599212646484, Poisson: -0.10211139917373657\r\n",
"\r",
"Epoch 0: 82%|▊| 631/766 [02:34<00:33, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 82%|▊| 631/766 [02:35<00:33, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.43415069580078, Poisson: -0.09314945340156555\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 632/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 632/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.04065704345703, Poisson: -0.0961098000407219\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 633/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 633/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.605737686157227, Poisson: -0.09911376237869263\r\n",
"\r",
"Epoch 0: 83%|▊| 634/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 634/766 [02:35<00:32, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.07790756225586, Poisson: -0.09614162147045135\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 635/766 [02:36<00:32, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 83%|▊| 635/766 [02:36<00:32, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.729204177856445, Poisson: -0.10484756529331207\r\n",
"\r",
"Epoch 0: 83%|▊| 636/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 636/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.351215362548828, Poisson: -0.08738947659730911\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 637/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 637/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.743696212768555, Poisson: -0.1048312559723854\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 638/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 638/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.89417266845703, Poisson: -0.11054882407188416\r\n",
"\r",
"Epoch 0: 83%|▊| 639/766 [02:36<00:31, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 83%|▊| 639/766 [02:37<00:31, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.442575454711914, Poisson: -0.09320535510778427\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 640/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 84%|▊| 640/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.3289852142334, Poisson: -0.08738420903682709\r\n",
"\r",
"Epoch 0: 84%|▊| 641/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 641/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.37784194946289, Poisson: -0.09329848736524582\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 642/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 642/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.74251365661621, Poisson: -0.10472816228866577\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 643/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 643/766 [02:37<00:30, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.053268432617188, Poisson: -0.09602104127407074\r\n",
"\r",
"Epoch 0: 84%|▊| 644/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 644/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.585439682006836, Poisson: -0.0989774540066719\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 645/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 84%|▊| 645/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.32609748840332, Poisson: -0.10777314007282257\r\n",
"\r",
"Epoch 0: 84%|▊| 646/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 646/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.32042694091797, Poisson: -0.10790190100669861\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 647/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 84%|▊| 647/766 [02:38<00:29, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.8642578125, Poisson: -0.09032086282968521\r\n",
"\r",
"Epoch 0: 85%|▊| 648/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 648/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.9396915435791, Poisson: -0.09038145840167999\r\n",
"\r",
"Epoch 0: 85%|▊| 649/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 649/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.03581428527832, Poisson: -0.11642525345087051\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 650/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 85%|▊| 650/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.04706573486328, Poisson: -0.11630402505397797\r\n",
"\r",
"Epoch 0: 85%|▊| 651/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 651/766 [02:39<00:28, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.720582962036133, Poisson: -0.08459357917308807\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 652/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 652/766 [02:40<00:28, 4.07it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.181501388549805, Poisson: -0.0816231518983841\r\n",
"\r",
"Epoch 0: 85%|▊| 653/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 653/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.45564842224121, Poisson: -0.11357060074806213\r\n",
"\r",
"Epoch 0: 85%|▊| 654/766 [02:40<00:27, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 85%|▊| 654/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.209218978881836, Poisson: -0.10190658271312714\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 655/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=23\r",
"Epoch 0: 86%|▊| 655/766 [02:40<00:27, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.60980224609375, Poisson: -0.09892372041940689\r\n",
"\r",
"Epoch 0: 86%|▊| 656/766 [02:40<00:26, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 656/766 [02:41<00:27, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.1964054107666, Poisson: -0.10195201635360718\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 657/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 657/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.74530792236328, Poisson: -0.10479382425546646\r\n",
"\r",
"Epoch 0: 86%|▊| 658/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 658/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.213241577148438, Poisson: -0.10197531431913376\r\n",
"\r",
"Epoch 0: 86%|▊| 659/766 [02:41<00:26, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 659/766 [02:41<00:26, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.750818252563477, Poisson: -0.10478349030017853\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 660/766 [02:42<00:26, 4.07it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 86%|▊| 660/766 [02:42<00:26, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.8967227935791, Poisson: -0.11057905852794647\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 661/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 661/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.184062957763672, Poisson: -0.10184145718812943\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 662/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 86%|▊| 662/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.182111740112305, Poisson: -0.08155201375484467\r\n",
"\r",
"Epoch 0: 87%|▊| 663/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 663/766 [02:42<00:25, 4.07it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.304018020629883, Poisson: -0.10763479024171829\r\n",
"\r",
"Epoch 0: 87%|▊| 664/766 [02:42<00:25, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 664/766 [02:43<00:25, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.771034240722656, Poisson: -0.1047348603606224\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 665/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 87%|▊| 665/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.2939395904541, Poisson: -0.0873831957578659\r\n",
"\r",
"Epoch 0: 87%|▊| 666/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 666/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.618026733398438, Poisson: -0.09891299903392792\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 667/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 667/766 [02:43<00:24, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.608810424804688, Poisson: -0.09906212240457535\r\n",
"\r",
"Epoch 0: 87%|▊| 668/766 [02:43<00:24, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 668/766 [02:44<00:24, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.322099685668945, Poisson: -0.10757517069578171\r\n",
"\r",
"Epoch 0: 87%|▊| 669/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 669/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.787355422973633, Poisson: -0.1049061119556427\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 87%|▊| 670/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 87%|▊| 670/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.766332626342773, Poisson: -0.10484164953231812\r\n",
"\r",
"Epoch 0: 88%|▉| 671/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 671/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.71084976196289, Poisson: -0.10478072613477707\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 672/766 [02:44<00:23, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 672/766 [02:44<00:23, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.059356689453125, Poisson: -0.11647970974445343\r\n",
"\r",
"Epoch 0: 88%|▉| 673/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 673/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.204574584960938, Poisson: -0.10201311111450195\r\n",
"\r",
"Epoch 0: 88%|▉| 674/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 674/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.584774017333984, Poisson: -0.1193760558962822\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 675/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 88%|▉| 675/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.663442611694336, Poisson: -0.09896506369113922\r\n",
"\r",
"Epoch 0: 88%|▉| 676/766 [02:45<00:22, 4.08it/s, v_num=a0al, train_loss_step=24"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 676/766 [02:45<00:22, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.19459342956543, Poisson: -0.1019076555967331\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 677/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 88%|▉| 677/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.178815841674805, Poisson: -0.08161477744579315\r\n",
"\r",
"Epoch 0: 89%|▉| 678/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 678/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.783733367919922, Poisson: -0.10473403334617615\r\n",
"\r",
"Epoch 0: 89%|▉| 679/766 [02:46<00:21, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 679/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.467281341552734, Poisson: -0.11346116662025452\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 680/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 89%|▉| 680/766 [02:46<00:21, 4.07it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.04427146911621, Poisson: -0.09604737907648087\r\n",
"\r",
"Epoch 0: 89%|▉| 681/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 681/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.142990112304688, Poisson: -0.101886086165905\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 682/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 682/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.709651947021484, Poisson: -0.1048181876540184\r\n",
"\r",
"Epoch 0: 89%|▉| 683/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 683/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.052165985107422, Poisson: -0.09610036015510559\r\n",
"\r",
"Epoch 0: 89%|▉| 684/766 [02:47<00:20, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 684/766 [02:47<00:20, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 15.995346069335938, Poisson: -0.07582859694957733\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 89%|▉| 685/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 89%|▉| 685/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=15"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.30655288696289, Poisson: -0.08745232969522476\r\n",
"\r",
"Epoch 0: 90%|▉| 686/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=15"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 686/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.636791229248047, Poisson: -0.09912166744470596\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 687/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 687/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.205366134643555, Poisson: -0.10186842828989029\r\n",
"\r",
"Epoch 0: 90%|▉| 688/766 [02:48<00:19, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 688/766 [02:48<00:19, 4.07it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.31546401977539, Poisson: -0.10762669146060944\r\n",
"\r",
"Epoch 0: 90%|▉| 689/766 [02:48<00:18, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 689/766 [02:49<00:18, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.30607795715332, Poisson: -0.10776594281196594\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 690/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 90%|▉| 690/766 [02:49<00:18, 4.07it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.707111358642578, Poisson: -0.08455497026443481\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 691/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 691/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.056251525878906, Poisson: -0.09613558650016785\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 692/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 692/766 [02:49<00:18, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.71844482421875, Poisson: -0.0844632163643837\r\n",
"\r",
"Epoch 0: 90%|▉| 693/766 [02:49<00:17, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 90%|▉| 693/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.011394500732422, Poisson: -0.09618280827999115\r\n",
"\r",
"Epoch 0: 91%|▉| 694/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 694/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.220170974731445, Poisson: -0.1019996628165245\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 695/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 695/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.046058654785156, Poisson: -0.09611000120639801\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 696/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 696/766 [02:50<00:17, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.776384353637695, Poisson: -0.08445380628108978\r\n",
"\r",
"Epoch 0: 91%|▉| 697/766 [02:50<00:16, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 697/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02248191833496, Poisson: -0.0961388349533081\r\n",
"\r",
"Epoch 0: 91%|▉| 698/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 698/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.788034439086914, Poisson: -0.10468914359807968\r\n",
"\r",
"Epoch 0: 91%|▉| 699/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 699/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.033443450927734, Poisson: -0.09614822268486023\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 91%|▉| 700/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=21\r",
"Epoch 0: 91%|▉| 700/766 [02:51<00:16, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.43999671936035, Poisson: -0.0931275263428688\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 701/766 [02:51<00:15, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 701/766 [02:51<00:15, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02468490600586, Poisson: -0.09612933546304703\r\n",
"\r",
"Epoch 0: 92%|▉| 702/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 702/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.720537185668945, Poisson: -0.08454470336437225\r\n",
"\r",
"Epoch 0: 92%|▉| 703/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 703/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.856117248535156, Poisson: -0.11041641235351562\r\n",
"\r",
"Epoch 0: 92%|▉| 704/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 704/766 [02:52<00:15, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.749130249023438, Poisson: -0.10481631755828857\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 705/766 [02:52<00:14, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 705/766 [02:52<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.150577545166016, Poisson: -0.10191968083381653\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 706/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 706/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.73512077331543, Poisson: -0.10484417527914047\r\n",
"\r",
"Epoch 0: 92%|▉| 707/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 707/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.735193252563477, Poisson: -0.08444802463054657\r\n",
"\r",
"Epoch 0: 92%|▉| 708/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 92%|▉| 708/766 [02:53<00:14, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.171815872192383, Poisson: -0.10175672918558121\r\n",
"\r",
"Epoch 0: 93%|▉| 709/766 [02:53<00:13, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 709/766 [02:53<00:13, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.140235900878906, Poisson: -0.10202343016862869\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 710/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 710/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.266754150390625, Poisson: -0.08740904927253723\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 711/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 711/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.7435245513916, Poisson: -0.1047716736793518\r\n",
"\r",
"Epoch 0: 93%|▉| 712/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 712/766 [02:54<00:13, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.33485221862793, Poisson: -0.10773944854736328\r\n",
"\r",
"Epoch 0: 93%|▉| 713/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 713/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.449464797973633, Poisson: -0.11353325098752975\r\n",
"\r",
"Epoch 0: 93%|▉| 714/766 [02:54<00:12, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 714/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.8591365814209, Poisson: -0.09025004506111145\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 715/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 715/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.2924861907959, Poisson: -0.08760888129472733\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 716/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 93%|▉| 716/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.450326919555664, Poisson: -0.0932532325387001\r\n",
"\r",
"Epoch 0: 94%|▉| 717/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 717/766 [02:55<00:12, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.754545211791992, Poisson: -0.10489057004451752\r\n",
"\r",
"Epoch 0: 94%|▉| 718/766 [02:55<00:11, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 718/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.621562957763672, Poisson: -0.09915497153997421\r\n",
"\r",
"Epoch 0: 94%|▉| 719/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 719/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.644670486450195, Poisson: -0.09900769591331482\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 720/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 720/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.482215881347656, Poisson: -0.09339278936386108\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 721/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 721/766 [02:56<00:11, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.218652725219727, Poisson: -0.10194966197013855\r\n",
"\r",
"Epoch 0: 94%|▉| 722/766 [02:56<00:10, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 722/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.85138511657715, Poisson: -0.09043137729167938\r\n",
"\r",
"Epoch 0: 94%|▉| 723/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 94%|▉| 723/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.170902252197266, Poisson: -0.10202258080244064\r\n",
"\r",
"Epoch 0: 95%|▉| 724/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 724/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.220476150512695, Poisson: -0.10178679972887039\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 725/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 725/766 [02:57<00:10, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.098186492919922, Poisson: -0.09609003365039825\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 726/766 [02:57<00:09, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 726/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.026958465576172, Poisson: -0.09608585387468338\r\n",
"\r",
"Epoch 0: 95%|▉| 727/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 727/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.762968063354492, Poisson: -0.10483455657958984\r\n",
"\r",
"Epoch 0: 95%|▉| 728/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 728/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.0172061920166, Poisson: -0.0962311178445816\r\n",
"\r",
"Epoch 0: 95%|▉| 729/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 729/766 [02:58<00:09, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.594770431518555, Poisson: -0.09916572272777557\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 730/766 [02:58<00:08, 4.08it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 95%|▉| 730/766 [02:58<00:08, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.910221099853516, Poisson: -0.09029804170131683\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 731/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 95%|▉| 731/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.605316162109375, Poisson: -0.09899063408374786\r\n",
"\r",
"Epoch 0: 96%|▉| 732/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 732/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.684709548950195, Poisson: -0.08461694419384003\r\n",
"\r",
"Epoch 0: 96%|▉| 733/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 733/766 [02:59<00:08, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.47001838684082, Poisson: -0.09328693896532059\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 734/766 [02:59<00:07, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 734/766 [02:59<00:07, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.442007064819336, Poisson: -0.11350321769714355\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 735/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 96%|▉| 735/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.5697021484375, Poisson: -0.0991271361708641\r\n",
"\r",
"Epoch 0: 96%|▉| 736/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 736/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.006288528442383, Poisson: -0.09610603004693985\r\n",
"\r",
"Epoch 0: 96%|▉| 737/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 737/766 [03:00<00:07, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.479318618774414, Poisson: -0.11350422352552414\r\n",
"\r",
"Epoch 0: 96%|▉| 738/766 [03:00<00:06, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 738/766 [03:00<00:06, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.866106033325195, Poisson: -0.09043769538402557\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 739/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 96%|▉| 739/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.89381980895996, Poisson: -0.11064435541629791\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 740/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18\r",
"Epoch 0: 97%|▉| 740/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.364917755126953, Poisson: -0.0875319167971611\r\n",
"\r",
"Epoch 0: 97%|▉| 741/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 741/766 [03:01<00:06, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.492168426513672, Poisson: -0.09326649457216263\r\n",
"\r",
"Epoch 0: 97%|▉| 742/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 742/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.37800407409668, Poisson: -0.08742114901542664\r\n",
"\r",
"Epoch 0: 97%|▉| 743/766 [03:01<00:05, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 743/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.02222442626953, Poisson: -0.09628087282180786\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 744/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 744/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.798154830932617, Poisson: -0.08459708094596863\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 745/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 97%|▉| 745/766 [03:02<00:05, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.28070068359375, Poisson: -0.10781168937683105\r\n",
"\r",
"Epoch 0: 97%|▉| 746/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 97%|▉| 746/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.335309982299805, Poisson: -0.10783626139163971\r\n",
"\r",
"Epoch 0: 98%|▉| 747/766 [03:02<00:04, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 747/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.987077713012695, Poisson: -0.09613754600286484\r\n",
"\r",
"Epoch 0: 98%|▉| 748/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 748/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.21087074279785, Poisson: -0.08178546279668808\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 749/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 749/766 [03:03<00:04, 4.08it/s, v_num=a0al, train_loss_step=17"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.066789627075195, Poisson: -0.09616988152265549\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 750/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=17\r",
"Epoch 0: 98%|▉| 750/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.929866790771484, Poisson: -0.11071749031543732\r\n",
"\r",
"Epoch 0: 98%|▉| 751/766 [03:03<00:03, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 751/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.374528884887695, Poisson: -0.08747097849845886\r\n",
"\r",
"Epoch 0: 98%|▉| 752/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 752/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.452871322631836, Poisson: -0.11337430030107498\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 753/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 753/766 [03:04<00:03, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.317773818969727, Poisson: -0.10782989114522934\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 754/766 [03:04<00:02, 4.08it/s, v_num=a0al, train_loss_step=23"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 98%|▉| 754/766 [03:04<00:02, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.17522621154785, Poisson: -0.10185651481151581\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 755/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 99%|▉| 755/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.87299156188965, Poisson: -0.09019719064235687\r\n",
"\r",
"Epoch 0: 99%|▉| 756/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 756/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.004302978515625, Poisson: -0.09615175426006317\r\n",
"\r",
"Epoch 0: 99%|▉| 757/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=18"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 757/766 [03:05<00:02, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.4959716796875, Poisson: -0.09322664886713028\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 758/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 758/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.062843322753906, Poisson: -0.09629540145397186\r\n",
"\r",
"Epoch 0: 99%|▉| 759/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 759/766 [03:05<00:01, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.584623336791992, Poisson: -0.09911467134952545\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 760/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20\r",
"Epoch 0: 99%|▉| 760/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18416404724121, Poisson: -0.10187393426895142\r\n",
"\r",
"Epoch 0: 99%|▉| 761/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 761/766 [03:06<00:01, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.091028213500977, Poisson: -0.0962648093700409\r\n",
"\r",
"Epoch 0: 99%|▉| 762/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 99%|▉| 762/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.143661499023438, Poisson: -0.10190434008836746\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 100%|▉| 763/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=20"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 100%|▉| 763/766 [03:06<00:00, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.87443733215332, Poisson: -0.11070720106363297\r\n",
"\r",
"Epoch 0: 100%|▉| 764/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=21"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 100%|▉| 764/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=22"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 25.80946159362793, Poisson: -0.12509529292583466\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 100%|▉| 765/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=22\r",
"Epoch 0: 100%|▉| 765/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=25"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.509836196899414, Poisson: -0.09323473274707794\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Epoch 0: 100%|█| 766/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=25\r",
"Epoch 0: 100%|█| 766/766 [03:07<00:00, 4.08it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation: | | 0/? [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[A\r\n",
"\r",
"Validation: | | 0/? [00:00, ?it/s]\u001b[A\r\n",
"\r",
"Validation DataLoader 0: 0%| | 0/71 [00:00, ?it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.715787887573242, Poisson: -0.08436138182878494\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 1%|▎ | 1/71 [00:00<00:06, 11.09it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.200572967529297, Poisson: -0.08141561597585678\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 3%|▌ | 2/71 [00:00<00:06, 11.25it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.471235275268555, Poisson: -0.11336066573858261\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 4%|▊ | 3/71 [00:00<00:05, 11.34it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.440092086791992, Poisson: -0.08726721256971359\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 6%|█ | 4/71 [00:00<00:05, 11.35it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.81659698486328, Poisson: -0.10460898280143738\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 7%|█▎ | 5/71 [00:00<00:05, 11.38it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.75787353515625, Poisson: -0.08423992991447449\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 8%|█▌ | 6/71 [00:00<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.787721633911133, Poisson: -0.08449624478816986\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 10%|█▊ | 7/71 [00:00<00:05, 11.40it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.402618408203125, Poisson: -0.10755792260169983\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 11%|██▏ | 8/71 [00:00<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.863603591918945, Poisson: -0.11044501513242722\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 13%|██▍ | 9/71 [00:00<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.60628890991211, Poisson: -0.09900801628828049\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 14%|██▌ | 10/71 [00:00<00:05, 11.40it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.1425838470459, Poisson: -0.08161227405071259\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 15%|██▊ | 11/71 [00:00<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.7701358795166, Poisson: -0.10494677722454071\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 17%|███ | 12/71 [00:01<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.06895637512207, Poisson: -0.0960720032453537\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 18%|███▎ | 13/71 [00:01<00:05, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.843286514282227, Poisson: -0.10461314767599106\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 20%|███▌ | 14/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.058631896972656, Poisson: -0.11612120270729065\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation DataLoader 0: 21%|███▊ | 15/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.56452178955078, Poisson: -0.09911813586950302\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 23%|████ | 16/71 [00:01<00:04, 11.40it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.65570831298828, Poisson: -0.07856716960668564\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 24%|████▎ | 17/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.95334243774414, Poisson: -0.11065673828125\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 25%|████▌ | 18/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.04860496520996, Poisson: -0.09602082520723343\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 27%|████▊ | 19/71 [00:01<00:04, 11.40it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.706954956054688, Poisson: -0.1192212849855423\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 28%|█████ | 20/71 [00:01<00:04, 11.40it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.63813591003418, Poisson: -0.09894131869077682\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 30%|█████▎ | 21/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.087890625, Poisson: -0.11592213809490204\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 31%|█████▌ | 22/71 [00:01<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.508792877197266, Poisson: -0.0931679755449295\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 32%|█████▊ | 23/71 [00:02<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.87944221496582, Poisson: -0.09000281989574432\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 34%|██████ | 24/71 [00:02<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.072032928466797, Poisson: -0.09605841338634491\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 35%|██████▎ | 25/71 [00:02<00:04, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.435930252075195, Poisson: -0.09318968653678894\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 37%|██████▌ | 26/71 [00:02<00:03, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.161197662353516, Poisson: -0.10190277546644211\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 38%|██████▊ | 27/71 [00:02<00:03, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.20120620727539, Poisson: -0.08157812058925629\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 39%|███████ | 28/71 [00:02<00:03, 11.41it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.60438346862793, Poisson: -0.0989345908164978\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 41%|███████▎ | 29/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.479650497436523, Poisson: -0.11340800672769547\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 42%|███████▌ | 30/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.724428176879883, Poisson: -0.10450900346040726\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 44%|███████▊ | 31/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.462779998779297, Poisson: -0.0931844487786293\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 45%|████████ | 32/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.786657333374023, Poisson: -0.10474784672260284\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 46%|████████▎ | 33/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.443769454956055, Poisson: -0.09325771033763885\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 48%|████████▌ | 34/71 [00:02<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.488693237304688, Poisson: -0.09301599860191345\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 49%|████████▊ | 35/71 [00:03<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.99990463256836, Poisson: -0.11622511595487595\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 51%|█████████▏ | 36/71 [00:03<00:03, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.03451156616211, Poisson: -0.07577572762966156\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation DataLoader 0: 52%|█████████▍ | 37/71 [00:03<00:02, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.938905715942383, Poisson: -0.11054838448762894\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 54%|█████████▋ | 38/71 [00:03<00:02, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.075075149536133, Poisson: -0.09620349854230881\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 55%|█████████▉ | 39/71 [00:03<00:02, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.006479263305664, Poisson: -0.11629815399646759\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 56%|██████████▏ | 40/71 [00:03<00:02, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.275978088378906, Poisson: -0.08737660944461823\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 58%|██████████▍ | 41/71 [00:03<00:02, 11.42it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 15.497014045715332, Poisson: -0.07280221581459045\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 59%|██████████▋ | 42/71 [00:03<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.40738296508789, Poisson: -0.10748593509197235\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 61%|██████████▉ | 43/71 [00:03<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.083242416381836, Poisson: -0.11616585403680801\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 62%|███████████▏ | 44/71 [00:03<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.93613624572754, Poisson: -0.11035705357789993\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 63%|███████████▍ | 45/71 [00:03<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.766260147094727, Poisson: -0.10466591268777847\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 65%|███████████▋ | 46/71 [00:04<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18513298034668, Poisson: -0.10168717801570892\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 66%|███████████▉ | 47/71 [00:04<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.081628799438477, Poisson: -0.11622646450996399\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation DataLoader 0: 68%|████████████▏ | 48/71 [00:04<00:02, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.872467041015625, Poisson: -0.09034885466098785\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 69%|████████████▍ | 49/71 [00:04<00:01, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.83187484741211, Poisson: -0.08994244039058685\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 70%|████████████▋ | 50/71 [00:04<00:01, 11.43it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.023244857788086, Poisson: -0.09600641578435898\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 72%|████████████▉ | 51/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.718595504760742, Poisson: -0.1045961007475853\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 73%|█████████████▏ | 52/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.44754409790039, Poisson: -0.11339467763900757\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 75%|█████████████▍ | 53/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.757465362548828, Poisson: -0.10470139235258102\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 76%|█████████████▋ | 54/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.18140983581543, Poisson: -0.10165048390626907\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 77%|█████████████▉ | 55/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 18.28765296936035, Poisson: -0.0873810350894928\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 79%|██████████████▏ | 56/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.64971923828125, Poisson: -0.09891148656606674\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 80%|██████████████▍ | 57/71 [00:04<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.557477951049805, Poisson: -0.0990000069141388\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 82%|██████████████▋ | 58/71 [00:05<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.666433334350586, Poisson: -0.11910754442214966\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation DataLoader 0: 83%|██████████████▉ | 59/71 [00:05<00:01, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 23.435176849365234, Poisson: -0.11322159320116043\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 85%|███████████████▏ | 60/71 [00:05<00:00, 11.44it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.201955795288086, Poisson: -0.10189700126647949\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 86%|███████████████▍ | 61/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 16.05048942565918, Poisson: -0.0757397785782814\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 87%|███████████████▋ | 62/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.027244567871094, Poisson: -0.09596437960863113\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 89%|███████████████▉ | 63/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.64202308654785, Poisson: -0.09875722229480743\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 90%|████████████████▏ | 64/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.188655853271484, Poisson: -0.1018335297703743\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 92%|████████████████▍ | 65/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 22.29461669921875, Poisson: -0.10748331248760223\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 93%|████████████████▋ | 66/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 19.94753646850586, Poisson: -0.09584959596395493\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 94%|████████████████▉ | 67/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 17.73777961730957, Poisson: -0.08445792645215988\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 96%|█████████████████▏| 68/71 [00:05<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 21.220571517944336, Poisson: -0.10187307000160217\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 97%|█████████████████▍| 69/71 [00:06<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 20.63282585144043, Poisson: -0.09883726388216019\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r",
"Validation DataLoader 0: 99%|█████████████████▋| 70/71 [00:06<00:00, 11.45it/s]\u001b[A"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Multinomial: 24.04494857788086, Poisson: -0.11639108508825302\r\n",
"\r\n",
"\r",
"Validation DataLoader 0: 100%|██████████████████| 71/71 [00:06<00:00, 11.45it/s]\u001b[A\r\n",
"\r",
" \u001b[A\r",
"Epoch 0: 100%|█| 766/766 [03:14<00:00, 3.93it/s, v_num=a0al, train_loss_step=19\r",
"Epoch 0: 100%|█| 766/766 [03:14<00:00, 3.93it/s, v_num=a0al, train_loss_step=19"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"`Trainer.fit` stopped: `max_epochs=1` reached.\r\n",
"\r",
"Epoch 0: 100%|█| 766/766 [03:20<00:00, 3.83it/s, v_num=a0al, train_loss_step=19\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1;34mwandb\u001b[0m: \r\n",
"\u001b[1;34mwandb\u001b[0m: 🚀 View run \u001b[33mfinetune_test_0\u001b[0m at: \u001b[34m\u001b[0m\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0m"
]
}
],
"source": [
"! CUDA_VISIBLE_DEVICES=0 decima finetune \\\n",
"--name finetune_test_0 \\\n",
"--model 0 \\\n",
"--device 0 \\\n",
"--matrix-file {ad_file_path} \\\n",
"--h5-file {h5_file_path} \\\n",
"--outdir {outdir} \\\n",
"--learning-rate {lr} \\\n",
"--loss-total-weight {total_weight} \\\n",
"--gradient-accumulation {grad} \\\n",
"--batch-size 1 \\\n",
"--max-seq-shift {shift} \\\n",
"--epochs 1 \\\n",
"--logger {logger} \\\n",
"--num-workers {workers}"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "538d1250-8fc2-460b-b5fc-61ec083cca86",
"metadata": {
"collapsed": true,
"execution": {
"iopub.execute_input": "2025-11-21T22:35:16.458997Z",
"iopub.status.busy": "2025-11-21T22:35:16.458827Z",
"iopub.status.idle": "2025-11-21T22:35:16.461130Z",
"shell.execute_reply": "2025-11-21T22:35:16.460540Z"
},
"jupyter": {
"outputs_hidden": true
}
},
"outputs": [],
"source": [
"# Uncomment if necessary\n",
"# import wandb\n",
"# wandb.login(host=\"https://genentech.wandb.io\", anonymous=\"never\", relogin=True)"
]
},
{
"cell_type": "markdown",
"id": "0ce40323-32c8-4984-9578-3579bafa1436",
"metadata": {},
"source": [
"## 8. Make and evaluate predictions using trained models"
]
},
{
"cell_type": "markdown",
"id": "03c4bc32-afc1-4498-9dcb-20d08bc4f29b",
"metadata": {},
"source": [
"Using the training commands above, we trained two model replicates. Now, we can use these models to predict gene expression:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "11e3047e",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:35:16.462669Z",
"iopub.status.busy": "2025-11-21T22:35:16.462525Z",
"iopub.status.idle": "2025-11-21T22:35:16.465688Z",
"shell.execute_reply": "2025-11-21T22:35:16.465188Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt\n"
]
}
],
"source": [
"checkpoint = glob.glob(os.path.join(outdir, \"lightning_logs/*/checkpoints/*.ckpt\"))[0]\n",
"print(checkpoint)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "2d04b6e9-373f-4cf4-80ca-662f1dc60a75",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:35:16.467043Z",
"iopub.status.busy": "2025-11-21T22:35:16.466885Z",
"iopub.status.idle": "2025-11-21T22:35:16.469956Z",
"shell.execute_reply": "2025-11-21T22:35:16.469456Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt,./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# comma-separated list of model checkpoints\n",
"checkpoint_list = \",\".join([checkpoint, checkpoint])\n",
"checkpoint_list"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "5b94d98f-7889-4223-8c0a-7cec1a365a73",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:35:16.471038Z",
"iopub.status.busy": "2025-11-21T22:35:16.470916Z",
"iopub.status.idle": "2025-11-21T22:39:05.747907Z",
"shell.execute_reply": "2025-11-21T22:39:05.747013Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n",
" warnings.warn(\r\n",
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.\r\n",
" warnings.warn(\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Using device: 0 and genome: hg38 for prediction.\r\n",
"decima - INFO - Loading model ['./example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt', './example/lightning_logs/g20ya0al/checkpoints/epoch=0-step=154.ckpt']...\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"decima - INFO - Making predictions\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/__init__.py:1617: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)\r\n",
"💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.\r\n",
"GPU available: True (cuda), used: True\r\n",
"TPU available: False, using: 0 TPU cores\r\n",
"HPU available: False, using: 0 HPUs\r\n",
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torch/utils/data/dataloader.py:627: UserWarning: This DataLoader will create 32 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"SLURM auto-requeueing enabled. Setting signal handlers.\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting: | | 0/? [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting: | | 0/? [00:00, ?it/s]\r",
"Predicting DataLoader 0: 0%| | 0/115 [00:00, ?it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 1%|▏ | 1/115 [00:03<06:53, 0.28it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 2%|▎ | 2/115 [00:05<04:55, 0.38it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 3%|▍ | 3/115 [00:06<04:18, 0.43it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 3%|▋ | 4/115 [00:08<03:59, 0.46it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 4%|▊ | 5/115 [00:10<03:47, 0.48it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 5%|▉ | 6/115 [00:12<03:38, 0.50it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 6%|█ | 7/115 [00:13<03:31, 0.51it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 7%|█▎ | 8/115 [00:15<03:26, 0.52it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 8%|█▍ | 9/115 [00:17<03:21, 0.53it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 9%|█▍ | 10/115 [00:18<03:17, 0.53it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 10%|█▋ | 11/115 [00:20<03:14, 0.54it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 10%|█▊ | 12/115 [00:22<03:10, 0.54it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 11%|█▉ | 13/115 [00:23<03:07, 0.54it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 12%|██ | 14/115 [00:25<03:05, 0.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 13%|██▏ | 15/115 [00:27<03:02, 0.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 14%|██▎ | 16/115 [00:29<02:59, 0.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 15%|██▌ | 17/115 [00:30<02:57, 0.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 16%|██▋ | 18/115 [00:32<02:54, 0.55it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 17%|██▊ | 19/115 [00:34<02:52, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 17%|██▉ | 20/115 [00:35<02:50, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 18%|███ | 21/115 [00:37<02:48, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 19%|███▎ | 22/115 [00:39<02:46, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 20%|███▍ | 23/115 [00:40<02:43, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 21%|███▌ | 24/115 [00:42<02:41, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 22%|███▋ | 25/115 [00:44<02:39, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 23%|███▊ | 26/115 [00:46<02:37, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 23%|███▉ | 27/115 [00:47<02:35, 0.56it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 24%|████▏ | 28/115 [00:49<02:33, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 25%|████▎ | 29/115 [00:51<02:31, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 26%|████▍ | 30/115 [00:52<02:29, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 27%|████▌ | 31/115 [00:54<02:28, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 28%|████▋ | 32/115 [00:56<02:26, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 29%|████▉ | 33/115 [00:58<02:24, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 30%|█████ | 34/115 [00:59<02:22, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 30%|█████▏ | 35/115 [01:01<02:20, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 31%|█████▎ | 36/115 [01:03<02:18, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 32%|█████▍ | 37/115 [01:04<02:16, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 33%|█████▌ | 38/115 [01:06<02:14, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 34%|█████▊ | 39/115 [01:08<02:13, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 35%|█████▉ | 40/115 [01:10<02:11, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 36%|██████ | 41/115 [01:11<02:09, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 37%|██████▏ | 42/115 [01:13<02:07, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 37%|██████▎ | 43/115 [01:15<02:05, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 38%|██████▌ | 44/115 [01:16<02:03, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 39%|██████▋ | 45/115 [01:18<02:02, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 40%|██████▊ | 46/115 [01:20<02:00, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 41%|██████▉ | 47/115 [01:21<01:58, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 42%|███████ | 48/115 [01:23<01:56, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 43%|███████▏ | 49/115 [01:25<01:54, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 43%|███████▍ | 50/115 [01:27<01:53, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 44%|███████▌ | 51/115 [01:28<01:51, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 45%|███████▋ | 52/115 [01:30<01:49, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 46%|███████▊ | 53/115 [01:32<01:47, 0.57it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 47%|███████▉ | 54/115 [01:33<01:46, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 48%|████████▏ | 55/115 [01:35<01:44, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 49%|████████▎ | 56/115 [01:37<01:42, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 50%|████████▍ | 57/115 [01:39<01:40, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 50%|████████▌ | 58/115 [01:40<01:38, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 51%|████████▋ | 59/115 [01:42<01:37, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 52%|████████▊ | 60/115 [01:44<01:35, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 53%|█████████ | 61/115 [01:45<01:33, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 54%|█████████▏ | 62/115 [01:47<01:31, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 55%|█████████▎ | 63/115 [01:49<01:30, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 56%|█████████▍ | 64/115 [01:50<01:28, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 57%|█████████▌ | 65/115 [01:52<01:26, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 57%|█████████▊ | 66/115 [01:54<01:24, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 58%|█████████▉ | 67/115 [01:56<01:23, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 59%|██████████ | 68/115 [01:57<01:21, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 60%|██████████▏ | 69/115 [01:59<01:19, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 61%|██████████▎ | 70/115 [02:01<01:17, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 62%|██████████▍ | 71/115 [02:02<01:16, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 63%|██████████▋ | 72/115 [02:04<01:14, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 63%|██████████▊ | 73/115 [02:06<01:12, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 64%|██████████▉ | 74/115 [02:08<01:10, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 65%|███████████ | 75/115 [02:09<01:09, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 66%|███████████▏ | 76/115 [02:11<01:07, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 67%|███████████▍ | 77/115 [02:13<01:05, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 68%|███████████▌ | 78/115 [02:14<01:03, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 69%|███████████▋ | 79/115 [02:16<01:02, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 70%|███████████▊ | 80/115 [02:18<01:00, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 70%|███████████▉ | 81/115 [02:19<00:58, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 71%|████████████ | 82/115 [02:21<00:57, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 72%|████████████▎ | 83/115 [02:23<00:55, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 73%|████████████▍ | 84/115 [02:25<00:53, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 74%|████████████▌ | 85/115 [02:26<00:51, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 75%|████████████▋ | 86/115 [02:28<00:50, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 76%|████████████▊ | 87/115 [02:30<00:48, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 77%|█████████████ | 88/115 [02:31<00:46, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 77%|█████████████▏ | 89/115 [02:33<00:44, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 78%|█████████████▎ | 90/115 [02:35<00:43, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 79%|█████████████▍ | 91/115 [02:37<00:41, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 80%|█████████████▌ | 92/115 [02:38<00:39, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 81%|█████████████▋ | 93/115 [02:40<00:37, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 82%|█████████████▉ | 94/115 [02:42<00:36, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 83%|██████████████ | 95/115 [02:43<00:34, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 83%|██████████████▏ | 96/115 [02:45<00:32, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 84%|██████████████▎ | 97/115 [02:47<00:31, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 85%|██████████████▍ | 98/115 [02:48<00:29, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 86%|██████████████▋ | 99/115 [02:50<00:27, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 87%|█████████████▉ | 100/115 [02:52<00:25, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 88%|██████████████ | 101/115 [02:54<00:24, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 89%|██████████████▏ | 102/115 [02:55<00:22, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 90%|██████████████▎ | 103/115 [02:57<00:20, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 90%|██████████████▍ | 104/115 [02:59<00:18, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 91%|██████████████▌ | 105/115 [03:00<00:17, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 92%|██████████████▋ | 106/115 [03:02<00:15, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 93%|██████████████▉ | 107/115 [03:04<00:13, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 94%|███████████████ | 108/115 [03:06<00:12, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 95%|███████████████▏| 109/115 [03:07<00:10, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 96%|███████████████▎| 110/115 [03:09<00:08, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 97%|███████████████▍| 111/115 [03:11<00:06, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 97%|███████████████▌| 112/115 [03:12<00:05, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 98%|███████████████▋| 113/115 [03:14<00:03, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 99%|███████████████▊| 114/115 [03:16<00:01, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 100%|████████████████| 115/115 [03:18<00:00, 0.58it/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Predicting DataLoader 0: 100%|████████████████| 115/115 [03:18<00:00, 0.58it/s]\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/celikm5/miniforge3/envs/decima2/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: The ``compute`` method of metric WarningCounter was called before the ``update`` method which may lead to errors, as metric states have not yet been updated.\r\n",
"decima - INFO - Creating anndata\r\n",
"decima - INFO - Evaluating performance\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Performance on genes in the train dataset.\r\n",
"Mean Pearson Correlation per gene: Mean: 0.01.\r\n",
"Mean Pearson Correlation per gene using size factor (baseline): 0.03.\r\n",
"Mean Pearson Correlation per pseudobulk: 0.00\r\n",
"\r\n",
"Performance on genes in the val dataset.\r\n",
"Mean Pearson Correlation per gene: Mean: -0.01.\r\n",
"Mean Pearson Correlation per gene using size factor (baseline): 0.06.\r\n",
"Mean Pearson Correlation per pseudobulk: -0.01\r\n",
"\r\n",
"Performance on genes in the test dataset.\r\n",
"Mean Pearson Correlation per gene: Mean: -0.02.\r\n",
"Mean Pearson Correlation per gene using size factor (baseline): -0.00.\r\n",
"Mean Pearson Correlation per pseudobulk: -0.02\r\n",
"\r\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0m"
]
}
],
"source": [
"! CUDA_VISIBLE_DEVICES=0 decima predict-genes \\\n",
"--output example/test_preds.h5ad \\\n",
"--model {checkpoint_list} \\\n",
"--metadata {ad_file_path} \\\n",
"--device 0 \\\n",
"--batch-size 8 \\\n",
"--num-workers 32 \\\n",
"--max_seq_shift 0 \\\n",
"--genome hg38 \\\n",
"--save-replicates"
]
},
{
"cell_type": "markdown",
"id": "b6c253a0-b2d7-4a5d-9c46-32410fbfaecb",
"metadata": {},
"source": [
"We can open the output h5ad file to see the individual predictions and metrics."
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "e7bceeb6-5e91-455e-b40f-2b1c02f90d39",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:39:05.750232Z",
"iopub.status.busy": "2025-11-21T22:39:05.750008Z",
"iopub.status.idle": "2025-11-21T22:39:05.781095Z",
"shell.execute_reply": "2025-11-21T22:39:05.780468Z"
}
},
"outputs": [],
"source": [
"ad_out = anndata.read_h5ad(\"example/test_preds.h5ad\")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "af5c37c9-53df-439e-9501-e06eed5ae8f7",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:39:05.783023Z",
"iopub.status.busy": "2025-11-21T22:39:05.782863Z",
"iopub.status.idle": "2025-11-21T22:39:05.785830Z",
"shell.execute_reply": "2025-11-21T22:39:05.785382Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AnnData object with n_obs × n_vars = 50 × 920\n",
" obs: 'cell_type', 'tissue', 'disease', 'study', 'size_factor', 'train_pearson', 'val_pearson', 'test_pearson'\n",
" var: 'chrom', 'start', 'end', 'strand', 'gene_start', 'gene_end', 'gene_length', 'gene_mask_start', 'gene_mask_end', 'dataset', 'pearson', 'size_factor_pearson'\n",
" layers: 'preds', 'preds_finetune_test_0'"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad_out"
]
},
{
"cell_type": "markdown",
"id": "42f4da2d-33e2-4b0e-ad39-af57c71ea37a",
"metadata": {},
"source": [
"`.layers['preds_0']` and `.layers['preds_1']` contain the predictions made by the individual models whereas `.layers['preds_0']` contains the average predictions. You will see that performance metrics have been added to both `.obs` and `.var`."
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "c507929c-a73b-4641-b1de-9f066385a972",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:39:05.787091Z",
"iopub.status.busy": "2025-11-21T22:39:05.786958Z",
"iopub.status.idle": "2025-11-21T22:39:05.794747Z",
"shell.execute_reply": "2025-11-21T22:39:05.794293Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" cell_type | \n",
" tissue | \n",
" disease | \n",
" study | \n",
" size_factor | \n",
" train_pearson | \n",
" val_pearson | \n",
" test_pearson | \n",
"
\n",
" \n",
" \n",
" \n",
" | pseudobulk_0 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_0 | \n",
" st_0 | \n",
" 4946.397461 | \n",
" 0.010020 | \n",
" 0.171944 | \n",
" 0.122095 | \n",
"
\n",
" \n",
" | pseudobulk_1 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_1 | \n",
" st_0 | \n",
" 4858.091797 | \n",
" -0.024151 | \n",
" 0.061900 | \n",
" -0.169406 | \n",
"
\n",
" \n",
" | pseudobulk_2 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_2 | \n",
" st_1 | \n",
" 4921.185547 | \n",
" 0.007005 | \n",
" -0.079252 | \n",
" -0.094602 | \n",
"
\n",
" \n",
" | pseudobulk_3 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_0 | \n",
" st_1 | \n",
" 4928.486816 | \n",
" 0.016869 | \n",
" -0.023038 | \n",
" 0.007967 | \n",
"
\n",
" \n",
" | pseudobulk_4 | \n",
" ct_0 | \n",
" t_0 | \n",
" d_1 | \n",
" st_2 | \n",
" 4756.819336 | \n",
" 0.050297 | \n",
" 0.160398 | \n",
" -0.101163 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" cell_type tissue disease study size_factor train_pearson \\\n",
"pseudobulk_0 ct_0 t_0 d_0 st_0 4946.397461 0.010020 \n",
"pseudobulk_1 ct_0 t_0 d_1 st_0 4858.091797 -0.024151 \n",
"pseudobulk_2 ct_0 t_0 d_2 st_1 4921.185547 0.007005 \n",
"pseudobulk_3 ct_0 t_0 d_0 st_1 4928.486816 0.016869 \n",
"pseudobulk_4 ct_0 t_0 d_1 st_2 4756.819336 0.050297 \n",
"\n",
" val_pearson test_pearson \n",
"pseudobulk_0 0.171944 0.122095 \n",
"pseudobulk_1 0.061900 -0.169406 \n",
"pseudobulk_2 -0.079252 -0.094602 \n",
"pseudobulk_3 -0.023038 0.007967 \n",
"pseudobulk_4 0.160398 -0.101163 "
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad_out.obs.head()"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "121a7787-4c74-465f-ae93-b529564cc2fa",
"metadata": {
"execution": {
"iopub.execute_input": "2025-11-21T22:39:05.796225Z",
"iopub.status.busy": "2025-11-21T22:39:05.796089Z",
"iopub.status.idle": "2025-11-21T22:39:05.802450Z",
"shell.execute_reply": "2025-11-21T22:39:05.801934Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" chrom | \n",
" start | \n",
" end | \n",
" strand | \n",
" gene_start | \n",
" gene_end | \n",
" gene_length | \n",
" gene_mask_start | \n",
" gene_mask_end | \n",
" dataset | \n",
" pearson | \n",
" size_factor_pearson | \n",
"
\n",
" \n",
" \n",
" \n",
" | gene_0 | \n",
" chr1 | \n",
" 26191000 | \n",
" 26715288 | \n",
" + | \n",
" 26354840 | \n",
" 26879128 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
" 0.177304 | \n",
" -0.062494 | \n",
"
\n",
" \n",
" | gene_1 | \n",
" chr19 | \n",
" 41275257 | \n",
" 41799545 | \n",
" - | \n",
" 41111417 | \n",
" 41635705 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
" 0.049450 | \n",
" -0.037428 | \n",
"
\n",
" \n",
" | gene_2 | \n",
" chr1 | \n",
" 79937866 | \n",
" 80462154 | \n",
" - | \n",
" 79774026 | \n",
" 80298314 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
" -0.095439 | \n",
" 0.240203 | \n",
"
\n",
" \n",
" | gene_4 | \n",
" chr16 | \n",
" 3905208 | \n",
" 4429496 | \n",
" - | \n",
" 3741368 | \n",
" 4265656 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
" -0.092946 | \n",
" -0.042283 | \n",
"
\n",
" \n",
" | gene_5 | \n",
" chr10 | \n",
" 22495641 | \n",
" 23019929 | \n",
" + | \n",
" 22659481 | \n",
" 23183769 | \n",
" 524288 | \n",
" 163840 | \n",
" 524288 | \n",
" train | \n",
" -0.310151 | \n",
" -0.069181 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" chrom start end strand gene_start gene_end gene_length \\\n",
"gene_0 chr1 26191000 26715288 + 26354840 26879128 524288 \n",
"gene_1 chr19 41275257 41799545 - 41111417 41635705 524288 \n",
"gene_2 chr1 79937866 80462154 - 79774026 80298314 524288 \n",
"gene_4 chr16 3905208 4429496 - 3741368 4265656 524288 \n",
"gene_5 chr10 22495641 23019929 + 22659481 23183769 524288 \n",
"\n",
" gene_mask_start gene_mask_end dataset pearson size_factor_pearson \n",
"gene_0 163840 524288 train 0.177304 -0.062494 \n",
"gene_1 163840 524288 train 0.049450 -0.037428 \n",
"gene_2 163840 524288 train -0.095439 0.240203 \n",
"gene_4 163840 524288 train -0.092946 -0.042283 \n",
"gene_5 163840 524288 train -0.310151 -0.069181 "
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ad_out.var.head()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "decima2",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.14"
}
},
"nbformat": 4,
"nbformat_minor": 5
}