!pip3 install kaggle
!pip3 install prophet
!pip3 install pystan==2.19.1.1
Requirement already satisfied: kaggle in /usr/local/lib/python3.7/dist-packages (1.5.12)
Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.7/dist-packages (from kaggle) (1.15.0)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from kaggle) (2.23.0)
Requirement already satisfied: python-slugify in /usr/local/lib/python3.7/dist-packages (from kaggle) (6.1.1)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/dist-packages (from kaggle) (1.24.3)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from kaggle) (4.63.0)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/dist-packages (from kaggle) (2.8.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/dist-packages (from kaggle) (2021.10.8)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.7/dist-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->kaggle) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->kaggle) (3.0.4)
Requirement already satisfied: prophet in /usr/local/lib/python3.7/dist-packages (1.0.1)
Requirement already satisfied: pystan~=2.19.1.1 in /usr/local/lib/python3.7/dist-packages (from prophet) (2.19.1.1)
Requirement already satisfied: numpy>=1.15.4 in /usr/local/lib/python3.7/dist-packages (from prophet) (1.21.5)
Requirement already satisfied: holidays>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from prophet) (0.10.5.2)
Requirement already satisfied: pandas>=1.0.4 in /usr/local/lib/python3.7/dist-packages (from prophet) (1.3.5)
Requirement already satisfied: convertdate>=2.1.2 in /usr/local/lib/python3.7/dist-packages (from prophet) (2.4.0)
Requirement already satisfied: python-dateutil>=2.8.0 in /usr/local/lib/python3.7/dist-packages (from prophet) (2.8.2)
Requirement already satisfied: tqdm>=4.36.1 in /usr/local/lib/python3.7/dist-packages (from prophet) (4.63.0)
Requirement already satisfied: matplotlib>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from prophet) (3.2.2)
Requirement already satisfied: cmdstanpy==0.9.68 in /usr/local/lib/python3.7/dist-packages (from prophet) (0.9.68)
Requirement already satisfied: setuptools-git>=1.2 in /usr/local/lib/python3.7/dist-packages (from prophet) (1.2)
Requirement already satisfied: LunarCalendar>=0.0.9 in /usr/local/lib/python3.7/dist-packages (from prophet) (0.0.9)
Requirement already satisfied: Cython>=0.22 in /usr/local/lib/python3.7/dist-packages (from prophet) (0.29.28)
Requirement already satisfied: ujson in /usr/local/lib/python3.7/dist-packages (from cmdstanpy==0.9.68->prophet) (5.1.0)
Requirement already satisfied: pymeeus<=1,>=0.3.13 in /usr/local/lib/python3.7/dist-packages (from convertdate>=2.1.2->prophet) (0.5.11)
Requirement already satisfied: korean-lunar-calendar in /usr/local/lib/python3.7/dist-packages (from holidays>=0.10.2->prophet) (0.2.1)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from holidays>=0.10.2->prophet) (1.15.0)
Requirement already satisfied: hijri-converter in /usr/local/lib/python3.7/dist-packages (from holidays>=0.10.2->prophet) (2.2.3)
Requirement already satisfied: pytz in /usr/local/lib/python3.7/dist-packages (from LunarCalendar>=0.0.9->prophet) (2018.9)
Requirement already satisfied: ephem>=3.7.5.3 in /usr/local/lib/python3.7/dist-packages (from LunarCalendar>=0.0.9->prophet) (4.1.3)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.0.0->prophet) (1.4.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.0.0->prophet) (3.0.7)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.0.0->prophet) (0.11.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib>=2.0.0->prophet) (3.10.0.2)
Requirement already satisfied: pystan==2.19.1.1 in /usr/local/lib/python3.7/dist-packages (2.19.1.1)
Requirement already satisfied: Cython!=0.25.1,>=0.22 in /usr/local/lib/python3.7/dist-packages (from pystan==2.19.1.1) (0.29.28)
Requirement already satisfied: numpy>=1.7 in /usr/local/lib/python3.7/dist-packages (from pystan==2.19.1.1) (1.21.5)
import tqdm
import numpy as np
import pandas as pd
import seaborn as sns

from zipfile import ZipFile
from prophet import Prophet
from matplotlib import pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (12, 12)

Before running the below cell, upload your kaggle token, to make sure an error doesn't popup.

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
mkdir: cannot create directory ‘/root/.kaggle’: File exists
!kaggle competitions download -c tabular-playground-series-mar-2022
tabular-playground-series-mar-2022.zip: Skipping, found more recently modified local copy (use --force to force download)
with ZipFile('/content/tabular-playground-series-mar-2022.zip', 'r') as zf:
    zf.extractall('./')

Loading the data

train = pd.read_csv('train.csv', index_col='row_id', parse_dates=['time'])
train.head()
time x y direction congestion
row_id
0 1991-04-01 0 0 EB 70
1 1991-04-01 0 0 NB 49
2 1991-04-01 0 0 SB 24
3 1991-04-01 0 1 EB 18
4 1991-04-01 0 1 NB 60
train.info()
train.describe()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 848835 entries, 0 to 848834
Data columns (total 5 columns):
 #   Column      Non-Null Count   Dtype         
---  ------      --------------   -----         
 0   time        848835 non-null  datetime64[ns]
 1   x           848835 non-null  int64         
 2   y           848835 non-null  int64         
 3   direction   848835 non-null  object        
 4   congestion  848835 non-null  int64         
dtypes: datetime64[ns](1), int64(3), object(1)
memory usage: 38.9+ MB
x y congestion
count 848835.000000 848835.000000 848835.000000
mean 1.138462 1.630769 47.815305
std 0.801478 1.089379 16.799392
min 0.000000 0.000000 0.000000
25% 0.000000 1.000000 35.000000
50% 1.000000 2.000000 47.000000
75% 2.000000 3.000000 60.000000
max 2.000000 3.000000 100.000000
sns.heatmap(train.corr(), annot=True, vmin=-1, vmax=1, cmap='RdYlGn')
<matplotlib.axes._subplots.AxesSubplot at 0x7fa74ad97410>
test = pd.read_csv('test.csv', index_col='row_id', parse_dates=['time'])
test.head()
time x y direction
row_id
848835 1991-09-30 12:00:00 0 0 EB
848836 1991-09-30 12:00:00 0 0 NB
848837 1991-09-30 12:00:00 0 0 SB
848838 1991-09-30 12:00:00 0 1 EB
848839 1991-09-30 12:00:00 0 1 NB

There are no missing values, in the data.

if train.isna().any().any():
    print(train.isna().sum()/train.shape[0])
else:
    print("No Missing values")
No Missing values

Preparation

test['congestion'] = 0.0
grouped_train_data = train.groupby(['time', 'x', 'y', 'direction'])
grouped_test_data = test.groupby(['time', 'x', 'y', 'direction'])
train_dict = dict()
test_dict = dict()

for g in grouped_train_data:
    if (g[0][1], g[0][2], g[0][3]) in train_dict.keys():
        train_dict[(g[0][1], g[0][2], g[0][3])].append((g[0][0], g[1]['congestion'].values[0]))
    else:
        train_dict[(g[0][1], g[0][2], g[0][3])] = [(g[0][0], g[1]['congestion'].values[0])]

for g in grouped_test_data:
    if (g[0][1], g[0][2], g[0][3]) in test_dict.keys():
        test_dict[(g[0][1], g[0][2], g[0][3])].append((g[0][0], g[1]['congestion'].values[0]))
    else:
        test_dict[(g[0][1], g[0][2], g[0][3])] = [(g[0][0], g[1]['congestion'].values[0])]
for idx, li in train_dict.items():
    train_dict[idx] = pd.DataFrame(columns=['ds', 'y'], data=li)

for idx, li in test_dict.items():
    test_dict[idx] = pd.DataFrame(columns=['ds', 'y'], data=li).drop(['y'], axis=1)

Modelling

Approach-1

In this method, I have grouped the data into a number of instances and made the predictions on that instances.

An instance is uniquely identifiable by its a key which is a combination of its cordinates and the direction.

for idx, train_data in tqdm.tqdm(train_dict.items()):
    model = Prophet()
    model.fit(train_data)
    forecast = model.predict(test_dict[idx])
    test_dict[idx]['congestion'] = np.round(forecast['yhat'])
    test_dict[idx]['x'] = idx[0]
    test_dict[idx]['y'] = idx[1]
    test_dict[idx]['direction'] = idx[2]
  0%|          | 0/65 [00:00<?, ?it/s]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  2%|▏         | 1/65 [00:08<08:55,  8.36s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  3%|▎         | 2/65 [00:12<06:23,  6.09s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  5%|▍         | 3/65 [00:20<06:55,  6.70s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  6%|▌         | 4/65 [00:29<07:53,  7.76s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  8%|▊         | 5/65 [00:36<07:21,  7.37s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
  9%|▉         | 6/65 [00:41<06:39,  6.77s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 11%|█         | 7/65 [00:48<06:37,  6.85s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 12%|█▏        | 8/65 [00:55<06:24,  6.75s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 14%|█▍        | 9/65 [01:00<05:49,  6.24s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 15%|█▌        | 10/65 [01:06<05:35,  6.10s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 17%|█▋        | 11/65 [01:11<05:08,  5.71s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 18%|█▊        | 12/65 [01:16<04:56,  5.60s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 20%|██        | 13/65 [01:24<05:25,  6.25s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 22%|██▏       | 14/65 [01:28<04:46,  5.63s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 23%|██▎       | 15/65 [01:34<04:46,  5.72s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 25%|██▍       | 16/65 [01:39<04:31,  5.55s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 26%|██▌       | 17/65 [01:45<04:26,  5.55s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 28%|██▊       | 18/65 [01:51<04:29,  5.74s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 29%|██▉       | 19/65 [01:57<04:23,  5.72s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 31%|███       | 20/65 [02:02<04:14,  5.65s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 32%|███▏      | 21/65 [02:06<03:41,  5.03s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 34%|███▍      | 22/65 [02:11<03:46,  5.26s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 35%|███▌      | 23/65 [02:18<03:52,  5.54s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 37%|███▋      | 24/65 [02:21<03:24,  4.99s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 38%|███▊      | 25/65 [02:28<03:43,  5.58s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 40%|████      | 26/65 [02:37<04:09,  6.40s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 42%|████▏     | 27/65 [02:44<04:15,  6.72s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 43%|████▎     | 28/65 [02:51<04:10,  6.78s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 45%|████▍     | 29/65 [02:56<03:44,  6.24s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 46%|████▌     | 30/65 [03:02<03:34,  6.14s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 48%|████▊     | 31/65 [03:08<03:34,  6.29s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 49%|████▉     | 32/65 [03:14<03:17,  5.99s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 51%|█████     | 33/65 [03:19<03:07,  5.85s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 52%|█████▏    | 34/65 [03:23<02:43,  5.28s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 54%|█████▍    | 35/65 [03:29<02:39,  5.33s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 55%|█████▌    | 36/65 [03:35<02:47,  5.76s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 57%|█████▋    | 37/65 [03:41<02:41,  5.77s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 58%|█████▊    | 38/65 [03:46<02:30,  5.57s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 60%|██████    | 39/65 [03:49<02:04,  4.78s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 62%|██████▏   | 40/65 [03:55<02:07,  5.12s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 63%|██████▎   | 41/65 [04:03<02:24,  6.03s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 65%|██████▍   | 42/65 [04:10<02:20,  6.10s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 66%|██████▌   | 43/65 [04:14<02:03,  5.62s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 68%|██████▊   | 44/65 [04:21<02:06,  6.01s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 69%|██████▉   | 45/65 [04:27<02:01,  6.08s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 71%|███████   | 46/65 [04:32<01:48,  5.70s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 72%|███████▏  | 47/65 [04:37<01:39,  5.54s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 74%|███████▍  | 48/65 [04:42<01:29,  5.29s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 75%|███████▌  | 49/65 [04:49<01:34,  5.92s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 77%|███████▋  | 50/65 [04:55<01:25,  5.69s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 78%|███████▊  | 51/65 [05:00<01:17,  5.51s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 80%|████████  | 52/65 [05:04<01:07,  5.20s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 82%|████████▏ | 53/65 [05:11<01:08,  5.74s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 83%|████████▎ | 54/65 [05:17<01:05,  5.93s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 85%|████████▍ | 55/65 [05:23<00:57,  5.77s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 86%|████████▌ | 56/65 [05:28<00:51,  5.73s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 88%|████████▊ | 57/65 [05:35<00:48,  6.03s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 89%|████████▉ | 58/65 [05:40<00:40,  5.78s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 91%|█████████ | 59/65 [05:46<00:33,  5.58s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 92%|█████████▏| 60/65 [05:51<00:28,  5.67s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 94%|█████████▍| 61/65 [05:55<00:20,  5.05s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 95%|█████████▌| 62/65 [05:59<00:14,  4.68s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 97%|█████████▋| 63/65 [06:04<00:09,  4.79s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
 98%|█████████▊| 64/65 [06:08<00:04,  4.56s/it]INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
100%|██████████| 65/65 [06:11<00:00,  5.72s/it]
preds_semi_final = pd.concat(test_dict.values(), ignore_index=True)
preds_final = test.reset_index().merge(preds_semi_final, left_on=['time', 'x', 'y', 'direction'], right_on=['ds', 'x', 'y', 'direction'])[['row_id', 'congestion_y']]
submission = pd.read_csv('/content/sample_submission.csv')
submission = submission.merge(preds_final, on='row_id')[['row_id', 'congestion_y']].rename(columns={'congestion_y': 'congestion'})
submission.to_csv('output.csv', index=False)
!kaggle competitions submit -c tabular-playground-series-mar-2022 -f output.csv -m "FB Prophet correct 2 with round"
100% 27.4k/27.4k [00:00<00:00, 150kB/s]
Successfully submitted to Tabular Playground Series - Mar 2022