Impact of minimum wage increase on employment rate of fast food restaurants

import numpy as np
import pandas as pd

# pull the data
dataset = pd.read_csv("datasets/njmin3.csv")

dataset.head()

	NJ	fte	bk	roys	co_owned	centralj	demp
0	1	15.00	1	0	0	1	12.00
1	1	15.00	1	0	0	1	6.50
2	1	24.00	0	1	0	1	-1.00
3	1	19.25	0	1	1	0	2.25
4	1	21.50	1	0	0	0	13.00

NJ: if the fast food restaurante is located at New Jersey (1) or Pensylvania (0)
POST_APRIL92: if the observation was recorded after (1) or before (0) april 92
NJ_POST_APRIL92: multiplication of NJ by POST_APRIL92
fte: full time employment rate

Each line of the dataframe represents an observation of fte on a fast food restaurant.

dataset.shape

(820, 14)

dataset.describe()

	NJ	POST_APRIL92	NJ_POST_APRIL92	fte	bk	kfc	roys	wendys	co_owned	centralj	southj	pa1	pa2	demp
count	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000	820.000000
mean	0.807317	0.500000	0.403659	21.026511	0.417073	0.195122	0.241463	0.146341	0.343902	0.153659	0.226829	0.087805	0.104878	-0.070443
std	0.394647	0.500305	0.490930	9.271972	0.493376	0.396536	0.428232	0.353664	0.475299	0.360841	0.419037	0.283184	0.306583	8.725511
min	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	-41.500000
25%	1.000000	0.000000	0.000000	15.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	-3.500000
50%	1.000000	0.500000	0.000000	20.500000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
75%	1.000000	1.000000	1.000000	25.500000	1.000000	0.000000	0.000000	0.000000	1.000000	0.000000	0.000000	0.000000	0.000000	4.000000
max	1.000000	1.000000	1.000000	85.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000	34.000000

dataset.isnull().sum()

NJ                  0
POST_APRIL92        0
NJ_POST_APRIL92     0
fte                26
bk                  0
kfc                 0
roys                0
wendys              0
co_owned            0
centralj            0
southj              0
pa1                 0
pa2                 0
demp               52
dtype: int64

# replacing null values with averages
from sklearn.impute import SimpleImputer

missingvalues_imputer = SimpleImputer(missing_values = np.nan, strategy = 'mean')
missingvalues_imputer.fit(dataset[['fte', 'demp']])
dataset[['fte', 'demp']] = missingvalues_imputer.transform(dataset[['fte', 'demp']])

DiD with Aggregated Metrics

dataset.groupby(['NJ', 'POST_APRIL92'])['fte'].mean().reset_index()

	NJ	POST_APRIL92	fte
0	0	0	23.272823
1	0	1	21.162064
2	1	0	20.457145
3	1	1	21.027396

(NJ fte after treatment) - (NJ fte before treatment) = 21.03 - 20.46 = 0.57
(PENN fte after treatment) - (PENN fte before treatment) = 21.162064 - 23.272823 = - 2.11
DiD = 0.57 - (-2.11) = 0.57 + 2.11 = 2.68
DiD = 2.68

The full time employment (fte) rate on New Jersey have an increase of 2.73 due to the minimum wage increase policy.

In other words, increasing the minimum wage has a positive impact on employment rate for fast food restaurants on New Jersey.

DiD with Linear Regression

Let NJ be represented by G and POST_APRIL92 represented by T. So the functional form of linear regression is:

\[fte(G,T) = \beta_0 + \beta_1 G + \beta_2 T + \beta_3 T G\]

\[DiD = [fte(1,1) - fte(1,0)] - [fte(0,1) - fte(0,0)]\]

\[DiD = [\beta_0 + \beta_1 + \beta_2 + \beta_3 - \beta_0 - \beta_1] - [\beta_0 + \beta_2 - \beta_0]\]

\[DiD = \beta_2 + \beta_3 - \beta_2 = \beta_3\]

\[DiD = \beta_3\]

X = dataset[['NJ', 'POST_APRIL92', 'NJ_POST_APRIL92']]
y = dataset['fte'].values

import statsmodels.api as sm
X = sm.add_constant(X)
model1 = sm.OLS(y, X).fit()

print(model1.summary(yname="FTE",
                     xname=("intercept", "New Jersey", "After April 1992", "New Jersey and after April 1992"),
                     title="Model 1: FTE ~ NJ + POST_APRIL92 + NJ_POST_APRIL92"))

              Model 1: FTE ~ NJ + POST_APRIL92 + NJ_POST_APRIL92              
==============================================================================
Dep. Variable:                    FTE   R-squared:                       0.007
Model:                            OLS   Adj. R-squared:                  0.004
Method:                 Least Squares   F-statistic:                     1.974
Date:                Wed, 28 Dec 2022   Prob (F-statistic):              0.116
Time:                        20:11:03   Log-Likelihood:                -2986.2
No. Observations:                 820   AIC:                             5980.
Df Residuals:                     816   BIC:                             5999.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
===================================================================================================
                                      coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------------------------
intercept                          23.2728      1.041     22.349      0.000      21.229      25.317
New Jersey                         -2.8157      1.159     -2.430      0.015      -5.091      -0.541
After April 1992                   -2.1108      1.473     -1.433      0.152      -5.001       0.780
New Jersey and after April 1992     2.6810      1.639      1.636      0.102      -0.536       5.898
==============================================================================
Omnibus:                      232.659   Durbin-Watson:                   1.847
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              908.337
Skew:                           1.289   Prob(JB):                    5.72e-198
Kurtosis:                       7.465   Cond. No.                         11.4
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

The coefficient of the variable NJ_POST_APRIL92 = New Jersey and after April 1992 is 2.68, that is equal to the value founded by the aggregation method for DiD.

DiD with Aggregated Metrics

DiD with Linear Regression

References