Abstract

IJDSBDA

International Journal of Data Science and Big Data Analytics

2710-2599

SvedbergOpen

ijdsbda-1-2-004

10.51483/IJDSBDA.1.2.2021.31-38

Research Paper

Restaurant tip prediction using linear regression

Mirugwe

Alex

¹ ^*

¹Department of Statistical Sciences, Faculty of Science, University of Cape Town, Cape Town, South Africa. E-mail: Mrgale005@myuct.ac.za

^*Corresponding author: Alex Mirugwe, Department of Statistical Sciences, Faculty of Science, University of Cape Town, Cape Town, South Africa. E-mail: Mrgale005@myuct.ac.za

05 2021

1 2 31 38 Abstract

The objective of this paper is to build a linear model for predicting the average amount of tip in dollars a waiter is expected to earn from the restaurant given the predictor variables, i.e., total bill paid, day, the gender of the customer (sex) time of the party, smoker and size of the party. The model was based on the data created by one waiter at a certain restaurant in California who recorded information about each tip he received. This model can be applied at any restaurant with similar predictor variables to determine the amount of tip. The final result from this analysis proved a regression model with a minimum prediction Root Mean Square Error (RMSE) of 1.1815.

Keywords Machine learning Linear regression Mean Squared Error (MSE) Root Mean Squared Error (RMSE)

References Berk

R.A.

(2020). Statistical Learning from a Regression Perspective. Springer International Publishing. Briscoe

Feldman

(2011). Conceptual complexity and the bias/variance tradeoff. Cognition, 118 (1), 2–16. Doan

Kalita

(2016). Selecting machine learning algorithms using regression models. Proc. - 15th IEEE Int. Conf. Data Min. Work. ICDMW 2015, pp. 1498–1505. doi: 10.1109/ICDMW.2015.43 James

Witten

Hastie

Tibshirani

(2013). An introduction to statistical learning, Vol. 112. Springer. Jarque

C.M.

Bera

A.K.

(1980) Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ. Lett., 6 (3), 255–259. Kavitha

Varuna

Ramya

(2017). A comparative analysis on linear regression and support vector regression. Proc. 2016 Online Int. Conf Green Eng. Technol. IC-GET 2016. doi: 10.1109/GET.2016.7916627. Kologlu

Birinci

Kanalmaz

S.I.

Özyýlmaz

(2018). A multiple linear regression approach for estimating the market value of football players in forward position. arXiv, 1–12. Lim

H. Il.

(2019). A linear regression approach to modeling software characteristics for classifying similar software. Proc. -Int. Comput.Softw. Appl. Conf. 1, 942–943. doi: 10.1109/COMPSAC.2019.00152.

Neal

et al. (2018). A modern take on the bias-variance tradeoff in neural networks. arXiv Prepr arXiv1810.08591. Shalev-Shwartz

Ben-David

(2014). Understanding machine learning: From theory to algorithms. Cambridge University Press. Zeng

(2016). Integration of machine learning and human learning for training optimization in robust linear regression Xiaohua Li , Yu Chen State University of New York at Binghamton Department of ECE , Binghamton , NY 13902, Icassp. 2613–2617