<?xml version="1.0"?>
<!DOCTYPE article SYSTEM "C:\nlm\converter\journal-publishing-dtd-2.0\journalpublishing.dtd">
<article>
<front>
<journal-meta>
<journal-id journal-id-type="publisher">IJDSBDA</journal-id>
<journal-title>International Journal of Data Science and Big Data Analytics</journal-title>
<issn pub-type="epub">2710-2599</issn>
<publisher>
<publisher-name>SvedbergOpen</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="other">ijdsbda-1-2-004</article-id>
<doi-group>
<article-doi><ext-link ext-link-type="uri" xmlns:xlink="https://doi.org/" xlink:href="10.51483/IJDSBDA.1.2.2021.31-38">10.51483/IJDSBDA.1.2.2021.31-38</ext-link></article-doi>
</doi-group>
<article-categories>
<subj-group>
<subject>Research Paper</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Restaurant tip prediction using linear regression</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Mirugwe</surname><given-names>Alex</given-names></name>
<xref ref-type="aff" rid="aff001"><sup>1</sup></xref>
<xref ref-type="corresp" rid="cor001"><sup>*</sup></xref>
</contrib>
</contrib-group>
<aff id="aff001"><sup>1</sup><deptname>Department of Statistical Sciences, Faculty of Science, University of Cape Town</deptname>, <instaddress>Cape Town</instaddress>, <instcountry>South Africa</instcountry>. E-mail: <email>Mrgale005@myuct.ac.za</email></aff>
<author-notes>
<corresp id="cor001"><sup>*</sup>Corresponding author: Alex Mirugwe, <deptname>Department of Statistical Sciences, Faculty of Science, University of Cape Town</deptname>, <instaddress>Cape Town</instaddress>, <instcountry>South Africa</instcountry>. E-mail: <email>Mrgale005@myuct.ac.za</email></corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>05</month>
<year>2021</year>
</pub-date>
<volume>1</volume>
<issue>2</issue>
<fpage>31</fpage>
<lpage>38</lpage>
<abstract>
<title>Abstract</title>
<p>The objective of this paper is to build a linear model for predicting the average amount of tip in dollars a waiter is expected to earn from the restaurant given the predictor variables, i.e., total bill paid, day, the gender of the customer (sex) time of the party, smoker and size of the party. The model was based on the data created by one waiter at a certain restaurant in California who recorded information about each tip he received. This model can be applied at any restaurant with similar predictor variables to determine the amount of tip. The final result from this analysis proved a regression model with a minimum prediction Root Mean Square Error (RMSE) of 1.1815.</p>
</abstract>
<kwd-group>
<title>Keywords</title>
<kwd>Machine learning</kwd>
<kwd>Linear regression</kwd>
<kwd>Mean Squared Error (MSE)</kwd>
<kwd>Root Mean Squared Error (RMSE)</kwd>
</kwd-group>
<counts>
<ref-count count="11"/>
<page-count count="8"/>
</counts>
</article-meta>
</front>
<back>
<ref-list>
<title>References</title>
<ref id="bib001"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Berk</surname><given-names>R.A.</given-names></name></person-group> (<year>2020</year>). <source>Statistical Learning from a Regression Perspective</source>. <publisher-name>Springer International Publishing</publisher-name>.</citation></ref>
<ref id="bib002"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Briscoe</surname><given-names>E.</given-names></name><name><surname>Feldman</surname><given-names>J.</given-names></name></person-group> (<year>2011</year>). <article-title>Conceptual complexity and the bias/variance tradeoff</article-title>. <source>Cognition</source>, <volume>118</volume> (<issue>1</issue>), <fpage>2</fpage>&#x2013;<lpage>16</lpage>.</citation></ref>
<ref id="bib003"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Doan</surname><given-names>T.</given-names></name><name><surname>Kalita</surname><given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>Selecting machine learning algorithms using regression models</article-title>. <source>Proc. - 15th IEEE Int. Conf. Data Min. Work. ICDMW 2015</source>, pp. <fpage>1498</fpage>&#x2013;<lpage>1505</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ICDMW.2015.43</pub-id></citation></ref>
<ref id="bib004"><citation citation-type="book"><person-group person-group-type="author"><name><surname>James</surname><given-names>G.</given-names></name><name><surname>Witten</surname><given-names>D.</given-names></name><name><surname>Hastie</surname><given-names>T.</given-names></name><name><surname>Tibshirani</surname><given-names>R.</given-names></name></person-group> (<year>2013</year>). <source>An introduction to statistical learning</source>, Vol. <volume>112</volume>. <publisher-name>Springer</publisher-name>.</citation></ref>
<ref id="bib005"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jarque</surname><given-names>C.M.</given-names></name><name><surname>Bera</surname><given-names>A.K.</given-names></name></person-group> (<year>1980</year>) <article-title>Efficient tests for normality, homoscedasticity and serial independence of regression residuals</article-title>. <source>Econ. Lett</source>., <volume>6</volume> (<issue>3</issue>), <fpage>255</fpage>&#x2013;<lpage>259</lpage>.</citation></ref>
<ref id="bib006"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Kavitha</surname><given-names>S.</given-names></name><name><surname>Varuna</surname><given-names>S.</given-names></name><name><surname>Ramya</surname><given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>A comparative analysis on linear regression and support vector regression</article-title>. <source>Proc. 2016 Online Int. Conf Green Eng. Technol. IC-GET 2016</source>. doi: <pub-id pub-id-type="doi">10.1109/GET.2016.7916627</pub-id>.</citation></ref>
<ref id="bib007"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Kologlu</surname><given-names>Y.</given-names></name><name><surname>Birinci</surname><given-names>H.</given-names></name><name><surname>Kanalmaz</surname><given-names>S.I.</given-names></name><name><surname>&#x00D6;zy&#x00FD;lmaz</surname><given-names>B.</given-names></name></person-group> (<year>2018</year>). <article-title>A multiple linear regression approach for estimating the market value of football players in forward position</article-title>. <source>arXiv</source>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>.</citation></ref>
<ref id="bib008"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lim</surname><given-names>H. Il.</given-names></name></person-group> (<year>2019</year>). <article-title>A linear regression approach to modeling software characteristics for classifying similar software</article-title>. <source>Proc. -Int. Comput.Softw. Appl. Conf</source>. <volume>1</volume>, <fpage>942</fpage>&#x2013;<lpage>943</lpage>. doi: <pub-id pub-id-type="doi">10.1109/COMPSAC.2019.00152</pub-id>.</citation></ref>
<ref id="bib009"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Neal</surname><given-names>B.</given-names></name></person-group> <etal>et al.</etal> (<year>2018</year>). <article-title>A modern take on the bias-variance tradeoff in neural networks</article-title>. <source>arXiv Prepr arXiv1810.08591</source>.</citation></ref>
<ref id="bib0010"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Shalev-Shwartz</surname><given-names>S.</given-names></name><name><surname>Ben-David</surname><given-names>S.</given-names></name></person-group> (<year>2014</year>). <source>Understanding machine learning: From theory to algorithms</source>. <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="bib0011"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>K.</given-names></name></person-group> (<year>2016</year>). <article-title>Integration of machine learning and human learning for training optimization in robust linear regression Xiaohua Li , Yu Chen State University of New York at Binghamton Department of ECE , Binghamton , NY 13902</article-title>,<source> Icassp</source>. <fpage>2613</fpage>&#x2013;<lpage>2617</lpage></citation></ref>
</ref-list>
</back>
</article>