Chapter 15 References

Ackley, David H, Geoffrey E Hinton, and Terrence J Sejnowski. 1985. “A Learning Algorithm for Boltzmann Machines.” Cognitive Science.

“AI and Compute.” 2019. https://openai.com/blog/ai-and-compute.

“Apache Solr.” 2019. http://lucene.apache.org/solr/.

“Apache Spark and Cern Open Data Analysis, an Example.” 2017. https://bit.ly/2KoTGlc.

“Apache Spark Officially Sets a New Record in Large-Scale Sorting.” 2014. https://bit.ly/2r95oHX.

“AutoML: Automatic Machine Learning.” 2019. http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html.

“Azure Wikipedia.” 2018. https://en.wikipedia.org/wiki/Microsoft_Azure.

“Big Compute Nimbix.” 2019. https://www.nimbix.net/glossary/big-compute/.

“Big Compute Vs Big Data.” 2013. https://bit.ly/2FhjStV.

“Big Data Wikipedia.” 2019. https://bit.ly/2XnLHec.

“Bioinformatics Applications on Apache Spark.” 2018. https://bit.ly/2KUZEdb.

Ceruzzi, Paul E. 2012. Computing: A Concise History. MIT Press.

Chang, Winston. 2012. R Graphics Cookbook: Practical Recipes for Visualizing Data. O’Reilly Media, Inc.

Chollet, Francois, and J.J. Allaire. 2018. Deep Learning with R. Manning Publications.

Cleveland, William S. 2001. “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics?”

“Cloudera Wikipedia.” 2018. https://en.wikipedia.org/wiki/Cloudera.

Codd, Edgar F. 1970. “A Relational Model of Data for Large Shared Data Banks.” ACM.

Cook, Darren. 2016. Practical Machine Learning with H2o: Powerful, Scalable Techniques for Deep Learning and Ai. O’Reilly Media, Inc.

“CRAN - Package Sparklyr.” 2019. https://cran.r-project.org/web/packages/sparklyr/index.html.

“Databricks Community Edition.” 2019. https://community.cloud.databricks.com.

“Databricks Documentation.” 2018. https://docs.databricks.com/spark/latest/sparkr/sparklyr.html.

“Databricks Wikipedia.” 2018. https://en.wikipedia.org/wiki/Databricks.

“Dataproc Wikipedia.” 2018. https://en.wikipedia.org/wiki/Google_Cloud_Dataproc.

Dean, Jeffrey, and Sanjay Ghemawat. 2004. “MapReduce: Simplified Data Processing on Large Clusters.” In USENIX Symposium on Operating System Design and Implementation (Osdi).

Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung. 2003. “The Google File System.” In Proceedings of the Nineteenth Acm Symposium on Operating Systems Principles. New York, NY, USA: ACM.

Greenacre, Michael. 2017. Correspondence Analysis in Practice. Chapman; Hall/CRC.

Group, World Bank. 2016. The Data Revolution. World Bank Publications.

“Higgs Boson Machine Learning Challenge.” 2019. https://www.kaggle.com/c/higgs-boson.

Hinton, Geoffrey E, Simon Osindero, and Yee-Whye Teh. 2006. “A Fast Learning Algorithm for Deep Belief Nets.” Neural Computation 18 (7): 1527–54.

“Hortonworks Microsoft.” 2018. https://hortonworks.com/partner/microsoft/.

“Hortonworks Wikipedia.” 2018. https://en.wikipedia.org/wiki/Hortonworks.

“Human Genome.” 2019. https://en.wikipedia.org/wiki/Human_genome.

“IBM Cloud Wikipedia.” 2018. https://en.wikipedia.org/wiki/IBM_cloud_computing.

Kim, Albert Y, and Adriana Escobedo-Land. 2015. “OKCupid Data for Introductory Statistics and Data Science Courses.” Journal of Statistics Education 23 (2).

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “Imagenet Classification with Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems, 1097–1105.

Kuhn, Max, and Kjell Johnson. 2019. “Feature Engineering and Selection: A Practical Approach for Predictive Models.” Chapman; Hall/CRC.

Laudon, Kenneth C, Carol Guercio Traver, and Jane P Laudon. 1996. “Information Technology and Systems.” Cambridge, MA: Course Technology.

“MapR Wikipedia.” 2018. https://en.wikipedia.org/wiki/MapR.

“Maven Repository: Repositories.” 2019. https://mvnrepository.com/repos.

Minsky, Marvin, and Seymour A Papert. 2017. Perceptrons: An Introduction to Computational Geometry. MIT press.

“Netflix at Spark.” 2018. https://bit.ly/2MUlqwf.

“Profvis.” 2018. https://rstudio.github.io/profvis/.

Rosenblatt, Frank. 1958. “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain.” Psychological Review.

“RSparkling — H2o Sparkling Water 2.3.31 Documentation.” 2019. http://docs.h2o.ai/sparkling-water/2.3/latest-stable/doc/rsparkling.html.

“RStudio Connect.” 2019. https://www.rstudio.com/products/connect/.

“RStudio Profiler.” 2018. https://bit.ly/2RqJPw8.

“RStudio Server Pro.” 2019. https://www.rstudio.com/products/rstudio-server-pro/.

“Running Spark on Mesos.” 2018. https://spark.apache.org/docs/latest/running-on-mesos.html.

“Running Spark on Yarn.” 2018. https://spark.apache.org/docs/latest/running-on-yarn.html.

Samuel, Arthur L. 1959. “Some Studies in Machine Learning Using the Game of Checkers.” IBM Journal of Research and Development 3 (3): 210–29.

Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. O’Reilly Media, Inc.

“Sort Benchmark.” 2014. http://sortbenchmark.org/.

“Spark-Solr Spark Package.” 2019. https://spark-packages.org/package/LucidWorks/spark-solr.

“Spark Wins Cloudsort Benchmark as the Most Efficient Engine.” 2016. https://bit.ly/2DexBmm.

“Spark with R in Gitter.” 2019. https://gitter.im/rstudio/sparklyr.

“The History of R’s Predecessor, S, from Co-Creator Rick Becker.” 2016. https://bit.ly/2MSTm0j.

Webster, Merriam. 2006. “Merriam-Webster Online Dictionary.” Webster, Merriam.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.

Wu, C.F. Jeff. 1997. “Statistics = Data Science?”

Xie, Grolemund, Allaire. 2018. R Markdown: The Definite Guide. 1st ed. CRC Press.

Zaharia, Matei, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. “Spark: Cluster Computing with Working Sets.” HotCloud 10 (10-10): 95.

Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301–20.