新书报道
当前位置: 首页 >> 电类优秀教材 >> 正文
R语言机器学习参考手册
发布日期:2016-05-12  浏览

 

[内容推荐]
  R语言是一种强大的开源函数式编程语言。从本 质上看,R语言是一种统计编程语言,可以提供丰富 的工具用于分析数据并创建高级图形。
  丘祐玮所*的《R语言机器学习参考手册(影印版 )(英文版)》通过设置一个用户友好的编程环境并使 用R语言进行数据ETL来介绍R语言基础知识。提供数 据探索样例,以展示R语言的数据可视化和机器学习 功能在探索隐含关系方面的强大能力。你将深入了解 重要的机器学习主题,包括数据分类、回归、聚类、 关联规则挖掘、降维等。
[目录]
Preface
Chapter 1: Practical Machine Learning with R
  Introduction
  Downloading and installing R
  Downloading and installing RStudio
  Installing and loading packages
  Reading and writing data
  Using R to manipulate data
  Applying basic statistics
  Visualizing data
  Getting a dataset for machine learning
Chapter 2: Data Exploration with RMS Titanic
  Introduction
  Reading a Titanic dataset from a CSV file
  Converting types on character variables
  Detecting missing values
  Imputing missing values
  Exploring and visualizing data
  Predicting passenger survival with a decision tree
  Validating the power of prediction with a confusion matrix
  Assessing performance with the ROC curve
Chapter 3: R and Statistics
  Introduction
  Understanding data sampling in R
  Operating a probability distribution in R
  Working with univariate descriptive statistics in R
  Performing correlations and multivariate analysis
  Operating linear regression and multivariate analysis
  Conducting an exact binomial test
  Performing student's t-test
  Performing the Kolmogorov-Smirnov test
  Understanding the Wilcoxon Rank Sum and Signed Rank test
  Working with Pearson's Chi-squared test
  Conducting a one-way ANOVA
  Performing a two-way ANOVA
Chapter 4: Understanding Regression Analysis
  Introduction
  Fitting a linear regression model with Im
  Summarizing linear model fits
  Using linear regression to predict unknown values
  Generating a diagnostic plot of a fitted model
  Fitting a polynomial regression model with Im
  Fitting a robust linear regression model with rim
  Studying a case of linear regression on SLID data
  Applying the Gaussian model for generalized linear regression
  Applying the Poisson model for generalized linear regression
  Applying the Binomial model for generalized linear regression
  Fitting a generalized additive model to data
  Visualizing a generalized additive model
  Diagnosing a generalized additive model
Chapter 5: Classification (I) - Tree, Lazy, and Probabilistic
  Introduction
  Preparing the training and testing datasets
  Building a classification model with recursive partitioning trees
  Visualizing a recursive partitioning tree
  Measuring the prediction performance of a recursive partitioning tree
  Pruning a recursive partitioning tree
  Building a classification model with a conditional inference tree
  Visualizing a conditional inference tree
  Measuring the prediction performance of a conditional inference tree
  Classifying data with the k-nearest neighbor classifier
  Classifying data with logistic regression
  Classifying data with the Naive Bayes classifier
Chapter 6: Classification (II) - Neural Network and SVM
  Introduction
  Classifying data with a support vector machine
  Choosing the cost of a support vector machine
  Visualizing an SVM fit
  Predicting labels based on a model trained by a support vector machine
  Tuning a support vector machine
  Training a neural network with neuralnet
  Visualizing a neural network trained by neuralnet
  Predicting labels based on a model trained by neuralnet
  Training a neural network with nnet
  Predicting labels based on a model trained by nnet
Chapter 7: Model Evaluation
  Introduction
  Estimating model performance with k-fold cross-validation
  Performing cross-validation with the e1071 package
  Performing cross-validation with the caret package
  Ranking the variable importance with the caret package
  Ranking the variable importance with the trainer package
  Finding highly correlated features with the caret package
  Selecting features using the caret package
  Measuringthe performance of the regression model
  Measuring prediction performance with a confusion matrix
  Measuring prediction performance using ROCR
  Comparing an ROC curve using the caret package
  Measuring performance differences between models with the caret package
Chapter 8: Ensemble Learning
  Introduction
  Classifying data with the bagging method
  Performing cross-validation with the bagging method
  Classifying data with the boosting method
  Performing cross-validation with the boosting method
  Classifying data with gradient boosting
  Calculating the margins of a classifier
  Calculating the error evolution of the ensemble method
  Classifying data with random forest
  Estimating the prediction errors of different classifiers
Chapter 9: Clustering
  Introduction
  Clustering data with hierarchical clustering
  Cutting trees into clusters
  Clustering data with the k-means method
  Drawing a bivariate cluster plot
  Comparing clustering methods
  Extracting silhouette information from clustering
  Obtaining the optimum number of clusters for k-means
  Clustering data with the density-based method
  Clustering data with the model-based method
  Visualizing a dissimilarity matrix
  Validating clusters externally
Chapter 10: Association Analysis and Sequence Mining
  Introduction
  Transforming data into transactions
  Displaying transactions and associations
  Mining associations with the Apriori rule
  Pruning redundant rules
  Visualizing association rules
  Mining frequent itemsets with Eclat
  Creating transactions with temporal information
  Mining frequent sequential patterns with cSPADE
Chapter 11: Dimension Reduction
  Introduction
  Performing feature selection with FSelector
  Performing dimension reduction with PCA
  Determining the number of principal components using the scree test
  Determining the number of principal components using the Kaiser method
  Visualizing multivariate data using biplot
  Performing dimension reduction with MDS
  Reducing dimensions with SVD
  Compressing images with SVD
  Performing nonlinear dimension reduction with ISOMAP
  Performing nonlinear dimension reduction with Local Linear Embedding
Chapter 12: Big Data Analysis(R and Hadoop)
  Introduction
  Preparing the RHadoop environment
  Installing rmr2
  Installing rhdfs
  Operating HDFS with rhdfs
  Implementing a word count problem with RHadoop
  Comparing the performance between an R MapReduce program and a standard R program
  Testing and debugging the rmr2 program
  Installing plyrmr
  Manipulating data with plyrmr
  Conducting machine learning with RHadoop
  Configuring RHadoop clusters on Amazon EMR
Appendix A: Resources for R and Machine Learning
Appendix B: Dataset - Survival of Passengers on the Titanic
Index

 

关闭


版权所有:西安交通大学图书馆      设计与制作:西安交通大学数据与信息中心  
地址:陕西省西安市碑林区咸宁西路28号     邮编710049

推荐使用IE9以上浏览器、谷歌、搜狗、360浏览器;推荐分辨率1360*768以上