Feature Engineering for Machine Learning_Principles and Techniques for Data Scientists(2018.03).A4
2022-11-18 14:57:30 6.16MB 机器学习 特种工程
1
Feature Engineering for Machine Learning and Data Analytics (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series) ISBN-10 书号: 1138744387 ISBN-13 书号: 9781138744387 Edition 版本: 1 出版日期: 2018-04-04 pages 页数: 418 Chapter 1 Preliminaries and Overview Guozhu Dong and Huan Liu Part I Feature Engineering for Various Data Types Chapter 2 Feature Engineering for Text Data Chase Geigle, Qiaozhu Mei, and ChengXiang Zhai Chapter 3 Feature Extraction and Learning for Visual Data Parag S. Chandakkar, Ragav Venkatesan, and Baoxin Li Chapter 4 Feature-Based Time-Series Analysis Ben D. Fulcher Chapter 5 Feature Engineering for Data Streams Yao Ma, Jiliang Tang, and Charu Aggarwal Chapter 6 Feature Generation and Feature Engineering for Sequences Guozhu Dong, Lei Duan, Jyrki Nummenmaa, and Peng Zhang Chapter 7 Feature Generation for Graphs and NetworksYuan Yao, Hanghang Tong, Feng Xu, and Jian Lu Part lI General Feature Engineering Techniques Chapter 8 Feature Selection and Evaluation Yun Li and Tao Li Chapter 9 Automating Feature Engineering in Supervised Learning Udayan Khurana Chapter 10 Pattern-Based Feature Generation Yunzhe Jia, James Bailey, Ramamohanarao Kotagiri, and Christopher Leckie Chapter 11 Deep Learning for Feature Representation Suhang Wang and Huan Liu Part ll Feature Engineering in Special Applications Chapter 12 Feature Engineering for Social Bot Detection Onur Varol, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini Chapter 13 Feature Generation and Engineering for Software Analytics Xin Xia and David Lo Chapter 14 Feature Engineering for Twitter-Based Applications Sanjaya Wijeratne, Amit Sheth, Shreyansh Bhatt, Lakshika Balasuriya, Hussein S. Al-Olimat, Manas Gaur, Amir Hossein Yazdavar, Krishnaprasad Thirunarayan Index
2022-11-18 14:53:08 22.18MB Machine lear
1
sklearn-feature-engineering 前言 博主最近参加了几个kaggle比赛,发现做特征工程是其中很重要的一部分,而sklearn是做特征工程(做模型调算法)最常用也是最好用的工具没有之一,因此将自己的一些经验做一个总结分享给大家,希望对大家有所帮助。大家也可以到我的博客上看 有这么一句话在业界广泛流传,数据和特征决定了机器学习的上限,而模型和算法只是逼近这个上限而已。那特征工程到底是什么呢?顾名思义,其本质是一项工程活动,目的是最大限度地从原始数据中提取特征以供算法和模型使用。 特征工程主要分为三部分: 数据预处理 对应的sklearn包: 特征选择 对应的sklearn包: 降维 对应的sklearn包: 本文中使用sklearn中的IRIS(鸢尾花)数据集来对特征处理功能进行说明,首先导入IRIS数据集的代码如下: 1 from sklearn.datasets
2022-04-25 12:37:34 8KB sklearn kaggle feature-engineering Python
1
大佬的特征工程书籍,全面的很。
2022-01-27 09:37:32 838KB feature engineer
1
房屋价格预测 艾姆斯住房数据集摘自kaggle竞赛。 该项目的目的是预测Boston Housing Dataset中房屋的房价。 提供了两个文件,即训练和测试,并且要估计测试数据的价格。 在这里,我已使用XGBoost进行预测。 感谢Krish Naik制作了这些精彩的视频,以帮助他们理解和实施房价预测。 稍后,我将添加探索性数据分析,并将XGBoost模型的结果与其他回归技术进行比较。 房价预测步骤 加载数据中 数据探索2.1具有空值的特征2.2数值特征 2.2.1 Year Features 2.2.2 Discrete Features 2.2.3 Continous Features 2.3分类特征 数据清理 数据转换4.1稀有分类特征处理 基本模型性能(XGBoost) 超参数调整 最终模型 可视化结果 1.加载数据 df = pd . read_csv
1
Feature Engineering for Machine Learning and Data Analytics 英文无水印原版pdf pdf所有页面使用FoxitReader、PDF-XChangeViewer、SumatraPDF和Firefox测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书
2021-11-28 17:28:23 22.33MB Feature Engineering Machine Learning
1
The Machine Learning Pipeline 10 Data 11 Tasks 11 Models 12 Features 13 2. Basic Feature Engineering for Text Data: Flatten and Filter. . . . . . . . . . . . . . . . . . . . . . . 15 Turning Natural Text into Flat Vectors 15 Bag-of-words 16 Implementing bag-of-words: parsing and tokenization 20 Bag-of-N-Grams 21 Collocation Extraction for Phrase Detection 23 Quick summary 26 Filtering for Cleaner Features 26 Stopwords 26 Frequency-based filtering 27 Stemming 30 Summary 31 3. The Effects of Feature Scaling: From Bag-of-Words to Tf-Idf. . . . . . . . . . . . . . . . . . . . . . . 33 Tf-Idf : A Simple Twist on Bag-of-Words 33 Feature Scaling 35 Min-max scaling 35 Standardization (variance scaling) 36 L2 normalization 37 iii www.it-ebooks.info Putting it to the Test 38 Creating a classification dataset 39 Implementing tf-idf and feature scaling 40 First try: plain logistic regression 42 Second try: logistic regression with regularization 43 Discussion of results 46 Deep Dive: What is Happening? 47
2021-11-18 10:03:35 3.63MB ML Feature data
1
心率变异性分析 hrvanalysis是用于对RR间隔进行心率变异性分析的Python模块,建立在SciPy,AstroPy,Nolds和NumPy的基础上,并根据GPLv3许可进行分发。 该库的开发于2018年7月开始,属于研发团队的项目的一部分,由Robin Champseix维护。 完整文档: : 网址: : GitHub : : 版本:1.0.4 安装/先决条件 用户安装 安装hrv-analysis的最简单方法是使用pip : $ pip install hrv-analysis 您还可以克隆存储库: $ git clone https://github.com/Aura-healthcare/hrv-analysis.git $ python setup.py install 依存关系 hrvanalysis需要以下条件: Python(> =
1
最近更新 尝试探索的使用。 Mmodel堆叠,TBC .. 更新 注意:此处显示了此项目的更新。但是报告中的模型结果将在2020年12月3日保持不变。 尝试使用句子BERT(Siamese BERT)来改进Model 3,这不比具有暹罗BiLSTM功能的BERT好。 2020.12.15 已尝试ESIM。 2020.12.15 有关重复问题对识别项目的文档 Author: YUAN Yan Zhe, yanzheyuan23@sina.com written on **DEC 3rd, 2020** Collaborators: WEN Ze @WENZe79, YU Jia Hui @YUJIAHUII 项目说明 在自然语言处理(NLP)领域中,文本相似性是一个热点。测量某些NLP子区域中的句子或短语之间的相似性尤其重要,例如对话系统和信息检索。 Quora Question P
1
Sanet.st.Feature Engineering Made Easy - Sinan Ozdemir.epub(正式版) 让特征工程变得简单正式版 This book will cover the topic of feature engineering. A huge part of the data science and machine learning pipeline, feature engineering includes the ability to identify, clean, construct, and discover new characteristics of data for the purpose of interpretation and predictive analysis. In this book, we will be covering the entire process of feature engineering, from inspection to visualization, transformation, and beyond. We will be using both basic and advanced mathematical measures to transform our data into a form that's much more digestible by machines and machine learning pipelines. By discovering and transforming, we, as data scientists, will be able to gain a whole new perspective on our data, enhancing not only our algorithms but also our insights.
2019-12-21 21:23:23 5.51MB 特征工程
1