IBM HR员工减员 数据取自此处要解决的主要业务问题是如何创建系统以帮助大公司通过了解哪个员工可能离职来控制其减员,从而为他/她提供一些激励措施。留下来。 如何导航? 注意: 3X项目仅使用Python 3.X和Tableau 10.0及更高版本进行分析 PPT-包含业务问题和转换为DS问题 Tableau-EDA洞察 功能选择 各种分类模型 最终PPT-解释 报告 安装 $ pip install imblearn # For Smote 问题陈述 我们的客户是ABC一家领先的公司,在该领域表现良好。 最近,它的员工流失率急剧上升。 在过去的一年中,员工流失率已从14%上升到25%。 我们被要求制定一项战略,以立即解决该问题,以免影响公司的业务发展,并提出长期有效的员工满意度计划。 当前,尚无此类程序。 不能再加薪。 幻灯片在 探索性数据分析 数据是不平衡的,我们有83%的人尚未离
2024-10-11 07:03:26 16.14MB python data-science data random-forest
1
Thoughtful Data Science: A Programmer's Toolset for Data Analysis and Artificial Intelligence with Python, Jupyter Notebook, and PixieDust Bridge the gap between developer and data scientist by creating a modern open-source, Python-based toolset that works with Jupyter Notebook, and PixieDust. Key Features Think deeply as a developer about your strategy and toolset in data science Discover the best tools that will suit you as a developer in your data analysis Accelerate the road to data insight as a programmer using Jupyter Notebook Deep dive into multiple industry data science use cases Book Description Thoughtful Data Science brings new strategies and a carefully crafted programmer's toolset to work with modern, cutting-edge data analysis. This new approach is designed specifically to give developers more efficiency and power to create cutting-edge data analysis and artificial intelligence insights. Industry expert David Taieb bridges the gap between developers and data scientists by creating a modern open-source, Python-based toolset that works with Jupyter Notebook, and PixieDust. You'll find the right balance of strategic thinking and practical projects throughout this book, with extensive code files and Jupyter projects that you can integrate with your own data analysis. David Taieb introduces four projects designed to connect developers to important industry use cases in data science. The first is an image recognition application with TensorFlow, to meet the growing importance of AI in data analysis. The second analyses social media trends to explore big data issues and natural language processing. The third is a financial portfolio analysis application using time series analysis, pivotal in many data science applications today. The fourth involves applying graph algorithms to solve data problems. Taieb wraps up with a deep look into the future of data science for developers and his views on AI for data science. What you will learn Bridge the gap between developer and data scientist with a Python-based toolset Get the most out of Jupyter Notebooks with new productivity-enhancing tools Explore and visualize data using Jupyter Notebooks and PixieDust Work with and assess the impact of artificial intelligence in data science Work with TensorFlow, graphs, natural language processing, and time series Deep dive into multiple industry data science use cases Look into the future of data analysis and where to develop your skills Who this book is for This book is for established developers who want to bridge the gap between programmers and data scientists. With the introduction of PixieDust from its creator, the book will also be a great desk companion for the already accomplished Data Scientist. Some fluency in data interpretation and visualization is also assumed since this book addresses data professionals such as business and general data analysts. It will be helpful to have some knowledge of Python, using Python libraries, and some proficiency in web development. Table of Contents Chapter 1 Perspectives on Data Science from a Developer Chapter 2 Data Science at Scale with Jupyter Notebooks and PixieDust Chapter 3 PixieApp under the Hood Chapter 4 Deploying PixieApps to the Web with the PixieGateway Server Chapter 5 Best Practices and Advanced PixieDust Concepts Chapter 6 Image Recognition with TensorFlow Chapter 7 Big Data Twitter Sentiment Analysis Chapter 8 Financial Time Series Analysis and Forecasting Chapter 9 US Domestic Flight Data Analysis Using Graphs Chapter 10 Final Thoughts
2024-07-28 12:25:03 22.87MB Data  Science AI  Financial
1
《Python数据科学手册》是Jake VanderPlas撰写的一本针对数据科学和机器学习工具的权威指南,特别适合已经熟悉Python编程的科学家和数据分析师。这本书的2023年版全面更新,旨在帮助读者掌握使用Python进行数据分析的核心工具。 1. **IPython与Jupyter**: IPython是一个交互式计算环境,而Jupyter Notebook是基于Web的界面,让科学家能够以交互方式编写和展示代码、数据和可视化结果。这两个工具结合,为数据科学家提供了强大且灵活的工作平台,支持多语言,便于合作和文档记录。 2. **NumPy**: NumPy是Python的一个核心库,提供了多维数据结构`ndarray`,用于高效存储和处理大型数组数据。NumPy还包含数学函数库,支持向量和矩阵运算,是进行数值计算的基础。 3. **Pandas**: Pandas是构建在NumPy之上的数据处理库,其DataFrame对象提供了一种高效的方式来组织和操作结构化或标签数据。DataFrame允许用户轻松地清洗、转换和合并数据,非常适合进行数据预处理工作。 4. **Matplotlib**: Matplotlib是Python最常用的绘图库,支持创建各种静态、动态和交互式的可视化。它提供了一套类似于MATLAB的API,可以绘制2D和3D图形,并支持自定义颜色、样式、标签等元素,满足复杂的数据可视化需求。 5. **Scikit-Learn**: Scikit-Learn是Python中广泛使用的机器学习库,提供了大量预包装的算法,包括监督学习(如分类、回归和聚类)和无监督学习方法。Scikit-Learn的API设计简洁,使得构建和评估机器学习模型变得简单。 6. **其他相关工具**: 除了上述工具,书中可能还会涵盖其他辅助工具,如用于数据处理的Pandas扩展库(如Dask、Pyspark),用于统计分析的Statsmodels,以及用于深度学习的TensorFlow和Keras等。 通过本书,读者将能够: - 学习如何利用IPython和Jupyter Notebook进行高效的数据探索和分析。 - 掌握NumPy和Pandas进行数据存储、清洗、转换和操纵的技巧。 - 使用Matplotlib创建各种图表,以视觉方式表达数据。 - 了解并应用Scikit-Learn构建机器学习模型,包括训练、验证和优化模型。 - 探索和整合其他相关工具,以扩展Python数据科学工具箱。 Jake VanderPlas,作为本书的作者,拥有丰富的经验,他在Google Research担任软件工程师,专注于开发支持数据密集型研究的工具,包括Scikit-Learn在内的Python库,确保了书中的内容既实用又前沿。这本书是Python数据科学家必备的参考资源,无论你是初学者还是经验丰富的专业人士,都能从中受益。
2024-07-24 11:37:14 19.7MB python
1
python data science handbook-english version python data science handbook-english version
2024-07-24 11:30:15 20.47MB python
1
Phishing_Website_Detection:该项目基于使用随机森林分类公式检测网络钓鱼欺诈性网站。 使用Python编程语言和Django框架实现
2024-05-20 11:25:47 53KB python security data-science machine-learning
1
R for Data Science
2024-02-20 21:27:03 32.31MB R Data Science
1
加州火区分析:《洛杉矶时报》对火灾危险区内的加利福尼亚建筑物的分析
2024-02-03 21:50:48 1.2GB python data-science news jupyter-notebook
1
data_science_byFLW-master.zip
2023-11-26 17:33:00 41.4MB pythpn
1
Python备忘单 这个代表越来越多的Python速查表清单。 发现错别字或有建议吗? 分叉,做出贡献并根据您的口味进行调整! 目前包括: 安装 如果要单独安装软件包,请进入相应的.md文件以获取有关如何安装的说明。 从项目根目录: 通过点: $ pip install -r requirements.txt 通过Anaconda: $ conda install --yes --file requirements.txt 这会将所有软件包安装到您的环境中。 未来的补充: 基本的Python语法 (大概)熊猫 PyBind
2023-11-23 23:09:13 535KB python data-science numpy scikit-learn
1
精致的Python常用库的Cheat Sheet,共7个,含numpy, scipy, pandas, matplotlib等等
2023-11-19 06:02:05 1.89MB Python DataScience Numpy Pandas
1