写在前面 ETL Pipeline 学习资源 Ref: 使用 AWS Glue 和 Amazon Athena 实现无服务器的自主型机器学习 Ref: AWS Glue 常见问题 Extract is the process of reading data from a database. In this stage, the data is collected, often from multiple and different types of sources. Transform is t…
Data Engineering Data Pipeline Outline [DE] How to learn Big Data[了解大数据] [DE] Pipeline for Data Engineering[工作流案例示范] [DE] ML on Big data: MLlib[大数据的机器学习方案] DE基础(厦大) [Spark] 00 - Install Hadoop & Spark[ing] [Spark] 01 - What is Spark[大数据生态库] [Spark]…