前言


Let's go to https://www.kaggle.com/

Kaggle Notebook 有实践记录的案例。

一、线性拟合噪声数据

[Sklearn] Linear regression models to fit noisy data

二、打造 Pipeline

[Feature] Final pipeline: custom transformers

资源队列


阅读目录


Kaggle比赛源代码和讨论的收集整理。

Algorithmic Trading Challenge25

Allstate Purchase Prediction Challenge3

Amazon.com – Employee Access Challenge6

AMS 2013-2014 Solar Energy Prediction Contest2

Belkin Energy Disaggregation Competition1

Challenges in Representation Learning: Facial Expression Recognition Challenge4

Challenges in Representation Learning: The Black Box Learning Challenge1

Challenges in Representation Learning: Multi-modal Learning2

Detecting Insults in Social Commentary

EMI Music Data Science Hackathon

Galaxy Zoo – The Galaxy Challenge

Global Energy Forecasting Competition 2012 – Wind Forecasting

KDD Cup 2013 – Author-Paper Identification Challenge (Track 1)2

KDD Cup 2013 – Author Disambiguation Challenge (Track 2)1

Large Scale Hierarchical Text Classification4

Loan Default Prediction – Imperial College London

Merck Molecular Activity Challenge1

MLSP 2013 Bird Classification Challenge

Observing the Dark World

PAKDD 2014 – ASUS Malfunctional Components Prediction

Personalize Expedia Hotel Searches – ICDM 2013

Predicting a Biological Response1

Predicting Closed Questions on Stack Overflow

See Click Predict Fix1

See Click Predict Fix – Hackathon1

StumbleUpon Evergreen Classification Challenge

[The Analytics Edge (15.071x)](The%20Analytics Edge (15.071x))

The Marinexplore and Cornell University Whale Detection Challenge

Walmart Recruiting – Store Sales Forecasting1

Thank you FoxtrotJames PettersonBen S for providing some of the links and solutions above.


[PyData] 01 - Web Crawler的更多相关文章

  1. A web crawler design for data mining

    Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...

  2. [CareerCup] 10.5 Web Crawler 网络爬虫

    10.5 If you were designing a web crawler, how would you avoid getting into infinite loops? 这道题问如果让我们 ...

  3. (92) Web Crawling: How can I build a web crawler from scratch? - Quora

    (92) Web Crawling: How can I build a web crawler from scratch? - Quora How can I build a web crawler ...

  4. <Web Crawler><Java><thread-safe queue>

    Basic Solution The simplest way is to build a web crawler that runs on a single machine with single ...

  5. Free web scraping | Data extraction | Web Crawler | Octoparse, Free web scraping

    Free web scraping | Data extraction | Web Crawler | Octoparse, Free web scraping 人才知了

  6. 01.Web大前端时代之:HTML5+CSS3入门系列~初识HTML5

    Web大前端时代之:HTML5+CSS3入门系列:http://www.cnblogs.com/dunitian/p/5121725.html 文档申明 <!--文档类型申明,html代表是ht ...

  7. Tomcat笔记 #01# WEB应用管理工具简介

    索引 查看JVM以及SERVLET/接口的情况 动态管理WEB应用 Tomcat自带了一个基于网页的web应用管理工具,可以帮助我们监控&管理部署上去的WEB APP,特别方便!恰好之前碰到的 ...

  8. 【Web crawler】simulated DFS web crawler

    Finish crawl web learned from udacity 提示:在某些时候,你必须在page上调用get_page.这似乎违反直觉,但是我们用 page 这个词时,指的网页的网址 ( ...

  9. 01 Web框架介绍

    一.Web框架本质 所有的web应用程序本质上都是socket,用户的浏览器其实就是一个socket客户端. python中常用的web框架有: Django Flask web.py WSGI(we ...

随机推荐

  1. zookeepercli - Command Line Interface for ZooKeeper

    简介 ZooKeeper命令行界面(CLI)用于与ZooKeeper进行交互以用于开发目的.它对于调试很有用. 要执行ZooKeeper CLI操作,首先打开你的ZooKeeper服务器(“bin / ...

  2. Android跨进程通信:图文详解 Binder机制 原理

    binder原理讲的很详细 https://blog.csdn.net/carson_ho/article/details/73560642

  3. RouterOS双线进行IP分流上网

    环境: 1.第一条:电信静态IP,一级路由分配的IP:第二条:移动光纤 2.通过指定某些IP走电信,某些走移动 注意: 1.当有多条线路进行NAT伪装时,Out. Interface这个必须选择具体的 ...

  4. spring cloud:config-server中@RefreshScope的"陷阱"

    spring cloud的config-serfver主要用于提供分布式的配置管理,其中有一个重要的注解:@RefreshScope,如果代码中需要动态刷新配置,在需要的类上加上该注解就行.但某些复杂 ...

  5. 后端把Long类型的数据传给前端,前端可能会出现精度丢失的情况,以及解决方案

    后端把Long类型的数据传给前端,前端可能会出现精度丢失的情况.例如:201511200001725439这样一个Long类型的整数,传给前端后会变成201511200001725440. 解决方法: ...

  6. IOS 数据存储之 FMDB 详解

    FMDB是用于进行数据存储的第三方的框架,它与SQLite与Core Data相比较,存在很多优势. FMDB是面向对象的,它以OC的方式封装了SQLite的C语言API,使用起来更加的方便,不需要过 ...

  7. How to make an IntelliJ IDEA plugin in less than 30 minutes

    Sometimes it is a nice thing to extend an editor to have it do some new stuff, like being able to re ...

  8. windows多线程同步--信号量

    推荐参考博客:秒杀多线程第八篇 经典线程同步 信号量Semaphore   首先先介绍和windows信号量有关的两个API:创建信号量.释放信号量   HANDLE WINAPI CreateSem ...

  9. 【SqlServer】SqlServer的游标使用

    什么是游标 结果集,结果集就是select查询之后返回的所有行数据的集合. 游标则是处理结果集的一种机制吧,它可以定位到结果集中的某一行,多数据进行读写,也可以移动游标定位到你所需要的行中进行操作数据 ...

  10. HDU 3277 Marriage Match III(并查集+二分答案+最大流SAP)拆点,经典

    Marriage Match III Time Limit: 10000/4000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Othe ...