Installing scikit-learn

http://scikit-learn.org/stable/install.html

Installing scikit-learn

There are different ways to get scikit-learn installed:

Note

If you wish to contribute to the project, it’s recommended you install the latest development version.

Installing an official release

Scikit-learn requires:

  • Python (>= 2.6 or >= 3.3),
  • NumPy (>= 1.6.1),
  • SciPy (>= 0.9).

Windows

First you need to install numpy and scipy from their own official installers.

Wheel packages (.whl files) for scikit-learn from PyPI can be installed with the pip utility. Open a console and type the following to install or upgrade scikit-learn to the latest stable release:

pip install -U scikit-learn

If there are no binary packages matching your Python version you might to try to install scikit-learn and its dependencies fromChristoph Gohlke Unofficial Windows installers or from a Python distribution instead.

Mac OSX

Scikit-learn and its dependencies are all available as wheel packages for OSX:

pip install -U numpy scipy scikit-learn

Linux

At this time scikit-learn does not provide official binary packages for Linux so you have to build from source.

Installing build dependencies

Installing from source requires you to have installed the scikit-learn runtime dependencies, Python development headers and a working C/C++ compiler. Under Debian-based operating systems, which include Ubuntu, you can install all these requirements by issuing:

sudo apt-get install build-essential python-dev python-setuptools \
python-numpy python-scipy \
libatlas-dev libatlas3gf-base

On recent Debian and Ubuntu (e.g. Ubuntu 13.04 or later) make sure that ATLAS is used to provide the implementation of the BLAS and LAPACK linear algebra routines:

sudo update-alternatives --set libblas.so.3 \
/usr/lib/atlas-base/atlas/libblas.so.3
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/atlas-base/atlas/liblapack.so.3

Note

In order to build the documentation and run the example code contains in this documentation you will need matplotlib:

sudo apt-get install python-matplotlib

Note

The above installs the ATLAS implementation of BLAS (the Basic Linear Algebra Subprograms library). Ubuntu 11.10 and later, and recent (testing) versions of Debian, offer an alternative implementation called OpenBLAS.

Using OpenBLAS can give speedups in some scikit-learn modules, but can freeze joblib/multiprocessing prior to OpenBLAS version 0.2.8-4, so using it is not recommended unless you know what you’re doing.

If you do want to use OpenBLAS, then replacing ATLAS only requires a couple of commands. ATLAS has to be removed, otherwise NumPy may not work:

sudo apt-get remove libatlas3gf-base libatlas-dev
sudo apt-get install libopenblas-dev sudo update-alternatives --set libblas.so.3 \
/usr/lib/openblas-base/libopenblas.so.0
sudo update-alternatives --set liblapack.so.3 \
/usr/lib/lapack/liblapack.so.3

On Red Hat and clones (e.g. CentOS), install the dependencies using:

sudo yum -y install gcc gcc-c++ numpy python-devel scipy

Building scikit-learn with pip

This is usually the fastest way to install or upgrade to the latest stable release:

pip install --user --install-option="--prefix=" -U scikit-learn

The --user flag ask pip to install scikit-learn in the $HOME/.local folder therefore not requiring root permission. This flag should make pip ignore any old version of scikit-learn previously installed on the system while benefitting from system packages for numpy and scipy. Those dependencies can be long and complex to build correctly from source.

The --install-option="--prefix=" flag is only required if Python has a distutils.cfg configuration with a predefinedprefix= entry.

From source package

Download the source package from http://pypi.python.org/pypi/scikit-learn/ , unpack the sources and cd into the source directory.

This packages uses distutils, which is the default way of installing python modules. The install command is:

python setup.py install

Third party distributions of scikit-learn

Some third-party distributions are now providing versions of scikit-learn integrated with their package-management systems.

These can make installation and upgrading much easier for users since the integration includes the ability to automatically install dependencies (numpy, scipy) that scikit-learn requires.

The following is an incomplete list of Python and OS distributions that provide their own version of scikit-learn.

Debian and derivatives (Ubuntu)

The Debian package is named python-sklearn (formerly python-scikits-learn) and can be installed using the following command:

sudo apt-get install python-sklearn

Additionally, backport builds of the most recent release of scikit-learn for existing releases of Debian and Ubuntu are available from the NeuroDebian repository .

A quick-‘n’-dirty way of rolling your own .deb package is to use stdeb.

Python(x,y) for Windows

The Python(x,y) project distributes scikit-learn as an additional plugin, which can be found in the Additional plugins page.

Canopy and Anaconda for all supported platforms

Canopy and Anaconda ships a recent version, in addition to a large set of scientific python library.

MacPorts for Mac OSX

The MacPorts package is named py<XY>-scikits-learn, where XY denotes the Python version. It can be installed by typing the following command:

sudo port install py26-scikit-learn

or:

sudo port install py27-scikit-learn

Arch Linux

Arch Linux’s package is provided through the official repositories as python-scikit-learn for Python 3 and python2-scikit-learn for Python 2. It can be installed by typing the following command:

# pacman -S python-scikit-learn

or:

# pacman -S python2-scikit-learn

depending on the version of Python you use.

NetBSD

scikit-learn is available via pkgsrc-wip:

http://pkgsrc.se/wip/py-scikit_learn

Fedora

The Fedora package is called python-scikit-learn for the Python 2 version and python3-scikit-learn for the Python 3 version. Both versions can be installed using yum:

$ sudo yum install python-scikit-learn

or:

$ sudo yum install python3-scikit-learn

Building on windows

To build scikit-learn on Windows you need a working C/C++ compiler in addition to numpy, scipy and setuptools.

Picking the right compiler depends on the version of Python (2 or 3) and the architecture of the Python interpreter, 32-bit or 64-bit. You can check the Python version by running the following in cmd or powershell console:

python --version

and the architecture with:

python -c "import struct; print(struct.calcsize('P') * 8)"

The above commands assume that you have the Python installation folder in your PATH environment variable.

For 32-bit Python it is possible use the standalone installers for Microsoft Visual C++ Express 2008 for Python 2 or Microsoft Visual C++ Express 2010 or Python 3.

Once installed you should be able to build scikit-learn without any particular configuration by running the following command in the scikit-learn folder:

python setup.py install

For the 64-bit architecture, you either need the full Visual Studio or the free Windows SDKs that can be downloaded from the links below.

The Windows SDKs include the MSVC compilers both for 32 and 64-bit architectures. They come as a GRMSDKX_EN_DVD.isofile that can be mounted as a new drive with a setup.exe installer in it.

Both SDKs can be installed in parallel on the same host. To use the Windows SDKs, you need to setup the environment of acmd console launched with the following flags (at least for SDK v7.0):

cmd /E:ON /V:ON /K

Then configure the build environment with:

SET DISTUTILS_USE_SDK=1
SET MSSdk=1
"C:\Program Files\Microsoft SDKs\Windows\v7.0\Setup\WindowsSdkVer.exe" -q -version:v7.0
"C:\Program Files\Microsoft SDKs\Windows\v7.0\Bin\SetEnv.cmd" /x64 /release

Finally you can build scikit-learn in the same cmd console:

python setup.py install

Replace v7.0 by the v7.1 in the above commands to do the same for Python 3 instead of Python 2.

Replace /x64 by /x86 to build for 32-bit Python instead of 64-bit Python.

The .whl package and .exe installers can be built with:

pip install wheel
python setup.py bdist_wheel bdist_wininst -b doc/logos/scikit-learn-logo.bmp

The resulting packages are generated in the dist/ folder.

It is possible to use MinGW (a port of GCC to Windows OS) as an alternative to MSVC for 32-bit Python. Not that extensions built with mingw32 can be redistributed as reusable packages as they depend on GCC runtime libraries typically not installed on end-users environment.

To force the use of a particular compiler, pass the --compiler flag to the build step:

python setup.py build --compiler=my_compiler install

where my_compiler should be one of mingw32 or msvc.

Bleeding Edge

See section Retrieving the latest code on how to get the development version. Then follow the previous instructions to build from source depending on your platform.

Testing

Testing scikit-learn once installed

Testing requires having the nose library. After installation, the package can be tested by executing from outside the source directory:

$ nosetests -v sklearn

Under Windows, it is recommended to use the following command (adjust the path to the python.exe program) as using thenosetests.exe program can badly interact with tests that use multiprocessing:

C:\Python34\python.exe -c "import nose; nose.main()" -v sklearn

This should give you a lot of output (and some warnings) but eventually should finish with a message similar to:

Ran 3246 tests in 260.618s
OK (SKIP=20)

Otherwise, please consider posting an issue into the bug tracker or to the Mailing List including the traceback of the individual failures and errors.

Testing scikit-learn from within the source folder

Scikit-learn can also be tested without having the package installed. For this you must compile the sources inplace from the source directory:

python setup.py build_ext --inplace

Test can now be run using nosetests:

nosetests -v sklearn/

This is automated by the commands:

make in

and:

make test

You can also install a symlink named site-packages/scikit-learn.egg-link to the development folder of scikit-learn with:

pip install --editable .

Installing scikit-learn的更多相关文章

  1. scikit learn 模块 调参 pipeline+girdsearch 数据举例:文档分类 (python代码)

    scikit learn 模块 调参 pipeline+girdsearch 数据举例:文档分类数据集 fetch_20newsgroups #-*- coding: UTF-8 -*- import ...

  2. (原创)(三)机器学习笔记之Scikit Learn的线性回归模型初探

    一.Scikit Learn中使用estimator三部曲 1. 构造estimator 2. 训练模型:fit 3. 利用模型进行预测:predict 二.模型评价 模型训练好后,度量模型拟合效果的 ...

  3. (原创)(四)机器学习笔记之Scikit Learn的Logistic回归初探

    目录 5.3 使用LogisticRegressionCV进行正则化的 Logistic Regression 参数调优 一.Scikit Learn中有关logistics回归函数的介绍 1. 交叉 ...

  4. Scikit Learn: 在python中机器学习

    转自:http://my.oschina.net/u/175377/blog/84420#OSC_h2_23 Scikit Learn: 在python中机器学习 Warning 警告:有些没能理解的 ...

  5. Scikit Learn

    Scikit Learn Scikit-Learn简称sklearn,基于 Python 语言的,简单高效的数据挖掘和数据分析工具,建立在 NumPy,SciPy 和 matplotlib 上.

  6. Linear Regression with Scikit Learn

    Before you read  This is a demo or practice about how to use Simple-Linear-Regression in scikit-lear ...

  7. Scikit Learn安装教程

    Windows下安装scikit-learn 准备工作 Python (>= 2.6 or >= 3.3), Numpy (>= 1.6.1) Scipy (>= 0.9), ...

  8. 如何使用scikit—learn处理文本数据

    答案在这里:http://www.tuicool.com/articles/U3uiiu http://scikit-learn.org/stable/modules/feature_extracti ...

  9. Query意图分析:记一次完整的机器学习过程(scikit learn library学习笔记)

    所谓学习问题,是指观察由n个样本组成的集合,并根据这些数据来预测未知数据的性质. 学习任务(一个二分类问题): 区分一个普通的互联网检索Query是否具有某个垂直领域的意图.假设现在有一个O2O领域的 ...

  10. 机器学习框架Scikit Learn的学习

    一   安装 安装pip 代码如下:# wget "https://pypi.python.org/packages/source/p/pip/pip-1.5.4.tar.gz#md5=83 ...

随机推荐

  1. Oracel JDBC URL 和 Driver 的获取

    Driver 的获取 Driver Name:   oracle.jdbc.driver.OracleDriver Oracel JDBC URL的获取: URL:   jdbc:oracle:thi ...

  2. XMPPFramework ios 例子中链接服务器失败,opnefire 服务器链接失败

    首先说下上周又做了几天得无用功, 之前一直用的是ejabberd ,这次换了opnefire,有人说opnefire跟新的xmpp协议不兼容,后来又更换成了ejabberd, Github 上得dem ...

  3. 对装饰模式(Decorator)的解读

    看过好多对装饰模式的讲解,他们几乎都有一句相同的话:对现有类功能的扩展.不知道大家怎么理解这句话的,之前我把”对功能的扩展“理解成”加功能=加方法“,比如Person类本来有两个功能:Eat 和 Ru ...

  4. angularjs应用骨架(3)

    好,继续上一章节我们继续聊聊angularjs骨架.开发任何一款优秀的应用都会面临一项非常困难的工作,那就是找到一种合适的方式方法把代码组织在合适的功能范围内.我们已经看过控制器的处理方式,它会提供一 ...

  5. Nodejs简单验证码ccap安装

    首先要求: node npm 安装时如果提示npm-gyp失败,可进行如下操作: 确认python版本2.7+ 安装npm install ccap 如果失败,尝试npm install ccap@0 ...

  6. Android学习3—电话拨号器

    本测试主要实现了一个Android的拨打电话的功能 一:界面预览 由图中可以看出,这个Activity需要3个控件:TextView.EditText.Button 其实实现一个功能要经过几个步骤: ...

  7. 如何在一个网站或者一个页面,去书写你的JS代码

    // JavaScript Document //如何在一个网站或者一个页面,去书写你的JS代码: //1.js的分层(功能) : jquery(tools) 组件(ui) 应用(app), mvc( ...

  8. iOS狂暴之路---iOS的第一个应用中能学到哪些知识

    一.前文回顾 在之前已经介绍了 iOS的学习路线图,因为中间遇到一些Android开发问题,所以就耽搁了一段时间,那么接下来的这段时间我们将继续开始iOS的狂暴之路学习,按照国际惯例,第一个应用当然是 ...

  9. PLSQL性能优化技巧

    1.理解执行计划1-1.什么是执行计划 oracle数据库在执行sql语句时,oracle的优化器会根据一定的规则确定sql语句的执行路径,以确保sql语句能以最优性能执行.在oracle数据库系统中 ...

  10. PHPCMS GET标签使用

    大纲: get 标签概述get 标签语法get 标签创建工具get 调用本系统示例get 调用其他系统示例一.get 标签概述    通俗来讲,get 标签是Phpcms定义的能直接调用数据库里面内容 ...