NumPy: Basic Statistics

from:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-4-numpy?ex=13

  • Average versus median

You now know how to use numpy functions to get a better feeling for your data. It basically comes down to importingnumpy and then calling several simple functions on the numpyarrays:

import numpy as np
x = [1, 4, 8, 10, 12]
np.mean(x)
np.median(x)

# np_baseball is available

# Import numpy
import numpy as np

# Create np_height from np_baseball
np_height = np.array(np_baseball)[:,0]

# Print out the mean of np_height
print(np.mean(np_height))

# Print out the median of np_height
print(np.median(np_height))

  • Explore the baseball data

# np_baseball is available

# Import numpy
import numpy as np

# Print mean height (first column)
avg = np.mean(np_baseball[:,0])
print("Average: " + str(avg))

# Print median height. Replace 'None'
med = np.median(np_baseball[:,0])
print("Median: " + str(med))

# Print out the standard deviation on height. Replace 'None'
stddev = np.std(np_baseball[:,0])
print("Standard Deviation: " + str(stddev))

# Print out correlation between first and second column. Replace 'None'
corr = np.corrcoef(np_baseball[:,0],np_baseball[:,1])
print("Correlation: " + str(corr))

  • Blend it all together

You've contacted FIFA for some data and they handed you two lists. The lists are the following:

positions = ['GK', 'M', 'A', 'D', ...]
heights = [191, 184, 185, 180, ...]

Each element in the lists corresponds to a player. The first list,positions, contains strings representing each player's position. The possible positions are: 'GK' (goalkeeper), 'M' (midfield),'A' (attack) and 'D' (defense). The second list, heights, contains integers representing the height of the player in cm. The first player in the lists is a goalkeeper and is pretty tall (191 cm).

You're fairly confident that the median height of goalkeepers is higher than that of other players on the soccer field. Some of your friends don't believe you, so you are determined to show them using the data you received from FIFA and your newly acquired Python skills.

# heights and positions are available as lists

# Import numpy
import numpy as np

# Convert positions and heights to numpy arrays: np_positions, np_heights
np_positions = np.array(positions)
np_heights = np.array(heights)

# Heights of the goalkeepers: gk_heights
gk_heights = np_heights[np_positions == "GK"]

# Heights of the other players: other_heights
other_heights = np_heights[np_positions != "GK"]

# Print out the median height of goalkeepers. Replace 'None'
print("Median height of goalkeepers: " + str(np.median(gk_heights)))

# Print out the median height of other players. Replace 'None'
print("Median height of other players: " + str(np.median(other_heights)))

Intro to Python for Data Science Learning 8 - NumPy: Basic Statistics的更多相关文章

  1. Intro to Python for Data Science Learning 6 - NumPy

    NumPy From:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-4-numpy?ex=1 ...

  2. Intro to Python for Data Science Learning 7 - 2D NumPy Arrays

    2D NumPy Arrays from:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-4- ...

  3. Intro to Python for Data Science Learning 5 - Packages

    Packages From:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-3-functio ...

  4. Intro to Python for Data Science Learning 2 - List

    List from:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-2-python-list ...

  5. Intro to Python for Data Science Learning 4 - Methods

    Methods From:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-3-function ...

  6. Intro to Python for Data Science Learning 3 - functions

    Functions from:https://campus.datacamp.com/courses/intro-to-python-for-data-science/chapter-3-functi ...

  7. Intermediate Python for Data Science learning 2 - Histograms

    Histograms from:https://campus.datacamp.com/courses/intermediate-python-for-data-science/matplotlib? ...

  8. Intermediate Python for Data Science learning 1 - Basic plots with matplotlib

    Basic plots with matplotlib from:https://campus.datacamp.com/courses/intermediate-python-for-data-sc ...

  9. Intermediate Python for Data Science learning 3 - Customization

    Customization from:https://campus.datacamp.com/courses/intermediate-python-for-data-science/matplotl ...

随机推荐

  1. 使用atomic一定是线程安全的吗?

    这个问题很少遇到,但是答案当然不是.atomic在set方法里加了锁,防止了多线程一直去写这个property,造成难以预计的数值.但这也只是读写的锁定.跟线程安全其实还是差一些.看下面. @inte ...

  2. 教你写gulp plugin

    前端开发近两年工程化大幅飙升.随着Nodejs大放异彩,静态文件处理不再需要其他语言辅助.主要的两大工具即为基于文件的grunt,基于流的gulp.简单来说,如果需要的只是文件处理,gulp绝对首选. ...

  3. git回退之前版本

    所有没有 commit 的本地改动,都会随着 reset --hard 丢掉,无法恢复. 如果只是想回到 pull 之前当前分支所在的commit位置,则可以.比方说你在 master 分支上,可以用 ...

  4. [分布式系统学习] 6.824 LEC2 RPC和线程 笔记

    6.824的课程通常是在课前让你做一些准备.一般来说是先读一篇论文,然后请你提一个问题,再请你回答一个问题.然后上课,然后布置Lab. 第二课的准备-Crawler 第二课的准备不是论文,是让你实现G ...

  5. C++虚函数virtual,纯虚函数pure virtual和Java抽象函数abstract,接口interface与抽象类abstract class的比较

    由于C++和Java都是面向对象的编程语言,它们的多态性就分别靠虚函数和抽象函数来实现. C++的虚函数可以在子类中重写,调用是根据实际的对象来判别的,而不是通过指针类型(普通函数的调用是根据当前指针 ...

  6. python----并发之协程

    <python并发之协程>一: 单线程下实现并发,即只在一个主线程,并且cpu只有一个的情况下实现并发.(并发的本质:切换+保存状态) cpu正在运行一个任务,会在两种情况下切去执行其他的 ...

  7. poj1001 Exponentiation【java大数】

    Exponentiation Time Limit: 500MS   Memory Limit: 10000K Total Submissions: 183034   Accepted: 44062 ...

  8. 一般图的着色 - [Welch Powell法][贪心]

    原本这是离散数学的期末作业,因为对图论比较熟悉,就先看了一下图论题: 引用<离散数学(左孝凌版)>(其实就是我们的离散数学课本……): 用韦尔奇·鲍威尔法对图G进行着色,其方法是: a)将 ...

  9. TCP报文

    源端口和目的端口: 各占16位 ,服务相对应的源端口和目的端口. 序列号: 占32位,它的范围在[0~2^32-1],序号随着通信的进行不断的递增,当达到最大值的时候重新回到0在开始递增.TCP是面向 ...

  10. Python几种并发实现方案的性能比较

    http://blog.csdn.net/permike/article/details/54846831 Python几种并发实现方案的性能比较 2017-02-03 14:33 1541人阅读 评 ...