Brief introduction to Scala and Breeze for statistical computing
Brief introduction to Scala and Breeze for statistical computing
Introduction
In the previous post I outlined why I think Scala is a good language for statistical computing and data science . In this post I want to give a quick taste of Scala and the Breeze numerical library to whet the appetite of the uninitiated. This post certainly won’t provide enough material to get started using Scala in anger – but I’ll try and provide a few pointers along the way. It also won’t be very interesting to anyone who knows Scala – I’m not introducing any of the very cool Scala stuff here – I think that some of the most powerful and interesting Scala language features can be a bit frightening for new users.
To reproduce the examples, you need to install Scala and Breeze. This isn’t very tricky, but I don’t want to get bogged down with a detailed walk-through here – I want to concentrate on the Scala language and Breeze library. You just need to install a recent version of Java , then Scala , and then Breeze . You might also want SBT and/or theScalaIDE , though neither of these are necessary. Then you need to run the Scala REPL with the Breeze library in the classpath. There are several ways one can do this. The most obvious is to just run scala with the path to Breeze manually specified (or specified in an environment variable). Alternatively, you could run a console from an sbt session with a Breeze dependency (which is what I actually did for this post), or you could use a Scala Worksheet from inside a ScalaIDE project with a Breeze dependency.
A Scala REPL session
A first glimpse of Scala
We’ll start with a few simple Scala concepts that are not dependent on Breeze. For further information, see the Scala documentation .
Welcome to Scala version 2.10.3 (OpenJDK 64-Bit Server VM, Java 1.7.0_25).
Type in expressions to have them evaluated.
Type :help for more information. scala> val a = 5
a: Int = 5 scala> a
res0: Int = 5
So far, so good. Using the Scala REPL is much like using the Python or R command line, so will be very familiar to anyone used to these or similar languages. The first thing to note is that labels need to be declared on first use. We have declared a to be a val . These are immutable values , which can not be just re-assigned, as the following code illustrates.
scala> a = 6
<console>:8: error: reassignment to val
a = 6
^
scala> a
res1: Int = 5
Immutability seems to baffle people unfamiliar with functional programming. But fear not, as Scala allows declaration of mutable variables as well:
scala> var b = 7
b: Int = 7 scala> b
res2: Int = 7 scala> b = 8
b: Int = 8 scala> b
res3: Int = 8
The Zen of functional programming is to realise that immutability is generally a good thing, but that really isn’t the point of this post. Scala has excellent support for both mutable and immutable collections as part of the standard library. See the API docs for more details. For example, it has immutable lists.
scala> val c = List(3,4,5,6)
c: List[Int] = List(3, 4, 5, 6) scala> c(1)
res4: Int = 4 scala> c.sum
res5: Int = 18 scala> c.length
res6: Int = 4 scala> c.product
res7: Int = 360
Again, this should be pretty familiar stuff for anyone familiar with Python. Note that thesum and product methods are special cases of reduce operations, which are well supported in Scala. For example, we could compute the sum reduction using
scala> c.foldLeft(0)((x,y) => x+y)
res8: Int = 18
or the slightly more condensed form given below, and similarly for the product reduction.
scala> c.foldLeft(0)(_+_)
res9: Int = 18 scala> c.foldLeft(1)(_*_)
res10: Int = 360
Scala also has a nice immutable Vector class, which offers a range of constant time operations (but note that this has nothing to do with the mutable Vector class that is part of the Breeze library).
scala> val d = Vector(2,3,4,5,6,7,8,9)
d: scala.collection.immutable.Vector[Int] = Vector(2, 3, 4, 5, 6, 7, 8, 9) scala> d
res11: scala.collection.immutable.Vector[Int] = Vector(2, 3, 4, 5, 6, 7, 8, 9) scala> d.slice(3,6)
res12: scala.collection.immutable.Vector[Int] = Vector(5, 6, 7) scala> val e = d.updated(3,0)
e: scala.collection.immutable.Vector[Int] = Vector(2, 3, 4, 0, 6, 7, 8, 9) scala> d
res13: scala.collection.immutable.Vector[Int] = Vector(2, 3, 4, 5, 6, 7, 8, 9) scala> e
res14: scala.collection.immutable.Vector[Int] = Vector(2, 3, 4, 0, 6, 7, 8, 9)
Note that when e is created as an updated version of d the whole of d is not copied – only the parts that have been updated. And we don’t have to worry that aspects of d ande point to the same information in memory, as they are both immutable… As should be clear by now, Scala has excellent support for functional programming techniques. In addition to the reduce operations mentioned already, maps and filters are also well covered.
scala> val f=(1 to 10).toList
f: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) scala> f
res15: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) scala> f.map(x => x*x)
res16: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) scala> f map {x => x*x}
res17: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) scala> f filter {_ > 4}
res18: List[Int] = List(5, 6, 7, 8, 9, 10)
Note how Scala allows methods with a single argument to be written as an infix operator, making for more readable code.
A first look at Breeze
The next part of the session requires the Breeze library – see the Breeze quickstart guide for further details. We begin by taking a quick look at everyone’s favourite topic of non-uniform random number generation. Let’s start by generating a couple of draws from a Poisson distribution with mean 3.
scala> import breeze.stats.distributions._
import breeze.stats.distributions._ scala> val poi = Poisson(3.0)
poi: breeze.stats.distributions.Poisson = Poisson(3.0) scala> poi.draw
res19: Int = 2 scala> poi.draw
res20: Int = 3
If more than a single draw is required, an iid sample can be obtained.
scala> val x = poi.sample(10)
x: IndexedSeq[Int] = Vector(2, 3, 3, 4, 2, 2, 1, 2, 4, 2) scala> x
res21: IndexedSeq[Int] = Vector(2, 3, 3, 4, 2, 2, 1, 2, 4, 2) scala> x.sum
res22: Int = 25 scala> x.length
res23: Int = 10 scala> x.sum.toDouble/x.length
res24: Double = 2.5
Note that this Vector is mutable. The probability mass function (PMF) of the Poisson distribution is also available.
scala> poi.probabilityOf(2)
res25: Double = 0.22404180765538775 scala> x map {x => poi.probabilityOf(x)}
res26: IndexedSeq[Double] = Vector(0.22404180765538775, 0.22404180765538775, 0.22404180765538775, 0.16803135574154085, 0.22404180765538775, 0.22404180765538775, 0.14936120510359185, 0.22404180765538775, 0.16803135574154085, 0.22404180765538775) scala> x map {poi.probabilityOf(_)}
res27: IndexedSeq[Double] = Vector(0.22404180765538775, 0.22404180765538775, 0.22404180765538775, 0.16803135574154085, 0.22404180765538775, 0.22404180765538775, 0.14936120510359185, 0.22404180765538775, 0.16803135574154085, 0.22404180765538775)
Obviously, Gaussian variables (and Gamma, and several others) are supported in a similar way.
scala> val gau=Gaussian(0.0,1.0)
gau: breeze.stats.distributions.Gaussian = Gaussian(0.0, 1.0) scala> gau.draw
res28: Double = 1.606121255846881 scala> gau.draw
res29: Double = -0.1747896055492152 scala> val y=gau.sample(20)
y: IndexedSeq[Double] = Vector(-1.3758577012869702, -1.2148314970824652, -0.022501190144116855, 0.3244006323566883, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> y
res30: IndexedSeq[Double] = Vector(-1.3758577012869702, -1.2148314970824652, -0.022501190144116855, 0.3244006323566883, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> y.sum/y.length
res31: Double = -0.34064156102380994 scala> y map {gau.logPdf(_)}
res32: IndexedSeq[Double] = Vector(-1.8654307403000054, -1.6568463163564844, -0.9191916849836235, -0.9715564183413823, -0.9836614354155007, -1.3847302992371653, -1.0023094506890617, -0.9256472309869705, -1.3059361584943119, -0.975419259871957, -1.1669755840586733, -1.6444202843394145, -0.93783943912556, -0.9683690047171869, -0.9209315167224245, -2.090114759123421, -1.6843650876361744, -1.0915455053203147, -1.359378517654625, -1.1399116208702693) scala> Gamma(2.0,3.0).sample(5)
res33: IndexedSeq[Double] = Vector(2.38436441278546, 2.125017198373521, 2.333118708811143, 5.880076392566909, 2.0901427084667503)
This is all good stuff for those of us who like to do Markov chain Monte Carlo. There are not masses of statistical data analysis routines built into Breeze, but a few basic tools are provided, including some basic summary statistics.
scala> import breeze.stats.DescriptiveStats._
import breeze.stats.DescriptiveStats._ scala> mean(y)
res34: Double = -0.34064156102380994 scala> variance(y)
res35: Double = 0.574257149387757 scala> meanAndVariance(y)
res36: (Double, Double) = (-0.34064156102380994,0.574257149387757)
Support for linear algebra is an important part of any scientific library. Here the Breeze developers have made the wise decision to provide a nice Scala interface to netlib-java . This in turn calls out to any native optimised BLAS or LAPACK libraries installed on the system, but will fall back to Java code if no optimised libraries are available. This means that linear algebra code using Scala and Breeze should run as fast as code written in any other language, including C, C++ and Fortran, provided that optimised libraries are installed on the system. For further details see the Breeze linear algebra guide . Let’s start by creating and messing with a dense vector.
scala> import breeze.linalg._
import breeze.linalg._ scala> val v=DenseVector(y.toArray)
v: breeze.linalg.DenseVector[Double] = DenseVector(-1.3758577012869702, -1.2148314970824652, -0.022501190144116855, 0.3244006323566883, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> v(1) = 0 scala> v
res38: breeze.linalg.DenseVector[Double] = DenseVector(-1.3758577012869702, 0.0, -0.022501190144116855, 0.3244006323566883, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> v(1 to 3) := 1.0
res39: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 1.0, 1.0) scala> v
res40: breeze.linalg.DenseVector[Double] = DenseVector(-1.3758577012869702, 1.0, 1.0, 1.0, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> v(1 to 3) := DenseVector(1.0,1.5,2.0)
res41: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 1.5, 2.0) scala> v
res42: breeze.linalg.DenseVector[Double] = DenseVector(-1.3758577012869702, 1.0, 1.5, 2.0, 0.35978577573558407, 0.9651857500320781, -0.40834034207848985, 0.11583348205331555, -0.8797699986810634, -0.33609738668214695, 0.7043252811790879, -1.2045594639823656, 0.19442688045065826, -0.31442160076087067, 0.06313451540562891, -1.5304745838587115, -1.2372764884467027, 0.5875490994217284, -0.9385520597707431, -0.6647903243363228) scala> v :> 0.0
res43: breeze.linalg.BitVector = BitVector(1, 2, 3, 4, 5, 7, 10, 12, 14, 17) scala> (v :> 0.0).toArray
res44: Array[Boolean] = Array(false, true, true, true, true, true, false, true, false, false, true, false, true, false, true, false, false, true, false, false)
Next let’s create and mess around with some dense matrices.
scala> val m = new DenseMatrix(5,4,linspace(1.0,20.0,20).toArray)
m: breeze.linalg.DenseMatrix[Double] =
1.0 6.0 11.0 16.0
2.0 7.0 12.0 17.0
3.0 8.0 13.0 18.0
4.0 9.0 14.0 19.0
5.0 10.0 15.0 20.0 scala> m
res45: breeze.linalg.DenseMatrix[Double] =
1.0 6.0 11.0 16.0
2.0 7.0 12.0 17.0
3.0 8.0 13.0 18.0
4.0 9.0 14.0 19.0
5.0 10.0 15.0 20.0 scala> m.rows
res46: Int = 5 scala> m.cols
res47: Int = 4 scala> m(::,1)
res48: breeze.linalg.DenseVector[Double] = DenseVector(6.0, 7.0, 8.0, 9.0, 10.0) scala> m(1,::)
res49: breeze.linalg.DenseMatrix[Double] = 2.0 7.0 12.0 17.0 scala> m(1,::) := linspace(1.0,2.0,4)
res50: breeze.linalg.DenseMatrix[Double] = 1.0 1.3333333333333333 1.6666666666666665 2.0 scala> m
res51: breeze.linalg.DenseMatrix[Double] =
1.0 6.0 11.0 16.0
1.0 1.3333333333333333 1.6666666666666665 2.0
3.0 8.0 13.0 18.0
4.0 9.0 14.0 19.0
5.0 10.0 15.0 20.0 scala> scala> val n = m.t
n: breeze.linalg.DenseMatrix[Double] =
1.0 1.0 3.0 4.0 5.0
6.0 1.3333333333333333 8.0 9.0 10.0
11.0 1.6666666666666665 13.0 14.0 15.0
16.0 2.0 18.0 19.0 20.0 scala> n
res52: breeze.linalg.DenseMatrix[Double] =
1.0 1.0 3.0 4.0 5.0
6.0 1.3333333333333333 8.0 9.0 10.0
11.0 1.6666666666666665 13.0 14.0 15.0
16.0 2.0 18.0 19.0 20.0 scala> val o = m*n
o: breeze.linalg.DenseMatrix[Double] =
414.0 59.33333333333333 482.0 516.0 550.0
59.33333333333333 9.555555555555555 71.33333333333333 77.33333333333333 83.33333333333333
482.0 71.33333333333333 566.0 608.0 650.0
516.0 77.33333333333333 608.0 654.0 700.0
550.0 83.33333333333333 650.0 700.0 750.0 scala> o
res53: breeze.linalg.DenseMatrix[Double] =
414.0 59.33333333333333 482.0 516.0 550.0
59.33333333333333 9.555555555555555 71.33333333333333 77.33333333333333 83.33333333333333
482.0 71.33333333333333 566.0 608.0 650.0
516.0 77.33333333333333 608.0 654.0 700.0
550.0 83.33333333333333 650.0 700.0 750.0 scala> val p = n*m
p: breeze.linalg.DenseMatrix[Double] =
52.0 117.33333333333333 182.66666666666666 248.0
117.33333333333333 282.77777777777777 448.22222222222223 613.6666666666667
182.66666666666666 448.22222222222223 713.7777777777778 979.3333333333334
248.0 613.6666666666667 979.3333333333334 1345.0 scala> p
res54: breeze.linalg.DenseMatrix[Double] =
52.0 117.33333333333333 182.66666666666666 248.0
117.33333333333333 282.77777777777777 448.22222222222223 613.6666666666667
182.66666666666666 448.22222222222223 713.7777777777778 979.3333333333334
248.0 613.6666666666667 979.3333333333334 1345.0
So, messing around with vectors and matrices is more-or-less as convenient as in well-known dynamic and math languages. To conclude this section, let us see how to simulate some data from a regression model and then solve the least squares problem to obtain the estimated regression coefficients. We will simulate 1,000 observations from a model with 5 covariates.
scala> val X = new DenseMatrix(1000,5,gau.sample(5000).toArray)
X: breeze.linalg.DenseMatrix[Double] =
-0.40186606934180685 0.9847148198711287 ... (5 total)
-0.4760404521336951 -0.833737041320742 ...
-0.3315199616926892 -0.19460446824586297 ...
-0.14764615494496836 -0.17947658245206904 ...
-0.8357372755800905 -2.456222113596015 ...
-0.44458309216683184 1.848007773944826 ...
0.060314034896221065 0.5254462055311016 ...
0.8637867740789016 -0.9712570453363925 ...
0.11620167261655819 -1.2231380938032232 ...
-0.3335514290842617 -0.7487303696662753 ...
-0.5598937433421866 0.11083382409013512 ...
-1.7213395389510568 1.1717491221846357 ...
-1.078873342208984 0.9386859686451607 ...
-0.7793854546738327 -0.9829373863442161 ...
-1.054275201631216 0.10100826507456745 ...
-0.6947188686537832 1.215...
scala> val b0 = linspace(1.0,2.0,5)
b0: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 1.25, 1.5, 1.75, 2.0) scala> val y0 = X * b0
y0: breeze.linalg.DenseVector[Double] = DenseVector(0.08200546839589107, -0.5992571365601228, -5.646398002309553, -7.346136663325798, -8.486423788193362, 1.451119214541837, -0.25792385841948406, 2.324936340609002, -1.2285599639827862, -4.030261316643863, -4.1732627416377674, -0.5077151099958077, -0.2087263741903591, 0.46678616461409383, 2.0244342278575975, 1.775756468177401, -4.799821190728213, -1.8518388060564481, 1.5892306875621767, -1.6528539564387008, 1.4064864330994125, -0.8734630221484178, -7.75470002781836, -0.2893619536998493, -5.972958583649336, -4.952666733286302, 0.5431255990489059, -2.477076684976403, -0.6473617571867107, -0.509338416957489, -1.5415350935719594, -0.47068802465681125, 2.546118380362026, -7.940401988804477, -1.037049442788122, -1.564016663370888, -3.3147087994...
scala> val y = y0 + DenseVector(gau.sample(1000).toArray)
y: breeze.linalg.DenseVector[Double] = DenseVector(-0.572127338358624, -0.16481167194161406, -4.213873268823003, -10.142015065601388, -7.893898543052863, 1.7881055848475076, -0.26987820512025357, 3.3289433195054148, -2.514141419925489, -4.643625974157769, -3.8061000214061886, 0.6462624993109218, 0.23603338389134149, 1.0211137806779267, 2.0061727641393317, 0.022624943149799348, -5.429601401989341, -1.836181225242386, 1.0265599173053048, -0.1673732536615371, 0.8418249443853956, -1.1547110533101967, -8.392100167478764, -1.1586377992526877, -6.400362975646245, -5.487018086963841, 0.3038055584347069, -1.2247410435868684, -0.06476921390724344, -1.5039074374120407, -1.0189111630970076, 1.307339668865724, 2.048320821568789, -8.769328824477714, -0.9104251029228555, -1.3533910178496698, -2.178788...
scala> val b = X \ y // defaults to a QR-solve of the least squares problem
b: breeze.linalg.DenseVector[Double] = DenseVector(0.9952708232116663, 1.2344546192238952, 1.5543512339052412, 1.744091673457169, 1.9874158953720507)
So all of the most important building blocks for statistical computing are included in the Breeze library.
At this point it is really worth reminding yourself that Scala is actually a statically typedlanguage, despite the fact that in this session we have not explicitly declared the type of anything at all! This is because Scala has type inference , which makes type declarations optional when it is straightforward for the compiler to figure out what the types must be. For example, for our very first expression, val a = 5 , because the RHS is an Int , it is clear that the LHS must also be an Int , and so the compiler infers that the type of a must be an Int , and treats the code as if the type had been declared asval a: Int = 5 . This type inference makes Scala feel very much like a dynamic language in general use. Typically, we carefully specify the types of function arguments (and often the return type of the function, too), but then for the main body of each function, just let the compiler figure out all of the types and write code as if the language were dynamic. To me, this seems like the best of all worlds. The convenience of dynamic languages with the safety of static typing.
Declaring the types of function arguments is not usually a big deal, as the following simple example demonstrates.
scala> def mean(arr: Array[Int]): Double = {
| arr.sum.toDouble/arr.length
| }
mean: (arr: Array[Int])Double
scala> mean(Array(3,1,4,5))
res55: Double = 3.25
A complete Scala program
For completeness, I will finish this post with a very simple but complete Scala/Breeze program. In a previous post I discussed a simple Gibbs sampler in Scala , but in that post I used the Java COLT library for random number generation. Below is a version using Breeze instead.
object BreezeGibbs {
import breeze.stats.distributions._
import scala.math.sqrt
class State(val x: Double, val y: Double)
def nextIter(s: State): State = {
val newX = Gamma(3.0, 1.0 / ((s.y) * (s.y) + 4.0)).draw()
new State(newX, Gaussian(1.0 / (newX + 1), 1.0 / sqrt(2 * newX + 2)).draw())
}
def nextThinnedIter(s: State, left: Int): State = {
if (left == 0) s
else nextThinnedIter(nextIter(s), left - 1)
}
def genIters(s: State, current: Int, stop: Int, thin: Int): State = {
if (!(current > stop)) {
println(current + " " + s.x + " " + s.y)
genIters(nextThinnedIter(s, thin), current + 1, stop, thin)
} else s
}
def main(args: Array[String]) {
println("Iter x y")
genIters(new State(0.0, 0.0), 1, 50000, 1000)
}
}
Summary
In this post I’ve tried to give a quick taste of the Scala language and the Breeze library for those used to dynamic languages for statistical computing. Hopefully I’ve illustrated that the basics don’t look too different, so there is no reason to fear Scala. It is perfectly possible to start using Scala as a better and faster Python or R. Once you’ve mastered the basics, you can then start exploring the full power of the language. There’s loads of introductory Scala material to be found on-line. It probably makes sense to start with the links I’ve highlighted above. After that, just start searching – there’s an interesting set of tutorials I noticed just the other day. A very time-efficient way to learn Scala quickly is to do the FP with Scala course on Coursera, but whether this makes sense will depend on when it is next running. For those who prefer real books, the book Programming in Scala is the standard reference, and I’ve also found Functional programming in Scalato be useful (free text of the first edition of the former and a draft of the latter can be found on-line).
REPL Script
Below is a copy of the complete REPL script, for reference.
// start with non-Breeze stuff val a = 5
a
a = 6
a var b = 7
b
b = 8
b val c = List(3,4,5,6)
c(1)
c.sum
c.length
c.product
c.foldLeft(0)((x,y) => x+y)
c.foldLeft(0)(_+_)
c.foldLeft(1)(_*_) val d = Vector(2,3,4,5,6,7,8,9)
d
d.slice(3,6)
val e = d.updated(3,0)
d
e val f=(1 to 10).toList
f
f.map(x => x*x)
f map {x => x*x}
f filter {_ > 4} // introduce breeze through random distributions
// https://github.com/scalanlp/breeze/wiki/Quickstart import breeze.stats.distributions._
val poi = Poisson(3.0)
poi.draw
poi.draw
val x = poi.sample(10)
x
x.sum
x.length
x.sum.toDouble/x.length
poi.probabilityOf(2)
x map {x => poi.probabilityOf(x)}
x map {poi.probabilityOf(_)} val gau=Gaussian(0.0,1.0)
gau.draw
gau.draw
val y=gau.sample(20)
y
y.sum/y.length
y map {gau.logPdf(_)} Gamma(2.0,3.0).sample(5) import breeze.stats.DescriptiveStats._
mean(y)
variance(y)
meanAndVariance(y) // move on to linear algebra
// https://github.com/scalanlp/breeze/wiki/Breeze-Linear-Algebra import breeze.linalg._
val v=DenseVector(y.toArray)
v(1) = 0
v
v(1 to 3) := 1.0
v
v(1 to 3) := DenseVector(1.0,1.5,2.0)
v
v :> 0.0
(v :> 0.0).toArray val m = new DenseMatrix(5,4,linspace(1.0,20.0,20).toArray)
m
m.rows
m.cols
m(::,1)
m(1,::)
m(1,::) := linspace(1.0,2.0,4)
m val n = m.t
n
val o = m*n
o
val p = n*m
p // regression and QR solution val X = new DenseMatrix(1000,5,gau.sample(5000).toArray)
val b0 = linspace(1.0,2.0,5)
val y0 = X * b0
val y = y0 + DenseVector(gau.sample(1000).toArray)
val b = X \ y // defaults to a QR-solve of the least squares problem // a simple function example def mean(arr: Array[Int]): Double = {
arr.sum.toDouble/arr.length
} mean(Array(3,1,4,5))
Brief introduction to Scala and Breeze for statistical computing的更多相关文章
- MAST 397B: Introduction to Statistical Computing
MAST 397B: Introduction to Statistical ComputingABSTRACTNotes: (i) This project can be done in group ...
- The R Project for Statistical Computing
[Home] Download CRAN R Project About R Contributors What’s New? Mailing Lists Bug Tracking Conferenc ...
- scala 下 sigmoid 与breeze.numeric.sigmoid差异对比
scala> val beforeInit = System.nanoTime;val handsgn = rd.map(x => 1.0 / (1.0 + Math.exp(-x))); ...
- Scala class的构造方法与继承
有java背景的人,很清楚java是如何定义构造方法以及继承的.在scala里面,继承和java有些相似.但是构造方法的定义,就不大一样了,应该说是差别还是很大的.在java里面,定义构造方法,就是定 ...
- How-to: Do Statistical Analysis with Impala and R
sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...
- Can you share some Scala List class examples?
Scala List FAQ: Can you share some Scala List class examples? The Scala List class may be the most c ...
- 机器学习资源汇总----来自于tensorflow中文社区
新手入门完整教程进阶指南 API中文手册精华文章TF社区 INTRODUCTION 1. 新手入门 1.1. 介绍 1.2. 下载及安装 1.3. 基本用法 2. 完整教程 2.1. 总览 2.2. ...
- 【翻译】Awesome R资源大全中文版来了,全球最火的R工具包一网打尽,超过300+工具,还在等什么?
0.前言 虽然很早就知道R被微软收购,也很早知道R在统计分析处理方面很强大,开始一直没有行动过...直到 直到12月初在微软技术大会,看到我软的工程师演示R的使用,我就震惊了,然后最近在网上到处了解和 ...
- (转) [it-ebooks]电子书列表
[it-ebooks]电子书列表 [2014]: Learning Objective-C by Developing iPhone Games || Leverage Xcode and Obj ...
随机推荐
- php对象引用和析构函数的关系
在php中构造函数和析构函数都属于魔术方法,比如构造函数在某一个类中,当这个类被实例化的时候就会自动调用,而析构函数是在这个类的对象被销毁的时候自动调用,默认情况下是在程序执行结束时自动调用. 如果我 ...
- XP 之后, Delphi 动注册表不方便了...逼出来一个办法:
XP 之后, Delphi 动注册表不方便了...逼出来一个办法: 手头的程序需要修改注册表, 以让当前程序成为某格式的默认打开程序并关联图标; Vista 之后需要管理员权限才能操作注册表, 很麻烦 ...
- CodeForces - 416A (判断大于小于等于 模拟题)
Guess a number! Time Limit: 1000MS Memory Limit: 262144KB 64bit IO Format: %I64d & %I64u Sub ...
- 化工厂装箱员(洛谷 P2530)
题目描述 118号工厂是世界唯一秘密提炼锎的化工厂,由于提炼锎的难度非常高,技术不是十分完善,所以工厂生产的锎成品可能会有3种不同的纯度,A:100%,B:1%,C:0.01%,为了出售方便,必须 ...
- 谈谈Objective-C的警告 (转)
原文地址:http://onevcat.com/2013/05/talk-about-warning/ 一个有节操的程序员会在乎自己的代码的警告,就像在乎饭碗边上有只死蟑螂那样. ——@onevcat ...
- an error occurred during the file system check错误的解决
[root@GIT ~]# fsck -A /dev/mapper/VolGroup-lv_root 下面的选择,一路Y就行了,最后reboot,问题解决!
- Robots Exclusion Protocol简介
当Robot访问一个Web站点时,比如http://www.hello.com/,它先去检查是否存在文件http://www.hello.com/robots.txt.如果这个文件存在,它便会按照这样 ...
- OpenStack Swift集群部署流程与简单使用
之前介绍了<OpenStack Swift All In One安装部署流程与简单使用>,那么接下来就说一说Swift集群部署吧. 1. 简介 本文档详细描述了使用两台PC部署一个小型Sw ...
- 在《The DevOps 2.0 Toolkit》测试时要作的ansible的配置更改
这本极有用.要看完,测试完. 原生的配置: - name: Docker Compose is present get_url: url: https://github.com/docker/comp ...
- Java hour 52 Weather
采用jetty 后,默认的welcome-file-list 配置失效了,直接跳转到了struts2 的control 中去了. <welcome-file-list> <welco ...