日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 >

在Spark上用Scala实验梯度下降算法

發(fā)布時(shí)間:2025/5/22 105 豆豆
生活随笔 收集整理的這篇文章主要介紹了 在Spark上用Scala实验梯度下降算法 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

首先參考的是這篇文章:http://blog.csdn.net/sadfasdgaaaasdfa/article/details/45970185

但是其中的函數(shù)太老了。所以要改。另外出發(fā)點(diǎn)是我自己的這篇文章?http://www.cnblogs.com/charlesblc/p/6206198.html 里面關(guān)于梯度下降的那幅圖片。

?

改來改去,在隨機(jī)化向量上耗費(fèi)了很多時(shí)間,最后還是做好了。代碼如下:

package com.spark.myimport org.apache.log4j.{Level, Logger} import org.apache.spark.{SparkConf, SparkContext} import breeze.linalg.DenseVector import breeze.numerics.exp/*** Created by baidu on 16/11/28.*/object GradientDemo{case class DataPoint(x: DenseVector[Double], y: Double) // case class見下文def parsePoint(x: Array[Double]): DataPoint = {//DataPoint(Vectors.dense(x.slice(0, x.size-2)), x(x.size-1))DataPoint(DenseVector(x.slice(0, x.size-2)), x(x.size-1))}def main(args: Array[String]) {Logger.getLogger("org.apache.spark").setLevel(Level.WARN)val conf = new SparkConf()val sc = new SparkContext(conf)println("Begin load gradient file")// 裝載數(shù)據(jù)集val text = sc.textFile("hdfs://master.Hadoop:8390/gradient_data/spam.data.txt")val lines = text.map {line =>line.split(" ").map(_.toDouble)}val points = lines.map(parsePoint(_)) // (parsePoint(_))看起來是一樣的var w = DenseVector.rand(lines.first().size - 2)val iterations = 100for (i <- 1 to iterations) {val gradient = points.map(p =>(1 / (1 + exp(-p.y * (w dot p.x))) - 1) * p.y * p.x).reduce(_ + _)w -= gradient}println("Finish data loading, w num: " + w.length + "; w: " + w)} }

?

然后在m42n05機(jī)器上,先用的是把?http://www-stat.stanford.edu/~tibs/ElemStatLearn/datasets/spam.data 這個(gè)文件拷貝到Hadoop上:

$hadoop fs -mkdir /gradient_data$ hadoop fs -put spam.data.txt /gradient_data/$ hadoop fs -ls /gradient_data/ Found 1 items -rw-r--r-- 3 work supergroup 698341 2016-12-21 17:59 /gradient_data/spam.data.txt

?

然后把jar包也拷貝過來,運(yùn)行命令:

$ ./bin/spark-submit --class com.spark.my.GradientDemo --master spark://10.117.146.12:7077 myjars/scala-demo.jar 得到輸出: 16/12/21 18:17:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/12/21 18:17:58 INFO util.log: Logging initialized @1689ms 16/12/21 18:17:58 INFO server.Server: jetty-9.2.z-SNAPSHOT 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@107ed6fc{/jobs,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1643d68f{/jobs/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@186978a6{/jobs/job,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2e029d61{/jobs/job/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@482d776b{/stages,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4052274f{/stages/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@132ddbab{/stages/stage,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@297ea53a{/stages/stage/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@acb0951{/stages/pool,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5bf22f18{/stages/pool/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@267f474e{/storage,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7a7471ce{/storage/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@28276e50{/storage/rdd,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@62e70ea3{/storage/rdd/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3efe7086{/environment,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@675d8c96{/environment/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@741b3bc3{/executors,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2ed3b1f5{/executors/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@63648ee9{/executors/threadDump,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68d6972f{/executors/threadDump/json,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@45be7cd5{/static,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7651218e{/,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3185fa6b{/api,null,AVAILABLE} 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d366c9b{/stages/stage/kill,null,AVAILABLE} 16/12/21 18:17:58 INFO server.ServerConnector: Started ServerConnector@53e211ee{HTTP/1.1}{0.0.0.0:4040} 16/12/21 18:17:58 INFO server.Server: Started @1811ms 16/12/21 18:17:58 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e0d4a8{/metrics/json,null,AVAILABLE} Begin load gradient file 16/12/21 18:18:00 INFO mapred.FileInputFormat: Total input paths to process : 1 16/12/21 18:18:02 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 16/12/21 18:18:02 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS Finish data loading, w num: 56; w: DenseVector(0.5742670447735152, 0.3793477463119241, 0.9681722093411653, 0.5967720119758925, 1.513648869152009, 0.8246263930800145, 0.8513296345703405, 0.5016541916805365, 0.10371045067354999, 1.0622529560536655, 0.7333760424194737, 2.1149483032187897, 0.9299367625800867, 0.7255747859512406, 0.13008556580706143, 1.4831202765138185, 0.7729907277492736, 0.9723309264036033, 13.394753146641808, 0.5531526429090097, 2.7444722115693665, 0.11325813324181622, 0.5096129116641023, 0.7201439311127137, 0.44719912156747926, 0.8273500952621051, 0.6736417633922696, 0.046531684571481415, 0.017895929000231802, 0.4726397794671698, 0.394438566392741, 0.8438784726078483, 0.4144073806784945, 0.18873920886297268, 0.4760240368798872, 0.31604719205329873, 0.694745503752298, 0.721380820951884, 0.988535475648986, 0.13515871744899247, 0.15694652560543523, 0.6939378895510522, 0.9279201378471407, 0.3336083293555714, 0.38938263676999685, 0.17159756568171308, 0.18897754115255144, 0.7281027812135723, 0.7233165381530381, 1.1093715737790655, 0.15675561193336351, 2.059622965151493, 0.6839713282339183, 0.11528695729374866, 7.413534050555067, 23.13404922028611) 16/12/21 18:18:07 INFO server.ServerConnector: Stopped ServerConnector@53e211ee{HTTP/1.1}{0.0.0.0:4040} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6d366c9b{/stages/stage/kill,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3185fa6b{/api,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7651218e{/,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@45be7cd5{/static,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@68d6972f{/executors/threadDump/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@63648ee9{/executors/threadDump,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2ed3b1f5{/executors/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@741b3bc3{/executors,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@675d8c96{/environment/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3efe7086{/environment,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@62e70ea3{/storage/rdd/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@28276e50{/storage/rdd,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7a7471ce{/storage/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@267f474e{/storage,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5bf22f18{/stages/pool/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@acb0951{/stages/pool,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@297ea53a{/stages/stage/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@132ddbab{/stages/stage,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4052274f{/stages/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@482d776b{/stages,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2e029d61{/jobs/job/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@186978a6{/jobs/job,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1643d68f{/jobs/json,null,UNAVAILABLE} 16/12/21 18:18:07 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@107ed6fc{/jobs,null,UNAVAILABLE}

可以看到數(shù)據(jù)正常進(jìn)行了處理。

?

在代碼的迭代循環(huán)里面再加上這么一句,看看過程:

println("In data loading, w num: " + w.length + "; w: " + w)

然后重新拷貝jar包,然后運(yùn)行。發(fā)現(xiàn)增加了很多中間數(shù)據(jù),但是每次改動(dòng)不大,有的只是最后幾個(gè)數(shù)字改動(dòng):

In data loading, w num: 56; w: DenseVector(0.8387794911469437, 0.041931950643148204, 0.610593576873822, 0.775693127624059, 0.9595814255406686, 0.8346753461732199, 1.3049939469403333, 0.7056665962054256, 0.4607139317388798, 0.7272237992038442, 0.658182563650663, 0.733627042229442, 0.49543528179048996, 0.43928474305383947, 0.7784540121519834, 3.3618947233533456, 0.8863247999385253, 0.4007587753541083, 2.0631977325748334, 0.8211289850510815, 1.2076387347473903, 0.43209585536401196, 0.8361371667999544, 0.3902040623717107, 0.9249800607229486, 0.9684655358995048, 0.7122113545634148, 0.7564214721597596, 0.9295754044438086, 0.0667831407627083, 0.8262226990678785, 0.9866253536733688, 0.7214690647928418, 0.5992067836236182, 0.801215365214358, 1.0206941788488395, 0.8887684894893382, 0.39696145592511084, 0.7994301499483707, 0.39766237687949973, 0.3213782652296576, 0.3959330364022269, 0.6573698429264838, 0.5725594506918451, 0.932872703406284, 0.4276515117478306, 0.8908902872993782, 0.6281143587881469, 0.5136752276267151, 1.0933173640821512, 0.10820509511118362, 1.9426418431339785, 0.2017114624971559, 0.9827542778431644, 5.224634203803431, 16.694903977208174) In data loading, w num: 56; w: DenseVector(0.8387794911469437, 0.041931950643148204, 0.6105935768739001, 0.775693127624059, 0.9595814255414439, 0.8346753461732199, 1.3049939469403333, 0.7056665962054256, 0.4607139317388798, 0.7272237992038442, 0.658182563650663, 0.733627042229442, 0.49543528179048996, 0.43928474305383947, 0.7784540121519834, 3.3618947233534118, 0.8863247999385373, 0.4007587753541083, 2.0631977325749897, 0.8211289850510815, 1.2076387347474142, 0.43209585536401196, 0.8361371667999544, 0.3902040623717107, 0.9249800607229486, 0.9684655358995048, 0.7122113545634148, 0.7564214721597596, 0.9295754044438086, 0.0667831407627083, 0.8262226990678785, 0.9866253536733688, 0.7214690647928418, 0.5992067836236182, 0.801215365214358, 1.0206941788488395, 0.8887684894893382, 0.39696145592511084, 0.7994301499483707, 0.3976623768795117, 0.3213782652296576, 0.3959330364022269, 0.6573698429264838, 0.5725594506918451, 0.932872703406296, 0.4276515117478306, 0.8908902872993782, 0.6281143587881469, 0.5136752276267151, 1.093317364082217, 0.10820509511118362, 1.942641843152015, 0.2017114624971559, 0.982754277843168, 5.22463420411604, 16.694903977520784)

?

?

梯度下降原理

梯度下降原理講的比較好的,可以看這里:

http://blog.csdn.net/woxincd/article/details/7040944

還有這篇:

http://www.cnblogs.com/maybe2030/p/5089753.html?utm_source=tuicool&utm_medium=referral

?

仔細(xì)看了一下,發(fā)現(xiàn)上面的公式,和代碼里面的公式好像不太一樣。應(yīng)該是代碼里面用到了Sigmoid函數(shù)。

?

還需要好好領(lǐng)悟一下。

上面代碼里面用到的公式主要是:

(1 / (1 + exp(-p.y * (w dot p.x))) - 1) * p.y * p.x)
上面p.x是一個(gè)n維的vector,p.y是一個(gè)數(shù)值。

然后 reduce(_+_)是說把沒一行的都加起來。也就是最后是一個(gè)n維的vector.

然后 w -= gradient

?

然后迭代N次,得到一個(gè)新的w.

?

case class

case class和class的區(qū)別可以看:http://www.tuicool.com/articles/yEZr6ve

在Scala中存在case class,它其實(shí)就是一個(gè)普通的class。但是它又和普通的class略有區(qū)別,如下:

1、初始化的時(shí)候可以不用new,當(dāng)然你也可以加上,普通類一定需要加new;

2、toString的實(shí)現(xiàn)更漂亮;

3、默認(rèn)實(shí)現(xiàn)了equals 和hashCode;

4、默認(rèn)是可以序列化的,也就是實(shí)現(xiàn)了Serializable ;

5、自動(dòng)從scala.Product中繼承一些函數(shù);

6、case class構(gòu)造函數(shù)的參數(shù)是public級(jí)別的,我們可以直接訪問;

7、支持模式匹配。

?

Breeze

另外,上面的DenseVector其實(shí)都是用的Breeze里面的類

?

?

LinearRegressionWithSGD

另外,這是Spark里面實(shí)現(xiàn)的線性回歸,是基于隨機(jī)梯度下降的。相似的函數(shù)還有:

MLlib中可用的線性回歸算法有:LinearRegressionWithSGD,RidgeRegressionWithSGD,LassoWithSGD;MLlib回歸分析中涉及到的主要類有,GeneralizedLinearAlgorithm,GradientDescent。

?

?

Scala用Java

上文最后用的是DenseVector,所以沒有用下面這段。但是下面這段說明了Scala里面可以用Java的:

import java.util.Random val rand = new Random(53)

?

總結(jié)

以上是生活随笔為你收集整理的在Spark上用Scala实验梯度下降算法的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。