日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

数据集(二)

發(fā)布時間:2023/12/15 编程问答 35 豆豆
生活随笔 收集整理的這篇文章主要介紹了 数据集(二) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

1、氣候監(jiān)測數(shù)據(jù)集?http://cdiac.ornl.gov/ftp/ndp026b

2、幾個實用的測試數(shù)據(jù)集下載的網(wǎng)站

? ?Data for MATLAB hackers?(Handwritten Digits、Faces、Text)

? ?http://www.cs.toronto.edu/~roweis/data.html

3、UCI KDD Archive(各類數(shù)據(jù)集)

? ?http://kdd.ics.uci.edu/summary.task.type.html

? ?http://kdd.ics.uci.edu/summary.data.type.html

4、UCI收集的機器學(xué)習(xí)數(shù)據(jù)集

? ?ftp://pami.sjtu.edu.cn/ ?

? ?http://www.ics.uci.edu/~mlearn//MLRepository.htm ?

5、樣本數(shù)據(jù)庫

? ?http://kdd.ics.uci.edu/

? ?WWW-pages were manually classified

? ?http://www-2.csNaNu.edu/afs/csNaNu.edu/project/theo-20/www/data/ ?

6、CMU World Wide Knowledge Base (Web->KB) project(classified web pages、relational data describing pages and hyperlinks)

? ?http://www-2.csNaNu.edu/afs/csNaNu.edu/project/theo-11/www/wwkb/ ?

7、人工智能機器學(xué)習(xí)

? ?http://duch-links.wikispaces.com/

8、文本分類,即rainbow的數(shù)據(jù)集

? ?http://www-2.csNaNu.edu/afs/cs/project/theo-11/www/naive-bayes.html ?

9、Statlib?數(shù)理統(tǒng)計相關(guān)程序庫

? ?http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm

? ?http://lib.statNaNu.edu/

? ?http://lib.statNaNu.edu/datasets/

? ?http://lib.statNaNu.edu/modules.php?op=modload&name=Downloads&file=index&req=viewdownload&cid=2

10、癌癥基因:

? ?http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi

11、金融、醫(yī)藥數(shù)據(jù):

? ?http://lisp.vse.cz/pkdd99/Challenge/chall.htm

12、時間序列數(shù)據(jù)的網(wǎng)址

? ?http://www.stat.wisc.edu/~reinsel/bjr-data/ ?

13、kdnuggets?相關(guān)鏈接各種數(shù)據(jù)集:

? ?http://www.kdnuggets.com/datasets/index.html

14、德國智能分析和信息系統(tǒng)

? ?http://www.mlnet.org/cgi-bin/mlnetois.pl/?File=datasets.html ?

? ?http://dctc.sjtu.edu.cn/adaptive/datasets/ ?

? ?http://fimi.cs.helsinki.fi/data/ ?

15、IBM智能信息

? ?http://www-958.ibm.com/software/data/cognos/manyeyes/datasets

? ?http://www.almaden.ibm.com/software/quest/Resources/index.shtml

16、Frequent Set Counting

? ?http://miles.cnuce.cnr.it/~palmeri/datam/DCI/datasets.php

17、評分數(shù)據(jù)集

??Movielens?電影評分數(shù)據(jù)

? ?基本數(shù)據(jù)描述:包括以下三個數(shù)據(jù)集:

? ?a.943個用戶對1682個電影的10萬條評分

? ?b.6040個用戶對3900個電影的1百萬條評分

? ?c.71567個用戶對10681個電影的1千萬條評分

? ?http://www.grouplens.org/ ?

?

? ?Book-Crossing?書籍評分數(shù)據(jù)

? ?基本數(shù)據(jù)描述:包含了278,858個用戶對271,379本書籍的1,149,780條評分。該數(shù)據(jù)集由Cai-Nicolas Ziegler?在2004年8-9月用4周的時間從Book-Crossing社區(qū)用網(wǎng)絡(luò)爬出。

? ?http://www.informatik.uni-freiburg.de/~cziegler/BX/

?

??Jester Joke Data Set?笑話評分集合

? ?來自UC Berkeley的Ken Goldberg發(fā)布的一個推薦系統(tǒng)使用的數(shù)據(jù)集。包含關(guān)于100個笑話的73,496名用戶評分的410萬條連續(xù)評分。

? ?http://www.ieor.berkeley.edu/~goldberg/jester-data/

?

? Netflix?數(shù)據(jù)集

? ?也是電影評分數(shù)據(jù)集,480,189?個用戶,17,770?部電影,100,480,507?條評分記錄。與它相比,MovieLens?數(shù)據(jù)集少了?2?個數(shù)量級。它的位置相信會逐漸被?Netflix?數(shù)據(jù)所替代,這是時代進步的必然結(jié)果。

? ?說明:以上四個均為用戶評分數(shù)據(jù)

18、GPS軌跡數(shù)據(jù)

? ?GeoLife GPS Trajectories

? ?http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/default.aspx ?

?

? ?GPS Trajectories with transportation mode labels

? ?http://research.microsoft.com/apps/pubs/?id=141896

?

? ?Movebank?動物軌跡

? ?http://www.movebank.org/

19、手機WIFI藍牙

A Community Resource for Archiving Wireless Data At Dartmouth

? ?http://crawdad.cs.dartmouth.edu/

? ?crowflow ?手機和wifi軌跡

? ?http://crowdflow.net/

20、OpenStreetMap Data

? ?planet.openstreetmap.org?或者?http://metro.teczno.com/

21、openpath上傳數(shù)據(jù)+API

? ?https://openpaths.cc/ ?

22、FOURSQUARE

23、GeoTime

? ?http://www.geotime.com/GeoTime(s)/January-2012/Cupid-Strikes-Again--Time-Series---GIS--Together-a.aspx ?

24、數(shù)據(jù)堂

? ?http://www.datatang.com/

25、http://www.kdnuggets.com/datasets/

26、http://appsrv.cse.cuhk.edu.hk/~kdd/data_collection.html

IBM Almaden Research Center Data Mining Projects

Data Sets:

· ? ? ? ??Synthetic Data Generation Code for Associations and Sequential Patterns

· ? ? ? ??Synthetic Data Generation Code for Classification

· ? ? ? ??"Dense" Data-Sets (apriori binary format, 3.2Mb)

· ? ? ? ??Enron Email Data Set

Demos:

· ? ? ? ??General Visualizations for Associations

· ? ? ? ??Visualization Demo: Market Basket Analysis

?

IBM Intelligent Miner:

?

· ? ? ? ??IBM Intelligent Miner for Data

· ? ? ? ??Video and image clips from IBM Data Mining T.V. Ad

IBM Data Mining Resources:

· ? ? ? ??Business Intelligence Solutions ? Our colleagues offering data mining consultancy and services.

· ? ? ? ??Data Abstraction Research Group ? Our colleagues in IBM Thomas J. Watson Research Center. ? Our colleagues in France.

· ? ? ? ??Data Mining: Extending the Information Warehouse Framework ? IBM White Paper on Data Mining.

在下面的網(wǎng)址可以找到reuters數(shù)據(jù)集

? ?http://www.research.att.com/~lewis/reuters21578.html

關(guān)于基金的數(shù)據(jù)挖掘的網(wǎng)站

? ?http://www.gotofund.com/index.asp

? ?http://lans.ece.utexas.edu/~strehl/

reuters數(shù)據(jù)集

? ?http://www.research.att.com/~lewis/reuters21578.html

? ?http://www-2.csNaNu.edu/webkb

? ?http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf

關(guān)聯(lián):

? ?http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar

? ?http://www.phys.uni.torun.pl/~duch/software.html

WEKA:

? ?http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar ?

1。A jarfile containing 37 classification problems, originally obtained from the UCI repository

? ?http://prdownloads.sourceforge.net/weka/datasets-UCI.jar ?

2。A jarfile containing 37 regression problems, obtained from various sources

? ?http://prdownloads.sourceforge.net/weka/datasets-numeric.jar ?

3。A jarfile containing 30 regression datasets collected by Luis Torgo

? ?http://prdownloads.sourceforge.net/weka/regression-datasets.jar ?

數(shù)據(jù)挖掘相關(guān)比賽以及數(shù)據(jù)集

  • 2005 University of California data mining contest, predicting bad accounts and their churn date using real-world CRM data, deadline June 30, 2005.

  • ILP 2005 Challenge, on the prediction of functional classes of genes.

  • KDD Cup 2005, on classifying internet user search queries, deadline July 8.

  • Data Mining Cup 2005 (Chemnitz, Germany), for students; topic: How data mining can ascertain the risk of loss of payments and reduce this risk.

  • ?KDD Cup 2004, focuses on data-mining for a several performance criteria using datasets frombioinformatics and quantum physics.

  • ?InfoVis 2004 Contest, The History of InfoVis.

  • DATA MINING CUP 2004 (Chemnitz, Germany), for students.

  • InfoVis 2003 Contest: Visualization and Pair Wise Comparison of Trees, results announced Sep 5, 2003.

  • KDD CUP 2003

  • ?http://www.cs.cornell.edu/projects/kddcup/index.html

  • ?KDD Cup 2003, focuses on problems motivated by network mining and the analysis of usage logs.

  • DATA MINING CUP 2003 (Chemnitz, Germany). The task is to identify spam emails before they reach the user′s mailbox.

  • ?KDD Cup 2002, focus on data mining in molecular biology.

  • ?Student Data Mining Cup (2002), Chemnitz University and Prudential Systems.

轉(zhuǎn)載于:https://www.cnblogs.com/codeOfLife/p/6773825.html

總結(jié)

以上是生活随笔為你收集整理的数据集(二)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。