载入中。。。 'S bLog
 
载入中。。。
 
载入中。。。
载入中。。。
载入中。。。
载入中。。。
载入中。。。
 
填写您的邮件地址,订阅我们的精彩内容:


 
一些网络大数据集,可用于社交信息网络分析等
[ 2012/7/25 15:56:00 | By: 梦翔儿 ]
 
CS322:
(Social and Information) Network Analysis
Autumn 2009

Resources


Datasets

Snap network datasets

Yahoo! Webscope Catalog of datasets

  • Note: Jure Leskovec will have to apply for any sets you want, and we must agree not to distribute them further.
    There may be a delay, so get requests in early.

Coauthorship and Citation Networks

Internet Topology

Wikipedia

Movie Ratings

Who trusts whom data at Trustlet

Mark Newman's pointers


Software Tools

a C++ libary for working with massive network datsets (Windows, Linux, Mac).

a program for large network analysis (Windows or Linux via Wine).

an exploratory data analysis and visualization tool for graphs and networks.

software framework for information visualization (Linux, MacOSX, Windows).

software for social network analysis (Windows).

a graph visualization software

a python package for the study of the structure of complex networks.

a large-scale network analysis, modeling and visualization toolkit

tools for fitting heavy-tailed distributions to data


Websites

Some websites that may be interesting to do analysis on:


Similar Courses

from: http://snap.stanford.edu/na09/resources.html

==========

DBLP 数据集:

Data characteristics:
  • Over 1,200,000 objects
  • Over 2,480,000 links
  • 12 object attributes
  • 6 link attributes
Additional information:


Citation links omitted; click to enlarge
The PROXIMITY DBLP database presents information on computer science publications listed in the DBLP Computer Science Bibliography. The data in this dataset were derived from a snapshot of the bibliography as of April 12, 2006. The PROXIMITY DBLP dataset maps each entry in the original DBLP data to one of six types of objects representing different types of publications. It includes links from publications to their authors and editors and from papers to the journal, proceedings, or book in which they appear, as well as citation links from one publication to another.

See the README for additional information on the DBLP database.

Acknowledgments:
Please include the following acknowledgment in all publications that describe work using this database:

The PROXIMITY DBLP database is based on data from the DBLP Computer Science Bibliography with additional preparation performed by the Knowledge Discovery Laboratory, University of Massachusetts Amherst.

===============

http://kdl.cs.umass.edu/data/msn/msn-info.html

类似大规模社交网络用数据集,也可以在这个网址左侧可以找到

Databases
HEP-Th
Can-o-sleep
Mobile Social Networks
DBLP

===============

社会计算,图挖掘方向的一些数据集。

1.snap.stanford.edu/na09/resources.html 这个网站给出了非常多的 有用的数据集包括:dblp data, kdd data,imdb database ,邮件网络,博客网络,等等。此外还给出了一些实用的工具进行网络分析,数据呈现等。

2。citeseerx.ist.psu.edu/about/metadata 此地址给出了citeseer 数据的下载方式,citeseer数据包括合作者,引文等信息。关于citeseer的下载办法,参见本博客的另一篇文章citeseer data的下载方法。

3。Cora dataset 的下载地址www.cs.umass.edu/~mccallum/code-data.html 关于更详细的数据介绍请看hi.baidu.com/zhudaohui/blog/item/4e6f86fdc4df791e08244d12.html

4。dblp 数据下载地址dblp.uni-trier.de/xml/ dblp 数据量较大,数据包括 合作者,日期,但是一般不包引文

 
 
  • 标签:数据集 
  • 发表评论:
    载入中。。。

     
     
     

    梦翔儿网站 梦飞翔的地方 http://www.dreamflier.net
    中华人民共和国信息产业部TCP/IP系统 备案序号:辽ICP备09000550号

    Powered by Oblog.