Professional Documents
Culture Documents
305-8573 1-1-1
277-8568 5-1-5
() 141-0031 8-3-6
113-0033 7-3-1
101-8430 2-1-2
1.
blogWatcher 4[1]
2.
Globe of Blogs 5
Blogwise 7
[2][6]
[4] 88%
75%
i)
ii)
[3], [7]
[4][6]
[5] TREC
Blog06
adsense 9
3.
1
3
i)
ii)
iii)
3. 1
[4], [6]
[10]
i)
1
http://technorati.com/
2
http://www.blogpulse.com/
ii)
3
http://kizasi.jp/ ()
iii) 10
4
http://blogwatcher.pi.titech.ac.jp/ ()
iv)
5
http://www.globeofblogs.com/
6
http://www.misohoni.com/bba/
9http://google.com/adsense
7
http://www.blogwise.com/
10
8
http://trec.nist.gov/
(%)
80.5
31.0
8.1
42.1
14.3
70.8
27.1
2.9
[11]
3.6
12.7
[6]
SEO
11.5
[11]
4.5
49.5
36.9
v) [11]
3. 2
5. 1
2.
a)
i)
ii)
iii)
b)
i) ii)iii) 6.
iv)
4. 2
v) [11]
( i) iii) )
3. 3
i)
4.
4. 1
ii)
iii)
(a)
iv) [6]
(b)
(a)
1
(b)
2 (2007 12 3 0:00 )
3,591,306 192,699,276
1,355
196,975
4. 2
(3-a)
adsense
(3-b)
5. 2
[13]
2 50
RSS 12
Atom
50
Juman 13
50
11
5.
5. 1
2
(2007 3 ) 2004
360 1 9300
5. 3
5. 1
11
[12]
3.
1 2 50
URL 2007
2 URL 50
60URL
110 URL 50URL 1 3
60URL 1
2
3 URL
a URL
i.
ii.
b URL .
5
6.
4. 2 50 4 22
22
6. 1
3 , 88% 3
2
50%
14
4 URL
14 [12]
Doorway Doorway
S C
J A L G Y
192
142
54
24
26
442
203
115
169
355
128
130
207
396
1703
395
257
223
379
131
131
207
422
2145
48.6
55.3
24.2
6.3
2.3
0.8
0.0
6.2
20.6
(%)
ID
( 1 )
115 (42.3%)
ZARD
Wii
2
56
(20.6%)
30
(11.0%)
()
26
(9.6%)
()
20
(7.4%)
(
)
10
(3.7%)
(2.5%)
(1.5%)
(0.7%)
10
(0.7%)
272
1
10%
442
2 10
442 272(61.5%)
10
10
3
6. 2
22 5
5 22
30%30 10%10% 3
2
4
(1) 30% 5 4
5 (
50%)
(%)
ID
(%)
(%)
89.2
92.4
2, 6, 8
38.5
88.1
94.8
27.8
58.1
90.2
3, 4
12.0
40.9
18.5
36.1
58.7
5, 7
19.8
37.4
24.4
14.3
1, 10
21.7
22.5
11.1
20.5
22.1
0.0
22.1
19.1
0.0
19.1
15.2
80.0
1, 6
3.4
15.1
0.0
15.1
14.3
14.3
12.2
6.9
71.4
1, 3
2.1
ZARD
4.7
20.0
3.8
4.7
20.0
3.8
2.9
100.0
0.0
Wii
2.8
66.7
1.0
2.8
33.3
1.9
2.0
0.0
2.0
1.8
50.0
0.9
0.0
0.0
0.0
0.0
0.0
0.0
20.5
61.5
1 - 10
9.0
(6)
(2) 4 10%
7.
30%
[4][6]
[7], [9]
(3) 3 (2)
(4) 10% 6
(5) 30% 5
4
[1] T. Nanno, T. Fujiki, Y. Suzuki, and M. Okumura. Automatically collecting, monitoring, and mining Japanese weblogs.
In WWW Alt. 04: Proceedings of the 13th international
World Wide Web conference on Alternate track papers &
posters, pp. 320321. ACM Press, 2004.
[2] Z. Gy
ongyi and H. Garcia-Molina. Web spam taxonomy. In
AIRWeb 05: Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web, pp.
3947, 2005.
[3] Wikipedia, Spam blog. http://en.wikipedia.org/wiki/
Spam blog.
[4] P. Kolari, A. Joshi, and T. Finin. Characterizing the splogosphere. In Proceedings of WWW 2006 3rd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis
and Dynamics, 2006.
[5] C. Macdonald and I. Ounis. The TREC Blogs06 collection
: Creating and analysing a blog test collection. Technical
Report TR-2006-224, University of Glasgow, Department of
Computing Science, 2006.
[6] P. Kolari, T. Finin, and A. Joshi. Spam in blogs and social
media. In Tutorial at ICWSM, 2007.
[7] Y.-R. Lin, H. Sundaram, Y. Chi, J. Tatemura, and B. L.
Tseng. Splog detection using self-similarity analysis on blog
temporal dynamics. In AIRWeb 07: Proceedings of the 3rd
International Workshop on Adversarial Information Retrieval on the Web, pp. 18, 2007.
[8] . .
Web (DBWeb2007)
. , 2007.
[9] P. Kolari, T. Finin, and A. Joshi. SVMs for the Blogosphere:
Blog identication and Splog detection. In Proceedings of
the 2006 AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs, pp. 9299, 2006.
[10] Y. Sato, T. Utsuro, T. Fukuhara, Y. Kawada, Y. Murakami,
H. Nakagawa, and N. Kando. Collecting and analyzing
Japanese splogs based on characteristics of keywords. In
Proceedings of ICWSM, 2008.
[11] Wikipedia, Word salad (computer science). http://en.
wikipedia.org/wiki/Word salad %28computer science%29.
[12] Y.M. Wang, M. Ma, Y. Niu, and H. Chen. Spam doublefunnel: Connecting web spammers with advertisers,. In Proceedings of the 16th WWW Conference, pp. 291300, 2007.
[13] , , .
.
13 Web
, pp. 4043, 2007.