Professional Documents
Culture Documents
305-8573 1-1-1
305-8573 1-1-1
101-8457 2-2
() 141-0031 8-3-6
277-8568 5-1-5
SVM
SVM
SVM
SVM
1.
Technorati 1BlogPulse 2kizasi.jp 3
blogWatcher 4[1]
Globe of Blogs
2.
SVM
2. 1
Blogwise
2. 1. 1
()
[12], [13]
[2][6]
[4] 88%
75%
[3], [7]
2. 1. 2
[5]
TREC 8Blog06
URL
[4], [8]
[10]
2. 2
2. 3
SVM
SVM
URL URL
2. 3. 1
SEO
2. 3. 2 URL URL
[8], [9]
1
http://technorati.com/
2
http://www.blogpulse.com/
3. SVM
3
http://kizasi.jp/ ()
3. 1 SVM
4
http://blogwatcher.pi.titech.ac.jp/ ()
SVM TinySVM 10
5
http://www.globeofblogs.com/
6
http://www.misohoni.com/bba/
7
http://www.blogwise.com/
8
http://trec.nist.gov/
9 (http://chasen-legacy.sourceforge.jp/) ipadic
10http://chasen.org/~taku/software/TinySVM/
1: 2: 3: 4:
1
1: +2: +3: +
2
[12], [13]
2. 2
4647 1695
F 3
761 934
F 0.875
10
F 0.902
SVM
4. 2
[12], [13]
4.
ID
4. 1
ID
ID=15
4 (a)
3 SVM (
+)
4 (b)
ID=2
ID=2
5 (a)
(b) ID=2
(b)
5.
90%
ID
[1] T. Nanno, T. Fujiki, Y. Suzuki, and M. Okumura. Automatically collecting, monitoring, and mining Japanese weblogs.
In WWW Alt. 04: Proceedings of the 13th international
World Wide Web conference on Alternate track papers &
posters, pp. 320321. ACM Press, 2004.
[2] Z. Gy
ongyi and H. Garcia-Molina. Web spam taxonomy. In
AIRWeb 05: Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web, pp.
3947, 2005.
(a)
(b)
4 (F )
(a)
F
F
(b) ID=2
5 ID=2