Professional Documents
Culture Documents
Abstract are made by users themselves. These users are mostly con-
arXiv:1703.00800v1 [cs.SI] 2 Mar 2017
User
Gender Inferred from a user’s name.
Focus A user’s specialized topics
Country A user’s main place of residence
Followers # of users follow this user Figure 2: 16 W3C Colors
Following # of users this user follows
Appreciations # of users liked this user’s projects
Comments # of comments on this user’s projects of the user and then repeated the process from the chosen
Counts # of designs created by this user user until a desired number of users are collected. With the
Views # of users viewed this user’s projects probability of 0.15, the process randomly jumped back to
the starting user in the seed set. To avoid being stuck in sink
Project nodes, we did a random jump after every 1000 steps of walk-
Appreciation # of users liked this project ing the network by again randomly choosing another starting
Topics Topics to which this project belongs user in the seed set.
H,S,V Avg.pixels in the HSV color space Once we collected a total of 50,000 users from the random
W3C Colors % of pixels close to 16 W3C Colors walk, we extracted 881,208 projects and 2,608,052 apprecia-
Colorfulness1 Defined by Hasler and Suesstrunk tions from the users. We limited the number of appreciations
Colorfulness2 Defined by Yendrikhovskij et al. to at most 80 per user since the number can be tremendously
large and thus intractable.
for both female and male users. Since we only consider users # Views # Comment # Count
of whom we were accurately able to predict gender, many 1.00
Table 3 summarizes the predictors and overall fit of the nega- 0.25
tive binomial model using the number of appreciations as the
0.00
dependent variable. The list of predictors is ordered based on 0 5 10 15 0 5 10 15 0 5 10 15
their relative importance which we derived by standardizing
the beta coefficients. While all predictors including top 15 Figure 3: Complementary cumulative distributions of
topics and 10 countries were significant, we did not include project view, comment and count variables. The x-axis is in
them all due to their low importance and space constraint. log scale.
Table 4: The result of a negative binomial regression with R2-What topics are popular?
the number of followers as the dependent variable. The table
only lists a subset of predictors in the order of relative im- We summed and normalized the topic vectors of the 668,581
portance based on standardized coefficients. However, the β projects to derive the topic vector t̂C , where each element
coefficients here remain on their original scales. in the vector represents the proportion of each topic in our
dataset. About 1.2% of projects have topics that do not be-
long to the official 68 topics and only 0.03% of projects did
Predictor β std.err z p not specify topics.
−2
intercept 4.22 2.17e 194.7 <2e−16 Figure 5 show the overall rankings of top 30 topics,
appreciations 1.47e−4 2.32e−6 63.33 <2e−16 which account for more than 90% of projects. The topic
comments 1.35e−3 2.37e−5 56.95 <2e−16 popularity follows a power law distribution (p<0.05, using
following 2.51e−4 3.72e−6 67.35 <2e−16 the Kolmogorov-Smirnov test). Graphic design, illustration,
views 4.20e−6 1.42e−7 29.46 <2e−16 art direction together already cover about 33% of projects,
counts 1.24e−2 2.20e−4 56.55 <2e−16 while industrial design, product design, interaction design,
gendermale 3.32e−1 1.75e−2 18.95 <2e−16 and icon design are not popular together account for less
from united states 4.22e−1 2.35e−2 17.98 <2e−16 than 3%.
from art direction 3.24e−1 2.11e−2 15.37 <2e−16 For validation, we compared the topic rankings directly
from india −6.75e−1 4.32e−2 -15.61 <2e−16 derived from projects with another ranking derived from
users’ specialization topics which are automatically com-
Summary null.dev res.dev χ2 p puted by Behance. Using Kendall’s rank correlation tau-b,
92.0K 47.5K 44.5K <2e−16 the agreement between the two ranking vectors is significant
(τ =0.88, p <0.001).
We also conducted pairwise comparisons of the topic
vector t̂C and two additional topic vectors derived from
able for predicting the number of appreciations. Country and projects created by female and male users respectively.
gender variables are not as important as others, and thus not The three rankings show a strong agreement each other
included in the table. However, we observed that being male (∀p<0.001) with some differences; the agreement between
suggests more appreciations, β=2.49e−1 , p<2e−16 . To take female and male users was the lowest with τ =0.81. Top-
a deeper look at the difference in gender, we calculated the ics in top 10 male topics but not found in top 10 fe-
complementary cumulative distribution function (CCDF) of male topics include fashion (M: 3.07%, F:1.90%) and fine
each project-related variable by gender (Figure 3). While the arts (M: 2.79%, F:1.84%), while we observed web design
number of project counts is comparable across gender, male (M: 2.17%, F:3.02%), typography (M: 2.37%, F:2.37%)
users tend to receive more views and comments. All predic- in the opposite direction. For men, illustration (M: 15.2%,
tors have positive effects on increasing attraction except a F:11.78%) is the most popular topic while graphic design
few country variables; for example, users from United States (M: 14.46%, F:13.71%) is at the top for women.
and Brazil mean less appreciations.
R2-How topics are related?
Figure 6 shows the result of the hierarchical clustering of
R1-What attracts followers? top 20 topics using Jaccard distance and Wards minimum
criterion. The 20 topics account for more than %80 percent
Table 4 presents the results of our negative binomial regres- of all projects. We excluded other topics as each of them
sion model using follower count as the dependent variable; only accounts for a negligible percent of projects.
it only lists top predictors based on their relative importance. The inter-topic relationships mostly confirm our intuitions
The model provides a considerable improvement in de- about how topics are related. The most closely related top-
viance, χ2 (31, N=37.7K)=92.0K-47.5K=44.5K, p<2e−16 .
Having more project appreciations leads to more follow-
ers. Residing in the United States also suggests more fol- gender female male
lowers, while from India, Brazil, and Egypt indicates fewer # Follower # Following # Appreciation
followers,∀β<0, ∀p<2e−16 . We also observed specializing 1.00
0.50
can specialize in more than one topic. Other top topics, not
included in the table, that have positive effects include ty- 0.25
pography, fashion, illustration, and digital arts. Male users
0.00
attract more followers. The differences in gender can be also 0 5 10 0 5 10 0 5 10
observed from activity patterns shown in Figure 3 and 4;
male users follow more users and draw more attention, while Figure 4: Complementary cumulative distributions of fol-
both male and female users produce a similar number of lower, following, and appreciation variables. The x-axis is
projects. in log scale.
0.6
Fields
graphic_design character_design motion_graphics
illustration digital_photography animation
Normalized count
Figure 5: Distribution of designs created by users across Behance topics. The x-axis shows a selected set of top 30 topics and
the y-axis shows the proportion of designs in each category.
the two most popular topics, they do not seem to cluster to-
editorial_design gether.
print_design
typography R2-Are users specialists or generalists?
architecture By computing the entropy of each user’s topic vector, we
advertising investigated the diversity of the user’s interests; that is, we
art_direction measured whether the user’s projects are evenly distributed
branding
across different topics. The maximum entropy for a user
is ln(68)=4.22 when the user created an equal number of
graphic_design
projects in each topic.
Top 20 Topics
digital_photography
Similar to Chang et al., we divided users into three equal-
photography sized groups based on project counts and compared the aver-
fashion age topic entropy across the groups with the hypothesis that
retouching a user with a larger number of projects has a diverse interest.
ui/ux Figure 7 shows the entropy distributions for the three
groups. The differences between the groups were signifi-
web_design
cant, ∀p<0.001. The effect size between the first and sec-
digital_art
ond groups is 0.84 (large) while the effect size between the
illustration second and third groups is 0.11 (negligible). We did not find
drawing any significant difference across gender in general.
character_design Overall, the mean entropy values of the groups all are less
painting than 1.95. This value is the maximum entropy when there
are only 7 topics. We also observed the mean entropy of all
fine_arts
users is 1.67. Given the fact that the maximum entropy of
0.0 0.4 0.8 1.2 68 topics is 4.22, the overall topic diversity seems relatively
Jaccard Distance low, suggesting that most users are specialists while some
differences may exist.
Figure 6: Hierarchical clustering of top 20 topics For example, in the third group, we found a generalist (en-
tropy=2.60, project count: 619) who created projects in 48
topics. However, 71.5% of the projects are covered by only
ics are web design & UI/UX. Other closely related top- 6 topics which include graphic design, branding, web de-
ics include digital photography & photography, branding & sign, UI/UX, calligraphy, and typography; in addition, some
graphic design, digital art & illustration, advertising & art of the 6 topics are related to each other such as web design
direction, painting & fine arts, and editorial design & print and UI/UX.
design in the order of similarity.
There are 6 clusters around the Jaccard distance of 1.0,
R2-Is there homophily in appreciating others?
each consisting of reasonably related topics. For instance, The average homophily of all users is 0.91 (Std=0.08) where
digital photography, photography, fashion, & retouching we computed each homophily of a user through the cosine
group together as do digital art, illustration, drawing, & char- similarity between the two sets of topics used in the projects
acter design. Although graphic design and illustration are created (ûPu ) and appreciated (ûAu ) by the user.
Table 5: The result of a negative binomial regression model
4 for color analysis. The table only lists a subset of predictors
in the order of relative importance based on standardized co-
●
efficients. The β coefficients here remain on their original
●
3
●
scales.
factor(group)
Diversity
[1,9]
Predictor Estimate std.err z p
2 [9,20] −2
intercept 4.02 1.33e 302.9 <2e−16
[20,4788] followers 1.83e−4 2.16e−7 848.9 <2e−16
1 white −5.98e−1 1.86e−2 -32.09 <2e−16
●
●
●
●
●
●
●
●
●
●
gray −4.18e−1 1.57e−2 -26.63 <2e−16
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
value 1.28e−3 9.11e−5 14.10 <2e−16
−1.08e−3 7.83e−5 <2e−16
●
0 ● ● ●
saturation -13.79
[1,9] [9,20] [20,4788] colorfulness1 −1.34e−3 1.21e−4 -11.05 <2e−16
Groups based on number of projects black −1.75e−1 1.95e−2 -9.00 <2e−16
olive −5.97e−1 4.88e−2 -12.23 <2e−16
Figure 7: Topic diversity (entropy) of three equal-sized colorfulness2 1.07e−3 1.09e−4 9.81 <2e−16
groups of users based on project counts.
Summary null.dev res.dev χ2 p
859K 680K 179K <2e−16
1.0
●
●
●
●
●
●
●
R3-What colors attract appreciations?
0.8 ●
●
●
●
●
●
●
●
● factor(group)
Homophily
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
Table 5 presents the result of a negative binomial regression
●
●
● ●
● [0,1.55]
●
●
●
●
●
●
●
●
model using the color metrics as experimental variables.
●
● ●
●
0.6 ●
●
●
●
●
● [1.55,1.98] The reduction in deviance from the full model to the null
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
●
● [1.98,3.2] model is significant, χ2 (21, N=542K)=859K-680K=179K,
●
●
●
●
●
●
●
● p<2e−16 .
●
0.4 ●
●
● We also compared the full model with a simpler model
● using the control variable alone (number of followers). The
[0,1.55] [1.55,1.98] [1.98,3.2] AIC8 value of the full model was lower than that of the
Groups based on diversity control-only model (1.90K>0), meaning that the full model
has a better fit. The chi-square test comparing the two mod-
Figure 8: Homophily (cosine similarity of ûPu and ûAu ) of els indicates that the color variables are significant predic-
three equal-sized groups of users based on topic diversity. tors, χ2 (20), N=542K)=1.90K, p<2e−16 .
Among all color variables, the highest effect on appreci-
ation is found in white color, β=-5.98e−1 , p<2e−16 . Gray
color, saturation, and colorfulness1 (Hasler and Suesstrunk
We initially hypothesized that users with high topic diver- 2003) had a negative effect while colorfulness2 (Yen-
sity would have higher tendency to appreciate projects of di- drikhovskij, Blommaert, and de Ridder 1998) and value
verse topics, thus resulting in lower homophily. We divided (brightness) has a positive effect on appreciation.
users into three equal-sized groups based on topic diversity Other significant predictors not included in the table are
and plotted the homophily distribution per each group (Fig- red, aqua, teal, and lime colors in the order of their stan-
ure 8). The graph shows an opposing trend of our hypoth- dardized contribution. The β coefficients for red and lime
esis (differences among groups are significant (∀p <0.001, colors are negative, while aqua and teal increases the chance
dgroup1,group2 =0.78, dgroup2,group3 =0.45). We observed that the of receiving appreciations. Silver, maroon, purple, fuchsia,
same trend holds when we divided users into five groups green, yellow, navy, and blue do not seem to impact appre-
(p <0.001). ciation.
We also tested whether men and women differed in the
homophily of their interests. Overall, there is a significant Discussion
difference across gender (p <0.001), suggesting that women In this section, we discuss our findings in light of research
tend to create and appreciate projects of similar topics to questions. We also qualitatively compare our results with
a greater extent than men. However, Cohen’s effect size previous research on Pinterest.
(d=0.067) value suggests a negligible practical significance.
Among the three groups, there is only a significant differ- 8
AIC=2k-2ln(L̂), where L̂ is the maximum value of the likeli-
ence between the first and second groups (p <0.001), but hood function of the model and k is the number of estimated pa-
the effect size is also very small (d=0.10). rameters.
R1-Activity: What attract attention? created by different users. We believe homophily of inter-
Our regression models capture 44% of the variance around ests among creative professionals is an interesting research
the number of appreciations and 48% of the variance around direction and needs further investigations; for example, do
the number of followers. The models provide a fairly good appreciations from people who specialize in different topics
explanatory power although additional features would fur- carry a different degree of recognition? what about appreci-
ther improve prediction accuracy. ations from different gender?
With regard to predicting appreciations, the number of
project comments, views, and counts is a strong indicator. R3-Color: What colors attract appreciations?
While comments and views are out of a user’s control, this Our analysis on colors shows that grayscale colors (black,
indicates that the user may produce more works to be more white, and gray) have a negative impact on attracting appre-
recognized. Most predictors also showed expected behav- ciation. This corresponds to the result of similar color anal-
ior; we see specializing in popular topics, followers, and fol- ysis on Pinterest (Bakhshi and Gilbert 2015) where black &
lowees all contribute. Interestingly, we see that being from white images are not shared as frequently as colored images.
the U.S has a negative impact on appreciation, β=−1.03e−1 , Similarly, the effect of lightness (value in the HSV color
p<2e−16 . space) indicates that the more brighter an image is, the more
For predicting followers, the number of appreciations, appreciation it can expect, β=1.28e−3 , p <2e−16 . This also
comments, views, counts also highly contributes so as the conforms to previous research showing that people consider
number of followees. Focusing on popular topics is also bright objects as good, whereas dark objects are bad (Meier,
helpful except graphic design which we found has a nega- Robinson, and Clore 2004).
tive impact, β=−1.85e−1 , p<2e−16 . Users from the United On the other hand, we observed mixed effects of colors
States are likely to attract more followers; this result con- that have non-saturated hues. Only aqua and teal, which lie
forms to the follower model on Pinterest by Gilbert et al.. at 180°in the hue dimension, shows a positive impact on ap-
What is more surprising is that male users attract more preciation. Red, olive, lime, and green colors, which covers
appreciations and followers than female users. Similarly, 0°to 120°have the opposite effect, while other colors have
Gilbert et al. found that being female users suggest fewer no effect. Our result is not consistent with the result on Pin-
followers on Pinterest. Our result might be due to the larger terest (Bakhshi and Gilbert 2015) where red, purple, and
population of male users on Behance. Further studies are pink colors promotes diffusion of content (i.e, repins). Fur-
necessary for a deeper understanding of the effect of gender. ther studies will be necessary to understand the differences.
The effect of other color features would be also interesting
R2-Topic: How people engage with various topics? to investigate (Khosla, Das Sarma, and Hamid 2014).
Our topic rankings are community-based, derived by aggre-
gating user-specified topics for projects. The two most pop- Limitations
ular topics are graphic design and illustration. Since they are Our work has a number of limitations. It is possible that our
generally used for any visual communication, we expect that sample may not represent the actual population on Behance;
they are specified along with many other topics. we sampled data twice and briefly, confirmed that our major
We see some discrepancy between top 12 topics in our findings hold. In addition, our analysis depends on a com-
rankings and 12 official popular topics from Behance which mercial service for predicting gender. While we manually
had 6 different topics, including interaction design, in- verified a few sample of predictions, we did not confirm
dustrial design, motion graphics, fashion, architecture, and its true accuracy. Because of the gender prediction, we hap-
UI/UX, while we had digital art, advertising, drawing, ty- pened to exclude some Asian countries to which our results
pography, character design, and digital photography. While may not generalize. For color analysis, we used web safe
this discrepancy might be due to the lack of samples we have colors that might not appropriate for the images of artworks.
in our data, Behance seems to select topics that are popular
and distinctive at the same time. Conclusion
The overall entropy of all users (M=1.67) is lower than the
mean entropy (M 2.02, estimated) that Chang et al. found In this paper, we provided a statistical overview of Behance,
on Pinterest data which even has a lower number of cat- a social network site for creative professionals. We found
egories (33, ln(32)=3.50 vs 68 topics, ln(68)=4.22). This several significant results. First, being male leads to more
suggests that Behance users are more specialized. We did followers and appreciations. Second, most users are special-
not find any significant difference across gender, which also ists focusing on a few topics and they tend to appreciate
contrasts with the result from Chang et al. where they found projects in their specialized topics. Finally, grayscale col-
that women curate significantly more diverse content than ors suggest less appreciations. For future work, we plan to
men. investigate another creative community site, Dribble, to see
A high similarity between two sets of topics, one from if our findings hold on the site as well.
projects created by users and another from those appreciated
by the users, suggests that there is homophily among users. Acknowledgement
While we did not directly analyze inter-user relationships This work has also been made possible through support the
(e.g., topics from followers and followees), this is a reason- Kwanjeong Educational Foundation.
able proxy for homophily as the projects appreciated are
References [Kwak et al. 2010] Kwak, H.; Lee, C.; Park, H.; and Moon,
S. 2010. What is twitter, a social network or a news me-
[Akdag Salah and Salah 2013] Akdag Salah, A. A., and dia? In Proceedings of the 19th International Conference
Salah, A. A. 2013. Flow of innovation in deviantart: follow- on World Wide Web, WWW ’10, 591–600. New York, NY,
ing artists on an online social network site. Mind & Society USA: ACM.
12(1):137–149.
[Leskovec and Faloutsos 2006] Leskovec, J., and Faloutsos,
[Bakhshi and Gilbert 2015] Bakhshi, S., and Gilbert, E. C. 2006. Sampling from large graphs. In Proceedings of the
2015. Red, purple and pink: The colors of diffusion on pin- 12th ACM SIGKDD international conference on Knowledge
terest. PloS one 10(2):e0117148. discovery and data mining, 631–636. ACM.
[Cameron and Trivedi 2013] Cameron, A. C., and Trivedi, [Meier, Robinson, and Clore 2004] Meier, B. P.; Robinson,
P. K. 2013. Regression analysis of count data, volume 53. M. D.; and Clore, G. L. 2004. Why good guys wear
Cambridge university press. white automatic inferences about stimulus valence based on
[Chang et al. 2014a] Chang, S.; Kumar, V.; Gilbert, E.; and brightness. Psychological science 15(2):82–87.
Terveen, L. G. 2014a. Specialization, homophily, and [Ottoni et al. 2013] Ottoni, R.; Pesce, J. P.; Casas, D. L.; Jr.,
gender in a social curation site: findings from pinterest. G. F.; Jr., W. M.; Kumaraguru, P.; and Almeida, V. 2013.
In Proceedings of the 17th ACM conference on Computer Ladies first: Analyzing gender roles and behaviors in pinter-
supported cooperative work & social computing, 674–686. est. In International AAAI Conference on Web and Social
ACM. Media.
[Chang et al. 2014b] Chang, Y.; Tang, L.; Inagaki, Y.; and [Perkel 2011] Perkel, D. 2011. Making art, creating infras-
Liu, Y. 2014b. What is tumblr: A statistical overview tructure: Deviantart and the production of the web. In Ph.D.
and comparison. ACM SIGKDD Explorations Newsletter dissertation.
16(1):21–29. [Reinecke et al. 2013] Reinecke, K.; Yeh, T.; Miratrix, L.;
[Deka et al. 2015] Deka, B.; Yu, H.; Ho, D.; Huang, Z.; Tal- Mardiko, R.; Zhao, Y.; Liu, J.; and Gajos, K. Z. 2013. Pre-
ton, J. O.; and Kumar, R. 2015. Ranking designs and users dicting users’ first impressions of website aesthetics with a
in online social networks. In Proceedings of the 33rd Annual quantification of perceived visual complexity and colorful-
ACM Conference Extended Abstracts on Human Factors in ness. In Proceedings of the SIGCHI Conference on Human
Computing Systems, 1887–1892. ACM. Factors in Computing Systems, CHI ’13, 2049–2058. New
York, NY, USA: ACM.
[Florida 2012] Florida, R. 2012. The rise of the creative
class: Revisited. New York. [Rudolph, Hoffman, and Hertzmann 2016] Rudolph, M. R.;
Hoffman, M.; and Hertzmann, A. 2016. A joint model
[Gilbert et al. 2013] Gilbert, E.; Bakhshi, S.; Chang, S.; and for who-to-follow and what-to-view recommendations on
Terveen, L. 2013. I need to try this?: a statistical overview behance. In Proceedings of the 25th International Confer-
of pinterest. In Proceedings of the SIGCHI conference on ence Companion on World Wide Web, 581–584. Interna-
human factors in computing systems, 2427–2436. ACM. tional World Wide Web Conferences Steering Committee.
[Halstead, Serrano, and Proctor 2015] Halstead, S.; Serrano, [Salah et al. 2012] Salah, A. A.; Salah, A. A.; Buter, B.; Di-
H. D.; and Proctor, S. 2015. Finding top ui/ux design talent jkshoorn, N.; Modolo, D.; Nguyen, Q.; van Noort, S.; and
on adobe behance. Procedia Computer Science 51:2426– van de Poel, B. 2012. Deviantart in spotlight: A network of
2434. artists. Leonardo 45(5):486–487.
[Han et al. 2014] Han, J.; Choi, D.; Chun, B.-G.; Kwon, T.; [Salah 2010] Salah, A. A. 2010. The online potential of
Kim, H.-c.; and Choi, Y. 2014. Collecting, organizing, and art creation and dissemination: deviantart as the next art
sharing pins in pinterest: Interest-driven or social-driven? venue. In Proceedings of the 2010 international confer-
SIGMETRICS Perform. Eval. Rev. 42(1):15–27. ence on Electronic Visualisation and the Arts, 16–22. British
[Hasler and Suesstrunk 2003] Hasler, D., and Suesstrunk, Computer Society.
S. E. 2003. Measuring colorfulness in natural images. Proc. [Scolere and Humphreys 2016] Scolere, L., and Humphreys,
SPIE 5007:87–95. L. 2016. Pinning design: The curatorial labor
[Hu et al. 2014] Hu, Y.; Manikonda, L.; Kambhampati, S.; of creative professionals. Social Media + Society
et al. 2014. What we instagram: A first analysis of insta- 2(1):2056305116633481.
gram photo content and user types. In ICWSM. [Tan and Yuen 2015] Tan, Y. Y., and Yuen, A. H. 2015.
destuckification: Use of social media for enhancing design
[Jones 2015] Jones, B. L. 2015. Collective learning re-
practices. In New Media, Knowledge Practices and Multilit-
sources: Connecting social-learning practices in deviantart
eracies. Springer. 67–75.
to art education. Studies in Art Education 56(4):341–354.
[Yendrikhovskij, Blommaert, and de Ridder 1998]
[Khosla, Das Sarma, and Hamid 2014] Khosla, A.; Yendrikhovskij, S.; Blommaert, F. J.; and de Ridder,
Das Sarma, A.; and Hamid, R. 2014. What makes an H. 1998. Optimizing color reproduction of natural images.
image popular? In Proceedings of the 23rd International In Color and Imaging Conference, 140–145. Society for
Conference on World Wide Web, WWW ’14, 867–876. New Imaging Science and Technology.
York, NY, USA: ACM.