You are on page 1of 21

TensorFlow-based Automatic Personality Recognition Used in

Asynchronous Video Interviews

Abstract- The automated analysis of video interviews to determine individual

personality characteristics has been an important topic of study with applications in

personality computing, human-computer interaction, and psychological evaluation since

the emergence of artificial intelligence (AI). Convolutional neural network (CNN)

models can effectively analyze human nonverbal signs and assign their personality

qualities with the use of a camera, thanks to advances in computer vision and pattern

recognition based on deep learning (DL) approaches. An end-to-end AI interviewing

system was developed using asynchronous video interview (AVI) processing and a

TensorFlow AI engine to perform automatic personality recognition (APR) based on

features extracted from the AVIs and true personality scores from the facial expressions

and self-reported questionnaires of 120 real job applicants in this study. The trial

findings reveal that our AI-based interview agent can correctly distinguish an

interviewee's "big five" characteristics with an accuracy of 90.9 percent to 97.4 percent.

Our research also shows that, despite the lack of large-scale data, the semi supervised

DL technique performed unexpectedly well in terms of automated personality detection,

despite the absence of time-consuming human annotation and labelling. Existing self-

reported personality evaluation techniques that job candidates may falsify to obtain

socially desired outcomes may be supplemented or replaced by the AI-based interview

agent.

Index Terms- Big five, convolutional neural network (CNN), personality computing,

TensorFlow
I. Introduction

Personality is a worldwide predictor utilized in job selection, according to industrial

and organizational (I/O) psychologists [1]. Some businesses utilize self-reported surveys to

assess job candidates' personalities; nevertheless, job seekers may lie about their personality

qualities in order to increase their chances of landing a job [2]. Because candidates have a

hard time faking nonverbal signs, some employers utilize their facial expressions and other

nonverbal indicators during job interviews to assess candidates' personalities [3]. Due to cost

and time constraints, it is not feasible for every job candidate to attend a live job interview in

person or participate in interviews done through telephone or web conferences [4]. One-way

asynchronous video interview (AVI) software may be used to interview job candidates

automatically at a single point in time. Employers may evaluate the audio-visual recordings at

a later date using this method [5]. Human raters find it difficult to appropriately analyze

applicants' personality attributes based on video pictures when utilizing AVI [6]. Human

raters were unable to effectively identify an applicant's personality just by viewing recorded-

video interviews, according to Barrick et al. [7].

Because applying AI techniques to audio-visual datasets can achieve more reliable

and predictive power than human raters, both I/O psychology and computer science scholars

have suggested that artificial intelligence (AI) may surpass humans in recognising or

predicting an applicant's personality for screening job applicants [8]–[11]. “AI is a discipline

of computer science that tries to create intelligent computers that act in a way comparable to

human intelligence” [12], and it “aims to expand and enhance human capability and

efficiency in activities of remaking nature” [13]. Machine learning (ML) is a key component

of AI, since it “allows computers to learn without having to be explicitly programmed” [14].

Deep learning (DL) is an ML implementation methodology that can “mimic the human brain
process to comprehend input including pictures, sounds, and texts” [15]. DL feature

extraction is automated rather than laborious, in contrast to classical ML [12].

Supervised learning, unsupervised learning, and semi-supervised learning are the

three types of ML/DL [12]. Unsupervised learning may automatically learn the right answers

from a huge quantity of data without needing predetermined labels [10], [16]. Supervised

learning tasks are generally accomplished by classification using predetermined labelled

training data (called "ground truth"). Semi-supervised learning combines the two techniques

by employing a lesser quantity of unlabeled data along with some labelled data for pattern

recognition; as a result, this strategy may minimize labelling efforts while maintaining high

accuracy.

Previous automated personality recognition (APR) research relied on supervised

machine learning (ML), which requires human labelling and is time intensive [17]. Because

convolutional neural networks (CNNs) have been shown to be high-performing models for

automatically processing images and inferring first impressions from camera images, this

study used semi-supervised DL methods, including CNNs, to develop an AI-based interview

agent that can automatically recognize a job applicant's personality based on relatively

smaller datasets.The following is how the rest of the article is organized: The backdrop of

APR from audio-visual data is discussed in Section 2. Our data processing strategy is

described in Section 3. Section 4 contains a thorough model and its outcomes. Finally, in

Section 5, we discuss and summarize our results as well as our future work.

II. Related Work

A. Personality Taxonomy

Individual variances in distinctive patterns of thinking, feeling, and behaviour are

referred to as personality [19]. This concept is often used to predict whether a job seeker will
do well in a particular work function and participate effectively in a potential cultural

environment [20]. The "big five" features, also known as the five-factor model (FFM) or

OCEAN model, offer academics and practitioners with a well-defined taxonomy for choosing

job candidates [20]. The big five fundamental characteristics of openness, conscientiousness,

extraversion, agreeability, and neuroticism (poor emotional stability) are classified and

utilized in many cultural situations [21].

 Imaginative and creative openness: the degree to which a person is imaginative and

creative.

 Conscientiousness: an individual's level of organization, thoroughness, and

thoughtfulness.

 Extraversion: It refers to how chatty, lively, and forceful a person is.

 Agreeability: a person's willingness to compromise, sympathy, kindness, and

fondness.

 Neuroticism: a person's stress, moodiness, and anxiety are reflected in this trait.

B. Personality Computing

Different methodologies, including self-rating and observation, exist to measure the

large five qualities of a person. Self-rate reveals a personality, whereas observer-rate

represents other people's subjective judgments of the personality of a person(s). In an

autonomous approach, personality refers to the reasons, intentions, emotions, and actions

stated by a person. From the observer point of view, personality includes information on the

social reputation of an individual, but intimate contacts such as partners, friends or colleagues

should ideally be able to acquire reliable observation ratings [19]. In I/O psychology, the

authentic evaluations are the basic information utilized to anticipate individual workplace

behaviour and performances, when it is difficult to determine the valid five major features of
the observer. Selbstbewertungen may also be used to forecast whether a job applicant is in a

zero-knowledge setting, like a job interview [19], in good match for the job criteria and the

business culture.

The respondents use distal signals to externalize their apparent personality (i.e., any

observable behaviour that can be perceived by the interviewer, such as facial expression,

gaze, posture, body movement, speaking, and prosody). Alternatively, the interviewer uses

proximal cues (i.e., any interviewee behaviour that are actually perceived by the interviewer,

including indirect observable cues) to attribute the unobservable personality traits of the

interviewee; however, these cues can translate into perceptions by the interviewer [6], [23].

Despite the fact that only the candidates' heads and torsos are visible in AVI,

personality psychology researchers have discovered that an interviewer or rater may still

employ nonverbal clues to determine the candidates' personality qualities [24]. Individuals

may infer genuine personality attributes to zero acquaintances based on short video snippets,

according to several experimental investigations [23]. The lens model is being used to

automatically detect, detect, and synthesize human behavioral signals and personality using

personality computing, an emerging research topic connected to AI and personality

psychology. APR, automated personality perception (APP), and automatic personality

synthesis (APS) are three methodologies in personality computing for auto-assessing

personality [6].

APR is designed to auto-recognize an interviewee's self-assessed personality from

distal signals by extracting characteristics from the audio-visual data of AVI [9], [25]–[27].

APP, on the other hand, is designed to anticipate an interviewee's personality based on

proximate clues [28]. Because investigating proximal cues is difficult, APP relies on distal

cues as a proxy, as discussed in [8, [10], [17], [29]–[33]. Artificial agents, avatars, or robots
are employed in APS to exhibit human-like distal signals, with the purpose of people

recognizing and inferring these signals as preset attributes [34], [35]. AVI may be modified

and integrated with APR to auto-recognize interviewees' "actual scores" compared to their

self-rated personality attributes to generate AI-based APR [6].

C. Mel-Frequency Cepstrum

Mel-Frequency Kepstrum (MFC) is a long-term power spectrum-based audio

representation. To get it, we'll use a nonlinear Mel frequency scale and the linear cosine

transform of a log power. Mel-frequency cepstral coefficients make up MFC as a whole

(MFCCs). A nonlinear spectrum of spectrum, a sort of cepstral depiction of the audio clip, is

used to create MFCCs. In the case of MFC, the frequency bands are evenly spaced on the

Mel scale to resemble the human auditory system's response closer than the linearly separated

frequency bands used in conventional cepstrum.

The steps to generate Mel-frequency cepstral coefficients are as follows (Fig-1):

1. A windowed snippet of an audio signal is Fourier transformed.

2. Use triangular overlapping panes to map the spectrum's powers onto the Mel scale.

3. Keep track of the strength of each melt frequency.

4. Take the discrete cosine transform of the Mel log powers from the list above.

5. The amplitudes of the resultant spectrum are the needed MFCCs.


Fig-1 Steps to derive MFC

III. Problem Statement

Work productivity and organizational effectiveness are associated with interpersonal

communication skills and personal qualities. For information to be passed accurately, both

verbal and nonverbal means of communication are used. These include gestures, posture, and

tone. In order to achieve this, the researcher aims to suggest a revolutionary technique that

uses numerous machine learning algorithms to identify the personality.

IV. Data Processing

A. Data Collection

We created AVI cloud-based software, similar to the work in [42], to build our dataset

in a genuine job interview setting. The AVI server may accept recorded video prompts,

construct interview scripts, send video prompts from the interview, and receive video answers

using Google cloud storage. The video replies' content may subsequently be utilized to do

algorithmic studies, such as audio and visual data analytics of the video replies. The replies of

interviews may be recorded at one moment in time but afterwards assessed by an algorithm,

human raters, or both at a later point in time throughout the AVI.


We ran an experiment with the help of Ahmedabad based nonprofit human resources

(HR) organization. Our sponsor made a job description available to its members on a web

website for recruiting 2–3 HR experts from a connected firm in Ahmedabad, Rajkot and

Baroda. Members that were interested sent their resumes to the researchers, who vetted the

resumes based on the job description. A total of 120 candidates were asked to log into the

AVI, which sent pre-programmed interview questions to interviewees' web camera-equipped

mobile or computer devices at any time of day through a browser. The replies of the

respondents were captured for examination, including both audio and visual information. Our

algorithms would record and evaluate the full of the candidates' interview procedures and

replies, including audio and video, and utilize it as a reference for hiring recommendations.

During the AVI, the interview questions were arranged in a regular fashion. The same

five questions were given to all candidates, and they were behaviorally geared to evaluate the

candidates' communication abilities in light of the job description. Each question was shown

on a separate screen, and as the candidates entered the screen, the audio for the text questions

began immediately. The candidates were allowed a maximum of 3 minutes to answer each

question, which were displayed on-screen one at a time in order. Within the 3-minute time

limit, candidates might opt to skip to the following question. After 3 minutes, a new screen

with the following question opened automatically. The whole video interview procedure took

around 20 minutes, including one practice try.

B. Data Labelling

We utilized a 50-item international personality item pool (IPIP) assessment created in

[43] to evaluate the applicants' self-rated big five qualities to acquire genuine ratings for the

individual big five features [6]. All candidates were requested to take the IPIP survey online

before to participating in the AVI and were told that the survey data would be sent to
researchers alone and would have no bearing on the hiring recommendation. This method

was carried out to lessen the impacts of social desire, which may cause self-rated personality

qualities to be distorted in the pursuit of a job [44].

C. Feature Extraction

We began using the pretrained Inception-v3 dataset acquired for ImageNet, which has

over 14 million photos organized into 1,000 classes, to capture the candidates' facial

expressions. In addition, as shown in Figure-2, we trained our face recognition model using

OpenCV and Dlib while monitoring 86 facial landmark points every frame. Furthermore,

since this landmark point is unaffected by facial emotions, we utilized it as the anchor point

location during feature extraction to limit background noise and decrease mistakes such head

motion [45].

Fig-2 Image Annotation


Using FFmpeg, we extracted the pictures frame by frame from our AVI dataset to

create the feature extractor. All of the pictures were normalized to 640 pixels in width, with

the height of each picture set by the vision device's pixel ratio. As shown in Figure-3, we

retrieved the characteristics of the 86 landmark points from each frame during a 5-second

period from all of the AVI data for each applicant. We turned all the photos to grayscale to

enhance picture categorization and eliminate background interference from hair and

cosmetics. More than 10,000 photos were utilized in this experiment's test scenarios.

Fig-3 Extracted Video Frames


V. Modeling and Results

A. Model Building

To train our APR model, which was developed using a sophisticated CNN developed

in Python and the TensorFlow DL engine [46], we integrated the 120 applicants' personality-

labeled data with their extracted characteristics. We normalized the features by rescaling the

range of feature values to [0, 1] before feeding them into the neural network model. The

initial few layers of the neural network model, which had over 72 thousand neurons, were

designed to automatically extract information from photos that had been preprocessed as

stated in the preceding section. The retrieved features were then concatenated with additional

features before being sent to the output layer for classification. Our CNN structure included

four convolutional layers, three pooling layers, ten mixed layers, a fully connected layer, and

a softmax layer as the output, as shown in Figure -4.

Fig-4 Steps in CNN

The neural network's inputs were the extracted elements of the applicants' facial

expressions, and the output was their self-assessed big five personality trait scores. As

illustrated in Figure-6, the completely linked network was formed utilizing the relationships

between distinct nodes. A grayscale picture was used to represent the input. Three filter

functions were utilized in each of the four convolutional layers. Convolutional layer 2 has 64
convolution filters, up from 32 in convolutional layer 1. A pooling layer was added after each

convolutional layer. The dropout rate was set at 0.1 and both pooling layers (one average-

pooling and two max-pooling) had a stride of 2 2. There were 2,048 neurons in the last

completely linked layer, with dropouts of 0.4 and 0.5. The suggested CNN's last layer was a

softmax layer with 50 potential outputs (10 interval- scale classes from 1.1 to 6.0 that

reflected the big five personality classifications). To avoid overfitting, we included dropout

after the fully connected layers at the end of our convolutional network, beginning with a

probability of 0.5 and progressively decreasing the dropout rate until the performance was

optimized.

We divided the data in these studies as follows: the test set was 50%, and the

validation set was 50%, based on the same sampling. In this dataset, each candidate

possessed five distinctive attributes (the big five qualities). 4,000 training iterations were

completed. The training batch size was 256, the learning rate was 0.01, the evaluation

frequency was 10, and the learning rate was 0.01.

B. Criteria for the Assessment

To analyze the performance of our APR, we employed concurrent validity to see how

well the novel measuring technique (the APR) corresponded with the well-established

measuring process (the self-reported inventory). In this experiment, we utilized the Pearson

correlation coefficient (r) to quantify concurrent validity, as described in [6] and [47]. We

also used [8] to calculate the coefficient of determination (R2) and mean square error (MSE).

The R2 reflects the amount of variation in the dependent variable (y) that the predictor can

predict or explain. The better the model, the higher the R 2. The MSE quantifies the regression

model's quality of fit, the higher the value, the bigger the error.
C. Results

Prior to evaluating our APR's performance, we tested the construct validity and

internal consistency reliability of the self-reported personality characteristics using IBM's

statistical software for the social sciences (SPSS v 23). A confirmatory factor analysis

revealed that each factor loading was larger than 0.6, while the Kaiser-Meyer-Olkin (KMO)

value was larger than 0.8, indicating that the concept validity was adequate in this research.

The internal consistency reliability was good because the Cronbach's alpha (α) values were

all larger than 0.7 as follows: Openness to experience (α = .75), conscientiousness (α = .83),

extraversion (α = .88), agreeableness (α = .80), and neuroticism (α = .84).

The AI TensorFlow engine effectively learnt and predicted all of the dimensions of

the big five qualities, as shown in Table I. APR could predict all of the genuine big five

personality self-assessment scores. Each dimension's Pearson correlation was between 0.966

and 0.976. Each dimension's R2 ranged from 0.933 to 0.952. The MSE for each dimension

was between 0.053 and 0.120, and all correlations were determined to be significant (p 0:01).

The better the estimator, the greater the R2 (100 percent is excellent). In contrast, the smaller

the estimator error is, the lower the MSE is (0 is ideal). Furthermore, the average accuracy of

the classifiers (ACC) was 95.36 percent, according to the classification accuracy data.

TABLE-1 Experiment Results

Personality Traits R R2 MSE ACC%

Openness to experience 0.966 0.933 0.053 97.4


Conscientiousness 0.976 0.952 0.094 96.7
Extraversion 0.974 0.949 0.120 97.0
Agreeability 0.971 0.943 0.069 90.9
Neuroticism 0.968 0.937 0.092 94.8

p < .01
VI. Discussions

A. Comparison with Related Works

This research is in response to a request for personality computing research [40], [48],

[49]. Validating APR using manually labelled features from any discernible distal cues was

fairly difficult in classical personality computing [6]. As a result, several recent research have

used deep learning architectures to predict personality using third-party datasets like

Amazon's Mechanical Turk or ChaLearn's First Impressions dataset [40]. However, most of

these experiments employed APP, in which the DL engines acted as observers detecting

nonverbal clues from interviews and making inferences about the interviewees' personality

characteristics in the setting of zero-acquaintance judgments. In other words, these studies

employed subjective personality perceptions as independent variables rather than real

personality scores [6], which might have created biases [18].

VI. Future Works

Previous research has showed that deep neural networks that learn multimodal

variables (image frames and audio) perform better than unimodal features in predicting the

big five qualities. To learn how to discern an interviewee's personality, we may combine our

visual technique with prosodic elements in future research. Furthermore, the participants in

this research were a specialized sort of professional, which may restrict the generalizability of

the findings. A more varied participant pool should be included in future studies.

VII. Conclusion

Based on just 120 genuine samples of job candidates, this article created an AVI

integrated with a TensorFlow-based semi-supervised DL model to successfully auto-


recognize an interviewee's genuine personality. In the context of nonverbal communication,

our APR technique produced an accuracy of over 90%, exceeding earlier relevant laboratory

experiments with accuracy ranging from 61% to 75% [6]. The high-performing APR utilized

in this AVI may be used to enhance or replace self-reported personality evaluation

approaches that may be skewed by job candidates owing to societal pressure to be hired.

References

[1] A. M. Ryan et al., "Culture and testing practices: is the world flat?" Appl.

Psychology, vol. 66, no. 3, pp. 434– 467, Mar. 2017.

[2] M. J. W. McLarnon, A. C. DeLongchamp, and T. J. Schneider, "Faking it! Individual

differences in types and degrees of faking behavior," Personality and Individual

Differences, vol. 138, pp. 88–95, Feb. 2019.

[3] T. DeGroot and J. Gooty, "Can nonverbal cues be used to make meaningful

personality attributions in employment interviews?" J. Bus. Psychology, vol. 24, no.

2, pp. 179–192, Feb. 2009.

[4] I. Nikolaou and K. Foti, "Personnel selection and personality," in The SAGE

Handbook of Personality and Individual Differences, V. Zeigler-Hill and T.

Shackelford, Eds., London, UK: Sage, 2018, pp. 659– 677.

[5] F. S. Brenner, T. M. Ortner, and D. Fay, "Asynchronous video interviewing as a new

technology in personnel selection: the applicant’s point of view," Frontiers in

Psychology, vol. 7, no. 863, pp. 1–11, Jun. 2016.

[6] A. Vinciarelli and G. Mohammadi, "A survey of personality computing," IEEE

Trans. Affective Comput., vol. 5, no. 3, pp. 273–291, Jul. 2014.

[7] M. R. Barrick, G. K. Patton, and S. N. Haugland, "Accuracy of interviewer

judgments of job applicant personality traits," Personnel Psychology, vol. 53, no. 4,
pp. 925–951, Dec. 2000.

[8] O. Celiktutan and H. Gunes, "Automatic prediction of impressions in time and

across varying context: personality, attractiveness and likeability," IEEE Trans.

Affective Comput., vol. 8, no. 1, pp. 29–42, Jan. 2017.

[9] N. Y. Asabere, A. Acakpovi, and M. B. Michael, "Improving socially-aware

recommendation accuracy through personality," IEEE Trans. Affective Comput., vol.

9, no. 3, pp. 351–361, Aug. 2018.

[10] I. Naim, M. I. Tanveer, D. Gildea, and M. E. Hoque, "Automated analysis and

prediction of job interview performance," IEEE Trans. Affective Comput., vol. 9, no. 2,

pp. 191–204, Apr. 2018.

[11] M. Langer, C. J. König, and K. Krause, "Examining digital interviews for personnel

selection: applicant reactions and interviewer ratings," Int. J. Selection Assessment,

vol. 25, no. 4, pp. 371–382, Dec. 2017.

[12] Y. Xin et al., "Machine learning and deep learning methods for cybersecurity," IEEE

Access, vol. 6, no. pp. 35365–35381, May 2018.

[13] J. Liu et al., "Artificial intelligence in the 21st century," IEEE Access, vol. 6, pp.

34403–34421, Mar. 2018.

[14] M. I. Jordan and T. M. Mitchell, "Machine learning: trends, perspectives, and

prospects," Science, vol. 349, no. 6245, pp. 255–260, Jul. 2015.

[15] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning,"

Nature, vol. 521, no. 7553, pp. 436–444, May 2015.

[16] K. K. Htike and O. O. Khalifa, "Comparison of supervised and unsupervised learning

classifiers for human posture recognition," in Int. Conf. Comput. Commun. Eng.

(ICCCE'10), Kuala Lumpur, Malaysia, 2010, pp. 1–6.

[17] B. Aydin, A. A. Kindiroglu, O. Aran, and L. Akarun, "Automatic personality


prediction from audiovisual data using random forest regression," in 23rd Int. Conf.

Pattern Recognition (ICPR), Cancun, Mexico, 2016, pp. 37–42.

[18] H. J. Escalante et al., "Explaining first impressions: modeling, recognizing, and

explaining apparent personality from videos," arXiv:1802.00745, 2018.

[19] M. Mikulincer, P. R. Shaver, M. L. Cooper, and R. J. Larsen, APA Handbooks in

Psychology. APA Handbook of Personality and Social Psychology. Personality

Processes and Individual Differences. Washington, DC: American Psychological

Association, 2015.

[20] L. M. Hough and S. Dilchert, "Personality: its measurement and validity employee

selection," in Handbook of Employee Selection, J. L. Farr and N. T. Tippins, Eds.,

New York, NY: Routledge, 2017, pp. 298–325.

[21] D. P. Schmitt, J. Allik, R. R. McCrae, and V. Benet- Martínez, "The geographic

distribution of big five personality traits," J. Cross-Cultural Psychology, vol. 38, no. 2,

pp. 173–212, Mar. 2007.

[22] J. B. Walther, "Theories of computer-mediated communication and interpersonal

relations," in The Handbook of Interpersonal Communication, M. L. Knapp and J. A.

Daly, Eds., Thousand Oaks, CA: Sage, 2011, pp. 443–479.

[23] S. Nestler and M. D. Back, "Applications and extensions of the lens model to

understand interpersonal judgments at zero acquaintance," Current Directions

Psychological Sci., vol. 22, no. 5, pp. 374–379, Sep. 2013.

[24] C. A. Gorman, J. Robinson, and J. S. Gamble, "An investigation into the validity of

asynchronous web- based video employment-interview ratings," Consulting

Psychology J., vol. 70, no. 2, pp. 129–146, Jun. 2018.

[25] L. Batrinca, B. Lepri, N. Mana, and F. Pianesi, "Multimodal recognition of personality

traits in human- computer collaborative tasks," in Proc. 14th ACM Int. Conf.
Multimodal Interaction, Santa Monica, CA, 2012, pp. 39–46.

[26] L. M. Batrinca, N. Mana, B. Lepri, F. Pianesi, and N. Sebe, "Please, tell me about

yourself: automatic personality assessment using short self-presentations," in Proc.

13th Int. Conf. Multimodal Interfaces, Alicante, Spain, 2011, pp. 255–262.

[27] A. V. Ivanov, G. Riccardi, A. J. Sporka, and J. Franc, "Recognition of personality

traits from human spoken conversations," in Proc. 12th Annu. Conf. Int. Speech

Commun. Assoc. (INTERSPEECH 2011), Florence, Italy, 2011, pp. 1549–1552.

[28] Y. Güçlütürk et al., "Multimodal first impression analysis with deep residual

networks," IEEE Trans. Affective Comput., vol. 9, no. 3, pp. 316–329, Sep. 2018.

[29] L. Chen, F. Zaromb, Z. Yang, C. W. Leong, and M. Martin-Raugh, "Can a machine

pass a situational judgment test measuring personality perception?" in Seventh Int.

Conf. Affective Comput. Intelligent Interaction Workshops and Demos (ACIIW), San

Antonio, TX, 2017, pp. 7–11.

[30] L. Chen et al., "Automated video interview judgment on a large-sized corpus

collected online," in Seventh Int. Conf. Affective Comput. Intelligent Interaction

(ACII), San Antonio, TX, 2017, pp. 504–509.

[31] L. Chen et al., "Automatic scoring of monologue video interviews using multimodal

cues," in Proc. INTERSPEECH 2016, the 17th Annu. Conf. Int. Speech Commun.

Assoc., Princeton, NJ, 2016, pp. 32–36.

[32] L. S. Nguyen and D. Gatica-Perez, "Hirability in the wild: analysis of online

conversational video resumes," IEEE Trans. Multimedia, vol. 18, no. 7, pp. 1422–

1437, Jul. 2016.

[33] A. T. Rupasinghe, N. L. Gunawardena, S. Shujan, and D.S. Atukorale, "Scaling

personality traits of interviewees in an online job interview by vocal spectrum and

facial cue analysis," in Sixteenth Int. Conf. Advances in ICT for Emerging Regions
(ICTer), Negombo, Sri Lanka, 2016, pp. 288–295.

[34] M. McRorie et al., "Evaluation of four designed virtual agent personalities," IEEE

Trans. Affective Comput., vol. 3, no. 3, pp. 311–322, Jul. 2012.

[35] S. K. Ö tting and G. W. Maier, "The importance of procedural justice in human–

machine interactions: intelligent systems as new decision agents in organizations,"

Comput. Human Behavior, vol. 89, pp. 27–39, Dec. 2018.

[36] L. P. Naumann, S. Vazire, P. J. Rentfrow, and S. D. Gosling, "Personality judgments

based on physical appearance," Personality and Social Psychology Bulletin, vol. 35,

no. 12, pp. 1661–1671, Sep. 2009.

[37] J. C. S. J. Junior et al., "First impressions: a survey on computer vision based

apparent personality trait analysis," arXiv:1804.08046, 2018.

[38] R. Petrican, A. Todorov, and C. Grady, "Personality at face value: facial appearance

predicts self and other personality judgments among strangers and spouses," J.

Nonverbal Behavior, vol. 38, no. 2, pp. 259–277, Jan. 2014.

[39] A. Basu et al., "A portable personality recognizer based on affective state

classification using spectral fusion of features," IEEE Trans. Affective Comput., vol.

9, no. 3, pp. 330–342, Jul. 2018.

[40] S. Escalera, X. Baro, I. Guyon, and H. J. Escalante, "Guest editorial: apparent

personality analysis," IEEE Trans. Affective Comput., vol. 9, no. 3, pp. 299–302, Jul.

2018.

[41] G. Pons and D. Masip, "Supervised committee of convolutional neural networks in

automated facial expression analysis," IEEE Trans. Affective Comput., vol. 9, no. 3,

pp. 343–350, Jul. 2018.

[42] S. Rasipuram, S. B. P. Rao, and D. B. Jayagopi, "Asynchronous video interviews vs.

face-to-face interviews for communication skill measurement: a systematic study,"


in Proc. 18th ACM Int. Conf. Multimodal Interaction, Tokyo, Japan, 2016, pp. 370–

377.

[43] L. R. Goldberg, "The development of markers for the big-five factor structure,"

Psychological Assessment, vol. 4, no. 1, pp. 26–42, Mar. 1992.

[44] B. W. Swider, M. R. Barrick, and T. B. Harris, "Initial impressions: what they are,

what they are not, and how they influence structured interview outcomes," J. Appl.

Psychology, vol. 101, no. 5, pp. 625–638, May 2016.

[45] S. Yang and B. Bhanu, "Understanding discrete facial expressions in video using an

emotion avatar image," IEEE Trans. Syst., Man, Cybern. Part B, vol. 42, no. 4, pp.

980–992, Aug. 2012.

[46] K. Bregar and M. Mohorcic, "Improving indoor localization using convolutional

neural networks on computationally restricted devices," IEEE Access, vol. 6, no. pp.

17429–17441, Mar. 2018.

[47] L. Zheng et al., "Reliability and concurrent validation of the IPIP big-five factor

markers in China: consistencies in factor structure between Internet-obtained

heterosexual and homosexual samples," Personality and Individual Differences, vol.

45, no. 7, pp. 649–654, Nov. 2008.

[48] D. Xue et al., "Personality recognition on social media with label distribution

learning," IEEE Access, vol. 5, pp. 13478–13488, Jun. 2017.

[49] M. M. Tadesse, H. Lin, B. Xu, and L. Yang, "Personality predictions based on user

behavior on the facebook social media platform," IEEE Access, vol. 6, pp. 61959–

61969, Oct. 2018.

You might also like