You are on page 1of 12

Introduction

To be able to survive and develop sustainably in today's market economy, each business must
ensure stable financial capacity. As a result, analytical work is becoming increasingly important, not
only for corporate administrators but also for investors, shareholders, creditors, government
agencies, and so on. As a result, as a Research Analyst at SSI Securities Corporation, I chose to plan
the business reports of some Vietnamese companies/industries using statistical methods to assist
managers in understanding the current state of the business, predicting risks, and identifying
opportunities for future growth. From there, managers can develop appropriate policies and plans to
reduce risks and improve operational efficiency. In particular, in this report, I will evaluate and
analyze business data (financial information, stock market), microeconomics or recent
macroeconomic issues, trends or intentions for the future, and so on related to my research topic.
And the investigation of the aforementioned issues will be limited to the asset turnover of
companies listed on the stock exchange. At the same time, this study will assist in providing
documents for investors to consider the business's financial position.
Reason for choosing the topic: Real estate investment and business activities are a unique business
field due to a large amount of capital involved, the long capital recovery time, and the highly
sustainable products. As a result, the role of financial analysis in real estate investment projects is
even more critical. Because real estate projects are always large-scale endeavors, errors in financial
analysis are not tolerated. If there is an error in the calculation, the analysis results will have a
significant impact on not only the investment project but also the existence and development of the
business. Realizing the importance of the above urgent problem while studying and researching at
university, as well as learning the actual situation of project formulation and financial analysis of
the company's investment projects, I decided to choose this topic for research.
Research objective: provide statistics for investors.
Research scope: companies listed on the stock exchange 2010-2021.
The study uses quantitative, qualitative, and legacy research methods and aims to provide data in
the most understandable way for investors:
 On the practical side, providing statistics on financial indicators for investors to grasp the
market of the industry.
 Theoretically, the study provides data and definitions of industry companies as a reference
source for future research in the same field.
I. Evaluate the nature and process of business and economic data/information from a range of
different published sources.
Data management is a lexically demanding discipline. The terms "data," "information," and
"knowledge" play an important role in that vocabulary challenge. These three terms are so
frequently misused, abused, and used interchangeably that their true meanings are frequently
obscured. For illustration, Theirauf (1999) defines three components as follows: data as the lowest
point, a set of facts and unstructured data; information is the next level and is considered structured
data; knowledge is ultimately defined as "information about information". To begin tackling the
vocabulary challenge and creating an official program framework, these three terms must be
formally defined - to create the data-information-knowledge cycle - and used consistently (Michael
Brackett, 2015).
1. Data
Data is defined as a collection of individual facts or statistics (bloomfire, n.d). Although datum is
the plural form of data, it is not commonly used in everyday speech. Text, observations, figures,
images, numbers, charts, and symbols are all examples of data. At the same time, data is a type of
raw knowledge that has no meaning or purpose on its own. In other words, in order for the data to

1
make sense, it must be interpreted. Data can appear simple and even ineffective until it is analyzed,
organized, and interpreted.
There are two main types of data:
• Quantitative data is provided in numerical form, such as the weight, volume, or cost of an item.
• Qualitative data is descriptive data, but not metric, such as a person's name, gender, or eye color.
2. Information
Knowledge acquired through study, communication, research, or instruction is referred to as
information. Essentially, information is the result of data analysis and interpretation. While data
refers to specific figures, numbers, or graphs, information refers to people's perceptions of those
pieces of knowledge (dataversity, n.d).
For example, a dataset could include temperature readings from a specific location over time. Those
temperatures make no sense in the absence of any additional context. When that data is analyzed
and organized, however, seasonal temperature patterns or even broader climate trends can be
identified. Data can only be useful to others if it is organized and aggregated in a useful manner.
3. Knowledge
Knowledge is a collection of facts, experiences, and insights that help an individual or organization.
Furthermore, knowledge encompasses information gained through experience, learning, familiarity,
association, perception, and comprehension. Besides that, it includes standards for evaluating inputs
from our surroundings (linkedin, 2021).
Moreover, knowledge is classified into two types: implicit and explicit. Tacit knowledge is
information that a person has in mind. It can be difficult to communicate with others and widely
distribute it. Explicit knowledge, also known as formal knowledge, is information that has been
formalized and stored in a variety of formats for the benefit of humanity, such as books, periodicals,
audio recordings, presentations, and so on, and is kept in a reference library or on the internet. It is
easily transferable to different mediums and widely available.
4. How data can be turned into information and information into knowledge
Age Percentage of lung cancer cases
20-34 0.2 %
35-44 0.9 %
45-54 6.1 %
55-64 21.9 %
65-74 34.4 %
75-84 26.6 %
Over 84 9.7 %

The statistics table above exemplifies how data can be transformed into information and
information into knowledge. And the data shown here are; 20-34; 35-44; 45-54; 55-64; 65-74; 75-
84; 84+, which are taken from American Cancer Society (ACS) statistics on the age group with the
highest risk of lung cancer. According to the data collected, the age group with the highest risk of
lung cancer is 55-64 years old, with a rate of 21.9%; 65-74 years old, with a rate of 34.4%; and 75-
84 years old, with a rate of 26.6%. There will be an estimated 235,760 new cases of lung cancer and
131,880 deaths from the disease in 2021. While half of all lung cancers are diagnosed between the
ages of 55 and 74, non-small cell lung cancer can occur at any age, including children. Notably,
lung cancer is more common in women under the age of 40 than in men. In a study of people aged
20 to 49 with lung cancer, 28% of women and 18.6% of men had never smoked. Women's high
exposure to tobacco smoke, living in environments with radon gas (a natural air pollutant that

2
penetrates homes through cracks and small holes in the ground), genetics, environmental exposure,
or an occupational factor all contribute to this rate's rise. According to numerous recent studies, the
human papillomavirus (HPV) can also cause lung cancer. Furthermore, the female body is
extremely sensitive to the carcinogens found in tobacco. As a result, if they are exposed, they are
more likely to develop lung cancer. Overall, lung cancer is most common between the ages of 54
and 75, but it can occur at any age, so special attention is needed.
5. The analysis of data collection and the process of converting data into information and
knowledge
The study's data is obtained from the website cafef.vn, and the author then uses financial statements
to analyze the data of five companies, which include at least six variables, one categorical variable,
and fifty observed variables. The received results will be entered into an excel file to calculate
ROA, ROE, and ROS indexes before being entered into SPSS to run data and conduct research.
II. Evaluate data from a variety of sources using different methods of analysis.
1. Descriptive analysis
Descriptive statistics is a branch of statistics that aims at describing a number of features of data
usually involved in a study. The main purpose of descriptive statistics is to provide a brief summary
of the samples and the measures done on a particular study. Coupled with a number of graphics
analysis, descriptive statistics form a major component of almost all quantitative data analysis
(aresearchguide, n.d). Besides that, descriptive statistics are frequently used to present quantitative
data analysis in a straightforward manner. There are numerous variables that are commonly
measured in a study. As a result, descriptive statistics arose to simplify this massive amount of data.
Not only that, but descriptive statistics will assist everyone in the organization in making better
decisions to help the company's activities run smoothly. Simultaneously, managers can quickly
assess how well the company is performing and where adjustments may be required because it
reveals trends that may be hidden in raw data. Furthermore, descriptive statistics aid in the
provision of summary statistics for various data sets, allowing for comparison.
Example: Student A's score is {70, 85, 90, 65) and Class B is {60, 40, 89, 96}. Then, the average
score of each class can be calculated as the mean of 77.5 and 71.25. This signifies that the average
score of class A is higher than that of class B.
2. Exploratory data analysis
According to Mel Resrori, in data mining, Exploratory Data Analysis (EDA) is an approach to
analyzing datasets to summarize their main characteristics, often with visual methods. EDA is used
for seeing what the data can tell us before the modeling task. It is not easy to look at a column of
numbers or a whole spreadsheet and determine important characteristics of the data. The primary
goal of EDA is to assist in the review of data prior to making any assumptions. It can assist in
identifying obvious errors, as well as better understanding data patterns, detecting outliers, and
discovering interesting relationships between variables (ibm, n.d.). Managers can use exploratory
analytics to ensure that the results they produce are accurate and applicable to any desired business
outcome or goal. EDA also assists stakeholders by ensuring that they are asking the appropriate
questions and can provide answers to questions about standard deviations, categorical variables, and
confidence intervals. When the EDA is finished and the insights have been extracted, the features
can be used for more sophisticated data modeling or analysis.
For example, if a new feature is added to an existing app, product researchers will want to see how
users react to the new feature. The study is not exploratory if the feature added to the app is an
already existing feature.
3. Confirmatory factor analysis

3
Confirmatory factor analysis, or CFA, is a statistical formula used to test how effectively
measurable variables represent specific constructs. CFA works similarly to exploratory factor
analysis, with researchers specifying the number of factors required within the data and which
measured variable is relevant to which latent variable. Typically, CFA reflects or confirms one or
more concepts (indeed, 2022).
For example, this process has aided the development of theories in the social sciences where more
latent variables can emerge. The equation for the CFA is as follows: x = Λxξ + δ
In this formula, x represents the observed variables, ξ represents the latent variables, Λx represents
the coefficients or loading coefficients connecting the latent and observed variables, and δ>
represents any measurement error. Usually, the model for this equation is found in diagrams where
the latent variable is connected to many observed variables. By using this formula, experts can
determine if their model is viable.
Through example, it can be seen that analysis and statistics are very important to understand the
performance of data as well as in different research models.
III. The analysis of qualitative and quantitative raw business data from a range of examples,
using appropriate statistical methods
Research question: What is the relationship between post-return profits and how is it determined?
Defining the study population: all companies that have published their financial statements on the
Vietnamese stock exchange.
Sampling strategy: convenient by reading financial statements on the website cafef.vn, then
filtering data and processing it on an excel file.
Number of 60 observations.

4
Average
Net sales
No Stock code Year Total assets Shareholder's Net income ROA ROE ROS
revenue
equity
1 SID 2010 17.721.170.924 1.228.369.861.357 818.936.922.498 67.145.256.166 5,466 8,199 378,899
2 SID 2011 17.384.748.243 2.212.815.317.654 1.667.813.037.295 922.335.580.747 41,682 55,302 5305,430
3 SID 2012 51.179.201.643 2.228.932.605.313 1.757.872.112.275 106.008.650.532 4,756 6,031 207,132
4 SID 2013 92.038.687.855 2.390.823.445.506 1.970.460.202.879 168.516.566.610 7,048 8,552 183,093
5 SID 2014 164.958.799.296 2.190.672.436.121 2.032.436.590.889 121.407.498.996 5,542 5,973 73,599
6 SID 2015 244.555.756.477 2.289.341.637.089 2.069.251.338.539 97.499.593.198 4,259 4,712 39,868
7 SID 2016 127.911.246.403 2.407.695.692.411 2.117.152.739.756 39.996.892.252 1,661 1,889 31,269
8 SID 2017 231.114.134.827 2.333.809.430.493 2.095.824.722.584 59.779.771.729 2,561 2,852 25,866
9 SID 2018 202.784.787.666 2.372.964.204.148 2.176.810.794.760 75.269.821.467 3,172 3,458 37,118
10 SID 2019 108.992.303.383 2.313.287.847.630 2.174.168.234.236 66.258.133.286 2,864 3,048 60,792
11 SID 2020 86.670.263.775 2.374.943.363.222 2.238.678.550.058 66.278.730.395 2,791 2,961 76,472
12 SID 2021 73.419.784.953 2.405.664.611.387 2.269.465.559.600 32.482.427.146 1,350 1,431 44,242
13 BAX 2010 23.071.696.032 477.400.417.520 111.779.943.770 21.540.729.865 4,512 19,271 93,364
14 BAX 2011 35.731.789.665 452.036.164.291 103.825.567.796 22.398.813.948 4,955 21,574 62,686
15 BAX 2012 40.831.036.884 437.600.710.642 104.936.551.427 20.976.092.467 4,793 19,989 51,373
16 BAX 2013 47.131.595.521 478.514.187.762 113.490.290.843 25.283.627.708 5,284 22,278 53,645
17 BAX 2014 61.902.906.539 488.122.441.298 132.367.048.723 24.593.426.018 5,038 18,580 39,729
18 BAX 2015 67.397.918.810 494.718.101.882 135.099.459.973 22.889.375.079 4,627 16,943 33,962
19 BAX 2016 59.390.847.713 498.488.422.785 137.665.347.648 22.693.122.052 4,552 16,484 38,210
20 BAX 2017 70.830.789.628 499.596.364.252 144.237.278.886 25.722.525.837 5,149 17,833 36,315
21 BAX 2018 68.369.463.986 638.879.060.339 131.720.367.492 23.068.344.901 3,611 17,513 33,741
22 BAX 2019 183.774.710.168 890.661.989.439 181.142.451.107 85.024.748.660 9,546 46,938 46,266
23 BAX 2020 311.296.216.195 868.615.011.033 279.585.838.943 145.588.712.320 16,761 52,073 46,769
24 BAX 2021 171.905.798.278 828.373.196.874 226.416.428.053 60.333.386.197 7,283 26,647 35,097
25 SDU 2010 252.265.261.662 854.055.627.225 347.806.895.189 25.007.249.340 2,928 7,190 9,913
26 SDU 2011 68.983.924.820 827.653.905.222 328.183.037.458 1.383.391.609 0,167 0,422 2,005
27 SDU 2012 126.584.229.844 781.920.046.120 330.437.647.452 2.254.609.994 0,288 0,682 1,781

1
0,199 0,478 5,810
28 SDU 2013 27.328.317.714 799.571.078.732 332.025.452.201 1.587.804.749
29 SDU 2014 127.616.546.791 983.908.860.773 334.723.377.644 2.697.925.443 0,274 0,806 2,114
30 SDU 2015 75.510.193.484 1.190.745.049.977 335.530.223.732 1.076.638.632 0,090 0,321 1,426
31 SDU 2016 537.298.474.545 865.756.928.794 341.130.029.173 4.381.888.779 0,506 1,285 0,816
32 SDU 2017 37.381.982.602 1.038.202.070.920 342.975.315.879 1.867.122.110 0,180 0,544 4,995
33 SDU 2018 23.911.237.986 1.060.815.030.799 348.536.204.303 5.589.951.511 0,527 1,604 23,378
34 SDU 2019 88.375.259.018 1.084.954.619.198 350.934.521.024 612.273.812 0,056 0,174 0,693
35 SDU 2020 87.672.504.218 1.186.505.965.185 349.494.818.987 127.569.832 0,011 0,037 0,146
36 SDU 2021 53.866.473.736 1.185.282.337.276 350.389.589.168 894.770.181 0,075 0,255 1,661
37 IDV 2010 28.664.937.728 223.929.788.432 42.290.468.874 13.460.833.842 6,011 31,829 46,959
38 IDV 2011 29.906.436.640 291.915.203.700 54.176.055.002 14.821.563.187 5,077 27,358 49,560
39 IDV 2012 15.800.764.770 277.183.260.607 44.693.309.627 8.550.686.666 3,085 19,132 54,116
40 IDV 2013 33.562.964.210 318.028.609.860 50.689.103.491 18.121.997.150 5,698 35,751 53,994
41 IDV 2014 66.094.359.522 417.363.308.849 85.879.055.624 47.994.792.201 11,500 55,886 72,616
42 IDV 2015 63.706.304.553 503.999.580.054 117.884.060.462 48.053.166.887 9,534 40,763 75,429
43 IDV 2016 115.244.249.371 617.885.053.479 156.184.755.968 73.465.414.389 11,890 47,038 63,748
44 IDV 2017 109.977.086.307 704.570.355.196 193.425.152.639 80.819.668.566 11,471 41,783 73,488
45 IDV 2018 74.397.622.401 758.871.363.169 222.177.353.493 68.707.770.706 9,054 30,925 92,352
46 IDV 2019 124.523.774.684 921.123.995.842 270.022.110.579 98.369.987.517 10,679 36,430 78,997
47 IDV 2020 220.409.914.409 1.260.852.444.841 423.752.729.619 210.142.001.834 16,667 49,591 95,341
48 IDV 2021 120.917.840.262 1.408.767.706.633 551.300.565.561 155.620.988.680 11,047 28,228 128,700
49 TIP 2010 74.049.614.476 452.257.678.529 368.192.784.383 31.620.305.181 6,992 8,588 42,702
50 TIP 2011 82.482.606.656 438.055.980.081 345.069.039.772 35.851.395.053 8,184 10,390 43,465
51 TIP 2012 172.257.444.506 602.753.326.397 348.468.704.929 32.594.559.366 5,408 9,354 18,922
52 TIP 2013 188.260.735.900 558.281.976.982 353.921.974.809 39.469.347.213 7,070 11,152 20,965
53 TIP 2014 155.438.204.704 539.137.371.317 357.577.900.428 36.352.520.459 6,743 10,166 23,387
54 TIP 2015 194.838.432.309 552.879.473.883 409.861.605.321 72.546.569.990 13,122 17,700 37,234
55 TIP 2016 192.646.418.501 508.240.521.602 404.159.492.500 62.630.793.766 12,323 15,497 32,511
56 TIP 2017 189.766.450.449 560.610.767.450 476.759.602.717 59.898.255.962 10,684 12,564 31,564
57 TIP 2018 196.922.943.329 675.072.151.414 490.584.306.944 96.301.905.503 14,265 19,630 48,903
58 TIP 2019 216.904.874.667 812.886.131.659 517.431.485.001 89.765.998.057 11,043 17,348 41,385

2
13,544 22,632 53,015
59 TIP 2020 261.044.397.114 1.021.772.109.127 611.490.563.028 138.392.407.892
60 TIP 2021 247.433.037.564 959.916.031.924 693.801.796.537 92.845.295.303 9,672 13,382 37,523

(Financial statements of real estate and construction joint stock companies over 12 years, 2023)

3
1. The application of descriptive statistics

(Statistics descriptive, 2023)

Mean: 6.588878 is the average value of the total of 60 observed variables in ROA.
Median: 8.84035 is the middle value of 60 observed variables in ROA in the case of the system
description list.
Valid: 60, and this is the total number of observations is the sum of N (N=60).
Missing: 0, which means there are no missing values.
Mode: 0.11a is the most frequent response value in ROA.
Range: 41,671 is the range, the average distance between values.
Minimum: 0.011 is the lowest value in the ROA data.
Maximum: 41,682 the highest value in the ROA data.
Standard deviation: 6.424594 is the square root of the variance and it estimates the spread of a

1
set of observations.
Variance: 58.822 is the variance reflecting the spread of the data set.
The larger the data, the larger the variance from the mean.

Sig. sig factor of lnst = 0.000 is less than 0.05 so lnst is correlated with specific ROA lnst and ROA
correlation = 0.230 and have a positive correlation with each other.

(Statistics descriptive, 2023)


Looking at the chart, we can see that ROA is mainly concentrated from 000 to 20, peaking from 2
to 11 represents the maximum of 60 observed variables.
2. The application of exploratory data analysis

2
Based on the chart above, we can see that the median line has the main sharp shape, so there is a
proportional relationship between ROA and the same direction and it is concentrated mainly in the
range of 0.000-250000000000 and sparsely appeared at about 900000000000-1000000000000.
3. Confirmatory factor analysis

ROA= 6,623+ 5,209E-11.NI-(-3,444E-12).TA


Sig coefficient 0.000 so the variables are correlated with each other
Dependent variable increases: if ROA is increased by 1 unit, NI will increase by 5,209E-11 units,
especially if ROA is increased by 1 unit, TA will decrease (-3,444E-12) units.
Normalized beta indicates the importance of two variables and the NI variable has the highest beta,
so the NI has the strongest impact on the dependent variable.

Conclusion

3
In conclusion, this research has given readers a better understanding of statistical methodologies
and how to evaluate business and economic data from publicly available sources. With this, the type
and technique of business and economic data, information gathered from various publicly available
sources, as well as data obtained from numerous sources using various analytical approaches, are
given special consideration. At the same time, based on this analysis, businesses can identify risks
and plan for future growth.

Reference

4
Stedman, C. and Vaughan, J. (2022) What is data management and why is it important?, Data
Management. TechTarget. Available at:
https://www.techtarget.com/searchdatamanagement/definition/data-management (Accessed:
February 20, 2023).
Knight, M. (2023) What is data management?, DATAVERSITY. Available at:
https://www.dataversity.net/what-is-data-management/ (Accessed: February 20, 2023).
Hajric, E. (n.d) Knowledge management tools, Knowledge Information Data. Available at:
http://www.knowledge-management-tools.net/knowledge-information-data.html (Accessed:
February 20, 2023).
Seismic (2022) Data, information, and knowledge: What's the difference?, Seismic. Available at:
https://seismic.com/blog/data-information-and-knowledge-whats-the-difference/ (Accessed:
February 22, 2023).
Mohamed Saber, M. (n.d) The differences between data, information, and knowledge, and why you
never find it when it's needed!, LinkedIn. Available at: https://www.linkedin.com/pulse/differences-
between-data-information-knowledge-why-mohamed-1e (Accessed: March 20, 2023).
Villegas, F. (2022) Descriptive analysis: What it is + best research tips, QuestionPro. Available at:
https://www.questionpro.com/blog/descriptive-analysis/ (Accessed: February 22, 2023).
Descriptive Data Analysis (no date) Urban Institute. Available at:
https://www.urban.org/research/data-methods/data-analysis/quantitative-data-analysis/descriptive-
data-analysis (Accessed: February 22, 2023).
What is exploratory data analysis ? (2022) GeeksforGeeks. GeeksforGeeks. Available at:
https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/ (Accessed: February 26, 2023).
What is exploratory data analysis? (no date) IBM. Available at:
https://www.ibm.com/topics/exploratory-data-analysis (Accessed: February 26, 2023).
Confirmatory factor analysis (2023) Statistics Solutions. Available at:
https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/confirmatory-
factor-analysis/ (Accessed: February 28, 2023).
What is confirmatory factor analysis? (formula and steps) (no date). Available at:
https://uk.indeed.com/career-advice/career-development/confirmatory-factor-analysis (Accessed:
February 28, 2023).

You might also like