You are on page 1of 5

Title Goes here

Thesis Proposal

Supervisor: Dr. Tamim Ahmed Khan

Submitted by: Qanetah Ahmed


Roll No.: 01-241191-123
1. Abstract:
Semantic web helps in processing of data and converting into machine readable form
whereas web mining allows us to make use of machine readable data and extract relevant
information from it by applying different types of web mining techniques. The combination
of these two fields is called semantic web mining. The aim of this research is to convert
different forms of data into one form and by applying keywords or rules, making clusters of
the relevant information and lastly finding hidden patterns among data (textual) by
selecting a trend or sentiment. We will use data present in different formats. We will
convert the data present in RDF, JSON or XML form into machine readable form by using
ontological framework or a data file convertor. The data after conversion will be in RDF
triples (for the RDF file) and the rest formatted files will be converted according to their own
format, on which we will be able to apply SPARQL query/s. Hence resulting in data being in
a more refined and understandable format. Lastly we will apply web mining techniques to
this refined form of data to do analysis and gain results in the form of clustering relevant
data and discovering hidden patterns.

2. Introduction:
Motivation goes here…

Web usage mining requires large data sets. The data gathered for this technique is in the
form of logs [3-5].

3. Literature Review:
360 of your title…

semantic web techniques or ontologies with simple data mining techniques[6].

Research Gap:

Researchers have extracted and analyzed certain relations from data using data mining
techniques. We find research done in this field to find out

he relevant data present in a generic formatted form.

1.1. Problem Statement:


Researchers or sentiment.

1.2. Research Questions:


1. How can we make use of semantic web mining to extract hidden patterns w.r.t
ontological standpoint?
2. How can we interpret these hidden patterns to formulate behavioral patterns?

1.3. Objectives:
The main objectives of our research are to understand the working of semantic web
ontologies and frameworks and to gather the data from different sources in different
formats consisting of ontological vocabulary and framework/s to apply web mining
techniques to gather relevant information and find hidden patterns.

1. We can use semantic web mining to find patterns among data by making use of
convertor tools online and in downloadable form available which help in converting
formatted files such as XML, JSON etc. (made up ontological vocabularies and semantic
frameworks) into basic CSV form. Specifically for files with formats such as RDF or OWL,
convertors are rarely available. For conversion of RDF to CSV or from CSV to RDF there
are certain libraries that need to be installed with respect to their languages such as in
R, Python and even Java. Relevant data can be extracted by using specific rules and
keywords.

2. The data present on the web, not all of it is in pre-processed form. Data is present in
different formatted files as-well. To know what kind of data these files are carrying and
to know if the data is relevant or might be helpful or related in some way is important.
Knowing the context is important. This is why finding hidden patterns in these types of
files are necessary. We can apply SPARQL query to acquired dataset from RDF triples.
We can make use of the extracted keywords or rules and web mining techniques to find
the relevant data by making clusters of the relevant information and identifying
behavioral patterns by using a specific sentiment or trend.

1.4. Alternative Solutions:


An alternative solution that can be applied is by using a file in RDF format and applying
SPARQL to the file gaining a structured dataset (with labels)(consisting of URI entities)
to which k-means can be applied to make clusters of the information[7]. Hence, by
using this approach, clusters of relevant information can be gained as a result.

1.5. Expected Solution:


4. In step one; we will select data files present in different formats (RDF, JSON or
XML). The conversion of the file will be done by using Data File Converter which will
give us files in the form of RDF triples meaning subject, predicate and object (for the
RDF file/s) and the rest formatted files will be converted according to their own
format.

5. Research Methodology:
In this applied research the problem regarding data present in different formats and
how we can convert it in machine readable form so that useful information and patterns
can be extracted, will be discussed and the solution will be implemented. Following are
the steps that will be followed to find hidden relations between unstructured data:

6. Conclusion:
The method that we have proposed of using unstructured data and using it to make
clusters of relevant information as-well as finding relations among this relevant
information through a sentiment or trend is an effective approach because we can make
use of data available in different formats, make it machine readable and by applying
mining techniques, we can gain valuable information and results.

References:
1. Singh, S. and M.S. Aswal, Semantic Web Mining: Survey and Analysis. Journal of Web
Engineering & Technology, 2019. 5(3): p. 20-31.
2. Kabir, S., et al., Knowledge-based data mining using semantic web. IERI Procedia, 2014. 7: p.
113-119.
3. Jokar, N., et al., Web mining and Web usage mining techniques. Bulletin de la Société des
Sciences de Liège, 2016. 85(1): p. 321-328.
4. Available:, https://www.geeksforgeeks.org/web-mining/.
5. Berendt, B., G. Stumme, and A. Hotho, Usage mining for and on the semantic web. Data
Mining: Next Generation Challenges and Future Directions, 2004: p. 461-480.
6. Available:, https://data-mining.philippe-fournier-viger.com/lessons-from-the-past-the-
semantic-web-ontologies-and-why-it-failed/.
7. Mohammed, W.M.S. and M.M. Saraee, Mining Semantic Web Data Using K-means
Clustering Algorithm. Journal of Advances in Mathematics and Computer Science, 2016: p.
1-14.
8. Mughal, M.J.H., Data Mining: Web Data Mining Techniques, Tools and Algorithms: An
Overview. Information Retrieval, 2018. 9(6).
9. Ristoski, P. and H. Paulheim, Semantic Web in data mining and knowledge discovery: A
comprehensive survey. Journal of Web Semantics, 2016. 36: p. 1-22.
10. Gujrani, S. and A. Phakatkar, A survey on: Keyword search and similarity using RDF schema.
2016.
11. Manuja, M. and D. Garg, Semantic web mining of un-structured data: challenges and
opportunities. International Journal of Engineering (IJE), 2011. 5(3): p. 268.
12. Chomboon, K., et al., Data mining in semantic web data. International Journal of Computer
Theory and Engineering, 2014. 6(6): p. 472.
13. Sint, R., et al. Combining unstructured, fully structured and semi-structured information in
semantic wikis. in CEUR Workshop Proceedings. 2009.
14. Berendt, B., et al. A roadmap for web mining: from web to semantic web. in European Web
Mining Forum. 2003. Springer.
15. Feldman, D., Using subject-predicate-object triplets for opinion mining.

We need something like this in steps format where we argue that this setting is not done before, or we
have improved results with better steps taken and considered

You might also like