Professional Documents
Culture Documents
Logbook
Logbook
Evidence Record
1.INTRODUCTION :This logbook is to assist Interns to keep record of their Daily work. It will
show the work and periods of time spent in each.
4. MONTHLY REPORT : This is a summary of work done in a week and reports on the work
covered. Employee is required to present the logbook weekly to the Supervisor for progress.
Day of the week Planning Actual work done logged below. Also record below
deviance from plan and reflect on work done
Monday 25 Mar . Review and analyze the referral website Reviewed and analyzed the referral
www.xplorio.com website.
Understanding the flow of the website Understood the flow of the website.
Getting full understanding of the website features Getting fully understand the website.
Document initial findings features.
Understanding the uniqueness Understood the uniqueness of project
Tuesday 26 Mar Overview of Big Data: Definition, characteristics, and Overview of Big Data: Definition,
importance. characteristics, and importance.
Understanding the 3Vs of Big Data: Volume, Velocity, Understanding the 3Vs of Big Data:
and Variety. Volume, Velocity, and Variety.
Examples of real-world applications of Big Data in Examples of real-world applications of Big
various industries. Data in various industries.
Introduction to popular Big Data frameworks and Introduction to popular Big Data
technologies such as Hadoop, Spark, Kafka, and Flink. frameworks and technologies such as
Comparison of different Big Data processing Hadoop, Spark, Kafka, and Flink.
paradigms: batch processing vs. real-time processing Comparison of different Big Data
processing paradigms: batch processing
vs. real-time processing
Wednesday 27 Overview of data capture methods: Batch processing, Overview of data capture methods: Batch processing,
Mar real-time streaming, and micro-batching. real-time streaming, and micro-batching.
Detailed discussion on streaming data capture using Detailed discussion on streaming data capture using
tools like Apache Kafka and Apache Flume.
tools like Apache Kafka and Apache Flume.
Importance of data quality and strategies for data
Importance of data quality and strategies for data
cleansing and preprocessing.
cleansing and preprocessing. Hands-on session with Apache Kafka for setting up
Hands-on session with Apache Kafka for setting up data streaming pipelines.
data streaming pipelines.
1
Thursday 28 Mar Introduction to relational databases and NoSQL Introduction to relational databases and
databases. NoSQL databases.
Understanding the CAP theorem and its implications Understanding the CAP theorem and its
on distributed databases. implications on distributed databases.
Discussion on key-value stores, document stores, Discussion on key-value stores, document
column-family stores, and graph databases. stores, column-family stores, and graph
databases.
Friday 29 Mar Introduction to distributed file systems: HDFS Introduction to distributed file systems:
(Hadoop Distributed File System) and Amazon S3. HDFS (Hadoop Distributed File System)
Overview of distributed computing frameworks: and Amazon S3.
Apache Hadoop and Apache Spark. Overview of distributed computing
Use cases and best practices for choosing between frameworks: Apache Hadoop and Apache
different storage and processing frameworks. Spark.
Use cases and best practices for choosing
between different storage and processing
frameworks.
Monday 25 Mar . Review and analyze the referral website www.xplorio.com 5 Hours
Understanding the flow of the website
Getting full understanding of the website features
Document initial findings
Understanding the uniqueness
Tuesday 26 Mar Overview of Big Data: Definition, characteristics, and importance. 5 Hours
Understanding the 3Vs of Big Data: Volume, Velocity, and
Variety.
Examples of real-world applications of Big Data in various
industries.
Introduction to popular Big Data frameworks and technologies
such as Hadoop, Spark, Kafka, and Flink.
Comparison of different Big Data processing paradigms: batch
processing vs. real-time processing
Wednesday 27 Mar Overview of data capture methods: Batch processing, real-time 5 Hours
streaming, and micro-batching.
Detailed discussion on streaming data capture using tools like
Apache Kafka and Apache Flume.
Importance of data quality and strategies for data cleansing and
preprocessing.
Hands-on session with Apache Kafka for setting up data
streaming pipelines.
2
Distributed File System) and Amazon S3.
Overview of distributed computing frameworks: Apache Hadoop
and Apache Spark.
Use cases and best practices for choosing between different
storage and processing frameworks.
Comments: Learned about the importance of carefully evaluating use case requirements when choosing between
storage and processing frameworks, considering factors like scalability, latency, fault tolerance, and integration.
Additionally, understanding the strengths and trade-offs of each framework enables informed decision-making for
building efficient and effective data processing pipelines.
Day of the week Planning Actual work done logged above Also record below
deviance from plan
Monday 1 April Importance of data modeling in Importance of data modeling in
database design. database design.
Understanding different data modeling Understanding different data
techniques: Entity-Relationship (ER) modeling techniques: Entity-
diagrams, UML diagrams, etc. Relationship (ER) diagrams,
Hands-on exercise on designing UML diagrams, etc.
database schemas for specific use cases. Hands-on exercise on designing
database schemas for specific
use cases.
3
processes in Big Data projects.
Friday 5 April Deep dive into the domain relevant to Deep dive into the domain
the data capture and management relevant to the data capture and
system. management system.
Analysis of industry-specific challenges Analysis of industry-specific
and opportunities related to data challenges and opportunities
handling. related to data handling.
Case studies of successful Case studies of successful
implementations in similar domains. implementations in similar
domains
4
Importance of stakeholder interviews and 4 hours
Wednesday
requirements gathering in system design.
Techniques for eliciting and documenting user
03/04/2024 requirements: interviews, surveys, and use
cases.
Case study analysis of successful requirement
analysis processes in Big Data projects.
Day of the week Planning Actual work done logged above Also record below
deviance from plan
Monday 8 April Conduct stakeholder interviews to In practical research efforts, stakeholder
gather additional requirements and interviews were conducted to gather
additional requirements, refine existing use
insights.
cases based on feedback, and prioritize
Refinement of previously defined use
features and functionalities according to
cases based on stakeholder feedback. business value and feasibility.
Prioritization of features and
functionalities based on business value
and feasibility.
5
Tuesday 9 April Validation of documented requirements the following tasks were accomplished:
with stakeholders to ensure alignment. validating documented requirements with
stakeholders to ensure alignment, resolving
Addressing any discrepancies or
any discrepancies or ambiguities in the
ambiguities in the requirement
requirement documentation, and obtaining
documentation. formal sign-off on the requirement document
Formal sign-off on the requirement to proceed to the design phase.
document to proceed to the design phase.
Friday 12 April Breakdown of high-level components into breaking down high-level components
detailed design specifications. into detailed design specifications,
designing data ingestion pipelines
Designing data ingestion pipelines, storage
layers, processing modules, and interfaces.
Update the lockbook
Consideration of fault tolerance, data
consistency, and performance
optimizations.
6
design phase.
the following tasks were completed: 4 Hours
Wednesday introducing architectural design patterns and
principles, analyzing scalability, reliability,
10/04/2024
and security considerations, and identifying
architectural trade-offs to make informed
design decisions.
Revised Project Perposal
Do research in practical execution, high-level architectural 4 Hours
Thursday diagrams depicting system components and interactions
were created, followed by discussions on component
interfaces
11/04/2024
breaking down high-level components into detailed design 4 Hours
Friday specifications, designing data ingestion pipelines
Comments
All these steps
represent the
crucial transition
from
conceptualization
to practical
implementation,
ensuring a robust
and efficient
system design
that meets both
technical
requirements
and stakeholder
expectations.
Day of the week Planning Actual work done logged above Also record below
deviance from plan
Monday 15 April Evaluation of technology options for Take a call with supervisor
each component based on revised project plan
requirements and design constraints.
7
Prototyping critical components to
validate technology choices and design
assumptions.
Finalization of the technology stack for
the data capture and management
system.
8
DAY DESCRIPTION OF WORK DONE HOURS
16/04/2024
Work on documentation 4 hours
Wednesday
Make table of content
17/04/2024
Work on documentation 4 hours
Thursday
Make giant chart
18/04/2024
Work on documentation 4 hours
Friday
Proof read the documentation
Update lock book
19/04/2024
Total Hours Covered: 20 hours
Day of the week Planning Actual work done logged above Also record below
deviance from plan
Monday 22 April Evaluation of technology options for
each component based on
requirements and design constraints.
Prototyping critical components to
validate technology choices and design
assumptions.
Finalization of the technology stack for
9
the data capture and management
system.
Monday
22/04/2024
Tuesday
23/04/2024
10
Wednesday
24/04/2024
Thursday
25/04/2024
Friday
26/04/2024
Total Hours Covered:
Comments
11
EMPLOYEE’S SUMMARY REPORT: The report should contain a summary of work done this
month. This concludes the highlights of the project the EMPLOYEE was involved in. The
EMPLOYEE is expected to point out the weak and strong points of this document.
12
……………………………………………………………………………………………………………….
………………………… …………………………
Date: Date:
13