You are on page 1of 8


CMPE 226 (44709) Fall 2014 (Wednesday), version 1

CMPE 226 Databases


John Gash


Office Hours:

Immediately following class or by appointment.

Monday and Wednesday, Engr, R/337

Class Days/Time:

Wednesday: Lecture 6:00 8:45 pm, includes lab and discussion


Engr, R/337

Insructure is our primary distribution of information and how you submit

projects. -

Java development, general relational database (JDBC) proficiency**


OOA/D, UML and software engineering concepts/experience

*Email subject must start with 226 - subject
**Java is used extensively to demonstrate concepts and in the preparation of class projects. Java is not be
the only language used in lecture or assignments. You are encouraged to broaden your language skills.

Faculty Web Page and MYSJSU Messaging

You are responsible for regularly checking the following: 1) the messaging system through MySJSU; 2) the class course content for 226.

Course Catalog Description

Entity Relationship and relational model, functional dependencies and decompositions, SQL, failure recovery,
concurrency control, transaction management, distributed storage, hybrid models, databases,, data mining,
database systems and the Internet.

Course Format and Notes

CMPE 226 explores storage architectures, practices, and technologies in support of a wide range of storage
challenges in enterprise systems that includes massively scaled systems and unstructured data. This course
focuses on opportunities to research, and acquire experience using emerging concepts in relational and non-
relational-based storage theory and technologies. Resulting from this choice, the schedule is a dynamical process
influenced strongly by our discussion; we may spend more time on one subject or branch into new areas as

We use a hybrid classroom format - a combination of a flipped classroom and discussion, ~50% of each.

Course Goals Learning Outcomes

The goals and learning objectives of CMPE 226 is to provide individuals with an understanding of and
experience with:

OOA/D approaches for relational database design, development, and testing

Research in distributed, replicated, storage methodologies, practices and systems

Opportunities to investigate alternative storage architectures

Hands on experience with the current and emerging technologies

Gash Draft, not for distribution

Page 1 of 1

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

Required Texts/Readings
A specific textbook(s) is not required for this course. You are encouraged to use all resources normally
available in a working environment (e.g., books, Internet, papers, discussions, COTS or F/OSS software
packages, sample code). You must cite your references in all your work.

Supporting Readings

UML Distilled, Third Edition, by Fowler, 2003

Applying UML and Patterns, 3rd Edition, by Larman, 2004

Additional references and papers provided during lecture

Additional references included at the end of each lecture

Classroom Protocol

1. Ty to arrive on time. Arriving early can give you time to talk to your team members regarding the projects,
or hold general discussions on technologies, design, or coordination.
2. Cell phones must be muted during the lecture. If you need to take a call, please quietly use the hallway.
3. You are welcome to talk with me after my other classes. Please note that I try to give students in the current
lecture priority. I will do my best to stay as late as needed to accommodate as many people as possible. Your
patience is appreciated as we have a large number of students this semester.

Dropping and Adding

Students are responsible for understanding the policies and procedures about add/drop, grade forgiveness, etc.
Refer to the current semesters Catalog Policies section at
Add/drop deadlines can be found on the current academic calendar web page located at The Late Drop Policy is available at Students should be aware of the current deadlines and
penalties for dropping classes.

Information about the latest changes and news is available at the Advising Hub at

Assignments and Grading Policy

SJSU classes are designed such that in order to be successful, it is expected that students will spend a minimum
of forty-five hours for each unit of credit (normally three hours per unit per week), including preparing for class,
participating in course activities, completing assignments, and so on. More details about student workload can
be found at

NOTE that University policy F69-24, Students should attend all meetings of their classes, not only because they
are responsible for material discussed therein, but because active participation is frequently essential to insure
maximum benefit for all members of the class. Attendance per se shall not be used as a criterion for grading.

Grading (A curve is

(65 pts with distribution)
100 - 90 A
89 - 80 B
79 - 70 C
69 - 60 D
59 - 0 F

Project 1
Project 2
Deep-dive w/ paper
Final Examination



Grading is based on points accumulated through individual work (final and paper) and group-based projects.
Group grades are assigned individually based on project participation. You may observe large fluctuations in
your percentage grade at the beginning of the semester. Should you experience these larges variations, please

Gash Draft, not for distribution

Page 2 of 2

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

keep in mind that as you accumulate points the variations will be dampened. Instantaneous grade calculations
are only an indicator where you are at a moment in time, and helps you understand the potential within the
course. It is your responsibility to set your goals.

Regarding group projects, each group is responsible for ensuring equal contribution to a project. Your level of
contribution to project deliverables is a long-term investment. While gaming is always possible, you are doing
yourself a disservice.

University Policies:

Dropping and Adding
Students are responsible for understanding the policies and procedures about add/drop, grade forgiveness, etc. Refer to the
current semesters Catalog Policies section at Add/drop deadlines can be
found on the current academic year calendars document on the Academic Calendars webpage at The Late Drop Policy is available at Students should be aware of the current deadlines and penalties for
dropping classes.

Information about the latest changes and news is available at the Advising Hub at
Consent for Recording of Class and Public Sharing of Instructor Material
University Policy S12-7,, requires students to obtain instructors permission to
record the course.

Common courtesy and professional behavior dictate that you notify someone when you are recording him/her. You
must obtain the instructors permission to make audio or video recordings in this class. Such permission allows the
recordings to be used for your private, study purposes only. The recordings are the intellectual property of the
instructor; you have not been given any rights to reproduce or distribute the material.
It is suggested that the greensheet include the instructors process for granting permission, whether in
writing or orally and whether for the whole semester or on a class by class basis.
In classes where active participation of students or guests may be on the recording, permission of those
students or guests should be obtained as well.

Course material developed by the instructor is the intellectual property of the instructor and cannot be shared
publicly without his/her approval. You may not publicly share or upload instructor generated material for this
course such as exam questions, lecture notes, or homework solutions without instructor consent.

Academic integrity
Your commitment as a student to learning is evidenced by your enrollment at San Jose State University. The Error! Hyperlink
reference not valid. at requires you to be honest in all your academic course
work. Faculty members are required to report all infractions to the office of Student Conduct and Ethical Development. The
Student Conduct and Ethical Development website is available at
Campus Policy in Compliance with the American Disabilities Act
If you need course adaptations or accommodations because of a disability, or if you need to make special arrangements in case
the building must be evacuated, please make an appointment with me as soon as possible, or see me during office hours.
Presidential Directive 97-03 at requires that students with
disabilities requesting accommodations must register with the Accessible Education Center (AEC) at
to establish a record of their disability.

Gash Draft, not for distribution

Page 3 of 3

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

Storage perspective
Project 2 (No-SQL DBs strategies)

Project 1 (RDBMS scaling)

Tentative Schedule (subject to change)
Dates & Assignments Topics and Objectives (est. number of lectures)
Aug 27
Introductions, ingredients, and other administrative stuff (.4)

1. Schedule and projects

2. Development environment and tools
3. Using architecture patterns and supporting your decisions

Provisioning your computer
1. Languages, tools, and such
Sep 3
Relational database frameworks and strategies
Project 1 handout

Concepts (.5)
Sep 10
1. Reviewing the Normal Forms - NF

2. Patterns
Sep 17

Object-Relational Mapping frameworks (2.5)
Sep 24
1. Java-based (Hibernate, JPA)

2. Non-Java based (ActiveRecord, TBD)

Scaling and Failover (1)
1. Shared nothing - Sharding
2. Latency in replicated data
3. In-memory databases
Oct 1
Alternate storage models and tools

Oct 8
Tools (1)

1. Anaconda (IPython)
Oct 15

Project 1 due
Storage (2)
Project 2 handout
1. In-memory databases

2. File-based (HDF5/NetCDF, CSV/XML/JSON,..)

3. Hive's data model
Oct 22
No-SQL Solutions
Project 1 roundtable

Concepts: (.5)
Oct 29
1. Consistent-Available-Partitioned - CAP Theory

2. Social data analysis
Nov 5

Indexing frameworks (1)
Nov 12

Distributed Hash Tables - DHTs (1)
Nov 19
1. Highly available storage solutions
Project 2 due
a. Riak

b. Cassandra
Nov 26

Thanksgiving Holiday
Document-based frameworks (1.5)

1. Mongodb
2. Cross platform/language support

Mixing storage solutions - designing for diverse storage models (1)

Graph databases (.5) - TBD
Dec 3
Review and presentations continued
Deep dive


Dec 10
Dec 17
Final Exam (All material and deep dives)

Gash Draft, not for distribution

Page 4 of 4

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

Note: topics span multiple lectures and are subject to revisions and changes due to factors such as time
constraints, travel, or extended discussion (except for the exam date, which is fixed).

Gash Draft, not for distribution

Page 5 of 5

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

Additional information

The following information is provided to help in increasing your overall experience with CmpE 226.

This course includes interactive discussions and hands on software development and research relating to
computing methodologies and technologies for storage systems. Consequently, a considerable level of effort and
time will be required for research and software development (two software projects). Details follow.

Class/Lecture. Class discussion is an interactive exploration of concepts and ideas focusing on real world
situations (businesses, social, and research), which includes participation in critical problem solving, articulating
concepts, defending positions, and presenting ideas within a group environment. You are required to prepare
for each meeting by researching and investigating topics; this may include literature searching, prototyping, and
Internet investigation. We look at and discuss situations that you may experience, they include though not
limited to:


Design concepts of ORM frameworks

Legacy Integration

Distributed data storage (what is big


Research and Exploration

NoSQL (alternatives to RDBMS

Highly Scalable, Massively

scalable I/O design

Mixed storage designs

Fault Tolerant

Planning for availability and failure

Open Source trends

Large scale design

Distributed data repositories

Deep-dive Papers

The Deep dive assignment provides the individual an opportunity to explore, prototype, and expand his/her
horizons in an area of choice. This can include concepts related to the technologies or ideas discussed during
class. Research is also a chance to explore ideas relating to (or supporting) a Masters project topics.

1. Papers are associated or complementary to the topics under study for each project
2. Choose one technologies/concepts to conduct your investigation. This is a depth not breath (survey)
3. Papers with supporting code are due following the policies of a project and should be submitted


Projects are a key component to the class. Projects provide you a challenging, real, problem to apply your
engineering skills and concepts of the course. In order to maximize interactions, a team and individual approach
is used to foster perspectives and collaboration. Each project is composed of a team and individual effort. They
are (Note: Additional details will be provided in class):

1. Team design, problem solving, project implementation and report.
2. Individual assigned work, participation in design, development, and supporting other team members.

Teams. Please create teams as soon a possible as projects will begin/be due as early as the second or third
lecture. Teams are self-forming with each team composed of three, or four persons; larger or smaller teams will
not be accepted. In addition to interactions within a team, select projects will require cross team interaction to
solve distributed challenges.

Topics. Projects are organized with one or more key concepts or scenario (see above). Guidelines, objectives,
and expectations will be provided prior to the start of each assignment (projects will be structured to allow
completion in 2-3 weeks).

Deliverables. Each project (unless noted) will include two deliverables, an report, and a team project (source
code, test cases, and supporting data). Projects are submitted in electronic format within a 48 hour window of
the due date.

Gash Draft, not for distribution

Page 6 of 6

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

In order to facilitate grading and prompt feedback, projects should be submitted one per group and in the
following format:

Project directory must include the group ID/Name (e.g., project1-caffeine)

Within the project directory include the report, source code, and test data. If a large quantity of data is
to provided, only include data for testing. Do not include libraries (jars) used to build your project. The
report should include a list of dependencies and how to retrieve/install/configure.

Contain all group members in the report cover

For example:

Team crunch is submitting their work for project 1. They have provided the following files and source directory:

1. Create a directory for your project submission:


Installation notes

crunch-project-report.doc (.pdf)

project-1/ (do not send class or jar files)

2. Archive directory - (.tar.gz,, .rar):

zip r project1-crunch
tar cfz project1-crunch.tar.gz project1-crunch

3. Upload to - DO NOT email me your assignment, I will lose it!

Assignments and Grading Policy

Grading is based on both group (projects) and individual contributions (papers, final) see above for point

Your grade (letter) is determined using a curve and varies from class to class. There is no fixed percentage of
assigned As, Bs, etc. grades are awarded on where one falls within the letter tiers. This means that if everyone
in the class scores 85 or better (curve applied), everyone gets an A. The converse holds true as well. More
information will be provided during the first lecture.

1. None participation in group projects will affect the offending individuals grade determined by
2. Your grade is your responsibility not your groups.
3. Late assignments are assessed a late fee unless otherwise noted

Academic integrity
Instances of academic dishonesty will not be tolerated. Cheating on exams or plagiarism (presenting the work of
another as your own, or the use of another persons ideas without giving proper credit) will result in a failing
grade and sanctions by the University. For this class, all assignments are to be completed by the individual
student unless otherwise specified. If you would like to include in your assignment any material you have
submitted, or plan to submit for another class, please note that SJSUs Academic Policy F06-1 requires approval
of instructors.


Work/Life Balance. Life is not predictable and occasionally we need to rebalance family, class, and work
obligations. There are times when no amount of planning allows one to satisfy all requirements and conflicts
arise; if you find yourself in such a situation, please talk with me to see if there are options or adjustments that
will allow you to be successful.

Philosophy. As mentioned previously, this class is built upon interactive research and discussion. This format
requires individuals to perform investigations and research prior to each discussion topic. Sessions or lectures

Gash Draft, not for distribution

Page 7 of 7

Version 1


SJSU, CMPE 226 (44709) Fall 2014 (Wednesday), version 1

are for the discussion and examination of the topic at hand. Everyone is required to fully participate in all

Projects. Real world projects are not completed in isolation (bring me a rock), they are interactive, exploratory,
evolving, and advisory in nature; our class projects mimic these traits.

In support of project assignments, lectures and lab include time to foster and support discussions on project
approaches, strategies, and implementation. Please plan your time accordingly; projects are software
intensive, which require significant investment of time and research. They are also rewarding, as they
provide a unique opportunity to practice and validate ones research and studies of the domain and

Gash Draft, not for distribution

Page 8 of 8

Version 1