Active database

1. What is this database application? An active database is a database that includes an event-driven architecture which can respond to conditions both inside and outside the database. Possible uses include security monitoring, alerting, statistics gathering and authorization. Most modern relational databases include active database features in the form of database trigger. 2. What is the architectural design of this database? Active databases endow conventional database functionality with event-based rule processing. The behavior of an active database is accomplished through a set of ECA-rules (Event-Condition-Action rules) related with the database. Once a certain event is detected, the relevant rules will be triggered. The triggering of rules involves such procedures as evaluating a certain condition of the database and executing the corresponding action. An active database gains its power from the array of events it can respond to and the type of actions it can perform in response. Active databases support the creation of triggers which fire when certain operations occur on the database. Most industrial relational databases nowadays are integrated with active database capabilities (e.g. Microsoft SQL Server, Oracle, Postgres, Sybase and Teradata), and all support SQL Triggers. 3. How does it run?

Active database systems support mechanisms that enable them to respond automatically to events that are taking place either inside or outside the database system itself. Considerable effort has been directed towards improving understanding of such systems in recent years, and many different proposals have been made and applications suggested. This high level of activity has not yielded a single agreed-upon standard approach to the integration of active functionality with conventional database systems, but has led to improved understanding of active behavior description languages, execution models, and architectures. This survey presents the fundamental characteristics of active database systems, describes a collection of representative systems within a common framework, considers the consequences for implementations of certain design decisions, and discusses tools for developing active applications. 4. Who are the people involved in this database application? What are their functions? There is no doubt that computers have changed the lives of people. Nowadays almost every kind of task performed on a computer of a kind or another and their popularity keeps on increasing as well as does the development of new and modern technology. There are many things about technology that can make one s life easier and one of them is the database. Databases have been designed with the aim of helping people keep their data organized and provide them with an opportunity to select only the data that they need while not losing any type of information. There are different types of software database but here on can learn more about the active databases. 5. When it can be implemented? An active database system, in contrast, is a database system that monitors situations of interest, and when they occur, triggers an appropriate response in a timely manner. The desired behavior is expressed in production rules (also called event-condition-action rules), which are denied and stored in the database. This has the beneath that the rules can be shared by many application programs, and the database system can optimize their implementation. The production rule paradigm originated in the held of Artificial Intelligence (AI) with expert systems rule languages such as OPS5 6. Why it is needed? Active database systems aim at the representation of more real-world semantics in the database by supporting event-condition-action rules (ECA-rules). ECA-rules can be interpreted as "when the specified event occurs and the condition holds, execute the action." An event indicates the point in time when some sort of reaction is required from the DBMS. For primitive events, this point in time can be specified by an occurrence in the database, by an occurrence in the DBMS, or by an occurrence in the database environment. For composite events, the point in time is defined on the basis of other points in time which represent other primitive and/or composite events (called component events). These components are combined by means of event constructors as: negation, conjunction, disjunction, sequence, etc. The action describes treatments to achieve when a specific event happens and some condition holds. The potential uses of reactive behavior are significant: active rules support data derivations, integrity maintenance, workflow management, replication management, and more. 7. What are the examples of this database application?

y y

SQL'99. CS561.

Cloud database

1. What is this database application A Cloud database is a database that relies on cloud technology. Both the database and most of its DBMS reside remotely, "in the cloud," while its applications are both developed by programmers and later maintained and utilized by (application's) end-users through a Web browser and Open APIs. More and more such database products are emerging, both of new vendors and by virtually all established database vendors. 2. What is the architectural design of this database?

Most database services offer web-based consoles, which the end user can use to provision and configure database instances. For example, the Amazon Web Services web console enables users to launch database instances, create snapshots (similar to backups) of databases, and monitor database statistics.

Database services consist of a database manager component, which controls the underlying database instances using a service API. The service API is exposed to the end user, and permits users to perform maintenance and scaling operations on their database instances. For example, the Amazon Relational Database Service's service API enables creating a database instance, modifying the resources available to a database instance, deleting a database instance, creating a snapshot (similar to a backup) of a database, and restoring a database from a snapshot. Database services take care of scalability and high availability of the database. Scalability features differ between vendors - some offer auto-scaling, others enable the user to scale up using an API, but do not scale automatically. There is typically a commitment for a certain level of high availability (e.g. 99.9% or 99.99%). 3. How does it run? There are two primary methods to run a database on the cloud: Virtual Machine Image - cloud platforms allow users to purchase virtual machine instances for a limited time. It is possible to run a database on these virtual machines. Users can either upload their own machine image with a database installed on it, or use ready-made machine images that already include an optimized installation of a database. For example, Oracle provides a ready-made machine image with an installation of Oracle Database 11g Enterprise Edition on Amazon EC2. Database as a Service - some cloud platforms offer options for using a database as a service, without physically launching a virtual machine instance for the database. In this configuration, application owners do not have to install and maintain the database on their own. Instead, the database service provider takes responsibility for installing and maintaining the database, and application owners pay according to their usage. For example, Amazon Web Services provides two database services as part of its cloud offering, SimpleDB which is a NoSQL key-value store, and Amazon Relational Database Service which is an SQL-based database service with a MySQL interface. A third option is managed database hosting on the cloud, where the database is not offered as a service, but the cloud provider hosts the database and manages it on the application owner's behalf. For example, cloud provider Rackspace offers managed hosting for MySQL databases. 4. Who are the people involved in this database application? What are their functions? To many of us today, the cloud seems like a bit of magic. We often simply use the services of a cloud based system without really thinking about where the cloud is located, or who keeps it running. Ultimately, behind every cloud , there are real people managing real machines. What is marketed as a cloud is really a rack of machines, with a very real person who has to keep them running. To that person; the administrator, the cloud isn t in the cloud , it s in his own data center! The administrator must put together a set of machines, software and administrative tools that enable everything to be viewed in a completely hands-off way by the users, so that they think of it as a cloud . The challenge vendors face when trying to market and sell something for the cloud is that the definition of the cloud is so broad and varied. What passes as the cloud to one person is simply a set of machines to another person. Our new SQL Anywhere OnDemand Edition (code named Fuji ) currently in beta test, is one such product designed to help the administrator of those machines

to create a data cloud. While the administrator certainly will know what machines are in use, where the database servers are running and where the databases are located, the end user will be enabled to simply view their database as being in the cloud . The administrator of the cloud system will use one of the primary components of Fuji: the administrative console. The console is designed to enable an administrator to easily keep track of the various host machines that are part of the system, the SQL Anywhere servers that are running on each host, and the databases being served by each database server. The console also provides access to all the various tasks that an administrator might want to execute on their running cloud; including starting and stopping database servers, adding a new database into the cloud, setting up high-availability for a database, and backup/restore operations. The console is completely web-driven, so it can be accessed using a standard Flash-enabled web browser. 5. Where it can be implemented? SQL Databases, such as Oracle Database, Microsoft SQL Server and MySQL, are one type of database which can be run on the cloud (either as a Virtual Machine Image or as a service, depending on the vendor). SQL databases are difficult to scale, meaning they are not natively suited to a cloud environment, although cloud database services based on SQL are attempting to address this challenge. NoSQL Databases, such as Apache Cassandra, CouchDB and MongoDB, are another type of database which can run on the cloud. NoSQL databases are built to service heavy read/write loads and are able scale up and down easily, and therefore they are more natively suited to running on the cloud. However, most contemporary applications are built around an SQL data model, so working with NoSQL databases often requires a complete rewrite of application code. 6. Why it is needed? It is important to have a database as an organic part of the cloud for one key reason: to avoid dedicated and complex maintenance required to babysit the odd-child in the cloud infrastructure. Any non-cloud service will become some kind of an exception which requires special maintenance, skillset, procedures etc. It s definitely true long-term, but even in their day-to-day operations today, DBAs and System Administrators can attest to how much time and energy (and admin costs) go into monitoring their DB and ensure it plays well and is properly integrated with the other components of the cloud. In addition, non-native cloud services will not enjoy the benefits achieved by natural tenants of the cloud. These benefits include, for example, all the automation, resource-optimization, dynamic networking, and more. Databases should be regarded as integral part of the cloud so that the IT infrastructure can really be a commodity, which can be bought, traded, re-allocated and moved around as needed. On the other side of the equation, to live up to the theory, the cloud database technology itself must deliver a convincing paradigm and proof for its ability to keep the data safe, secured and always available, at least on the same level as enterprise databases today.

Bulk native loaders for input data and bulk SQL passes for aggregation are the norm. What is the architectural design of this database? . getting summarized. For example. reclassified. The warehouse becomes the central source of data for use by managers and other end-users who may not have access to operational data. transforming. What are the examples of this database application? y y y y y Oracle Database IBM DB2 Ingres (database) PostgreSQL MySQL Data warehouse 1. it is unusual and inefficient to target individual rows for update. Often operational data undergoes transformation on its way into the warehouse. etc. What is this database application Data warehouses archive data from operational databases and often from external sources such as market research firms.loading and managing data so as to make it available for further use. analyzing. Some basic and essential components of data warehousing include retrieving. sales data might be aggregated to weekly totals and converted from internal product codes to use UPCs so that it can be compared with ACNielsen data. Operations in a data warehouse are typically concerned with bulk data manipulation. insert or delete. and mining data. 2.7. anonymized. and as such.

dimensions and facts. Maintaining tight management over the data warehouse bus architecture is fundamental to maintaining the integrity of the data warehouse." These data marts can eventually be integrated to create a comprehensive data warehouse. The actual integration of two or more data marts is then done by a process known as "Drill across". The single data mart often models a specific business area such as "Sales" or "Production. this means that the dimensions "conform". The integration of data marts is managed through the implementation of what Kimball calls "a data warehouse bus architecture". The most important management task is making sure dimensions among data marts are consistent. the data warehousing effort might start in the . In the bottom-up approach data marts are first created to provide reporting and analytical capabilities for specific business processes.a central repository which supplies atomic (detail level) integrated information to the whole organization.a "snapshot" of a moment in time's enterprise-wide data (optional) one or more individual Data Mart(s) . Metadata is divided into two categories: information for technical use. a well-known author on data warehousing. and the method gives itself well to an exploratory and iterative approach to building data warehouses. Data marts contain. a collection of conformed dimensions and conformed facts. geographical region. A drill-across works by grouping (summarizing) the data along the keys of the (shared) conformed dimensions of each fact participating in the "drill across" followed by a join on the keys of these grouped (summarized) facts. that the data warehouse ends up being "segmented" into a number of logically self contained (up to and including The Bus) and consistent data marts. Though it is important to note that in Kimball methodology. Some consider it an advantage of the Kimball method. which are dimensions that are shared (in a specific way) between facts in two or more data marts. is a proponent of an approach to data warehouse design which he describes as bottom-up. For example. or time period one or more Metadata Store(s) or Repository(ies) . Facts can contain either atomic data and. rather than a big and often complex centralized model. if necessary. also known as the logical architecture. In Kimball's words. (optional) one Operational Data Store . The data warehouse bus architecture is primarily an implementation of "the bus". The integration of the data marts in the data warehouse is centered on the conformed dimensions (residing in "the bus") that define the possible integration "points" between data marts. the bottomup process is the result of an initial business oriented Top-down analysis of the relevant business processes to be modelled. Bottom-up design Ralph Kimball. summarized data. Business value can be returned as quickly as the first data marts can be created. primarily.catalog(s) of reference information about the primary data. includes the following components: one Enterprise Data Store (EDS) . and information for business end-users.Data Warehouse Configurations A Data Warehouse configuration.summarized subset of the enterprise's data specific to a functional area or department.

"Production department" resulting in a Production data mart."Sales" department. If integration via the bus is achieved. "Atomic" data. Time-variant For An operational system. is of critical business value. In the Inmon vision the data warehouse is at the center of the "Corporate Information Factory" (CIF). and retained for future reporting. has defined a data warehouse as a centralized repository for the entire enterprise. Generating new dimensional data marts against the data stored in the data warehouse is a relatively simple task. is that they share the same "Bus". Top-down design Bill Inmon. that is. once committed. in which the data warehouse is designed using a normalized enterprise data model. Inmon is one of the leading proponents of the topdown approach to data warehouse design. The top-down design methodology generates highly consistent dimensional views of data across data marts since all data marts are loaded from the centralized repository. which. Top-down design has also proven to be robust against business changes. An integration (possibly) achieved in a flexible and iterative fashion. say. no master plan is required upfront. are stored in the data warehouse. Inmon states that the data warehouse is: Subject-oriented The data in the data warehouse is organized so that all the data elements relating to the same realworld event or object are linked together. Upon completion of the Sales-data mart. often. The Sales-data mart is good as it is (assuming that the bus is complete) and the production data mart can be constructed virtually independent of the sales data mart (but not independent of the Bus). will not only be able to deliver the specific information that the individual data marts are designed to do. Integrated The data warehouse contains data from most or all of an organization's operational systems and these data are made consistent. Note that this does not require 100% awareness from the onset of the data warehousing effort. the data warehouse. the stored data contains the current value. that will be. The requirement for the Sales data mart and the Production data mart to be integral. by building a Sales-data mart. that the data warehousing team has made the effort to identify and implement the conformed dimensions in the bus. the data are static. in this example either "Sales" or "Production" information. but can deliver integrated Sales-Production information. the business might then decide to expand the warehousing activities into the. one of the first authors on the subject of data warehousing. through its two data marts. Dimensional data marts containing data needed for specific business processes or specific departments are created from the data warehouse. Non-volatile Data in the data warehouse are never over-written or deleted read-only. which provides a logical framework for delivering business intelligence (BI) and business management capabilities. The main disadvantage to the top-down methodology . and that the individual data marts links that information from the bus. data at the lowest level of detail.

It is however. the project to implement the current Data Warehouse iteration can proceed quickly. Data marts for specific reports can then be built on top of the DW solution. cleanse. system backup and recovery.is that it represents a very large project with a very broad scope. 3. and breaks some of the rules that 3NF dictates be followed. specific data for the fact tables and dimensions required. and facilitate the extract transform load (ETL) process. system functionality and user acceptance testing is conducted for the complete integrated Data Warehouse system. transform and load the source data and to periodically refresh the existing data in the Warehouse. DW solutions often make use of an operational data store (ODS). The up-front cost for implementing a data warehouse using the top-down methodology is significant. The hybrid architecture allows a DW to be replaced with a master data management solution where operational. Programs are developed to extract. and the duration of time from the start of project to the point that end users experience initial benefits can be substantial. generating large amounts of data. To consolidate these various data models. Small data marts can shop for data from the consolidated warehouse and use the filtered. The Data Vault Modeling components follow hub and spoke architecture. top-down architecture with a bottom up design. In addition. Once the programs have been developed and unit tested and the components are in place. System support processes of database security. is not efficient for business intelligence reports where dimensional modeling is prevalent. The metadata repository is loaded with transformational and business user metadata. A normal relational database however. Metrics are captured for the load process. and the validity of the output is measured. The Data Vault model is not a true 3rd normal form. software and middleware components are purchased and installed. larger systems will often store the data in a normalized way. The DW effectively provides a single source of information from which the data marts can read. not static information could reside. and the configuration management processes are implemented. the development and test environment is established. The information from the ODS is then parsed into the actual DW. How does it run? Implementation Once the Planning and Design stages are complete. the topdown methodology can be inflexible and unresponsive to changing departmental needs during the implementation phases. Canned production reports are developed and sample ad-hoc queries are run against the test database. and the programs are individually unit tested against a test database with sample source data. To reduce data redundancy. The Data Vault model is geared to be strictly a data warehouse. It is not geared to be end-user accessible. which when built. Legacy systems feeding the DW/BI solution often include customer relationship management (CRM) and enterprise resource planning solutions (ERP). It is important to note that the DW database in a hybrid solution is kept on third normal form to eliminate data redundancy. consisting of the best of breed practices from both 3rd normal form and star schema. User access to the data in the Warehouse is established. This modeling style is a hybrid design. . still requires the use of a data mart or star schema based release area for business purposes. Necessary hardware. Hybrid design Data warehouse (DW) solutions often resemble hub and spoke architecture. creating a highly flexible solution from a BI point of view.

cleansing.system disaster recovery. transforming and validating the data to ensure accuracy and consistency Defining the correct level of summarization to support business decision making Establishing a refresh program that is consistent with business needs. and the Data Warehouse programs and processes are moved into the production libraries and catalogs. and the extraction/cleanse/transformation routines are run on the operations system source data. Rollout presentations and tool demonstrations are given to the entire customer community. The Operations staff is trained. powerful tools at the desktop to access the data in the Warehouse Educating the business community about the realm of possibilities that are available to them through Data Warehousing Establishing a Data Warehouse Help Desk and training users to effectively utilize the desktop tools - . During this review. Data Warehouse applications provide the business community with access to accurate. The primary objective of Data Warehousing is to bring together information from disparate sources and put the information into a format that is conducive to making business decisions. aggregating. The production database is created. The Help Desk is established and put into operation. and identifying and selecting the hardware/software/middleware components to implement it Extracting. Who are the people involved in this database application? What are their functions? Data Warehouses and Data Warehouse applications are designed primarily to support executives. A Service Level Agreement is developed and approved by the customer organization. and business analysts in making complex business decisions. Finally. Transition to Production The Transition to Production stage moves the Data Warehouse development project into the production environment. senior managers. the system is evaluated for acceptance by the customer organization. The final step is to conduct the Production Readiness Review prior to transitioning the Data Warehouse system into production. and end-user training is scheduled and conducted. and data archiving are implemented and tested as the system is prepared for deployment. Data Warehousing requires both business and technical expertise and involves the following activities: Accurately identifying the business information that must be contained in the Warehouse Identifying and prioritizing subject areas to be included in the Data Warehouse Managing the scope of each subject area which will be implemented into the Warehouse on an iterative basis Developing a scaleable architecture to serve as the Warehouse s technical and application foundation. the new system is positioned for ongoing maintenance through the establishment of a Change Management Board and the implementation of change control procedures for future development cycles. 4. The development team works with the Operations staff to perform the initial load of this data to the Warehouse and execute the first refresh cycle. This objective necessitates a set of activities that are far more complex than just collecting data and reporting against it. timing and cycles Providing user-friendly. consolidated information from various internal and external sources.

Provide a single common data model for all data of interest regardless of the data's source. even if the source transaction systems do not. y Insurance fraud analysis y Call record analysis y Logistics and Inventory management y Agriculture 6. This benefit is always valuable. without impacting the operational systems. Integrate data from multiple source systems. In general it typically refers to a modular DBMS architecture that allows distinct DBMS instances to cooperate as a single DBMS over processes. enhancing.- Establishing processes for maintaining. but particularly so when the organization has grown by merger. enabling a central view across the enterprise. Improve data quality. while managing a single database distributed itself over multiple computers. What are the examples of this database application? y y AdventureWorksDW2008R2 ETL Distributed Database 1. notably customer relationship management (CRM) systems. What is this database application? The definition of a distributed database is broad. Credit Card users etc. even for complex analytic queries. This architectural complexity provides the opportunity to: Maintain data history. Restructure the data so that it makes sense to the business users. Add value to operational business applications. and sites. and ensuring the ongoing success and applicability of the Warehouse 5. Restructure the data so that it delivers excellent query performance. and may be utilized in different meanings. Why it is needed? A data warehouse maintains a copy of information from the source transaction systems. and different sites. Where it can be implemented? Some of the applications data warehousing can be used for are: y Decision support y Trend analysis y Financial forecasting y Churn Prediction for Telecom subscribers. flagging or even fixing bad data. computers. . by providing consistent codes and descriptions. 7. Present the organization's information consistently.

About this task If a distributed or partitioned topology is used for databases. In a homogenous distributed database system. How does it run? The First Steps script takes care of setting up database tables for Telecom Web Services Server (TWSS) when you are using a consolidated or shared database. Before you begin This procedure is a sub-procedure within the overall migration process of your test (non-production) or production system. Distributed databases use client/server architecture to process information requests. at least one of the databases is a non-Oracle database. additional steps may be required when you are using a distributed database configuration. In a heterogeneous distributed database system. A database User accesses the distributed database through: Local applications applications which do not require data from other sites. However. A distributed database does not share main memory or disks. Make sure that you have already installed TWSS version 7. Global applications Applications which do require data from other sites.2 level.2. you will encounter one of the following scenarios: . What is the architectural design of this database? A distributed database system allows applications to access data from local and remote databases.2 components and that you have performed the procedures described in one of the following post-installation configuration topics: y y Creating and configuring the DB2 database server instance Creating and configuring the Oracle database server instance Note that the First Steps script assumes that all nodes in the cluster are at a TWSS version 7. 3. each database is an Oracle database.

if you made changes to the default configuration for a previous version of TWSS. your changes are preserved during the migration process. see the topic Running the First Steps configuration script. perform the following steps. e. be sure that you have copied all of the existing data to the new database before deleting the old database and database tables. and update the necessary JNDI bindings to the respective data sources.y One database per cluster. If this is the case.2. Note: When the First Steps script sets up your configuration for running TWSS version 7. The First Steps script creates the minimum necessary data sources to support a distributed topology and points to the new single (temporary) database instance.2. except in the cases of some Direct Connectbased web service implementations. y To migrate a distributed database configuration when there is more than one database per cluster. Verify databases and logs. it does not modify your existing configuration. Who are the people involved in this database application? What are their functions? In a distributed database environment. d. Note: Migration of runtime data is generally not supported. No additional special considerations apply. when you run the First Steps script on the Access Gateway cluster. use the procedure in this topic to create a temporary database and then configure new databases to work with TWSS version 7. coordinate with the database administrator to determine the best location for the data. with each cluster having its own database. (For details. 4. For example. all of the database parameters should refer to the Access Gateway database (typically named AGDB). Then when you run the script on the Service Platform cluster. On one of the nodes in the cluster. For example. Therefore.) In these cases. More than one database per cluster. Verify the new migrated environment by running any successful service logic test case. Create additional data sources. Verify that the newly populated data coexists with the previous data in cases where migration of runtime data is supported for a given web service or feature. refer to the topic Planning to migrate from the previous version of Telecom Web Services Server. create a temporary database by running the First Steps script with the Initial Configuration Mode option pointing to a new temporary database. the database parameters should refer to the Service Platform database (typically SPMDB). Procedure a. For detailed information. If this is the case. you may have the Access Gateway running in one cluster and the Service Platform components running in another cluster. b. you may have a separate database for the WAP Push service on the same cluster with the database for the Service Platform components. Drop the temporary database. c. as necessary. Some issues to consider are: .

Armed with such tools.y y y y y y Number of transactions posted from each location Amount of data (portion of table) used by each node Performance characteristics and reliability of the network Speed of various nodes. an organization's information is maintained and controlled by a few highly skilled individuals at one location. but may also pose problems for the organization as a whole. 7. information. Where it can be implemented? The explosion of individual databases running on PC platforms can provide new opportunities to heads of departments. and managing document-oriented. This architecture should allow decentralized use of data. Document-oriented databases are . The centralized and decentralized models described above both generate major problems for large organizations. as the same data is stored in many databases with no system for managing the multiple copies. Information that could benefit the entire organization often becomes out of reach for users unable to access it or unaware of its existence. and generally lower skills of the personnel administering these systems. reliability can be significantly less than with centralized systems. because of the cheaper hardware and software used. With centralized database systems. What are the examples of this database application? Examples are databases of local work-groups and departments at regional offices. Two major factors have led many business users to reject the centralized database model: the natural tendency for humans not to share and the introduction of personal computer (PC) -based DBMSs powerful enough to handle many concurrent users. while providing for database administration that can be performed by personnel with the interests of the whole firm in mind. departments and workgroups can easily build their own databases. These databases can include both segments shared by multiple sites. Data inconsistency is another problem that occurs in such an environment. Additionally. and segments specific to one site and used only locally in that site. capacities of disks Importance of a node or link when it is unavailable Need for referential integrity among tables 5. Document-oriented database 1. 6. Some type of architecture that provides the advantages of both without the drawbacks would be ideal. or semi structured data. Why it is needed? Business leaders today understand the importance of information as a business resource. retrieving. manufacturing plants and other work sites. wresting control of the information resource from the administrators of the organization's central databases and satisfying their natural tendency not to share. What is this database application? A document-oriented database is a computer program designed for storing. branch offices.

Views are defined with aggregate functions and filters are computed in parallel. Every document in a Document-based Database database has a unique id and there is no required document schema. Field values can be simple things like strings. much like MapReduce. But you can also use ordered lists and associative maps. Document-based Database design and philosophy borrows heavily from Web architecture and the concepts of resources. ACID Semantics . Since Version 0. Views are generally stored in the database and their indexes updated continuously. including Ubuntu. although queries may introduce temporary views. Instead of storing data in rows and columns. As a consequence. the database manages a collection of JSON documents. Python and Erlang. What is the architectural design of this database? Document-based Database is not a relational database management system. Additionally. PHP. Ruby. Support for other languages can be easily added. 2. or dates. You can think of a document as one or more field/value pairs expressed as JSON. numbers. but retain query abilities via views. Features Document Storage Document-based Database stores documents in their entirety.11 Document-based Database supports CommonJS' Module specification. where it is used to synchronize address and bookmark data. Document-based Database exposes a RESTful HTTP API and a large number of pre-written clients are available. Document-based Database supports a view system using external socket servers and a JSON-based protocol. view servers have been developed in a variety of languages. methods and representations and can be simplified as the following. The documents in a collection need not share a schema. It is in use in many software projects and web sites. a plugin architecture allows for using different computer languages as the view server such as JavaScript (default).one of the main categories of so-called NoSQL databases and the popularity of the term "documentoriented database" (or "document store") has grown with the use of the term NoSQL itself.

GET. each view is constructed by a JavaScript function (server-side JavaScript by using CommonJS and SpiderMonkey) that acts as the Map half of a map/reduce operation. A lot of tools. in general.Like many relational database engines. . Since computing a view over a large database can be an expensive operation. REST uses the HTTP methods POST. you can develop views that are similar to their relational database counterparts. proxying and load balancing. Encodings in use include XML. Read. Update. The biggest gotcha typically associated with this level of flexibility is conflicts. YAML. In Document-based Database. modify it. How does it run? The central concept of a document-oriented database is the notion of a Document. but not all three. Map/Reduce Views and Indexes To provide some structure to the data stored in Document-based Database. Document-based Database guarantees eventual consistency to be able to provide both availability and partition tolerance. and then sync those changes at a later time. removed. Delete) operations on all resources. Document-based Database provides ACID semantics. and so on). they all assume documents encapsulate and encode data (or information) in some standard format(s) (or encoding(s)). This provides a very powerful indexing mechanism that grants unprecedented control compared to most databases. That means multiple replicas can have their own copies of the same data. The function takes a document and transforms it into a single value which it returns. It does this by implementing a form of Multi-Version Concurrency Control (MVCC) not unlike InnoDB or Oracle. as well as binary forms like PDF and Microsoft Office documents (MS Word. While each document-oriented database implementation differs on the details of this definition. The logic in your JavaScript functions can be arbitrarily complex. That means Document-based Database can handle a high volume of concurrent readers and writers without conflict. 3. interoperable. or updated. A distributed system can satisfy any two of these guarantees at the same time. HTTP is widely understood. software and hardware. scalable and proven technology. availability and partition tolerance guarantees. JSON and BSON. Distributed Architecture with Replication Document-based Database was designed with bi-direction replication (or synchronization) and off-line operation in mind. Excel. All items have a unique URI that gets exposed via HTTP. REST API Document-based Database treats all stored items (there are others besides documents) as a resource. PUT and DELETE for the four basic CRUD (Create. Document-based Database can index views and keep those indexes updated as documents are added. Eventual Consistency According to the CAP theorem it is impossible for a distributed system to simultaneously provide consistency. are available to do all sorts of things with HTTP like caching.

beyond the simple key-document (or key-value) lookup that you can use to retrieve a document. Age:2}].". this string is a URI or path. slots. This system allows new information to be added and it doesn't require explicitly stating if other pieces of information are left out. For example here's a document: FirstName="Bob". Children=[{Name:"Michael". conflict resolution etc. Age:5}. Unlike a relational database where each record would have the same set of fields and unused fields might be kept empty. Another document could be: FirstName="Jonathan". which can be leveraged to have a offline capability for applications. Address="5 Oak St. Keys Documents are addressed in the database via a unique key that represents that document. They are not required to adhere to a standard schema nor will they have all the same sections. the database will offer an API or query language that will allow you to retrieve documents based on their contents. you can use this key to retrieve the document from the database. keys. because the application domain needs the relational data. Retrieval One of the other defining characteristics of a document-oriented database is that. you may want a query that gets you all the documents with a certain field set to a certain value. {Name:"Samantha"..Age:10}. this key is a simple string. in some ways. varies significantly from one implementation to the next. Both documents have some similar information and some different. Organization Implementations offer a variety of ways of organizing documents. to records or rows.Documents inside a document-oriented database are similar. the database retains an index on the key such that document retrieval is fast. {Name:"Elena". I think this itself is the smell. parts. Often. The set of query APIs or query language features available. . Hobby="sailing". Address="15 Wanamassa Point Road". as well as the expected performance of the queries. in relational databases. Who are the people involved in this database application? What are their functions? People go for databases like CouchDB not because there is a need for schema less data but for the other cool features these databases gives us like replication (master to master). Regardless. there are no empty 'fields' in either document (record) in this case. In some cases. features like these should not drive one to use NoSQL databases. For example. or the like. Typically. Age:8}. including notions of y y y y Collections Tags Non-visible Metadata Directory hierarchies 4. {Name:"Jennifer". but they are less rigid.

CouchDB itself does not care how we store documents its up to the API to manage these relationships. Yes (Beta) Yes (Beta) Java JSON over REST/HTTP with Multi-Version Concurrency (unknown) Yes (there is only RESTful API) Rocket Software Proprietary Rocket Software Proprietary Apache Software Apache Foundation License Couchbase. Let s take an example which is quoted in the documentation of Ektorp itself. Free community C++ license /Commercial Yes CRX MUMPSDataba se UniVerse UniData Jackrabbit Day Software Proprietary Proprietarya nd GNU MUMPS Affero GPL (unknown) Commonly used in health (unknown) applications. can we add comments to the blog post and just by saving blog post does it save all the comments as individual documents ? The answer is yes.Any ways the choice is made (yet to be challenged) and we had to figure out how to model our documents and persist in the database which has the parent/child relationship. askSam Systems Proprietary Apstrata Significant Data Systems Proprietary Proprietary Clusterpoint Clusterpoint Ltd. its is possible with @DocumentReferences annotation. Where it can be implemented? Name Lotus Notes askSam Apstrata Datawasp Publisher IBM License Proprietary Language Notes RESTfulA PI (unknown) (unknown) (unknown) (unknown) Scalable.Apac Apache he Software License Foundation CouchDB Erlang . information ranking for search revelevance and clustering. 5. fast full text search engine functionality. schemafree. documentoriented database management system platform with server based data storage. Ektorp provides a way to achieve this. highperformance. What if we want to store a BlogPost and Comment as different documents. a BlogPost and a bunch of Comments.

transactional. PHP. Javascript.NET LINQ-enabled Document Database.0 C. . Ruby. flexible and scalable NoSQL data store for the . database optimized for PowerShell.[4] RESTfulA PI FleetDB FleetDB MIT License Clojure A JSON-based schemafree database optimized for agile development.Name Publisher License Language Notes Control and ACID propert ies. JSON over HTTP Optional using external tools[7] GemFire Enterprise VMWare Commercial Java. Haskell. Key-value store supporting lists and sets with fast. highly transient data.NET and Windows platforms. Scala Memory-oriented.Java. Inc GNU AGPL v3. C++. fast. Alpha software. schemaYes less. . C++ Yes OrientDB Orient Technologies Apache License Java Yes RavenDB RavenDB commercial or GNU AGPL .NET v3.NET. Redis BSD License ANSI C (unknown) StrokeDB Terrastore MIT License Apache License Java (unknown) (unknown) JSON/HTTP Built on top of Apache Thrift framework that provides indexing and document storage ThruDB BSD License C++. etc). Python. Perl.NET( Fast. (unknown) MongoDB 10gen. Java (unknown) . key-value database with indexing and querying support.0 A . Erlang. simple and binary-safe protocol. document-oriented C# F#. focused on providing high performance. Uses mapand reduce for views and queries.

Why it is needed? Utilized to conveniently store. 7. What is this database application? An embedded database system is a DBMS which is tightly integrated with an application software that requires access to stored data in a way that the DBMS is hidden from the application s end-user and requires little or no ongoing maintenance. Based on Dynamic objects. It is actually a broad technology category that includes . (unknown) Supports LINQ.NET 6. DBSlayer DBSlayer Apache License C Eloquera DB Eloquera Proprietary . High performance. Alpha software. read. Also supports JSONQuery/JSONPath querying. edit and retrieve documents. SQL queries. Provides RESTful JSON interface for Create. Alternate implementation is being developed in Java. database abstraction layer (over MySQL) used (unknown) by theNew York Times. What are the examples of these database applications? y y y y y Apache CouchDB MongoDb Github Sourceforge IBM Lotus Notes Embedded database 1. and Yes delete access to data. update. manage. RESTfulA PI Persevere Persevere BSD License A JSON database and JavaScript Application Server.Name Publisher License Language Notes services for building and scaling websites. JSON over HTTP.

Nowadays the database language has evolved and it is available in different types of databases. the ancient program that helped people manage better their information. satellite management systems. 4. nuclear power plant monitors. It is an ACID compliant. There are also APIs for C++ and JAVA. computers in particular and software have developed from some machines that would be able to perform basic tasks that people would probably no longer need anyways. Microsoft Windows and Real-time operating systems. to handling complex. Databases. The term "embedded database" can be confusing because only a small subset of embedded database products is used in real-time embedded systems such as telecommunications switches and consumer electronics devices. each potentially used by different people and in different settings. Who are the people involved in this database application? What are their functions? Technology has managed. Java. to change the ways in which people choose to live their lives. C++. The embedded database is a type of database system or DBMS which is very closely integrated into an application software that requires access to stored data. with deployment environments including medical systems. 2. JDBC.NET and kernel level APIs. network routers. This is why the software database has constantly increased in popularity ever since it was firstly developed in the late 1970s. 3. Applications developed using these APIs may be run in standalone and/or server modes. ODBC. Databases have moreover contributed to the emergence of a different way to keep one s information not only thoroughly organized but also easily accessible. The embedded database is therefore one of them. What is the architectural design of this database? Embedded database system supports many application programming interfaces in several programming languages. Unix. SQL. multiple and various information in a minimum amount of time. Applications developed using these APIs may be run in standalone and/or server modes. relational database that has been embedded into applications by organizations small to large. for quite a while now. MSCALL and ODBC. An .DBMSs with differing properties and target markets. The C programming language has the most APIs including the low-level kernel MR Routines. How does it run? Embedded Database is a full-function. ADO. The first ever built database was dBase. This means that the database system is actually hidden from the application that the end user will work with which makes it much easier for individuals who do not have professional training in programming to work with this type of databases. SQL database engine with C. The layered architecture design provides levels of system optimization for application development. although using programming language are in most cases user friendly and one can learn to work with them very fast and without too much effort. Embedded Database runs on Linux. and other embedded system applications that require reliability and power. Embedded SQL.

Some of them are much simpler than full fledged DBMSs. in an embedded environment. it is possible for the data manager to avoid copies by giving applications direct references to data items in a shared memory cache. eXtremeDB from McObject. and VistaDB from VistaDB Software. ITTIA DB from ITTIA . SQLite. RDM Embedded and RDM Server from Raima Inc. Several products exist to support such databases. and a relatively small "foot-print" (not much code to run as in "regular" general-purpose databases). However. EffiPRoz from EffiProz Systems . no real programming needed). Empress Embedded Database from Empress Software . SQL Server Compact from Microsoft Corporation. The term however depicts a very broad technological system which may include different application programming interfaces such as SQL.embedded database does not require maintenance. the robustness of the total software package is of paramount importance. Inc. with more elementary DBMS functionality (e. Examples of these are collections of documents. and other files. not the isolation between the application and the data manager.. What is this database application? These databases consist of data developed by individual end-users.. Why it is needed? Typically.com. 6. Extensible Storage Engine from Microsoft.ORG. spreadsheets. However. Valentina DB from Paradigma Software. 7. while still enjoying the database qualities and protections that these DBMSs can provide. when data is returned to the user. it must be copied from the data manager's buffer cache (or data page) into the application's memory. 5. Firebird Embedded. 2. HSQLDB from HSQLDB. . Berkeley DB from Oracle Corporation. makes it easier for the end user to work with or handle data. Where it can be implemented? Major embedded database products include. This is a significant performance optimization that can be allowed when the application and data manager are tightly integrated.g. ElevateDB from Elevate Software. in alphabetical order: Advantage Database Server from Sybase Inc. database architecture. What is the architectural design of this database? . Informix Dynamic Server (IDS) from IBM. with basic programming interfaces. This type of database. InnoDB from Oracle Corporation. . storage modes or database models as well as target markets. also available general-purpose DBMSs can often be used for such purpose. multimedia. As a result. not supporting multiple concurrent end-users on a same database). What are the examples of these database applications? y MySQL End-user database 1. although to a very small extent. CSQL from csqlcache. presentations. Inc. SolidDB from IBM. although very complex and complicated. if they provide basic user-interfaces for straightforward database applications (limited query and data display.

4. If so. it will be much easier to design the application that way from the beginning than to move to a separate-database approach later on.) How many concurrent end users do you expect the average tenant to support? The larger the number. data storage requirements may force you to adopt a separate-database model anyway. such as per-tenant backup and restore capability? Such services are easier to offer through a more isolated approach. Begin watching your database end users and performance will begin to take on a whole new meaning. y y y How much storage space do you expect the average tenant's data to occupy? If you expect some or all tenants to store very large amounts of data. while others may bias you toward a more shared approach. How does it run? If you continually gauge database performance from a strictly database-internal point of view. Who are the people involved in this database application? What are their functions? . Do you expect to offer any per-tenant value-added services. but think in terms of orders of magnitude: are you building an application for hundreds of tenants? Thousands? Tens of thousands? More? The larger you expect your tenant base to be. and needs of the tenants you expect to serve all affect your data architecture decision in different ways.The number. the more likely you will want to consider a more shared approach. Some of the following questions may bias you toward a more isolated approach. y How many prospective tenants do you expect to target? You may be nowhere near being able to estimate prospective use with authority. nature. you are missing the boat. the separate-database approach is probably best. 3. the more appropriate a more isolated approach will be to meet end-user requirements. (Indeed.

An inexpensive solution would be for end users to learn required database design skills from software tutors tailored to their needs. Empirical comparison of the tutors tested the teaching effectiveness of the facilitators. further analysis is done to give meaning to the data items.g.. The user can schedule the report to run overnight and have the results ready to view the next morning. a database designer's first step is to draw up a data requirements document. This research describes two tutors developed to teach these skills to end users. the conceptual design is translated into a more low-level.From a Discoverer end user's point of view. 5. Taking the specifications document. Taking the 'data requirements document'. the ability to schedule workbooks is useful for: y y reports that take a long time to run reports that have to run at regular intervals For example. the database designer models how the information is viewed by the database system and is how it is processed and conveyed to the end user. The requirements document contains a concise and non-technical summary of what data items will be stored in the database. What is this database application? . DBMS specific design. and you can further specify the time periods that scheduled workbooks are permitted to run. The tutors were based on a modified Entity-Relationship database design method. In the implementation design phase. workbook scheduling is useful to prevent long-running queries from adversely affecting system performance. The results lead to recommendations for closing the gap between skills required and skills learned by end users in database design. From a Discoverer manager's point of view. What are the examples of these database applications? y Oracle Federated database and multi-database 1. Where it can be implemented? In consultation with all potential users of the database. e. define the more detailed attributes of the data and define constraints if needed. Why it is needed? Lack of familiarity with database design methods could prevent many end users from effectively implementing their database management system packages. or only those workbooks that will exceed a predicted time that you specify). The result of this analysis is a 'preliminary specifications' document (Batini et al. They improved an end user's natural learning process by incorporating design principles and facilitators. a Discoverer end user might want to run a report that they know will take a long time to complete. You can force users to schedule workbooks (either all workbooks. and how the various data items relate to one another. 1986). 7. Development of tutors that teach specific database design skills irrespective of the software package used in implementation has important implications for practitioners and researchers. 6.

A federated database is an integrated database that comprises several distinct databases. With Composite you can create a reusable federated view to model. The wrapper architecture enables the creation of new wrappers. The constituent databases are interconnected via computer network. IBM supports the ANSI SQL/MED standard (MED stands for Management of External Data). What is the architectural design of this database? All systems need to evolve over time. Sometime the term multi-database is used as a synonym to federated database. local transactions confined to a single DBMS) across the participating databases. and deliver data from multiple relational and non-relational sources. a . The federated database engine accesses sources via a software component know as a wrapper. and may be geographically decentralized. though it may refer to a less integrated (e. point-and-click development environment and automated code generation tools. The Composite Studio. In a federated system. simple data definition (DDL) statements allow sources to be dynamically added to the federation without stopping ongoing queries or transactions. In this case typically middleware for distribution is used which typically includes an atomic commit protocol (ACP). the two-phase commit protocol. With Composite provided APIs. you can include data from multiple relational databases. 3.g. Accessing a new type of data source is done by acquiring or creating a wrapper for that source. Once a wrapper exists.. without an FDBMS and a managed integrated schema) group of databases that cooperate in a single application. which transparently integrates multiple autonomous DBMSs. new sources may be needed to meet the changing needs of the users' business. combine. possibly of different types (which makes it a heterogeneous database).. Thus wrappers can be written by third parties as well as IBM. This standard documents the protocols used by a federated server to communicate with external data sources. How does it run? Composite data virtualization lets you easily build and run federated views. Any data source can be wrapped. to allow distributed (global) transactions (vs. with it s easy-to-learn. It is handled as a single database by a federated database management system (FDBMS). access. greatly simplifies federated view building. and used in conjunction with IBM's federated database. federate. each with its own DBMS. Any wrapper written to the SQL/MED interface can be used with IBM's federated database.g. IBM makes it easy to add new sources. and provides them with an integrated conceptual view. e. 2.

 Execution autonomy allows a component DBMS to control the operations requested by local and external operations. leveraging source system resources and myriad other optimization techniques. The Composite Information Server stores these views.. in the case of Open source DBMSs (e. The three level .. DBMS development is a complicated task. 2. IBM.  Communication autonomy refers to the general operation of the DBMS to communicate with other DBMS or not.g. query language or conceptualization. There are four kinds of autonomies addressed Design Autonomy which refers to ability to choose its design irrespective of data. transforms. application data such as from SAP. which may include the DBMS inside (see Embedded database. the components of which are the conceptual schema. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades.These people know the application and its end-user interfaces. Oracle. but need not know nor understand the underlying DBMS.  Association autonomy gives a power to component DBS to disassociate itself from a federation which means FDBS can operate independently of any single DBS. Application developers and Database administrators . Sybase). subject to proper DBMS licensing). insurance people. Heterogeneities in an FDBS are primarily due to design autonomy. 4.These are the people that design and build the DBMS product. and even XML sources without worrying about difficult connections. though they are the intended and main beneficiaries of a DBMS. or other barriers. Composite optimizes the query across all the sources required. to acheive performance levels unmatched by our competitors. The latter group members design the needed database and maintain it. ODBC. Thus.  The ANSI/X3/SPARC Study Group outlined a three level data description architecture. accountants. Microsoft. It is important to understand the aspects of autonomy for component databases and how they can be addressed when a component DBS participates in an FDBS. they are only indirectly involved with it. internal schema and external schema of databases. When run. and ADO. Sometimes the application itself is packaged and sold as a separate product.) . making them available at runtime to mulitple consuming applications such as BI or portals via popular standards including JDBC.wide variety of files including Excel and other formats.These are the people that design and build a database-based application that uses the DBMS. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. medical doctors. Application's end-users (e. Where it can be implemented? Fundamental to the difference between an MDBS and an FDBS is the concept of autonomy. The first group members write the needed application programs which the application comprises. 3. or sold separately as an add-on to the DBMS.g. functionality of the system implementation.g. They are typically the employees of a DBMS vendor (e. 5. Who are the people involved in this database application? What are their functions? 1. and the only ones who touch its code.. DBMS developers .NET. or. etc. MySQL). volunteers or people supported by interested companies and organizations. They are typically skilled systems programmers.

Graph database is designed with the technical depth and enterprise focus of interest to technical team leads. What are the examples of these database applications? y y IBM OBIEE Graph database 1. 2. Export Schema represents a subset of a component schema that is available to the FDBS. edges. 7. External Schema defines a schema for a user/applications or a class of users/applications. for the community. It includes information on data distribution that is generated when integrating export schemas. What is the architectural design of this database? Graph database is a conference that is organized by the community. y y y While accurately representing the state of the art in data integration. They are useful when semantics missed in local schema are incorporated in the component. and properties to represent and store information. the Five Level Schema Architecture above does suffer from a major drawback. How does it run? . They help in integration of data for tightly coupled FDBS. Autonomy and Heterogeneity. Modern data users demand control over how data is presented. 3.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community. It was therefore extended to support the three dimensions of the FDBS namely Distribution. namely IT imposed look and feel. General graph databases that can store any graph are distinct from specialized graph databases such as triplestores and network databases. What is this database application? A graph database is a kind of NoSQL database that uses graph structures with nodes. 6. and project managers. their needs are somewhat in conflict with such bottom-up approaches to data integration. Why it is needed? The five level schema architecture includes the following: y y Local Schema is the conceptual concept expressed in primary data model of component DBMS. The export schema help in managing flow of control of data.architecture is however inadequate to describing the architectures of an FDBS. Federated Schema is an integration of multiple export schema. Component Schema is derived by translating local schema into a model called the canonical data model or common data model. architects. It may include access control information regarding its use by specific federation user. The five level schema architecture is explained below.

MySQL).These are the people that design and build the DBMS product.. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs). The latter group members design the needed database and maintain it.g. Misc: Open-Source. in the mix. Especially for AI and Semantic Web. They are typically the employees of a DBMS vendor (e. Their quick blurb on HyperGraphDB says it is a: general purpose. in the case of Open source DBMSs (e. Written in:Java. subject to proper DBMS licensing). and the only ones who touch its code. 4. Concurrency: STM. or sold separately as an add-on to the DBMS. embeddable. extensible. it can also be used as an embedded object-oriented database for projects of all sizes.. which may include the DBMS inside (see Embedded database.g. open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. Who are the people involved in this database application? What are their functions? 1. They are typically skilled systems programmers. . Oracle. Microsoft. DBMS developers . Replication: P2P. DBMS development is a complicated task. HyperGraphDB . distributed. Query Method: Java or P2P. or. The first group members write the needed application programs which the application comprises. it's interesting to see another graph database. volunteers or people supported by interested companies and organizations. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. 2. Sometimes the application itself is packaged and sold as a separate product. portable.With the success of Neo4j as a graph database in the NoSQL revolution. Application developers and Database administrators .These are the people that design and build a database-based application that uses the DBMS. IBM. Sybase).

a technology transition company from DAMA-UPC Filament . Files. PostgreSQL. they are more suitable to manage ad-hoc and changing data with evolving schemas.open-source platform. but need not know nor understand the underlying DBMS. medical doctors.a customizable.an open-source / commercial (GPLv3 community edition.an open-source (LGPL) graph database supporting generalized hypergraphs where edges can point to other edges InfiniteGraph . InfoGrid . high-performance RDF and graph database. distributed.a highly scalable. AGPLv3 advanced and enterprise edition) graph database OrientDB . They can scale more naturally to large data sets as they do not typically require expensive join operations.a clustered high performance and scalable RDF graph database server R2DF . 5. As they depend less on a rigid schema.a highly scalable RDF/graph database capable of 10B+ edges on a single node or clustered deployment for very high throughput.an open-source / commercial (AGPLv3) graph database and universal access layer (funded by Deutsche Telekom AG) VertexDB . MariaDB and Drizzle sones GraphDB . CloudGraph . GraphBase . etc.. Bigdata . fully transactional . distributed and cloud-enabled commercial product with flexible licensing for startups. small-footprint. accountants. Hadoop) Neo4j . Thus.A high-performance graph database from Sparsity Technologies. free for small entities) graph database with web front end and configurable storage engines (MySQL.g.NET graph database that uses graphs and key/value pairs to store data.and memory-based. graph databases are often faster for associative data sets. though they are the intended and main beneficiaries of a DBMS.graph persistence framework and associated toolkits based on a navigational query style.Graph computation engine (GPLv2 licensed) for MySQL.an open-source / commercial (AGPLv3. insurance people.a scalable.a high-performance open source document-graph database OQGRAPH . Application's end-users (e.These people know the application and its end-user interfaces. Why it is needed? Compared with relational databases.3. outgrowth of bioinformatics DEX . high-performance graph store with a rich tool set from FactNexus Graphd.high performance graph database server that supports automatic garbage collection.R2DF framework for ranked path queries over weighted RDF graphs 6. . they are only indirectly involved with it. and map more directly to the structure of object-oriented applications.a graph database from Microsoft Research Extreme Computing Group (XCG) based on the cloud programming infrastructure Orleans HyperGraphDB . Virtuoso Universal Server . the proprietary backend of Freebase Horton .) . Cytoscape . Where it can be implemented? The following is a list of several well-known graph database projects: y y y y y y y y y y y y y y y y y y y AllegroGraph .a disk.

and a window system for executing display of the data corresponding to the view processed by the view processor on the display and also for informing the view processor of a data entered on the display. 7. Graph databases are a powerful tool for graph-like queries. An extendable hypermedia system has a hypermedia data base for storing a hypermedia data. . In the hypermedia system a programmer can arbitrarily add both data type and view type. Other graph-like queries can be performed over a graph database in a natural way (for example graph's diameter computations or community detection). for example computing the shortest path between two nodes in the graph. albeit one spread across millions of independent computing systems. What are the examples of these database applications? y y y OQGRAPH VertexDB R2DF Hypermedia databases 1.Conversely. relational databases are typically faster at performing the same operation on large numbers of data elements. while Web crawlers and other software provide the equivalent of database indexes to support search and other activities. a view processor for executing modification of a view corresponding to the data processed by the data processor and also for executing addition of the type of the associated view. Web browsers "process" this data one page at a time. What is this database application? The World Wide Web can be thought of as a database. a display for providing a data display. a data processor for executing modification of the data stored in the hypermedia data base and also for executing addition of the type of the associated data.

2. subject to proper DBMS licensing). and URL links to files on external file servers. etc. Application's end-users (e. 4.g. and the only ones who touch its code. or. with emphasis on educational and business content management. Thus. They are typically skilled systems programmers. in the case of Open source DBMSs (e. The latter group members design the needed database and maintain it.. Embedded devices such as iPhone natively support SVG specifications and may be used to create mobile and distributed Hypermedia applications. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. 3. MySQL). Application developers and Database administrators . medical doctors. Oracle. Microsoft. volunteers or people supported by interested companies and organizations. accountants.. How does it run? Hypermedia applications may be developed on embedded devices for the mobile and the Digital signage industries using the Scalable Vector Graphics (SVG) specification from W3C (World Wide Web Consortium). 3. Who are the people involved in this database application? What are their functions? 1. and tools to include hyperlinks in PDF documents such as Adobe InDesign for creating and Adobe Acrobat for editing. Software applications such as Ikivo Animator and Inkscape simplify the development of Hypermedia content based on SVG. which may include the DBMS inside (see Embedded database.These people know the application and its end-user interfaces. Multimedia development software such as Adobe Flash. DBMS developers . with emphasis on entertainment content. For more emphasis on graphics and page layout.g. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. Where it can be implemented? Hyperlinks may also be added to data files using most business software via the limited scripting and hyperlinking features built in. The first group members write the needed application programs which the application comprises. Sybase).. Sometimes the application itself is packaged and sold as a separate product.) . insurance people. Some database software such as Visual FoxPro and FileMaker Developer may be used to develop stand-alone hypermedia applications. they are only indirectly involved with it. but need not know nor understand the underlying DBMS. What is the architectural design of this database? Hypermedia may be developed a number of ways. add-ons to print layout programs such as Quark Immedia. They are typically the employees of a DBMS vendor (e. IBM. Hyper Publish is a tool specifically designed and optimized for hypermedia . Any programming tool can be used to write programs that link data from internal variables and nodes for external data files.2. Macromedia Authorware. DBMS development is a complicated task.g.These are the people that design and build the DBMS product. and MatchWare Mediator may be used to create stand-alone hypermedia applications. 5. This includes presentation programs. or sold separately as an add-on to the DBMS. hyperlinks may be added using most modern desktop publishing tools. Documentation software such as the Microsoft Office Suite and LibreOffice allow for hypertext links to other content within the same file. though they are the intended and main beneficiaries of a DBMS. such as Microsoft Powerpoint and LibreOffice Impress. Adobe Director.These are the people that design and build a database-based application that uses the DBMS. other external files.

6. . A relational database is a way of organizing data such that it appears to the user to be stored in a series of interrelated tables. What is this database application? An in-memory database (IMDB. In applications where response time is critical. Any HTML Editor may be used to build HTML files. Why it is needed? The development and subsequent rapid advance of electronic computers in the second half of the twentieth century led to the development of database models that are far more efficient for dealing with large volumes of information than flat databases. a researcher at IBM. accessible by any web browser. and thus the first commercial products. but typically backed-up by non-volatile computer data storage. Subsequently. Often they run as an embedded database: created when a process starts. such as telecommunications network equipment. for example. and ability to perform a variety of useful tasks that had not been originally envisioned. An in-memory database is a database that runs entirely in main memory. Oracle and DB2. The most notable is the relational model. ease of use. without touching a disk. perhaps because the theoretical basis is not easy to understand. CD/DVD authoring tools such as DVD Studio Pro may be used to hyperlink the content of DVDs for DVD players or web links when the disc is played on a personal computer connected to the internet. Accessing data in memory reduces the I/O reading activity when.and hypertext management. and is destroyed when the process finishes. which was proposed by E. Interest in this model was initially confined to academia. main memory databases are often used. did not appear until around 1980. running embedded within that process. What are the examples of these database applications? y y Internet Inkscape In-memory database 1. Codd in 1970. querying the data. F. relational databases became the dominant type for high performance applications because of their efficiency. Codd. Main memory databases are faster than disk databases. 7. criticized existing data models for their inability to distinguish between the abstract descriptions of data structures and descriptions of the physical access mechanisms. also main memory database or MMDB) is a database that primarily resides in main memory.

Microsoft. Switching to an in-memory database can have an order of magnitude effect which can dramatically reduce build times.2. Who are the people involved in this database application? What are their functions? 1. or. The first one is to use a SQL in-memory database library. They are typically the employees of a DBMS vendor (e.These are the people that design and build the DBMS product. which allows you to keep the test and production deployments closer to each other. 3. 4. MySQL). or an EventPoster. IBM. One of the strengths of the repository approach is that it gives you a consistent way to access (and stub out) non SQL data sources too. volunteers or people supported . In Java-land the popular one seems to be HSQLDB. this makes a big difference to us. tests that hit the database can be a huge time drain when running your test suites.. or it can be reconstructed (imagine a routing table in a router. Often just a bunch of hash-tables for the entry points to the object graph is enough. Sybase). There are applications which need fast access to some sort of managed data which doesn't need to be persisted either because it doesn't change. Storing and manipulating data in main memory. Then you can swap out the database with regular in-memory data structures. in the case of Open source DBMSs (e. When you're developing an enterprise application. An IMDB usually features a strict memory-based architecture and direct data manipulation. there's a small but busy world of in-memory databases out there. particularly for testing.g.. and the only ones who touch its code. Another route is to abstract all the database access behind a Repository. DBMS developers . Elsewhere SQLite and Firebird come up. One issue is that they may not support quite the same dialects or have all the features of the target database. This means that your object-relational mapping system is also hidden inside the repository.) Yet even developers of traditional database systems can find an in-memory database useful. Oracle. What is the architectural design of this database? There are two routes people seem to take to a in-memory database for testing. You can do something similar by running a filebased database on a ram disk. The nice thing about these tools is that they allow you to use regular SQL to query them. Since most ThoughtWorkers get the shakes if they haven't had a green bar recently.g. How does it run? While most people think of databases as large disk-centered creatures.

Such styles can be very scalable and have high performance in cases where you have lots of readers and few writers. you can treat the in-memory database as a cache of the result of applying the log.5 Apache Derby Altibase CSQL BlackRay . Application's end-users (e.1. but need not know nor understand the underlying DBMS. What are the examples of these database applications? y y y y y y y y Java DB 10. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. People I know who tried it found it's tight coupling to the in-memory objects and lack of migration tools caused serious problems. Where it can be implemented? Indeed a few people actively dislike using SQL in-memory databases under the belief that they encourage spreading either SQL or object-relational mapper code around the domain model. which may include the DBMS inside (see Embedded database. Sometimes the application itself is packaged and sold as a separate product.5. If you use an approach that keeps an event log of all changes to your application state. Application developers and Database administrators . though they are the intended and main beneficiaries of a DBMS. They are typically skilled systems programmers. Thus. subject to proper DBMS licensing). 7.by interested companies and organizations.. 5. rebuilding it and snapshotting it as you need.) .1 ASE Berkeley DB Adaptive Server Enterprise (ASE) 15. they are only indirectly involved with it. Why it is needed? Testing is the main driver thus far. etc. 2. The first group members write the needed application programs which the application comprises. insurance people.g. or sold separately as an add-on to the DBMS. but I think there's more to come from in-memory databases. Memory sizes are now enough that many application databases can be loaded into memory. DBMS development is a complicated task. 6. Running SQL in-memory may removes much of the pain of slow access but acts as a deodorant to cover the smell of a missing repository. I've run into a few cases where people have used in-memory databases for very high performance applications. The latter group members design the needed database and maintain it.These people know the application and its end-user interfaces. A difference here is that these experiences tend to be with niche commercial databases while for testing people seem to prefer open-source. 3. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. accountants. But I think the approach of persistent change logs as systems of record is a fertile ground to explore in the future. Prevayler got a lot of attention for taking this kind of approach.These are the people that design and build a database-based application that uses the DBMS. medical doctors.

If one network path fails. and retrieval of knowledge. 2. What is the architectural design of this database? Knowledge base database supports 64-bit file I/O to allow use of files larger than 4 gigabytes (GB). With Knowledge base database 11g Release 1 (11.1). you can configure Knowledge base database to access NFS V3 servers directly using an Knowledge base internal Direct NFS client. providing instructors with the flexibility to model the book to their own course needs. In addition. providing the means for the computerized collection.y y Eloquera eXtremeDB Knowledge base database 1. elementary probability. and control files to support Knowledge base database Real Application Clusters (RAC) on Windows and for those cases where performance needs to be maximized. relations. and action. Also a collection of data representing problems with their solutions and related experiences. physical and logical raw files are supported as data. it is able to optimize the I/O path between itself and the NFS server providing significantly superior performance. organization. and introductory concepts of artificial intelligence. In addition. Through this integration. How does it run? A knowledge-based system (KBS) is a system that uses artificial intelligence techniques in problemsolving processes to support human decision-making. then Direct NFS Client will reissue commands over any remaining paths ensuring fault tolerance and high availability. 3. log. It assumes basic computer science skills and a math background that includes set theory. Direct NFS client simplifies and optimizes the NFS client configuration for database workloads. learning. instead of using the operating system kernel NFS client. What is this database application? A knowledge base is a special kind of database for knowledge management. Direct NFS Client delivers optimized performance by automatically load balancing requests across all specified paths. Knowledge base Direct NFS Client currently supports up to four parallel network paths to provide scalability and high availability. Each of the 12 chapters is designed to be modular. A comprehensive text and resource. Knowledge-Based Systems provides access . Exercises are incorporated throughout the text to highlight certain aspects of the material presented and to simulate thought and discussion.

The first group members write the needed application programs which the application comprises. Such knowledge bases are also used by the semantic web. Who are the people involved in this database application? What are their functions? 1. IBM. white papers. or users may browse through a classification scheme. Human-readable knowledge bases are designed to allow people to retrieve and use the knowledge they contain. classical deduction can be used to reason about the knowledge in the knowledge base. knowledge tags. They are typically skilled systems programmers.to the most current information in KBS and new artificial intelligences. genetic algorithms. y Knowledge base analysis and design (also known as KBAD) is an approach that allows people to conduct analysis and design in a way that results in a knowledge base. volunteers or people supported by interested companies and organizations. and soft systems.g. articles. MySQL).These are the people that design and build a database-based application that uses the DBMS. in the case of Open source DBMSs (e. They contain a set of data. This approach was first implemented by Dr. An ontology can define the structure of stored data . They might store troubleshooting information. Knowledge bases are essentially closed or open information repositories and can be categorised under two main headings: y Machine-readable knowledge bases store knowledge in a computer-readable form. subject to proper DBMS licensing). or answers to frequently asked questions. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. y A text based system that can include groups of documents including hyperlinks between them is known as Hypertext Systems. Microsoft." Knowledge bases can exist on both computers and mobile phones in a hypertext format. . Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work.g. as well as neural networks. such as And (conjunction). fuzzy logic. Sometimes the application itself is packaged and sold as a separate product. or. Typically. material implication and negation may be used to build it up from simpler pieces of information. DBMS developers . They are commonly used to complement a help desk or for sharing information among employees within an organization. Hypertext systems support the decision process by relieving the user of the significant effort it takes to relate and remember things. Sybase). for example as part of an expert system that focuses on a domain like prescription drugs or customs law. DBMS development is a complicated task. Application developers and Database administrators .what types of entities are recorded and what their relationships are.These are the people that design and build the DBMS product. Some machine-readable knowledge bases are used with artificial intelligence. Steven H. Dam 4. a search engine is used to locate information in the system. often in the form of rules that describe the knowledge in a logically consistent manner. user manuals. or sold separately as an add-on to the DBMS. Consequently. usually for the purpose of having automated deductive reasoning applied to them. and the only ones who touch its code. Or (disjunction). Oracle. 2. which can later be used to make informative decisions. which may include the DBMS inside (see Embedded database.. They are typically the employees of a DBMS vendor (e. Logical operators.. The latter group members design the needed database and maintain it.

and template-based content generation. In short. but need not know nor understand the underlying DBMS. Now. There are two ways to achieve dynamic content generation: programmatic content generation. a knowledge base is used to optimize information collection. information technology offers interactive and communication functions. professional knowledge of nutrition. JSP is more convenient to write by using conventional HTML writing tools and easier to modify because only the dynamic parts need to be changed when updating a web page. A well-organized knowledge base can save an enterprise money by decreasing the amount of employee time spent trying to find information about . They are a direct extension of Java servlets and provide a way to separate content generation from content presentation. the project connects to a Microsoft Access database using a type 1 JDBC-ODBC Bridge. Java servlets falls into the first category. organization. In relation to tnformation technology (IT). On the other hand. Today. insurance people. staff can easily manage different kind of information in the database of this health center. while JavaServer Pages (JSP) belongs to the second category. a knowledge base is a machine-readable resource for the dissemination of information. this text is designed to help users develop an appreciation of KBS and their architecture and understand a broad variety of knowledge-based techniques for decision support and planning. Tomcat was adopted to be the JSP engine and this thesis designs an online health consulting and shopping center that can provide users personal dietary assessment. accountants. Where it can be implemented? Electronic commerce (e-commerce) has been sweeping the globe. An integral component of knowledge management systems. information technology was mainly passive. users went online and downloaded data. as well as business professionals. generally online or with the capacity to be put online. such as a CRM package. . it has become a part of the way people live their lives. JSP with Java is more flexible because they are platform independent. this capacity should make the interaction simpler for both the customer and the organization. In this way. a knowledge base is a centralized repository for information: a public library. 5. etc.com could all be considered to be examples of knowledge bases. nutritional news.3. 6. they are only indirectly involved with it.among myriad possibilities . either separately (these are usually called knowledge management software) or as part of another application.tax laws or company policies and procedures. the posting of questions and shopping for nutritional supply products. a knowledge base can give customers easy access to information that would otherwise require contact with an organization's staff. Application's end-users (e. which makes e-commerce more than a fad.g. A number of software applications are available that allow users to create their own knowledge bases. Thus. a database of related information about a particular subject. Why it is needed? Ideal for advanced-undergraduate and graduate students. and whatis. In this project. or for the general public. For a long time. and retrieval for an organization. medical doctors. As a customer relationship management (CRM) tool. though they are the intended and main beneficiaries of a DBMS. special diets for patients. a technique that is an important foundation of e-commerce. JSP is a simple but powerful technology used to generate dynamic HTML on the server side.) . plus ODBC driver. In general..These people know the application and its end-user interfaces. Furthermore. Dynamic Web sites contain Web pages that display constantly changing content. as a rule.

What are the examples of these database applications? y PaperCut KB Operational database 1. According to the World Wide Web Consortium (W3C). process relatively high volumes of updates using transactions. Essentially every major organization on earth uses such databases. access by the index could degrade as there would be more levels that need to be traversed. credit. Indexes can also grow in size so it could support larger data volumes and with this increase. and financial databases that keep track of the organization's money. Enterprise resource planning that record details about product components. What is the architectural design of this database? An operational database contains enterprise data which are up to date and modifiable. an operational database could be said to be an opposite counterpart of a decision support . personnel databases that hold information such as salary. In an enterprise data management system. and demographic information about a business' customers. but a dynamic resource that may itself have the capacity to learn. 7. 2. What is this database application? In Data Warehousing. parts inventory. These databases store detailed data about the operations of an organization. accounting and financial dealings. in the future the Internet may become a vast and complex global knowledge base known as the Semantic Web.In general. An Operational Database is usually put on a separate machine from the Data Warehouse to increase performance. Operational Databases usually use an OLTP Database which is optimized for faster transaction processing: Inserting. Some IT professionals address this problem by having solutions that offload older data to data stores for archive. Data Warehouses use an OLAP Database (Online Analytical Processing) which is optimized for faster queries. They are typically organized by subject matter. a knowledge base is not a static collection of information. Deleting. and Updating data. Operational databases are just part of the entire enterprise data management and some of the data that need to be archived go directly to the data warehouse. for example. There are several reasons for this one of the most obvious reasons is that table scans need to reference more pages of data so it could give results. the Operational Database is one which is accessed by an Operational System to carry out regular operations of an organization. benefits. skills data about employees. On the other side. Examples include customer databases that record contact. as part of an artificial intelligence (AI) expert system.

Oracle.g.These are the people that design and build a database-based application that uses the DBMS.These are the people that design and build the DBMS product. the less efficient the transactions running against the database tend to be. 3. The latter group members design the needed database and maintain it. 2. is the database that is currently and progressive in use capturing real time data and supplying data for real time computations and other analyzing processes.. Most business organizations have regulations and requirements that dictate storing data for longer periods of time for operation. operational databases will have additional stress on processing of transactions leading to slowing down of things.g. and the only ones who touch its code. Who are the people involved in this database application? What are their functions? 1. as the name implies. It takes information and amounts from credit cards and accountants use the operational database because it must balance up to the last penny. How does it run? An operational database is also used for supported IRS task filings and regulations which is why it is sometimes managed by the IT for the finance and operations groups in a business organization. They are typically skilled systems programmers. Application developers and Database administrators . For example. Other areas in business that use an operational database is in a catalog fulfillment system any other Point of Sale system which is used in retail stores. Companies can seldom ran successfully without using an operational database as this database is based on accounts and transactions. in the case of Open source DBMSs (e. An operational database is used for keeping track of payments and inventory. there are certain issues that need to be addressed appropriately. Sybase). An operational database. Because of the very dynamic nature of an operational database. IBM. volunteers or people supported by interested companies and organizations. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. This can even create more complex setup in relation to database performance and usability. an operational database is the one which used for taking order and fulfilling them in a store whether it is a traditional store or an online store. MySQL). DBMS development is a complicated task. 4.. DBMS developers . An example use of a decision support database is that it provides data so that the average salary of many different kinds of workers can be determined while the operational database contains the same data which would be used to calculate the amount for pay checks of the workers depending on the number of days that they have reported in any given period of time. The first group members write the needed application programs which the application comprises. or. Sometimes the application itself is packaged and sold as a separate . They are typically the employees of a DBMS vendor (e. Microsoft. As a general trend. the more data there are in the operational database. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. With ever increasing or expanding operational data volume. An operational database can grow very fast in size and bulk so database administrations and IT analysts must purchase high powered computer hardware and top notch database management systems.database which contain non-modifiable data that are extracted for the purpose of statistical analysis.

Furthermore. personal employee information. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency. Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. medical doctors. Operational databases can be known by another name production database. Operational databases need continuous management. Depending on the type of operational database being used will determine how much information it can hold. Where it can be implemented? Operational Databases are very important to a business. Oracle can store larger amounts of information than Access. 3. For instance. Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. Storing information in a centralized area can increase retrieval time for users. etc. For instance. Application's end-users (e. an operational database has the ability to be modified. Since day-to-day information is important to a business. and previous proposal information. Known by different names.These people know the application and its end-user interfaces. Operational databases also have other features that focus on the business environment. Storing information in a centralized area can increase retrieval time for users. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency.. 5.product. For instance. Operational databases also have other features that focus on the business environment. Thus. Operational databases can store different types of information such as training status.g. insurance people. they are only indirectly involved with it. gather. or sold separately as an add-on to the DBMS. Storing information in a centralized area can increase retrieval time for users. and previous proposal information. subject to proper DBMS licensing). accountants. the management of that information becomes just as important. but need not know nor understand the underlying DBMS. though they are the intended and main beneficiaries of a DBMS. and retrieve specific company information. Why it is needed? Operational databases can store different types of information such as training status. users can misunderstand what the database is supposed to be used for within a business. For instance. These databases allow a business to enter. Operational databases can store different types of information such as training status. Users depend on the accuracy of this information. which may include the DBMS inside (see Embedded database. and previous proposal information. Furthermore. Having someone continually monitor the information being input into the database will make the information retrieved even more valuable because it will be accurate. An important feature of storing information in an operational database is the ability to share information across the company. provide simultaneous read/write requests through pre-defined queries. provide simultaneous read/write requests through pre-defined queries. personal employee information. .) . an operational database has the ability to be modified. personal employee information. Another feature of an operational database is how much information can be stored that pertains to a business. Operational databases are important when information is needed quickly. 6. transaction database could mean that the information stored focuses on financial information even though it may not.

What are the examples of these database applications? y MS Share Point Server Parallel database 1. The operational database is the source of data for the data warehouse. It may also contain system control data such as indicators. many operations are performed simultaneously. consisting of system-specific reference data and event data belonging to a transaction-update system. Operational databases also have other features that focus on the business environment. many operations are performed simultaneously. provide simultaneous read/write requests through pre-defined queries. but all units share the other storage. such as loading data. where operations are performed with no time overlap. an operational database has the ability to be modified. What is this database application? A parallel database. Shared nothing architecture. flags. Parallel databases improve processing and input/output speeds by using multiple central processing units (CPUs) (including multicore processors) and storage in parallel. In parallel processing. For instance. where each processing unit (typically consisting of multiple processors) has its own main memory. seeks to improve performance through parallelization for tasks such as loading data. sequential processing. A parallel database system seeks to improve performance through parallelization of various operations. The data continually changes as updates are made.Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. Centralized and client server database systems are not powerful enough to handle such applications. 7. the distribution is governed solely by performance considerations. run by a parallel DBMS. Shared disk architecture. as opposed to serial processing. The major parallel DBMS architectures (which are induced by the underlying hardware architecture are: Shared memory architecture. In parallel processing. building indexes and evaluating queries. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency. . as opposed to serial. Parallel databases improve processing and input/ouput speeds by using multiple CPUs and disks in parallel. as well as other data storage. Furthermore. where multiple processors share the main memory space. building indexes and evaluating queries. Operational Database is the database-of-record. in which the computational steps are performed sequentially. and counters. It contains detailed data used to run the day-to-day operations of the business. and reflect the current value of the last transaction. where each processing unit has its own main memory and other storage. Although data may be stored in a distributed fashion.

. and executes the smaller tasks concurrently on several nodes. but all nodes share mass storage. where each node has its own mass storage as well as main memory. As a result. How does it run? Using this method: y y Shared memory architecture. where multiple processors share the main memory space. as well as mass storage (e.g. each node usually also has multiple processors. Tasks may need to be synchronized. usually a storage area network. and other devices. y Parallel processing divides a large task into many smaller tasks. Shared disk architecture. where each node has its own main memory. In practice. the larger task completes more quickly.2. hard disk drives). disks. Shared nothing architecture. 3. What is the architectural design of this database? A parallel processing system has the following characteristics: y y y Each processor in a system can perform tasks concurrently. such as data. Nodes usually share resources.

Thus. Where it can be implemented? Parallel database software is often specialized-usually to serve as query processors. distributed operations. and support for a wide variety of client tools .These are the people that design and build the DBMS product.g. These include online decision support. portability. Sometimes the application itself is packaged and sold as a separate product. however. parallel processing will not necessarily speed up the flow of loans. and high availability systems.) . Features such as online backup. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. can reside on a single machine. Microsoft. or sold separately as an add-on to the DBMS. 3. DBMS development is a complicated task. 5... data replication. in the case of Open source DBMSs (e. Specialized servers have been used most successfully in the area of very large databases: in DSS applications. often on a separate machine. Other tasks. which may include the DBMS inside (see Embedded database. They are typically skilled systems programmers. Sybase). Application developers and Database administrators . For example. they are only indirectly involved with it. for example. do not lend themselves to this approach. Who are the people involved in this database application? What are their functions? 1. all the requests must form a single queue for bank manager approval. insurance people. Versatile parallel database software should offer excellent price/performance on open systems hardware. MySQL). and thus are good candidates for parallel processing. medical doctors. batch reporting. and the only ones who touch its code. OLTP. No matter how many tellers are available to process loans. With two tellers. accountants. and be designed to serve a wide variety of enterprise computing needs. specialized servers do not provide a common foundation for integrated operations. No amount of parallel processing can overcome this built-in bottleneck to the system.. in a bank with only one teller. DBMS developers . however. data warehousing. all customers must form a single queue to be served. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. etc. but need not know nor understand the underlying DBMS. if the bank manager must approve all loan requests.These are the people that design and build a database-based application that uses the DBMS. interoperability. The first group members write the needed application programs which the application comprises. The latter group members design the needed database and maintain it. Application's end-users (e.g. They are typically the employees of a DBMS vendor (e. By contrast. 4. subject to proper DBMS licensing).These people know the application and its end-user interfaces. Multiple processors.Note: A node is a separate processor. 2. volunteers or people supported by interested companies and organizations. Since they are designed to serve a single function. IBM. This is an instance in which parallel processing is an effective solution.g. Some tasks can be effectively divided. the task can be effectively split so that customers form two queues and are served twice as fast-or they can form a single queue to provide fairness. Oracle. though they are the intended and main beneficiaries of a DBMS. however. or.

Real-time databases are useful for accounting. as well as a mixed OLTP and DSS workload. law. 2. multi-media. Real-time processing means that a transaction is processed fast enough for the result to come back and be acted on right away (Capron). This differs from traditional databases containing persistent data. This is an efficient approach because many applications consist of online insert and update transactions which tend to have short data access requirements. Cluster (shared nothing + shared disk: SAN/NAS). banking. where each CPU processes an entire transaction. and scientific data analysis (Snodgrass). a stock market changes very rapidly and is dynamic. In addition to balancing the workload among CPUs. A parallel database is designed to take advantage of such architectures by running multiple instances which "share" a single physical database. What is this database application? A real-time database is a processing system designed to handle workloads whose state is constantly changing (Buchmann). A parallel server processes transactions in parallel by servicing a stream of transactions using multiple CPUs on different nodes. DSS applications are characterized by long transactions. What are the examples of these database applications? y y Non-Uniform Memory Architecture (NUMA). with high CPU and I/O usage. process control. with increased performance. mostly unaffected by time. the parallel database provides for concurrent access to data and protects data integrity. decision support system (DSS) applications. distributed operations. software. 7. For example. Using parallel data manipulation language you can have one transaction being performed by multiple nodes.Parallel database software must effectively deploy the system's processing power to handle diverse applications: online transaction processing (OLTP) applications. medical records. or peripheral devices. they are integrating themselves into our society and are employed in many applications. reservation systems. which is formed by a group of connected computers. Real-time database 1. What is the architectural design of this database? .can enable a parallel server to support application integration. a parallel server can allow access to a single database by users on multiple machines. which involves the Non-Uniform Memory Access. and mixed application workloads. 6. In appropriate applications. As computers increase in power and can store more data. The graphs of the different markets appear to be very unstable and yet a database has to keep track of current values for all of the markets of the New York Stock Exchange (Kanitkar). Why it is needed? A variety of hardware architectures allow multiple computers to share access to data. OLTP applications are characterized by short transactions which have low CPU and I/O usage.

A transaction is usually the result of an execution of a program that accesses or changes the contents of a database (Singhal). each transaction uses a timestamp to schedule the transactions (Abbot). transactions are sporadic with unpredictable arrival times. The timestamp method on relies on the arrival time in the system. the system gives an . and transactions can do both read and write operations. Throughout this article. This means in a stream. multiple users can read from the same piece of data. In real-time databases. A transaction is different from a stream because a stream only allows read-only operations. A priority mapper unit assigns a level of importance to each transaction upon its arrival in the database system that is dependent on how the system views times and other priorities.Although the real-time database system may seem like a simple system. a disk based database. Real-time databases can process these requests utilizing scheduling algorithms for concurrency control. if two students demand to take the remaining spot for a section of a class and they hit submit at the same time. In a real-time system. A database must let only one transaction operate at a time to preserve data consistency. we assume that the system has a single processor. and a main memory pool(Haritsa). For example. problems arise during overload when two or more database transactions require access to the same portion of the database. deadlines are formed and different kinds of systems respond differently to data that does not meet its deadline. For example. but they cannot both modify it (Abbot). only one student should be able to register for it (Abbot). Researchers indicate that for most studies. prioritizing both students requests in some way.

If the telephone switchboards were overloaded. Example applications that handle large amounts of data and have stringent timing requirements include telephone switching (e. Like a conventional database system. As another example. This is an example of least slack that can wait the least amount of time. 5. Earliest Deadline PT = DT product.g. people who call 911 should get priority (Snodgrass). automated searching and processing of large amounts of trading information are very desirable. 3. translating an 800 number into an actual numbradar tracking and others. The value of a transaction is not important. a RTDBS. Who are the people involved in this database application? What are their functions? For example. A school registration system probably uses this technique when the server receives two registration transactions. buy-sell decisions have to be made promptly. If one student had 22 credits and the other had 100 credits. Value inflated deadline PT = DT/VT Gives equal weight to deadline and values based on scheduling. which provides database operations with real-time constraints is generally called a real-time database system (RTDBS) [1]. involves trading commodities in different markets at different prices. The graphs of the different markets appear to be very unstable and yet a database has to keep track of current values for all of the markets of the New York Stock Exchange (Kanitkar). Where it can be implemented? . However. a radar surveillance system detects aircraft "images" or "radar signatures". for example. Arbitrage trading. whose "tasks" are associated with time constraints. has the added burden of ensuring some degree of confidence in meeting the system's timing requirements. In this scenario. Since price discrepancies are usually short-lived. An example is registering for classes where the student selects a block of classes that he wishes to take and presses submit. a RTDBS functions as a repository of data. a stock market changes very rapidly and is dynamic. The resulting integrated system. Some transactions should get to CPU based on criticalness.earlier request deadline to a higher priority and a later deadline to a lower priority (Haritsa). provides effcient storage. and performs retrieval and manipulation of information. Below is a comparison of different scheduling algorithms. the person with 100 credits would take priority (Value based scheduling). as a part of a realtime system. so in recent years there has been interest in "merging" database and real-time technology. not fairness. How does it run? Database management systems provide tools for such organization. 4. In order to capitalize on the opportunities. An example is a group of people calling to order a Highest Value PT = 1/VT The deadline is not important. higher priorities often take up precedence. often with a time constraint so that the financial overhead in performing the trade actions are well compensated by the benefit resulting from the trade.

While studies in labs have found that data-deadline based policies do not improve performance significantly. the result could be devastating.Real-time databases are traditional databases that use an extension to give the additional power to yield reliable responses. 6. A conventional database cannot work under these circumstances because the inconsistencies between the real world objects and the data that represents them are too severe for simple modifications. They use timing constraints that represent a certain range of values for which the data are valid. What are the examples of these database applications? y y ADDM Real-Time Database Systems 2001 Spatial database 1. consider how to represent attribute values in the database so that process transactions and data consistency have no violations. the timestamp can support transactions by providing clear time references. When designing a real-time database system. altitude. What is this database application? . often a sensor or an input device monitors the state of the physical system and updates the database with new information to reflect the physical system more accurately (Abbot). This range is called temporal validity. They use timing constraints that represent a certain range of values for which the data are valid. and support priority scheduling. it is important to consider what the system should do when deadlines are not met. how facts are associated with real-time system. To enter the data in the records. 7. The latter method helps increase the accuracy of the system and can cut down on the number of necessary processes that are aborted (Kang). An additional way of dealing with conflict resolution in a real-time database system besides deadlines is a wait policy method. and speed. For example. the forced wait policy can improve performance by 50 percent (Porkka). This range is called temporal validity. This process helps ensure the latest information in time critical systems. The forced wait policy delays processing until the data is updated using new input data. return only temporally valid data. If any of this information is late. an air-traffic control system constantly monitors hundreds of aircraft and makes decisions about incoming flight paths and determines the order in which aircraft should land based on data such as fuel. A conventional database cannot work under these circumstances because the inconsistencies between the real world objects and the data that represents them are too severe for simple modifications. one should consider how to represent valid time. Generally relying on wait policies is a not optimal (Kang). Why it is needed? Real-time databases are traditional databases that use an extension to give the additional power to yield reliable responses. The forced wait policy may involve waiting for higher priority transactions to process in order to prevent deadlock. An effective system needs to be able to handle timesensitive queries. Also. When designing a system. To address issues of obsolete data. Another example of when data can be delayed is when a block of data is about to expire. The policy avoids conflict by asking all non-requesting blocks to wait until the most essential block of data is processed (Abbot).

A spatial database is a database that is optimized to store and query data that is related to objects in space. What is the architectural design of this database? Database systems use indexes to quickly look up values and the way that most databases index data is not optimal for spatial queries. The following query types and many more are supported by the Open Geospatial Consortium: y y y y Spatial Measurements: Finds the distance between points. Observer Functions: Queries which return specific information about a feature such as the location of the center of a circle y Not all spatial databases support these query types. How does it run? . spatial databases can perform a wide variety of spatial operations. including points. If the first and last vertex of a line are identical the feature can also be of the type polygon (a closed line). Spatial Predicates: Allows true/false queries such as 'is there a residence located within a mile of the area we are planning to build the landfill?' (see DE-9IM) Constructor Functions: Creates new features with an SQL query specifying the vertices (points of nodes) which can make up lines. 3. lines and polygons. These are typically called geometry or feature. additional functionality needs to be added for databases to process spatial data types. etc. polygon area. 2. for example by providing a buffer around them. etc. In addition to typical SQL queries such as SELECT statements. spatial databases use a spatial index to speed up database operations. While typical databases can understand various numeric and character types of data. Instead. The Open Geospatial Consortium created the Simple Features specification and sets standards for adding spatial functionality to database systems. intersecting features. Spatial Functions: Modify existing features to create new ones.

or sold separately as an add-on to the DBMS. the native GE Smallworld GIS database -SpatiaLite extends Sqlite with spatial datatypes. they are only indirectly involved with it. functions. From there.. They are typically skilled systems programmers. i. Small queries are processed by single page accesses as described before. instead of consulting several different GDL could do a one-stop initial search in this data warehouse and find the preliminary information they need.g. For each scene just onesearch operation on secondary storage is necessary. etc. Who are the people involved in this database application? What are their functions? 1. and utilities.g.e. small queries as well as large queries are processed efficiently.. MySQL). Microsoft. If more precise information is needed about the successful data sets.These people know the application and its end-user interfaces. Using such architecture greatly reduces the problems previously mentioned (and completely eliminates them if the metadata stored/derived in the warehouse are sufficient). but need not know nor understand the underlying DBMS. Users. and the only ones who touch its code. all scenes intersecting the query region. insurance people. They are typically the employees of a DBMS vendor (e. using a computerized procedure to define their needs (the system could perform a preliminary but very useful filtering of the data sets. subtrees of the R*-tree. medical doctors. Application developers and Database administrators . y y y y y -All OpenGIS Specifications compliant products -Open source spatial databases and APIs.These are the people that design and build a database-based application that uses the DBMS. which may include the DBMS inside (see Embedded database..g. Using our scene architecture. The latter group members design the needed database and maintain it.) . If a range query specifies a larger query region. This Architecture integrates a number of various concepts and techniques for efficient query Processing. Sometimes the application itself is packaged and sold as a separate product. 4. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. volunteers or people supported by interested companies and organizations. are transferred into the main memory. DBMS developers . 5. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work.A data warehouse system getting some data from different GDLs and integrating them in a single database. 3. Where it can be implemented? We proposed a storage and access architecture for geographic database systems. Application's end-users (e. Thus. The first group members write the needed application programs which the application comprises. or. though they are the intended and main beneficiaries of a DBMS. Oracle. accountants.These are the people that design and build the DBMS product. in the case of Open source DBMSs (e. . 2. DBMS development is a complicated task. -Smallworld VMDS. users could go to the legacy GDL's data source to find what they are looking for. subject to proper DBMS licensing). IBM. some of which are OpenGIS compliant -Boeing's Spatial Query Server (Official Site) spatially enables Sybase ASE. Sybase).

What are the examples of these database applications? y y y y y y y y y y y y y y y y y -Census Data -NASA satellites imagery .0.Graph database that can build 1D and 2D indexes as Btree. ISO. It includes extension syntax for SPARQL queries -MongoDB supports geospatial indexes in 2D 6. MySQL versions earlier than 5.16. -MySQL DBMS implements the datatype geometry plus some spatial functions that haven't been implemented according to the OpenGIS specifications. Functions that test spatial relationships are limited to working with minimum bounding rectangles rather than the actual geometries. Farms. enterprises. As of MySQL 5. NDB. and governments manage and display spatial data in an intuitive. and adjusting to multiple types of user-profile demands. with support for spatial types -Oracle Spatial -Microsoft SQL Server has support for spatial types since version 2008 -PostgreSQL DBMS (database management system) uses the spatial extension PostGIS to implement the standardized datatype geometry and corresponding functions. adapting to corporate standards.16 only supported spatial data in MyISAM tables. BDB.y y y y y y y y -IBM DB2 Spatial Extender can be used to enable any edition of DB2. -Neo4j . easy-to-understand format that facilitates collaboration and data interoperability. including the free DB2 Express-C. 7. and INSPIRE standards. ecological impact -Medical Imaging -Road Map Grid (spatial index) Z-order (curve) Quadtree Octree UB-tree R-tree: R+ tree R* tree Hilbert R-tree X-tree kd-tree Temporal database . InnoDB.0. following OGC.terabytes of data per day -Weather and Climate Data -Rivers. and ARCHIVE also support spatial features. Quadtree and Hilbert curve directly in the graph -AllegroGraph . Organizations. It happens when all the pieces of spatial data infrastructure are in place.a Graph database provides a novel mechanism for efficient storage and retrieval of two-dimensional geospatial coordinates for Resource Description Framework data. Why it is needed? Creates flexible and scalable solutions that integrate easily into any organization's IT environment by adopting security policies.

A time period attached to the data expresses when it was valid or stored in the database. However. and Transaction-To is the time that the transaction was superseded (or infinity if it has not yet been superseded). such as Decision Time. in the database. Note that these two time periods do not have to be the same for a single fact. Bi-temporal relation contains both valid and transaction time. The Valid Time stress the data for which a fact is true in the real world. Two more fields are added to the Person table: Transaction-From and Transaction-To.    Valid time denotes the time period during which a fact is true with respect to the real world. This enables queries that show the state of the database at a given time. Imagine that we come up with a temporal database storing data about the 18th century. However even the most simple temporal query rewriting the history of an . By attaching a time period to the data. What is this database application? Temporal database is a database with builtin time aspects. where did the database believe John lived?") is provided by the transaction time. this approach introduces additional complexities such as dealing with the validity of (foreign) keys. Transaction time is the time period during which a fact is stored in the database. causing the queries to produce different results. This is good because it provides both temporal rollback and historical information. The valid time of these facts is somewhere between 1701 and 1800. Bitemporal data combines both Valid and Transaction Time.: "In 1992.: "Where did John live in 1992?") can be derived from valid time. Transaction time is the time a transaction was made. The answers to these example questions may not be identical . In that case the database would be called a multitemporal database as opposed to a bitemporal database. In order to achieve perfect archival quality it is of key importance to store the data under the schema version under which they firstly appeared. Transaction-From is the time a transaction was made. This is good because it provides both temporal rollback and historical information.the database may have been altered since 1992. 2.g. These attributes go together to form bitemporal data. What is the architectural design of this database? A bi-temporal relation contains both valid and transaction time.g. 3. Historical information (e. it becomes possible to store different database states. for example. How does it run? More specifically the temporal aspects usually include valid-time and transaction-time. It is possible to have timelines other than Valid Time and Transaction Time. whereas the transaction time starts when we insert the facts into the database.1. January 21. 1998. Temporal rollback (e.

-TimeDB is a free temporal relational DBMS by TimeConsult. in the case of Open source DBMSs (e.) . etc. -Provide query algebra to deal with temporal data.. medical doctors. IBM. 2. Application developers and Database administrators . -Prevent fragmentation of an object description. They are typically the employees of a DBMS vendor (e. Teradata version 13. subject to proper DBMS licensing).These are the people that design and build the DBMS product. A common solution is to provide automatic query rewriting. which may include the DBMS inside (see Embedded database. What are the examples of these database applications? . proposed and historical versions of data in the same database.a feature of Oracle Database. Microsoft. or.10 has temporal features built into the database. Why it is needed? y y y y y -It is use in dealing with variation of data over time. Who are the people involved in this database application? What are their functions? 1. y y y 6. The first group members write the needed application programs which the application comprises.attribute value would be required to be manually rewritten under each of the schema versions. DBMS development is a complicated task. insurance people. Application's end-users (e. Oracle.g. volunteers or people supported by interested companies and organizations. The latest version complies with TSQL2. accountants.g. -Teradata version13. It runs as a frontend to Oracle that accepts TSQL2 statements and generates SQL92 statements. The latter group members design the needed database and maintain it. Sometimes the application itself is packaged and sold as a separate product. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. 4. Sybase). Thus. 3. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. and the only ones who touch its code. but need not know nor understand the underlying DBMS. or sold separately as an add-on to the DBMS.. MySQL). 7. They are typically skilled systems programmers. DBMS developers .These are the people that design and build a database-based application that uses the DBMS. though they are the intended and main beneficiaries of a DBMS.. -Compatible with old database without temporal data. Where it can be implemented? The following implementations implement a bitemporal database in a relational database management system (RDBMS): y -Oracle Workspace Manager. enables application developers and DBAs to manage current.g. -PostgreSQL has an open-source contributed package that can be installed in the database to manage temporal data. they are only indirectly involved with it. -Identification of an appropriate data type for time.These people know the application and its end-user interfaces. 5.10 has temporal features built into the database. This process would be particularly taxing for users.

and facts as well. It may include email messages. Unstructured information is typically text-heavy. documents. but may contain data such as dates. The name may be misleading since some objects can be highly structured. Most established DBMSs now support unstructured data in various ways. the entire possible object collection does not fit into a predefined structured framework. However. What is the architectural design of this database? Unstructured Data (or unstructured information) refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables. This results in irregularities and ambiguities that make it difficult to understand using traditional computer programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents. What is this database application? An unstructured-data database is intended to store in a manageable and protected way diverse objects that do not fit naturally and conveniently in common databases. The term is imprecise for several reasons. and y unstructured information might have some structure (semi-structured) or even be highly structured but in ways that are unanticipated or unannounced. while not formally defined can still be implied and y data with some form of structure may still be characterized as unstructured if its structure is not helpful for the desired processing task.y y y -A temporal data model -A temporal version of Structured Query Language -Short biography Unstructured-data database 1. y structure. journals. numbers. 2. multimedia objects etc. How does it run? . and new dedicated DBMSs are emerging. 3.

and the only ones who touch its code. IBM. 2. volunteers or people supported by interested companies and organizations. making it understandable to the users. In analyzing unstructured information. they are only indirectly involved with it. medical doctors. MySQL). Sybase). or. Microsoft. for instance.) .These people know the application and its end-user interfaces... UIMA (Unstructured Information Management Architecture..These are the people that design and build a database-based application that uses the DBMS. by examining word morphology. Unstructured information can then be enriched and tagged to address ambiguities and relevancy-based techniques then used to facilitate search and discovery. Common techniques for structuring text usually involve manual tagging with metadata or Part-of-speech tagging for further text mining-based structuring. UIM applications make use of a variety of analysis technologies. and ontology. video. but need not know nor understand the underlying DBMS. machine learning. including statistical and rule-based Natural Language Processing (NLP). etc. auditory. or otherwise interpret. which may include the DBMS inside (see Embedded database. or sold separately as an add-on to the DBMS. subject to proper DBMS licensing). Oracle. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades.g. They are typically the employees of a DBMS vendor (e. They are typically skilled systems programmers. though they are the intended and main beneficiaries of a DBMS. Information Retrieval (IR). 5. sentence syntax. Sometimes the application itself is packaged and sold as a separate product.) provides a common framework for processing this information to extract meaning and create structured data about the information. Where it can be implemented? Data mining and text analytics and noisy text analytics techniques are different methods used to find patterns in. Creating order from an abundance of sources in many formats. The first group members write the needed application programs which the application comprises. in the case of Open source DBMSs (e. and other small. images. this information. Unstructured information management (UIM) applications are software systems that analyze unstructured information (text. etc.) to discover. Because of its seemingly infinite scalability. Why it is needed? Creates machine-processable structure exploits the linguistic. Thus.and large-scale patterns. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. . Who are the people involved in this database application? What are their functions? 1. DBMS development is a complicated task.g. and deliver relevant knowledge to the user.These are the people that design and build the DBMS product. The latter group members design the needed database and maintain it. accountants. insurance people. A design-friendly lack of schema. audio. Application's end-users (e. 6. Integration of both Restful and cloud computing technologies. unstructured information management (UIM) make this data accessible and searchable via management systems and applications. High availability. 4. organize. Application developers and Database administrators .g. Extraordinary fault tolerance. DBMS developers . This inherent structure can be inferred from text. 3. and visual structure that is inherent in all forms of human communication.Unstructured information represents the vast majority of the data collected and accessible to enterprises.

What are the examples of these database applications? y y y y y y y y y Books Journals Documents Metadata health records audio video files unstructured text (such as the body of an e-mail message. or word processor document) . Web page.7.

Niño. 2011 . Eliza Mapanoo Submitted On: December 12. City of Biñan. Jocson. Karl Neil Anthony C.University of Perpetual Help System Laguna Sto. Laguna Project in Database Management Submitted By: Garde II. Henry D. Submitted To: Mrs. Verlourd S. Russelle Ralph Ubando. Espeleta.