Active database

1. What is this database application? An active database is a database that includes an event-driven architecture which can respond to conditions both inside and outside the database. Possible uses include security monitoring, alerting, statistics gathering and authorization. Most modern relational databases include active database features in the form of database trigger. 2. What is the architectural design of this database? Active databases endow conventional database functionality with event-based rule processing. The behavior of an active database is accomplished through a set of ECA-rules (Event-Condition-Action rules) related with the database. Once a certain event is detected, the relevant rules will be triggered. The triggering of rules involves such procedures as evaluating a certain condition of the database and executing the corresponding action. An active database gains its power from the array of events it can respond to and the type of actions it can perform in response. Active databases support the creation of triggers which fire when certain operations occur on the database. Most industrial relational databases nowadays are integrated with active database capabilities (e.g. Microsoft SQL Server, Oracle, Postgres, Sybase and Teradata), and all support SQL Triggers. 3. How does it run?

Active database systems support mechanisms that enable them to respond automatically to events that are taking place either inside or outside the database system itself. Considerable effort has been directed towards improving understanding of such systems in recent years, and many different proposals have been made and applications suggested. This high level of activity has not yielded a single agreed-upon standard approach to the integration of active functionality with conventional database systems, but has led to improved understanding of active behavior description languages, execution models, and architectures. This survey presents the fundamental characteristics of active database systems, describes a collection of representative systems within a common framework, considers the consequences for implementations of certain design decisions, and discusses tools for developing active applications. 4. Who are the people involved in this database application? What are their functions? There is no doubt that computers have changed the lives of people. Nowadays almost every kind of task performed on a computer of a kind or another and their popularity keeps on increasing as well as does the development of new and modern technology. There are many things about technology that can make one s life easier and one of them is the database. Databases have been designed with the aim of helping people keep their data organized and provide them with an opportunity to select only the data that they need while not losing any type of information. There are different types of software database but here on can learn more about the active databases. 5. When it can be implemented? An active database system, in contrast, is a database system that monitors situations of interest, and when they occur, triggers an appropriate response in a timely manner. The desired behavior is expressed in production rules (also called event-condition-action rules), which are denied and stored in the database. This has the beneath that the rules can be shared by many application programs, and the database system can optimize their implementation. The production rule paradigm originated in the held of Artificial Intelligence (AI) with expert systems rule languages such as OPS5 6. Why it is needed? Active database systems aim at the representation of more real-world semantics in the database by supporting event-condition-action rules (ECA-rules). ECA-rules can be interpreted as "when the specified event occurs and the condition holds, execute the action." An event indicates the point in time when some sort of reaction is required from the DBMS. For primitive events, this point in time can be specified by an occurrence in the database, by an occurrence in the DBMS, or by an occurrence in the database environment. For composite events, the point in time is defined on the basis of other points in time which represent other primitive and/or composite events (called component events). These components are combined by means of event constructors as: negation, conjunction, disjunction, sequence, etc. The action describes treatments to achieve when a specific event happens and some condition holds. The potential uses of reactive behavior are significant: active rules support data derivations, integrity maintenance, workflow management, replication management, and more. 7. What are the examples of this database application?

y y

SQL'99. CS561.

Cloud database

1. What is this database application A Cloud database is a database that relies on cloud technology. Both the database and most of its DBMS reside remotely, "in the cloud," while its applications are both developed by programmers and later maintained and utilized by (application's) end-users through a Web browser and Open APIs. More and more such database products are emerging, both of new vendors and by virtually all established database vendors. 2. What is the architectural design of this database?

Most database services offer web-based consoles, which the end user can use to provision and configure database instances. For example, the Amazon Web Services web console enables users to launch database instances, create snapshots (similar to backups) of databases, and monitor database statistics.

Database services consist of a database manager component, which controls the underlying database instances using a service API. The service API is exposed to the end user, and permits users to perform maintenance and scaling operations on their database instances. For example, the Amazon Relational Database Service's service API enables creating a database instance, modifying the resources available to a database instance, deleting a database instance, creating a snapshot (similar to a backup) of a database, and restoring a database from a snapshot. Database services take care of scalability and high availability of the database. Scalability features differ between vendors - some offer auto-scaling, others enable the user to scale up using an API, but do not scale automatically. There is typically a commitment for a certain level of high availability (e.g. 99.9% or 99.99%). 3. How does it run? There are two primary methods to run a database on the cloud: Virtual Machine Image - cloud platforms allow users to purchase virtual machine instances for a limited time. It is possible to run a database on these virtual machines. Users can either upload their own machine image with a database installed on it, or use ready-made machine images that already include an optimized installation of a database. For example, Oracle provides a ready-made machine image with an installation of Oracle Database 11g Enterprise Edition on Amazon EC2. Database as a Service - some cloud platforms offer options for using a database as a service, without physically launching a virtual machine instance for the database. In this configuration, application owners do not have to install and maintain the database on their own. Instead, the database service provider takes responsibility for installing and maintaining the database, and application owners pay according to their usage. For example, Amazon Web Services provides two database services as part of its cloud offering, SimpleDB which is a NoSQL key-value store, and Amazon Relational Database Service which is an SQL-based database service with a MySQL interface. A third option is managed database hosting on the cloud, where the database is not offered as a service, but the cloud provider hosts the database and manages it on the application owner's behalf. For example, cloud provider Rackspace offers managed hosting for MySQL databases. 4. Who are the people involved in this database application? What are their functions? To many of us today, the cloud seems like a bit of magic. We often simply use the services of a cloud based system without really thinking about where the cloud is located, or who keeps it running. Ultimately, behind every cloud , there are real people managing real machines. What is marketed as a cloud is really a rack of machines, with a very real person who has to keep them running. To that person; the administrator, the cloud isn t in the cloud , it s in his own data center! The administrator must put together a set of machines, software and administrative tools that enable everything to be viewed in a completely hands-off way by the users, so that they think of it as a cloud . The challenge vendors face when trying to market and sell something for the cloud is that the definition of the cloud is so broad and varied. What passes as the cloud to one person is simply a set of machines to another person. Our new SQL Anywhere OnDemand Edition (code named Fuji ) currently in beta test, is one such product designed to help the administrator of those machines

to create a data cloud. While the administrator certainly will know what machines are in use, where the database servers are running and where the databases are located, the end user will be enabled to simply view their database as being in the cloud . The administrator of the cloud system will use one of the primary components of Fuji: the administrative console. The console is designed to enable an administrator to easily keep track of the various host machines that are part of the system, the SQL Anywhere servers that are running on each host, and the databases being served by each database server. The console also provides access to all the various tasks that an administrator might want to execute on their running cloud; including starting and stopping database servers, adding a new database into the cloud, setting up high-availability for a database, and backup/restore operations. The console is completely web-driven, so it can be accessed using a standard Flash-enabled web browser. 5. Where it can be implemented? SQL Databases, such as Oracle Database, Microsoft SQL Server and MySQL, are one type of database which can be run on the cloud (either as a Virtual Machine Image or as a service, depending on the vendor). SQL databases are difficult to scale, meaning they are not natively suited to a cloud environment, although cloud database services based on SQL are attempting to address this challenge. NoSQL Databases, such as Apache Cassandra, CouchDB and MongoDB, are another type of database which can run on the cloud. NoSQL databases are built to service heavy read/write loads and are able scale up and down easily, and therefore they are more natively suited to running on the cloud. However, most contemporary applications are built around an SQL data model, so working with NoSQL databases often requires a complete rewrite of application code. 6. Why it is needed? It is important to have a database as an organic part of the cloud for one key reason: to avoid dedicated and complex maintenance required to babysit the odd-child in the cloud infrastructure. Any non-cloud service will become some kind of an exception which requires special maintenance, skillset, procedures etc. It s definitely true long-term, but even in their day-to-day operations today, DBAs and System Administrators can attest to how much time and energy (and admin costs) go into monitoring their DB and ensure it plays well and is properly integrated with the other components of the cloud. In addition, non-native cloud services will not enjoy the benefits achieved by natural tenants of the cloud. These benefits include, for example, all the automation, resource-optimization, dynamic networking, and more. Databases should be regarded as integral part of the cloud so that the IT infrastructure can really be a commodity, which can be bought, traded, re-allocated and moved around as needed. On the other side of the equation, to live up to the theory, the cloud database technology itself must deliver a convincing paradigm and proof for its ability to keep the data safe, secured and always available, at least on the same level as enterprise databases today.

insert or delete. transforming. Bulk native loaders for input data and bulk SQL passes for aggregation are the norm. getting summarized. it is unusual and inefficient to target individual rows for update. analyzing. and as such.7. Some basic and essential components of data warehousing include retrieving. What is the architectural design of this database? . anonymized.loading and managing data so as to make it available for further use. Often operational data undergoes transformation on its way into the warehouse. For example. and mining data. What is this database application Data warehouses archive data from operational databases and often from external sources such as market research firms. reclassified. sales data might be aggregated to weekly totals and converted from internal product codes to use UPCs so that it can be compared with ACNielsen data. etc. 2. Operations in a data warehouse are typically concerned with bulk data manipulation. What are the examples of this database application? y y y y y Oracle Database IBM DB2 Ingres (database) PostgreSQL MySQL Data warehouse 1. The warehouse becomes the central source of data for use by managers and other end-users who may not have access to operational data.

this means that the dimensions "conform". The most important management task is making sure dimensions among data marts are consistent. primarily. The actual integration of two or more data marts is then done by a process known as "Drill across". or time period one or more Metadata Store(s) or Repository(ies) . dimensions and facts. The data warehouse bus architecture is primarily an implementation of "the bus". a well-known author on data warehousing. also known as the logical architecture.a "snapshot" of a moment in time's enterprise-wide data (optional) one or more individual Data Mart(s) . A drill-across works by grouping (summarizing) the data along the keys of the (shared) conformed dimensions of each fact participating in the "drill across" followed by a join on the keys of these grouped (summarized) facts. The integration of data marts is managed through the implementation of what Kimball calls "a data warehouse bus architecture". Business value can be returned as quickly as the first data marts can be created.summarized subset of the enterprise's data specific to a functional area or department. The integration of the data marts in the data warehouse is centered on the conformed dimensions (residing in "the bus") that define the possible integration "points" between data marts. For example." These data marts can eventually be integrated to create a comprehensive data warehouse. is a proponent of an approach to data warehouse design which he describes as bottom-up. Metadata is divided into two categories: information for technical use. In Kimball's words. rather than a big and often complex centralized model. Bottom-up design Ralph Kimball. that the data warehouse ends up being "segmented" into a number of logically self contained (up to and including The Bus) and consistent data marts. the data warehousing effort might start in the . In the bottom-up approach data marts are first created to provide reporting and analytical capabilities for specific business processes. Maintaining tight management over the data warehouse bus architecture is fundamental to maintaining the integrity of the data warehouse.Data Warehouse Configurations A Data Warehouse configuration. includes the following components: one Enterprise Data Store (EDS) . geographical region.catalog(s) of reference information about the primary data. if necessary. summarized data. which are dimensions that are shared (in a specific way) between facts in two or more data marts. (optional) one Operational Data Store . a collection of conformed dimensions and conformed facts. Data marts contain. The single data mart often models a specific business area such as "Sales" or "Production. and information for business end-users. Though it is important to note that in Kimball methodology. Facts can contain either atomic data and. the bottomup process is the result of an initial business oriented Top-down analysis of the relevant business processes to be modelled. Some consider it an advantage of the Kimball method.a central repository which supplies atomic (detail level) integrated information to the whole organization. and the method gives itself well to an exploratory and iterative approach to building data warehouses.

that is. and retained for future reporting. the business might then decide to expand the warehousing activities into the. in this example either "Sales" or "Production" information. Dimensional data marts containing data needed for specific business processes or specific departments are created from the data warehouse. Upon completion of the Sales-data mart. Top-down design has also proven to be robust against business changes. "Atomic" data. In the Inmon vision the data warehouse is at the center of the "Corporate Information Factory" (CIF). once committed. are stored in the data warehouse. Non-volatile Data in the data warehouse are never over-written or deleted read-only. say. that the data warehousing team has made the effort to identify and implement the conformed dimensions in the bus. Top-down design Bill Inmon. The Sales-data mart is good as it is (assuming that the bus is complete) and the production data mart can be constructed virtually independent of the sales data mart (but not independent of the Bus). The requirement for the Sales data mart and the Production data mart to be integral. in which the data warehouse is designed using a normalized enterprise data model. Inmon is one of the leading proponents of the topdown approach to data warehouse design. by building a Sales-data mart. Generating new dimensional data marts against the data stored in the data warehouse is a relatively simple task. no master plan is required upfront. which. the data warehouse. "Production department" resulting in a Production data mart. will not only be able to deliver the specific information that the individual data marts are designed to do. the data are static. The top-down design methodology generates highly consistent dimensional views of data across data marts since all data marts are loaded from the centralized repository. often. Note that this does not require 100% awareness from the onset of the data warehousing effort."Sales" department. Integrated The data warehouse contains data from most or all of an organization's operational systems and these data are made consistent. Inmon states that the data warehouse is: Subject-oriented The data in the data warehouse is organized so that all the data elements relating to the same realworld event or object are linked together. data at the lowest level of detail. through its two data marts. the stored data contains the current value. An integration (possibly) achieved in a flexible and iterative fashion. is that they share the same "Bus". that will be. If integration via the bus is achieved. has defined a data warehouse as a centralized repository for the entire enterprise. but can deliver integrated Sales-Production information. which provides a logical framework for delivering business intelligence (BI) and business management capabilities. The main disadvantage to the top-down methodology . one of the first authors on the subject of data warehousing. is of critical business value. and that the individual data marts links that information from the bus. Time-variant For An operational system.

system functionality and user acceptance testing is conducted for the complete integrated Data Warehouse system. Canned production reports are developed and sample ad-hoc queries are run against the test database. the project to implement the current Data Warehouse iteration can proceed quickly. DW solutions often make use of an operational data store (ODS). is not efficient for business intelligence reports where dimensional modeling is prevalent. Hybrid design Data warehouse (DW) solutions often resemble hub and spoke architecture. consisting of the best of breed practices from both 3rd normal form and star schema. system backup and recovery. the topdown methodology can be inflexible and unresponsive to changing departmental needs during the implementation phases. larger systems will often store the data in a normalized way. The information from the ODS is then parsed into the actual DW. It is however. The hybrid architecture allows a DW to be replaced with a master data management solution where operational. . cleanse. Once the programs have been developed and unit tested and the components are in place. which when built. software and middleware components are purchased and installed. 3. and the validity of the output is measured. Metrics are captured for the load process. top-down architecture with a bottom up design. In addition. creating a highly flexible solution from a BI point of view. The Data Vault model is not a true 3rd normal form. still requires the use of a data mart or star schema based release area for business purposes. Programs are developed to extract. It is important to note that the DW database in a hybrid solution is kept on third normal form to eliminate data redundancy. specific data for the fact tables and dimensions required. and the duration of time from the start of project to the point that end users experience initial benefits can be substantial. and breaks some of the rules that 3NF dictates be followed. The Data Vault model is geared to be strictly a data warehouse. User access to the data in the Warehouse is established. The metadata repository is loaded with transformational and business user metadata.is that it represents a very large project with a very broad scope. A normal relational database however. Data marts for specific reports can then be built on top of the DW solution. This modeling style is a hybrid design. and the programs are individually unit tested against a test database with sample source data. The DW effectively provides a single source of information from which the data marts can read. Legacy systems feeding the DW/BI solution often include customer relationship management (CRM) and enterprise resource planning solutions (ERP). and the configuration management processes are implemented. Necessary hardware. generating large amounts of data. Small data marts can shop for data from the consolidated warehouse and use the filtered. The Data Vault Modeling components follow hub and spoke architecture. not static information could reside. System support processes of database security. It is not geared to be end-user accessible. To consolidate these various data models. transform and load the source data and to periodically refresh the existing data in the Warehouse. To reduce data redundancy. and facilitate the extract transform load (ETL) process. The up-front cost for implementing a data warehouse using the top-down methodology is significant. the development and test environment is established. How does it run? Implementation Once the Planning and Design stages are complete.

The Help Desk is established and put into operation. The primary objective of Data Warehousing is to bring together information from disparate sources and put the information into a format that is conducive to making business decisions. Who are the people involved in this database application? What are their functions? Data Warehouses and Data Warehouse applications are designed primarily to support executives. During this review. The production database is created. Transition to Production The Transition to Production stage moves the Data Warehouse development project into the production environment. Finally. aggregating. Data Warehousing requires both business and technical expertise and involves the following activities: Accurately identifying the business information that must be contained in the Warehouse Identifying and prioritizing subject areas to be included in the Data Warehouse Managing the scope of each subject area which will be implemented into the Warehouse on an iterative basis Developing a scaleable architecture to serve as the Warehouse s technical and application foundation.system disaster recovery. transforming and validating the data to ensure accuracy and consistency Defining the correct level of summarization to support business decision making Establishing a refresh program that is consistent with business needs. 4. powerful tools at the desktop to access the data in the Warehouse Educating the business community about the realm of possibilities that are available to them through Data Warehousing Establishing a Data Warehouse Help Desk and training users to effectively utilize the desktop tools - . consolidated information from various internal and external sources. and business analysts in making complex business decisions. The final step is to conduct the Production Readiness Review prior to transitioning the Data Warehouse system into production. A Service Level Agreement is developed and approved by the customer organization. Rollout presentations and tool demonstrations are given to the entire customer community. Data Warehouse applications provide the business community with access to accurate. and the Data Warehouse programs and processes are moved into the production libraries and catalogs. the new system is positioned for ongoing maintenance through the establishment of a Change Management Board and the implementation of change control procedures for future development cycles. and the extraction/cleanse/transformation routines are run on the operations system source data. and data archiving are implemented and tested as the system is prepared for deployment. and identifying and selecting the hardware/software/middleware components to implement it Extracting. the system is evaluated for acceptance by the customer organization. This objective necessitates a set of activities that are far more complex than just collecting data and reporting against it. and end-user training is scheduled and conducted. The development team works with the Operations staff to perform the initial load of this data to the Warehouse and execute the first refresh cycle. timing and cycles Providing user-friendly. cleansing. senior managers. The Operations staff is trained.

and may be utilized in different meanings. Restructure the data so that it makes sense to the business users. What is this database application? The definition of a distributed database is broad. notably customer relationship management (CRM) systems. This benefit is always valuable. What are the examples of this database application? y y AdventureWorksDW2008R2 ETL Distributed Database 1. Integrate data from multiple source systems. by providing consistent codes and descriptions. Improve data quality. Restructure the data so that it delivers excellent query performance. while managing a single database distributed itself over multiple computers. even if the source transaction systems do not. Credit Card users etc. . Add value to operational business applications. even for complex analytic queries. Why it is needed? A data warehouse maintains a copy of information from the source transaction systems. flagging or even fixing bad data. y Insurance fraud analysis y Call record analysis y Logistics and Inventory management y Agriculture 6. Provide a single common data model for all data of interest regardless of the data's source. 7. and sites. enhancing. and different sites. and ensuring the ongoing success and applicability of the Warehouse 5.- Establishing processes for maintaining. without impacting the operational systems. Present the organization's information consistently. enabling a central view across the enterprise. This architectural complexity provides the opportunity to: Maintain data history. In general it typically refers to a modular DBMS architecture that allows distinct DBMS instances to cooperate as a single DBMS over processes. Where it can be implemented? Some of the applications data warehousing can be used for are: y Decision support y Trend analysis y Financial forecasting y Churn Prediction for Telecom subscribers. computers. but particularly so when the organization has grown by merger.

additional steps may be required when you are using a distributed database configuration.2. you will encounter one of the following scenarios: . each database is an Oracle database. Make sure that you have already installed TWSS version 7. About this task If a distributed or partitioned topology is used for databases. How does it run? The First Steps script takes care of setting up database tables for Telecom Web Services Server (TWSS) when you are using a consolidated or shared database. In a heterogeneous distributed database system.2 level.2 components and that you have performed the procedures described in one of the following post-installation configuration topics: y y Creating and configuring the DB2 database server instance Creating and configuring the Oracle database server instance Note that the First Steps script assumes that all nodes in the cluster are at a TWSS version 7. In a homogenous distributed database system. Before you begin This procedure is a sub-procedure within the overall migration process of your test (non-production) or production system. A distributed database does not share main memory or disks. at least one of the databases is a non-Oracle database. A database User accesses the distributed database through: Local applications applications which do not require data from other sites. However. Global applications Applications which do require data from other sites. What is the architectural design of this database? A distributed database system allows applications to access data from local and remote databases. Distributed databases use client/server architecture to process information requests. 3.

and update the necessary JNDI bindings to the respective data sources. Who are the people involved in this database application? What are their functions? In a distributed database environment. d. (For details. perform the following steps. Note: When the First Steps script sets up your configuration for running TWSS version 7. Then when you run the script on the Service Platform cluster. Verify databases and logs. If this is the case. e. No additional special considerations apply. For example. as necessary. Create additional data sources.y One database per cluster. Some issues to consider are: . Verify the new migrated environment by running any successful service logic test case. coordinate with the database administrator to determine the best location for the data. More than one database per cluster. if you made changes to the default configuration for a previous version of TWSS. you may have the Access Gateway running in one cluster and the Service Platform components running in another cluster. refer to the topic Planning to migrate from the previous version of Telecom Web Services Server. 4.2. the database parameters should refer to the Service Platform database (typically SPMDB). Procedure a. b. create a temporary database by running the First Steps script with the Initial Configuration Mode option pointing to a new temporary database. For example. be sure that you have copied all of the existing data to the new database before deleting the old database and database tables. For detailed information. when you run the First Steps script on the Access Gateway cluster. it does not modify your existing configuration. use the procedure in this topic to create a temporary database and then configure new databases to work with TWSS version 7. with each cluster having its own database. c. Verify that the newly populated data coexists with the previous data in cases where migration of runtime data is supported for a given web service or feature. all of the database parameters should refer to the Access Gateway database (typically named AGDB). If this is the case. Drop the temporary database. you may have a separate database for the WAP Push service on the same cluster with the database for the Service Platform components. On one of the nodes in the cluster. Therefore. The First Steps script creates the minimum necessary data sources to support a distributed topology and points to the new single (temporary) database instance.) In these cases. except in the cases of some Direct Connectbased web service implementations. see the topic Running the First Steps configuration script. y To migrate a distributed database configuration when there is more than one database per cluster. your changes are preserved during the migration process.2. Note: Migration of runtime data is generally not supported.

as the same data is stored in many databases with no system for managing the multiple copies. information. Document-oriented databases are . Where it can be implemented? The explosion of individual databases running on PC platforms can provide new opportunities to heads of departments. The centralized and decentralized models described above both generate major problems for large organizations. This architecture should allow decentralized use of data. 7. but may also pose problems for the organization as a whole. Armed with such tools. manufacturing plants and other work sites. Information that could benefit the entire organization often becomes out of reach for users unable to access it or unaware of its existence.y y y y y y Number of transactions posted from each location Amount of data (portion of table) used by each node Performance characteristics and reliability of the network Speed of various nodes. an organization's information is maintained and controlled by a few highly skilled individuals at one location. Some type of architecture that provides the advantages of both without the drawbacks would be ideal. and segments specific to one site and used only locally in that site. With centralized database systems. because of the cheaper hardware and software used. retrieving. Two major factors have led many business users to reject the centralized database model: the natural tendency for humans not to share and the introduction of personal computer (PC) -based DBMSs powerful enough to handle many concurrent users. These databases can include both segments shared by multiple sites. and managing document-oriented. What are the examples of this database application? Examples are databases of local work-groups and departments at regional offices. Document-oriented database 1. capacities of disks Importance of a node or link when it is unavailable Need for referential integrity among tables 5. Why it is needed? Business leaders today understand the importance of information as a business resource. while providing for database administration that can be performed by personnel with the interests of the whole firm in mind. reliability can be significantly less than with centralized systems. departments and workgroups can easily build their own databases. 6. or semi structured data. What is this database application? A document-oriented database is a computer program designed for storing. branch offices. Additionally. Data inconsistency is another problem that occurs in such an environment. and generally lower skills of the personnel administering these systems. wresting control of the information resource from the administrators of the organization's central databases and satisfying their natural tendency not to share.

including Ubuntu. Views are defined with aggregate functions and filters are computed in parallel. the database manages a collection of JSON documents. or dates. Python and Erlang. Document-based Database supports a view system using external socket servers and a JSON-based protocol. where it is used to synchronize address and bookmark data. view servers have been developed in a variety of languages. much like MapReduce. methods and representations and can be simplified as the following. Features Document Storage Document-based Database stores documents in their entirety. Document-based Database exposes a RESTful HTTP API and a large number of pre-written clients are available. although queries may introduce temporary views. You can think of a document as one or more field/value pairs expressed as JSON. Support for other languages can be easily added.one of the main categories of so-called NoSQL databases and the popularity of the term "documentoriented database" (or "document store") has grown with the use of the term NoSQL itself. The documents in a collection need not share a schema.11 Document-based Database supports CommonJS' Module specification. It is in use in many software projects and web sites. Document-based Database design and philosophy borrows heavily from Web architecture and the concepts of resources. ACID Semantics . What is the architectural design of this database? Document-based Database is not a relational database management system. a plugin architecture allows for using different computer languages as the view server such as JavaScript (default). Additionally. Ruby. Views are generally stored in the database and their indexes updated continuously. Since Version 0. As a consequence. but retain query abilities via views. 2. But you can also use ordered lists and associative maps. numbers. Field values can be simple things like strings. Every document in a Document-based Database database has a unique id and there is no required document schema. PHP. Instead of storing data in rows and columns.

as well as binary forms like PDF and Microsoft Office documents (MS Word. How does it run? The central concept of a document-oriented database is the notion of a Document. It does this by implementing a form of Multi-Version Concurrency Control (MVCC) not unlike InnoDB or Oracle. each view is constructed by a JavaScript function (server-side JavaScript by using CommonJS and SpiderMonkey) that acts as the Map half of a map/reduce operation. The function takes a document and transforms it into a single value which it returns.Like many relational database engines. A lot of tools. you can develop views that are similar to their relational database counterparts. Since computing a view over a large database can be an expensive operation. In Document-based Database. Delete) operations on all resources. and so on). While each document-oriented database implementation differs on the details of this definition. Map/Reduce Views and Indexes To provide some structure to the data stored in Document-based Database. modify it. The biggest gotcha typically associated with this level of flexibility is conflicts. GET. are available to do all sorts of things with HTTP like caching. Update. Document-based Database can index views and keep those indexes updated as documents are added. interoperable. scalable and proven technology. Distributed Architecture with Replication Document-based Database was designed with bi-direction replication (or synchronization) and off-line operation in mind. That means Document-based Database can handle a high volume of concurrent readers and writers without conflict. availability and partition tolerance guarantees. or updated. Eventual Consistency According to the CAP theorem it is impossible for a distributed system to simultaneously provide consistency. This provides a very powerful indexing mechanism that grants unprecedented control compared to most databases. Document-based Database guarantees eventual consistency to be able to provide both availability and partition tolerance. they all assume documents encapsulate and encode data (or information) in some standard format(s) (or encoding(s)). That means multiple replicas can have their own copies of the same data. JSON and BSON. REST API Document-based Database treats all stored items (there are others besides documents) as a resource. Read. Excel. REST uses the HTTP methods POST. A distributed system can satisfy any two of these guarantees at the same time. All items have a unique URI that gets exposed via HTTP. Encodings in use include XML. proxying and load balancing. Document-based Database provides ACID semantics. YAML. but not all three. and then sync those changes at a later time. The logic in your JavaScript functions can be arbitrarily complex. software and hardware. PUT and DELETE for the four basic CRUD (Create. 3. removed. in general. HTTP is widely understood. .

Age:8}. Address="5 Oak St. I think this itself is the smell. conflict resolution etc. in relational databases. varies significantly from one implementation to the next. the database retains an index on the key such that document retrieval is fast.. Often. Age:2}]. Children=[{Name:"Michael". Another document could be: FirstName="Jonathan". Regardless. This system allows new information to be added and it doesn't require explicitly stating if other pieces of information are left out. there are no empty 'fields' in either document (record) in this case. Hobby="sailing".Documents inside a document-oriented database are similar. which can be leveraged to have a offline capability for applications. Organization Implementations offer a variety of ways of organizing documents. as well as the expected performance of the queries. . {Name:"Jennifer". In some cases. keys. Age:5}. you may want a query that gets you all the documents with a certain field set to a certain value. including notions of y y y y Collections Tags Non-visible Metadata Directory hierarchies 4. They are not required to adhere to a standard schema nor will they have all the same sections. Who are the people involved in this database application? What are their functions? People go for databases like CouchDB not because there is a need for schema less data but for the other cool features these databases gives us like replication (master to master). Keys Documents are addressed in the database via a unique key that represents that document. this string is a URI or path. Retrieval One of the other defining characteristics of a document-oriented database is that. The set of query APIs or query language features available. For example here's a document: FirstName="Bob". or the like. to records or rows. in some ways. Address="15 Wanamassa Point Road".". because the application domain needs the relational data. but they are less rigid. slots. beyond the simple key-document (or key-value) lookup that you can use to retrieve a document.Age:10}. {Name:"Samantha". {Name:"Elena". Typically. the database will offer an API or query language that will allow you to retrieve documents based on their contents. Unlike a relational database where each record would have the same set of fields and unused fields might be kept empty. Both documents have some similar information and some different. For example. you can use this key to retrieve the document from the database. parts. this key is a simple string. features like these should not drive one to use NoSQL databases.

Any ways the choice is made (yet to be challenged) and we had to figure out how to model our documents and persist in the database which has the parent/child relationship. 5. documentoriented database management system platform with server based data storage. highperformance. What if we want to store a BlogPost and Comment as different documents. CouchDB itself does not care how we store documents its up to the API to manage these relationships. its is possible with @DocumentReferences annotation. askSam Systems Proprietary Apstrata Significant Data Systems Proprietary Proprietary Clusterpoint Clusterpoint Ltd. Where it can be implemented? Name Lotus Notes askSam Apstrata Datawasp Publisher IBM License Proprietary Language Notes RESTfulA PI (unknown) (unknown) (unknown) (unknown) Scalable. can we add comments to the blog post and just by saving blog post does it save all the comments as individual documents ? The answer is yes. schemafree. fast full text search engine functionality. Free community C++ license /Commercial Yes CRX MUMPSDataba se UniVerse UniData Jackrabbit Day Software Proprietary Proprietarya nd GNU MUMPS Affero GPL (unknown) Commonly used in health (unknown) applications.Apac Apache he Software License Foundation CouchDB Erlang . Yes (Beta) Yes (Beta) Java JSON over REST/HTTP with Multi-Version Concurrency (unknown) Yes (there is only RESTful API) Rocket Software Proprietary Rocket Software Proprietary Apache Software Apache Foundation License Couchbase. information ranking for search revelevance and clustering. Let s take an example which is quoted in the documentation of Ektorp itself. Ektorp provides a way to achieve this. a BlogPost and a bunch of Comments.

NET. flexible and scalable NoSQL data store for the . C++. key-value database with indexing and querying support. Python. Redis BSD License ANSI C (unknown) StrokeDB Terrastore MIT License Apache License Java (unknown) (unknown) JSON/HTTP Built on top of Apache Thrift framework that provides indexing and document storage ThruDB BSD License C++. Key-value store supporting lists and sets with fast.0 A . Java (unknown) . PHP. fast.NET and Windows platforms.NET( Fast. Javascript.0 C. Erlang. schemaYes less. Haskell. Perl. Scala Memory-oriented. Alpha software.NET LINQ-enabled Document Database. C++ Yes OrientDB Orient Technologies Apache License Java Yes RavenDB RavenDB commercial or GNU AGPL . (unknown) MongoDB 10gen. etc). document-oriented C# F#. .Name Publisher License Language Notes Control and ACID propert ies.Java. Inc GNU AGPL v3. Uses mapand reduce for views and queries. . database optimized for PowerShell. Ruby. highly transient data.[4] RESTfulA PI FleetDB FleetDB MIT License Clojure A JSON-based schemafree database optimized for agile development. simple and binary-safe protocol. transactional.NET v3. focused on providing high performance. JSON over HTTP Optional using external tools[7] GemFire Enterprise VMWare Commercial Java.

and Yes delete access to data. DBSlayer DBSlayer Apache License C Eloquera DB Eloquera Proprietary . read. JSON over HTTP. update.NET 6. 7. What is this database application? An embedded database system is a DBMS which is tightly integrated with an application software that requires access to stored data in a way that the DBMS is hidden from the application s end-user and requires little or no ongoing maintenance. RESTfulA PI Persevere Persevere BSD License A JSON database and JavaScript Application Server. What are the examples of these database applications? y y y y y Apache CouchDB MongoDb Github Sourceforge IBM Lotus Notes Embedded database 1. SQL queries. Based on Dynamic objects. Why it is needed? Utilized to conveniently store. Provides RESTful JSON interface for Create.Name Publisher License Language Notes services for building and scaling websites. It is actually a broad technology category that includes . High performance. edit and retrieve documents. Alternate implementation is being developed in Java. (unknown) Supports LINQ. database abstraction layer (over MySQL) used (unknown) by theNew York Times. Also supports JSONQuery/JSONPath querying. Alpha software. manage.

for quite a while now. 4. each potentially used by different people and in different settings. Microsoft Windows and Real-time operating systems. and other embedded system applications that require reliability and power. Databases. although using programming language are in most cases user friendly and one can learn to work with them very fast and without too much effort.NET and kernel level APIs. Applications developed using these APIs may be run in standalone and/or server modes. Embedded SQL. SQL database engine with C. This means that the database system is actually hidden from the application that the end user will work with which makes it much easier for individuals who do not have professional training in programming to work with this type of databases. Java. relational database that has been embedded into applications by organizations small to large. with deployment environments including medical systems. Embedded Database runs on Linux. The C programming language has the most APIs including the low-level kernel MR Routines. satellite management systems. The layered architecture design provides levels of system optimization for application development. This is why the software database has constantly increased in popularity ever since it was firstly developed in the late 1970s. How does it run? Embedded Database is a full-function. ODBC. to change the ways in which people choose to live their lives. multiple and various information in a minimum amount of time. An . network routers. The first ever built database was dBase. The embedded database is therefore one of them. Nowadays the database language has evolved and it is available in different types of databases. SQL. 3. C++. There are also APIs for C++ and JAVA. 2. the ancient program that helped people manage better their information. The term "embedded database" can be confusing because only a small subset of embedded database products is used in real-time embedded systems such as telecommunications switches and consumer electronics devices.DBMSs with differing properties and target markets. Unix. Databases have moreover contributed to the emergence of a different way to keep one s information not only thoroughly organized but also easily accessible. The embedded database is a type of database system or DBMS which is very closely integrated into an application software that requires access to stored data. nuclear power plant monitors. to handling complex. ADO. Applications developed using these APIs may be run in standalone and/or server modes. Who are the people involved in this database application? What are their functions? Technology has managed. What is the architectural design of this database? Embedded database system supports many application programming interfaces in several programming languages. It is an ACID compliant. computers in particular and software have developed from some machines that would be able to perform basic tasks that people would probably no longer need anyways. MSCALL and ODBC. JDBC.

eXtremeDB from McObject. although to a very small extent.com. This is a significant performance optimization that can be allowed when the application and data manager are tightly integrated. also available general-purpose DBMSs can often be used for such purpose. What is the architectural design of this database? . 7. ElevateDB from Elevate Software. it is possible for the data manager to avoid copies by giving applications direct references to data items in a shared memory cache. ITTIA DB from ITTIA . Empress Embedded Database from Empress Software . database architecture. SolidDB from IBM. multimedia. As a result. when data is returned to the user.embedded database does not require maintenance. with basic programming interfaces. and VistaDB from VistaDB Software. This type of database.g. However. Valentina DB from Paradigma Software. CSQL from csqlcache. Several products exist to support such databases. RDM Embedded and RDM Server from Raima Inc. Inc. . What are the examples of these database applications? y MySQL End-user database 1. EffiPRoz from EffiProz Systems . and a relatively small "foot-print" (not much code to run as in "regular" general-purpose databases). Why it is needed? Typically. HSQLDB from HSQLDB. SQLite. not supporting multiple concurrent end-users on a same database). with more elementary DBMS functionality (e. while still enjoying the database qualities and protections that these DBMSs can provide. Informix Dynamic Server (IDS) from IBM. 6. Examples of these are collections of documents. in an embedded environment. InnoDB from Oracle Corporation. Where it can be implemented? Major embedded database products include. What is this database application? These databases consist of data developed by individual end-users. no real programming needed). it must be copied from the data manager's buffer cache (or data page) into the application's memory. in alphabetical order: Advantage Database Server from Sybase Inc. if they provide basic user-interfaces for straightforward database applications (limited query and data display. Extensible Storage Engine from Microsoft. The term however depicts a very broad technological system which may include different application programming interfaces such as SQL. . the robustness of the total software package is of paramount importance. Berkeley DB from Oracle Corporation. spreadsheets.. and other files. makes it easier for the end user to work with or handle data. presentations. Firebird Embedded. Inc. not the isolation between the application and the data manager. although very complex and complicated. storage modes or database models as well as target markets.. SQL Server Compact from Microsoft Corporation. 2. 5. Some of them are much simpler than full fledged DBMSs. However.ORG.

) How many concurrent end users do you expect the average tenant to support? The larger the number. the more likely you will want to consider a more shared approach. y y y How much storage space do you expect the average tenant's data to occupy? If you expect some or all tenants to store very large amounts of data. while others may bias you toward a more shared approach. the more appropriate a more isolated approach will be to meet end-user requirements. (Indeed. How does it run? If you continually gauge database performance from a strictly database-internal point of view. y How many prospective tenants do you expect to target? You may be nowhere near being able to estimate prospective use with authority. you are missing the boat. Some of the following questions may bias you toward a more isolated approach. Begin watching your database end users and performance will begin to take on a whole new meaning. such as per-tenant backup and restore capability? Such services are easier to offer through a more isolated approach. Who are the people involved in this database application? What are their functions? . Do you expect to offer any per-tenant value-added services. and needs of the tenants you expect to serve all affect your data architecture decision in different ways. 4. the separate-database approach is probably best. it will be much easier to design the application that way from the beginning than to move to a separate-database approach later on. If so.The number. data storage requirements may force you to adopt a separate-database model anyway. 3. but think in terms of orders of magnitude: are you building an application for hundreds of tenants? Thousands? Tens of thousands? More? The larger you expect your tenant base to be. nature.

define the more detailed attributes of the data and define constraints if needed.g. This research describes two tutors developed to teach these skills to end users. workbook scheduling is useful to prevent long-running queries from adversely affecting system performance. the database designer models how the information is viewed by the database system and is how it is processed and conveyed to the end user. The result of this analysis is a 'preliminary specifications' document (Batini et al. e. or only those workbooks that will exceed a predicted time that you specify). The tutors were based on a modified Entity-Relationship database design method. An inexpensive solution would be for end users to learn required database design skills from software tutors tailored to their needs. 1986). Taking the specifications document. From a Discoverer manager's point of view. Where it can be implemented? In consultation with all potential users of the database. 5. further analysis is done to give meaning to the data items. You can force users to schedule workbooks (either all workbooks. Empirical comparison of the tutors tested the teaching effectiveness of the facilitators. The results lead to recommendations for closing the gap between skills required and skills learned by end users in database design. 6. and you can further specify the time periods that scheduled workbooks are permitted to run. the ability to schedule workbooks is useful for: y y reports that take a long time to run reports that have to run at regular intervals For example. DBMS specific design. They improved an end user's natural learning process by incorporating design principles and facilitators. Taking the 'data requirements document'. the conceptual design is translated into a more low-level. a Discoverer end user might want to run a report that they know will take a long time to complete. Development of tutors that teach specific database design skills irrespective of the software package used in implementation has important implications for practitioners and researchers. The user can schedule the report to run overnight and have the results ready to view the next morning. a database designer's first step is to draw up a data requirements document.From a Discoverer end user's point of view. In the implementation design phase. The requirements document contains a concise and non-technical summary of what data items will be stored in the database. What is this database application? .. 7. and how the various data items relate to one another. Why it is needed? Lack of familiarity with database design methods could prevent many end users from effectively implementing their database management system packages. What are the examples of these database applications? y Oracle Federated database and multi-database 1.

. Accessing a new type of data source is done by acquiring or creating a wrapper for that source. though it may refer to a less integrated (e. simple data definition (DDL) statements allow sources to be dynamically added to the federation without stopping ongoing queries or transactions. federate. The federated database engine accesses sources via a software component know as a wrapper. and may be geographically decentralized. and deliver data from multiple relational and non-relational sources. In a federated system. greatly simplifies federated view building. the two-phase commit protocol. without an FDBMS and a managed integrated schema) group of databases that cooperate in a single application. What is the architectural design of this database? All systems need to evolve over time. Once a wrapper exists. How does it run? Composite data virtualization lets you easily build and run federated views. Any data source can be wrapped. new sources may be needed to meet the changing needs of the users' business. to allow distributed (global) transactions (vs.A federated database is an integrated database that comprises several distinct databases. The Composite Studio. a . This standard documents the protocols used by a federated server to communicate with external data sources. IBM makes it easy to add new sources. With Composite provided APIs. Any wrapper written to the SQL/MED interface can be used with IBM's federated database. and used in conjunction with IBM's federated database. and provides them with an integrated conceptual view. with it s easy-to-learn. With Composite you can create a reusable federated view to model. e. combine. you can include data from multiple relational databases. point-and-click development environment and automated code generation tools. access. The wrapper architecture enables the creation of new wrappers. Sometime the term multi-database is used as a synonym to federated database. IBM supports the ANSI SQL/MED standard (MED stands for Management of External Data). each with its own DBMS. The constituent databases are interconnected via computer network.g. In this case typically middleware for distribution is used which typically includes an atomic commit protocol (ACP). possibly of different types (which makes it a heterogeneous database)..g. 3. Thus wrappers can be written by third parties as well as IBM. It is handled as a single database by a federated database management system (FDBMS). 2. local transactions confined to a single DBMS) across the participating databases. which transparently integrates multiple autonomous DBMSs.

or other barriers. making them available at runtime to mulitple consuming applications such as BI or portals via popular standards including JDBC. They are typically skilled systems programmers.. functionality of the system implementation. The Composite Information Server stores these views. They are typically the employees of a DBMS vendor (e. IBM. There are four kinds of autonomies addressed Design Autonomy which refers to ability to choose its design irrespective of data. the components of which are the conceptual schema. medical doctors. It is important to understand the aspects of autonomy for component databases and how they can be addressed when a component DBS participates in an FDBS. application data such as from SAP. When run.g. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. Sybase). Oracle. though they are the intended and main beneficiaries of a DBMS. The latter group members design the needed database and maintain it. and even XML sources without worrying about difficult connections. DBMS developers . Who are the people involved in this database application? What are their functions? 1.  Communication autonomy refers to the general operation of the DBMS to communicate with other DBMS or not. etc.g.  Association autonomy gives a power to component DBS to disassociate itself from a federation which means FDBS can operate independently of any single DBS. accountants. Heterogeneities in an FDBS are primarily due to design autonomy. or. internal schema and external schema of databases. volunteers or people supported by interested companies and organizations.These are the people that design and build a database-based application that uses the DBMS. and ADO.  The ANSI/X3/SPARC Study Group outlined a three level data description architecture. to acheive performance levels unmatched by our competitors. leveraging source system resources and myriad other optimization techniques. MySQL).  Execution autonomy allows a component DBMS to control the operations requested by local and external operations. Application developers and Database administrators . and the only ones who touch its code.These people know the application and its end-user interfaces. subject to proper DBMS licensing). DBMS development is a complicated task.) .These are the people that design and build the DBMS product. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. 3.g. or sold separately as an add-on to the DBMS.. Where it can be implemented? Fundamental to the difference between an MDBS and an FDBS is the concept of autonomy. in the case of Open source DBMSs (e. Microsoft. query language or conceptualization. Sometimes the application itself is packaged and sold as a separate product. but need not know nor understand the underlying DBMS. insurance people.NET. Thus. transforms.. 5. The first group members write the needed application programs which the application comprises. 4. Application's end-users (e. 2. ODBC. they are only indirectly involved with it.wide variety of files including Excel and other formats. Composite optimizes the query across all the sources required. which may include the DBMS inside (see Embedded database. The three level .

namely IT imposed look and feel. What is the architectural design of this database? Graph database is a conference that is organized by the community. Component Schema is derived by translating local schema into a model called the canonical data model or common data model. 3. It includes information on data distribution that is generated when integrating export schemas. edges. The export schema help in managing flow of control of data. their needs are somewhat in conflict with such bottom-up approaches to data integration. 2. The five level schema architecture is explained below. for the community. How does it run? . Autonomy and Heterogeneity. 7. It may include access control information regarding its use by specific federation user. and properties to represent and store information. It was therefore extended to support the three dimensions of the FDBS namely Distribution.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community. General graph databases that can store any graph are distinct from specialized graph databases such as triplestores and network databases. What is this database application? A graph database is a kind of NoSQL database that uses graph structures with nodes. architects. Graph database is designed with the technical depth and enterprise focus of interest to technical team leads. They help in integration of data for tightly coupled FDBS.architecture is however inadequate to describing the architectures of an FDBS. External Schema defines a schema for a user/applications or a class of users/applications. What are the examples of these database applications? y y IBM OBIEE Graph database 1. They are useful when semantics missed in local schema are incorporated in the component. and project managers. 6. Modern data users demand control over how data is presented. the Five Level Schema Architecture above does suffer from a major drawback. Why it is needed? The five level schema architecture includes the following: y y Local Schema is the conceptual concept expressed in primary data model of component DBMS. y y y While accurately representing the state of the art in data integration. Federated Schema is an integration of multiple export schema. Export Schema represents a subset of a component schema that is available to the FDBS.

. Who are the people involved in this database application? What are their functions? 1. Sybase). 4. it's interesting to see another graph database. DBMS development is a complicated task. in the case of Open source DBMSs (e. in the mix. The latter group members design the needed database and maintain it. They are typically the employees of a DBMS vendor (e.g. or sold separately as an add-on to the DBMS. and the only ones who touch its code. It is a graph database designed specifically for artificial intelligence and semantic web projects.. Concurrency: STM. MySQL). portable. Misc: Open-Source. DBMS developers . Application developers and Database administrators . Sometimes the application itself is packaged and sold as a separate product. Replication: P2P. They are typically skilled systems programmers. extensible. IBM. . Oracle. it can also be used as an embedded object-oriented database for projects of all sizes. HyperGraphDB . Written in:Java. Especially for AI and Semantic Web. 2. open-source data storage mechanism. Microsoft. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. which may include the DBMS inside (see Embedded database. Their quick blurb on HyperGraphDB says it is a: general purpose.These are the people that design and build the DBMS product. subject to proper DBMS licensing).With the success of Neo4j as a graph database in the NoSQL revolution.These are the people that design and build a database-based application that uses the DBMS. or. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. The first group members write the needed application programs which the application comprises. distributed. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs).g. volunteers or people supported by interested companies and organizations. embeddable. Query Method: Java or P2P.

CloudGraph . They can scale more naturally to large data sets as they do not typically require expensive join operations. Files. insurance people.open-source platform.a high-performance open source document-graph database OQGRAPH . Thus.an open-source / commercial (AGPLv3.an open-source / commercial (AGPLv3) graph database and universal access layer (funded by Deutsche Telekom AG) VertexDB . graph databases are often faster for associative data sets. GraphBase .NET graph database that uses graphs and key/value pairs to store data.a scalable. Where it can be implemented? The following is a list of several well-known graph database projects: y y y y y y y y y y y y y y y y y y y AllegroGraph .A high-performance graph database from Sparsity Technologies. small-footprint. Bigdata .a graph database from Microsoft Research Extreme Computing Group (XCG) based on the cloud programming infrastructure Orleans HyperGraphDB . Application's end-users (e. As they depend less on a rigid schema.high performance graph database server that supports automatic garbage collection.Graph computation engine (GPLv2 licensed) for MySQL. a technology transition company from DAMA-UPC Filament . medical doctors.R2DF framework for ranked path queries over weighted RDF graphs 6. InfoGrid .3. though they are the intended and main beneficiaries of a DBMS. high-performance RDF and graph database. 5.) . but need not know nor understand the underlying DBMS.These people know the application and its end-user interfaces.an open-source (LGPL) graph database supporting generalized hypergraphs where edges can point to other edges InfiniteGraph . free for small entities) graph database with web front end and configurable storage engines (MySQL.and memory-based. etc. PostgreSQL. distributed.a customizable.a highly scalable RDF/graph database capable of 10B+ edges on a single node or clustered deployment for very high throughput. they are only indirectly involved with it. outgrowth of bioinformatics DEX . .a disk.g. and map more directly to the structure of object-oriented applications.. Virtuoso Universal Server . Hadoop) Neo4j .an open-source / commercial (GPLv3 community edition. Cytoscape . accountants. high-performance graph store with a rich tool set from FactNexus Graphd. distributed and cloud-enabled commercial product with flexible licensing for startups. they are more suitable to manage ad-hoc and changing data with evolving schemas. the proprietary backend of Freebase Horton .a clustered high performance and scalable RDF graph database server R2DF . fully transactional . Why it is needed? Compared with relational databases.graph persistence framework and associated toolkits based on a navigational query style. AGPLv3 advanced and enterprise edition) graph database OrientDB . MariaDB and Drizzle sones GraphDB .a highly scalable.

while Web crawlers and other software provide the equivalent of database indexes to support search and other activities. What are the examples of these database applications? y y y OQGRAPH VertexDB R2DF Hypermedia databases 1. and a window system for executing display of the data corresponding to the view processed by the view processor on the display and also for informing the view processor of a data entered on the display. a data processor for executing modification of the data stored in the hypermedia data base and also for executing addition of the type of the associated data.Conversely. for example computing the shortest path between two nodes in the graph. . a view processor for executing modification of a view corresponding to the data processed by the data processor and also for executing addition of the type of the associated view. Graph databases are a powerful tool for graph-like queries. An extendable hypermedia system has a hypermedia data base for storing a hypermedia data. a display for providing a data display. Other graph-like queries can be performed over a graph database in a natural way (for example graph's diameter computations or community detection). Web browsers "process" this data one page at a time. In the hypermedia system a programmer can arbitrarily add both data type and view type. What is this database application? The World Wide Web can be thought of as a database. 7. albeit one spread across millions of independent computing systems. relational databases are typically faster at performing the same operation on large numbers of data elements.

4. Multimedia development software such as Adobe Flash. Sometimes the application itself is packaged and sold as a separate product. They are typically the employees of a DBMS vendor (e. DBMS developers . they are only indirectly involved with it. Sybase). Application's end-users (e. with emphasis on educational and business content management. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades.These are the people that design and build the DBMS product. For more emphasis on graphics and page layout. in the case of Open source DBMSs (e. 3. MySQL). insurance people.g. and URL links to files on external file servers. They are typically skilled systems programmers. with emphasis on entertainment content. and MatchWare Mediator may be used to create stand-alone hypermedia applications. or. This includes presentation programs.g. Who are the people involved in this database application? What are their functions? 1. and tools to include hyperlinks in PDF documents such as Adobe InDesign for creating and Adobe Acrobat for editing. Macromedia Authorware. 5. IBM. add-ons to print layout programs such as Quark Immedia. The latter group members design the needed database and maintain it. though they are the intended and main beneficiaries of a DBMS. or sold separately as an add-on to the DBMS. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. How does it run? Hypermedia applications may be developed on embedded devices for the mobile and the Digital signage industries using the Scalable Vector Graphics (SVG) specification from W3C (World Wide Web Consortium). subject to proper DBMS licensing). which may include the DBMS inside (see Embedded database. Software applications such as Ikivo Animator and Inkscape simplify the development of Hypermedia content based on SVG. Hyper Publish is a tool specifically designed and optimized for hypermedia . and the only ones who touch its code. such as Microsoft Powerpoint and LibreOffice Impress. What is the architectural design of this database? Hypermedia may be developed a number of ways.2. accountants..g.. medical doctors. Where it can be implemented? Hyperlinks may also be added to data files using most business software via the limited scripting and hyperlinking features built in.These are the people that design and build a database-based application that uses the DBMS. The first group members write the needed application programs which the application comprises. Some database software such as Visual FoxPro and FileMaker Developer may be used to develop stand-alone hypermedia applications. Any programming tool can be used to write programs that link data from internal variables and nodes for external data files.These people know the application and its end-user interfaces.) . volunteers or people supported by interested companies and organizations. Oracle. Microsoft. DBMS development is a complicated task. 2. hyperlinks may be added using most modern desktop publishing tools. Thus. Adobe Director. Documentation software such as the Microsoft Office Suite and LibreOffice allow for hypertext links to other content within the same file.. 3. Embedded devices such as iPhone natively support SVG specifications and may be used to create mobile and distributed Hypermedia applications. Application developers and Database administrators . etc. but need not know nor understand the underlying DBMS. other external files.

such as telecommunications network equipment. Accessing data in memory reduces the I/O reading activity when. In applications where response time is critical. for example. criticized existing data models for their inability to distinguish between the abstract descriptions of data structures and descriptions of the physical access mechanisms. Why it is needed? The development and subsequent rapid advance of electronic computers in the second half of the twentieth century led to the development of database models that are far more efficient for dealing with large volumes of information than flat databases. 7. Codd. relational databases became the dominant type for high performance applications because of their efficiency. Any HTML Editor may be used to build HTML files. Often they run as an embedded database: created when a process starts. 6. a researcher at IBM. What is this database application? An in-memory database (IMDB. F. running embedded within that process. perhaps because the theoretical basis is not easy to understand. Codd in 1970. did not appear until around 1980. A relational database is a way of organizing data such that it appears to the user to be stored in a series of interrelated tables. main memory databases are often used. Oracle and DB2. and ability to perform a variety of useful tasks that had not been originally envisioned. Main memory databases are faster than disk databases. Interest in this model was initially confined to academia. accessible by any web browser. and thus the first commercial products. but typically backed-up by non-volatile computer data storage. which was proposed by E. CD/DVD authoring tools such as DVD Studio Pro may be used to hyperlink the content of DVDs for DVD players or web links when the disc is played on a personal computer connected to the internet. querying the data. Subsequently. An in-memory database is a database that runs entirely in main memory. The most notable is the relational model. ease of use.and hypertext management. without touching a disk. and is destroyed when the process finishes. also main memory database or MMDB) is a database that primarily resides in main memory. . What are the examples of these database applications? y y Internet Inkscape In-memory database 1.

The nice thing about these tools is that they allow you to use regular SQL to query them. volunteers or people supported . or an EventPoster. in the case of Open source DBMSs (e. 3. The first one is to use a SQL in-memory database library. there's a small but busy world of in-memory databases out there.. Often just a bunch of hash-tables for the entry points to the object graph is enough.) Yet even developers of traditional database systems can find an in-memory database useful. How does it run? While most people think of databases as large disk-centered creatures. An IMDB usually features a strict memory-based architecture and direct data manipulation. One of the strengths of the repository approach is that it gives you a consistent way to access (and stub out) non SQL data sources too. Since most ThoughtWorkers get the shakes if they haven't had a green bar recently.. Microsoft. They are typically the employees of a DBMS vendor (e. When you're developing an enterprise application. There are applications which need fast access to some sort of managed data which doesn't need to be persisted either because it doesn't change. This means that your object-relational mapping system is also hidden inside the repository.2. What is the architectural design of this database? There are two routes people seem to take to a in-memory database for testing. Elsewhere SQLite and Firebird come up. particularly for testing. Then you can swap out the database with regular in-memory data structures.g. or it can be reconstructed (imagine a routing table in a router. Oracle. tests that hit the database can be a huge time drain when running your test suites. or. this makes a big difference to us. and the only ones who touch its code.These are the people that design and build the DBMS product. Another route is to abstract all the database access behind a Repository.g. 4. DBMS developers . You can do something similar by running a filebased database on a ram disk. Sybase). In Java-land the popular one seems to be HSQLDB. One issue is that they may not support quite the same dialects or have all the features of the target database. which allows you to keep the test and production deployments closer to each other. IBM. Storing and manipulating data in main memory. MySQL). Who are the people involved in this database application? What are their functions? 1. Switching to an in-memory database can have an order of magnitude effect which can dramatically reduce build times.

) . though they are the intended and main beneficiaries of a DBMS.1. they are only indirectly involved with it. 3. insurance people.5. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. rebuilding it and snapshotting it as you need. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. Why it is needed? Testing is the main driver thus far. The latter group members design the needed database and maintain it. Sometimes the application itself is packaged and sold as a separate product. Application developers and Database administrators . 6. medical doctors. Such styles can be very scalable and have high performance in cases where you have lots of readers and few writers. Prevayler got a lot of attention for taking this kind of approach. If you use an approach that keeps an event log of all changes to your application state. you can treat the in-memory database as a cache of the result of applying the log. subject to proper DBMS licensing). 2. They are typically skilled systems programmers. People I know who tried it found it's tight coupling to the in-memory objects and lack of migration tools caused serious problems.. But I think the approach of persistent change logs as systems of record is a fertile ground to explore in the future. Application's end-users (e. What are the examples of these database applications? y y y y y y y y Java DB 10. which may include the DBMS inside (see Embedded database. DBMS development is a complicated task. The first group members write the needed application programs which the application comprises. etc.5 Apache Derby Altibase CSQL BlackRay . A difference here is that these experiences tend to be with niche commercial databases while for testing people seem to prefer open-source. accountants.by interested companies and organizations. Where it can be implemented? Indeed a few people actively dislike using SQL in-memory databases under the belief that they encourage spreading either SQL or object-relational mapper code around the domain model. Running SQL in-memory may removes much of the pain of slow access but acts as a deodorant to cover the smell of a missing repository. or sold separately as an add-on to the DBMS. Memory sizes are now enough that many application databases can be loaded into memory.These people know the application and its end-user interfaces. but I think there's more to come from in-memory databases.1 ASE Berkeley DB Adaptive Server Enterprise (ASE) 15. 7. 5.These are the people that design and build a database-based application that uses the DBMS. but need not know nor understand the underlying DBMS. Thus.g. I've run into a few cases where people have used in-memory databases for very high performance applications.

Knowledge-Based Systems provides access . Also a collection of data representing problems with their solutions and related experiences. instead of using the operating system kernel NFS client. Direct NFS Client delivers optimized performance by automatically load balancing requests across all specified paths. and action. relations. With Knowledge base database 11g Release 1 (11. physical and logical raw files are supported as data. If one network path fails. It assumes basic computer science skills and a math background that includes set theory. In addition. Through this integration. What is this database application? A knowledge base is a special kind of database for knowledge management. Knowledge base Direct NFS Client currently supports up to four parallel network paths to provide scalability and high availability. A comprehensive text and resource. How does it run? A knowledge-based system (KBS) is a system that uses artificial intelligence techniques in problemsolving processes to support human decision-making. 2. Each of the 12 chapters is designed to be modular. elementary probability. you can configure Knowledge base database to access NFS V3 servers directly using an Knowledge base internal Direct NFS client. What is the architectural design of this database? Knowledge base database supports 64-bit file I/O to allow use of files larger than 4 gigabytes (GB). Direct NFS client simplifies and optimizes the NFS client configuration for database workloads. learning. Exercises are incorporated throughout the text to highlight certain aspects of the material presented and to simulate thought and discussion. and retrieval of knowledge. In addition. and introductory concepts of artificial intelligence. log. providing the means for the computerized collection. organization.y y Eloquera eXtremeDB Knowledge base database 1. providing instructors with the flexibility to model the book to their own course needs. it is able to optimize the I/O path between itself and the NFS server providing significantly superior performance. and control files to support Knowledge base database Real Application Clusters (RAC) on Windows and for those cases where performance needs to be maximized.1). then Direct NFS Client will reissue commands over any remaining paths ensuring fault tolerance and high availability. 3.

and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades.. They are typically skilled systems programmers. DBMS development is a complicated task. Who are the people involved in this database application? What are their functions? 1. The first group members write the needed application programs which the application comprises. a search engine is used to locate information in the system. 2.g. y Knowledge base analysis and design (also known as KBAD) is an approach that allows people to conduct analysis and design in a way that results in a knowledge base. They are commonly used to complement a help desk or for sharing information among employees within an organization. material implication and negation may be used to build it up from simpler pieces of information. white papers. Logical operators. usually for the purpose of having automated deductive reasoning applied to them. genetic algorithms. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. Dam 4. Sometimes the application itself is packaged and sold as a separate product. often in the form of rules that describe the knowledge in a logically consistent manner. The latter group members design the needed database and maintain it. This approach was first implemented by Dr. Sybase). Human-readable knowledge bases are designed to allow people to retrieve and use the knowledge they contain. DBMS developers . Or (disjunction).what types of entities are recorded and what their relationships are. volunteers or people supported by interested companies and organizations.g. . Application developers and Database administrators . articles. and soft systems. or users may browse through a classification scheme. y A text based system that can include groups of documents including hyperlinks between them is known as Hypertext Systems. They contain a set of data. IBM." Knowledge bases can exist on both computers and mobile phones in a hypertext format. Typically. Consequently. They might store troubleshooting information.to the most current information in KBS and new artificial intelligences. user manuals. Such knowledge bases are also used by the semantic web. for example as part of an expert system that focuses on a domain like prescription drugs or customs law. as well as neural networks. Some machine-readable knowledge bases are used with artificial intelligence. Oracle. or. Hypertext systems support the decision process by relieving the user of the significant effort it takes to relate and remember things. fuzzy logic. knowledge tags.These are the people that design and build the DBMS product. or answers to frequently asked questions. An ontology can define the structure of stored data .. Steven H. in the case of Open source DBMSs (e. and the only ones who touch its code. which can later be used to make informative decisions. Knowledge bases are essentially closed or open information repositories and can be categorised under two main headings: y Machine-readable knowledge bases store knowledge in a computer-readable form. which may include the DBMS inside (see Embedded database. They are typically the employees of a DBMS vendor (e. subject to proper DBMS licensing). Microsoft. such as And (conjunction). MySQL). or sold separately as an add-on to the DBMS. classical deduction can be used to reason about the knowledge in the knowledge base.These are the people that design and build a database-based application that uses the DBMS.

accountants. a knowledge base is a machine-readable resource for the dissemination of information. information technology was mainly passive. In this way. a knowledge base is used to optimize information collection. In relation to tnformation technology (IT). Where it can be implemented? Electronic commerce (e-commerce) has been sweeping the globe. Tomcat was adopted to be the JSP engine and this thesis designs an online health consulting and shopping center that can provide users personal dietary assessment. either separately (these are usually called knowledge management software) or as part of another application. As a customer relationship management (CRM) tool.g. which makes e-commerce more than a fad. and whatis. while JavaServer Pages (JSP) belongs to the second category. JSP is more convenient to write by using conventional HTML writing tools and easier to modify because only the dynamic parts need to be changed when updating a web page. . They are a direct extension of Java servlets and provide a way to separate content generation from content presentation. A well-organized knowledge base can save an enterprise money by decreasing the amount of employee time spent trying to find information about . generally online or with the capacity to be put online.tax laws or company policies and procedures. In short. For a long time. though they are the intended and main beneficiaries of a DBMS. a database of related information about a particular subject. insurance people.com could all be considered to be examples of knowledge bases. the project connects to a Microsoft Access database using a type 1 JDBC-ODBC Bridge.These people know the application and its end-user interfaces. it has become a part of the way people live their lives.. Dynamic Web sites contain Web pages that display constantly changing content. or for the general public. Today. 6.3. JSP is a simple but powerful technology used to generate dynamic HTML on the server side. Why it is needed? Ideal for advanced-undergraduate and graduate students. they are only indirectly involved with it. JSP with Java is more flexible because they are platform independent. a knowledge base can give customers easy access to information that would otherwise require contact with an organization's staff. There are two ways to achieve dynamic content generation: programmatic content generation. but need not know nor understand the underlying DBMS. In general. special diets for patients. information technology offers interactive and communication functions. Application's end-users (e. a technique that is an important foundation of e-commerce. etc. this capacity should make the interaction simpler for both the customer and the organization.) . plus ODBC driver. as a rule. medical doctors. Thus. a knowledge base is a centralized repository for information: a public library. On the other hand. this text is designed to help users develop an appreciation of KBS and their architecture and understand a broad variety of knowledge-based techniques for decision support and planning. and retrieval for an organization. and template-based content generation. Java servlets falls into the first category. 5. A number of software applications are available that allow users to create their own knowledge bases. Now. users went online and downloaded data. as well as business professionals. nutritional news.among myriad possibilities . An integral component of knowledge management systems. In this project. such as a CRM package. professional knowledge of nutrition. the posting of questions and shopping for nutritional supply products. Furthermore. staff can easily manage different kind of information in the database of this health center. organization.

as part of an artificial intelligence (AI) expert system. Essentially every major organization on earth uses such databases. Examples include customer databases that record contact. Enterprise resource planning that record details about product components. These databases store detailed data about the operations of an organization. They are typically organized by subject matter. In an enterprise data management system.In general. An Operational Database is usually put on a separate machine from the Data Warehouse to increase performance. a knowledge base is not a static collection of information. According to the World Wide Web Consortium (W3C). Deleting. benefits. the Operational Database is one which is accessed by an Operational System to carry out regular operations of an organization. skills data about employees. 7. Data Warehouses use an OLAP Database (Online Analytical Processing) which is optimized for faster queries. process relatively high volumes of updates using transactions. in the future the Internet may become a vast and complex global knowledge base known as the Semantic Web. Operational Databases usually use an OLTP Database which is optimized for faster transaction processing: Inserting. and financial databases that keep track of the organization's money. On the other side. but a dynamic resource that may itself have the capacity to learn. What is the architectural design of this database? An operational database contains enterprise data which are up to date and modifiable. accounting and financial dealings. and Updating data. an operational database could be said to be an opposite counterpart of a decision support . parts inventory. for example. access by the index could degrade as there would be more levels that need to be traversed. 2. What is this database application? In Data Warehousing. and demographic information about a business' customers. credit. What are the examples of these database applications? y PaperCut KB Operational database 1. Some IT professionals address this problem by having solutions that offload older data to data stores for archive. There are several reasons for this one of the most obvious reasons is that table scans need to reference more pages of data so it could give results. Operational databases are just part of the entire enterprise data management and some of the data that need to be archived go directly to the data warehouse. personnel databases that hold information such as salary. Indexes can also grow in size so it could support larger data volumes and with this increase.

database which contain non-modifiable data that are extracted for the purpose of statistical analysis. An operational database is used for keeping track of payments and inventory. volunteers or people supported by interested companies and organizations. is the database that is currently and progressive in use capturing real time data and supplying data for real time computations and other analyzing processes. in the case of Open source DBMSs (e. With ever increasing or expanding operational data volume.These are the people that design and build a database-based application that uses the DBMS. Who are the people involved in this database application? What are their functions? 1. They are typically skilled systems programmers.These are the people that design and build the DBMS product.. 4. and the only ones who touch its code. there are certain issues that need to be addressed appropriately. operational databases will have additional stress on processing of transactions leading to slowing down of things. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. the less efficient the transactions running against the database tend to be. Oracle. They are typically the employees of a DBMS vendor (e. Because of the very dynamic nature of an operational database. as the name implies. An operational database can grow very fast in size and bulk so database administrations and IT analysts must purchase high powered computer hardware and top notch database management systems. IBM. Sometimes the application itself is packaged and sold as a separate . The first group members write the needed application programs which the application comprises. As a general trend. or. Sybase).. For example. How does it run? An operational database is also used for supported IRS task filings and regulations which is why it is sometimes managed by the IT for the finance and operations groups in a business organization. It takes information and amounts from credit cards and accountants use the operational database because it must balance up to the last penny. 2. Most business organizations have regulations and requirements that dictate storing data for longer periods of time for operation. Companies can seldom ran successfully without using an operational database as this database is based on accounts and transactions. An example use of a decision support database is that it provides data so that the average salary of many different kinds of workers can be determined while the operational database contains the same data which would be used to calculate the amount for pay checks of the workers depending on the number of days that they have reported in any given period of time.g. MySQL). An operational database. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. an operational database is the one which used for taking order and fulfilling them in a store whether it is a traditional store or an online store. Other areas in business that use an operational database is in a catalog fulfillment system any other Point of Sale system which is used in retail stores. the more data there are in the operational database. 3. DBMS development is a complicated task. Application developers and Database administrators .g. DBMS developers . This can even create more complex setup in relation to database performance and usability. The latter group members design the needed database and maintain it. Microsoft.

Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. and previous proposal information. For instance. Storing information in a centralized area can increase retrieval time for users. personal employee information.g. the management of that information becomes just as important. personal employee information. medical doctors. 3. Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. insurance people. provide simultaneous read/write requests through pre-defined queries. For instance. These databases allow a business to enter. Known by different names. accountants.. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency. but need not know nor understand the underlying DBMS. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency. Storing information in a centralized area can increase retrieval time for users. Application's end-users (e. 5. Operational databases can store different types of information such as training status. Furthermore.) . Furthermore. Users depend on the accuracy of this information. gather. an operational database has the ability to be modified. personal employee information.These people know the application and its end-user interfaces. transaction database could mean that the information stored focuses on financial information even though it may not. Operational databases can store different types of information such as training status. and retrieve specific company information. which may include the DBMS inside (see Embedded database. Operational databases are important when information is needed quickly. provide simultaneous read/write requests through pre-defined queries. Operational databases also have other features that focus on the business environment. Operational databases also have other features that focus on the business environment. Oracle can store larger amounts of information than Access. Thus. An important feature of storing information in an operational database is the ability to share information across the company. Storing information in a centralized area can increase retrieval time for users. users can misunderstand what the database is supposed to be used for within a business. they are only indirectly involved with it. Another feature of an operational database is how much information can be stored that pertains to a business. Why it is needed? Operational databases can store different types of information such as training status. Having someone continually monitor the information being input into the database will make the information retrieved even more valuable because it will be accurate. Operational databases need continuous management. For instance. though they are the intended and main beneficiaries of a DBMS. and previous proposal information. an operational database has the ability to be modified. subject to proper DBMS licensing). . 6.product. For instance. Since day-to-day information is important to a business. Operational databases can be known by another name production database. Where it can be implemented? Operational Databases are very important to a business. and previous proposal information. or sold separately as an add-on to the DBMS. etc. Depending on the type of operational database being used will determine how much information it can hold.

such as loading data. where operations are performed with no time overlap. It may also contain system control data such as indicators. For instance. What are the examples of these database applications? y MS Share Point Server Parallel database 1. The data continually changes as updates are made. as opposed to serial processing. building indexes and evaluating queries. Centralized and client server database systems are not powerful enough to handle such applications. A parallel database system seeks to improve performance through parallelization of various operations. as well as other data storage. Parallel databases improve processing and input/output speeds by using multiple central processing units (CPUs) (including multicore processors) and storage in parallel. flags. many operations are performed simultaneously. in which the computational steps are performed sequentially. an operational database has the ability to be modified. Furthermore. where each processing unit has its own main memory and other storage. In parallel processing. The overall idea of using an operational database is to expedite the retrieval of large amounts of information with peak efficiency. sequential processing. Shared nothing architecture. consisting of system-specific reference data and event data belonging to a transaction-update system. and reflect the current value of the last transaction. It contains detailed data used to run the day-to-day operations of the business. and counters. Operational databases also have other features that focus on the business environment. What is this database application? A parallel database. where multiple processors share the main memory space. Parallel databases improve processing and input/ouput speeds by using multiple CPUs and disks in parallel. The operational database is the source of data for the data warehouse. Shared disk architecture. .Operational databases have the ability to flag specific information that may need to be retrieved on a continuous basis. Although data may be stored in a distributed fashion. but all units share the other storage. The major parallel DBMS architectures (which are induced by the underlying hardware architecture are: Shared memory architecture. In parallel processing. seeks to improve performance through parallelization for tasks such as loading data. provide simultaneous read/write requests through pre-defined queries. where each processing unit (typically consisting of multiple processors) has its own main memory. 7. Operational Database is the database-of-record. run by a parallel DBMS. as opposed to serial. many operations are performed simultaneously. the distribution is governed solely by performance considerations. building indexes and evaluating queries.

where each node has its own mass storage as well as main memory.2. y Parallel processing divides a large task into many smaller tasks. What is the architectural design of this database? A parallel processing system has the following characteristics: y y y Each processor in a system can perform tasks concurrently. Tasks may need to be synchronized. disks. In practice. as well as mass storage (e. usually a storage area network. and executes the smaller tasks concurrently on several nodes. each node usually also has multiple processors. How does it run? Using this method: y y Shared memory architecture. the larger task completes more quickly. but all nodes share mass storage. such as data. As a result. . hard disk drives). where multiple processors share the main memory space. Shared nothing architecture. Shared disk architecture. where each node has its own main memory. 3.g. and other devices. Nodes usually share resources.

Application developers and Database administrators . Specialized servers have been used most successfully in the area of very large databases: in DSS applications. For example. They are typically the employees of a DBMS vendor (e. however. in a bank with only one teller. but need not know nor understand the underlying DBMS. do not lend themselves to this approach. Other tasks. subject to proper DBMS licensing). and the only ones who touch its code. 4. These include online decision support. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. and be designed to serve a wide variety of enterprise computing needs. the task can be effectively split so that customers form two queues and are served twice as fast-or they can form a single queue to provide fairness. distributed operations. Thus. for example. No matter how many tellers are available to process loans. and high availability systems. No amount of parallel processing can overcome this built-in bottleneck to the system. volunteers or people supported by interested companies and organizations. 5.g. if the bank manager must approve all loan requests.Note: A node is a separate processor. Oracle.. parallel processing will not necessarily speed up the flow of loans. Features such as online backup. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work. all the requests must form a single queue for bank manager approval. often on a separate machine. etc. OLTP. They are typically skilled systems programmers. accountants. portability.g. data warehousing. can reside on a single machine. batch reporting. IBM. Since they are designed to serve a single function. 3. DBMS development is a complicated task. MySQL). The latter group members design the needed database and maintain it. Some tasks can be effectively divided. Sometimes the application itself is packaged and sold as a separate product. The first group members write the needed application programs which the application comprises. insurance people. interoperability. 2. Application's end-users (e. This is an instance in which parallel processing is an effective solution. By contrast. and support for a wide variety of client tools . Where it can be implemented? Parallel database software is often specialized-usually to serve as query processors.) . in the case of Open source DBMSs (e. or. and thus are good candidates for parallel processing. Who are the people involved in this database application? What are their functions? 1. though they are the intended and main beneficiaries of a DBMS. With two tellers. DBMS developers .These people know the application and its end-user interfaces. they are only indirectly involved with it.These are the people that design and build the DBMS product. medical doctors. specialized servers do not provide a common foundation for integrated operations. Microsoft.g.. however. Multiple processors.. which may include the DBMS inside (see Embedded database. data replication. or sold separately as an add-on to the DBMS. Versatile parallel database software should offer excellent price/performance on open systems hardware. all customers must form a single queue to be served.These are the people that design and build a database-based application that uses the DBMS. Sybase). however.

and mixed application workloads. For example. banking.Parallel database software must effectively deploy the system's processing power to handle diverse applications: online transaction processing (OLTP) applications. they are integrating themselves into our society and are employed in many applications. which is formed by a group of connected computers. 6. What is this database application? A real-time database is a processing system designed to handle workloads whose state is constantly changing (Buchmann). This is an efficient approach because many applications consist of online insert and update transactions which tend to have short data access requirements. 2. DSS applications are characterized by long transactions. What is the architectural design of this database? . In appropriate applications. or peripheral devices. This differs from traditional databases containing persistent data. where each CPU processes an entire transaction. A parallel server processes transactions in parallel by servicing a stream of transactions using multiple CPUs on different nodes. As computers increase in power and can store more data. which involves the Non-Uniform Memory Access. Real-time databases are useful for accounting. Real-time processing means that a transaction is processed fast enough for the result to come back and be acted on right away (Capron). with high CPU and I/O usage. Cluster (shared nothing + shared disk: SAN/NAS). process control. law. reservation systems. the parallel database provides for concurrent access to data and protects data integrity. decision support system (DSS) applications. as well as a mixed OLTP and DSS workload. A parallel database is designed to take advantage of such architectures by running multiple instances which "share" a single physical database. with increased performance. and scientific data analysis (Snodgrass). software. Real-time database 1. OLTP applications are characterized by short transactions which have low CPU and I/O usage. medical records. a parallel server can allow access to a single database by users on multiple machines. In addition to balancing the workload among CPUs. distributed operations. Why it is needed? A variety of hardware architectures allow multiple computers to share access to data. a stock market changes very rapidly and is dynamic. The graphs of the different markets appear to be very unstable and yet a database has to keep track of current values for all of the markets of the New York Stock Exchange (Kanitkar). Using parallel data manipulation language you can have one transaction being performed by multiple nodes. multi-media.can enable a parallel server to support application integration. mostly unaffected by time. What are the examples of these database applications? y y Non-Uniform Memory Architecture (NUMA). 7.

In a real-time system. A transaction is usually the result of an execution of a program that accesses or changes the contents of a database (Singhal). A database must let only one transaction operate at a time to preserve data consistency. This means in a stream. deadlines are formed and different kinds of systems respond differently to data that does not meet its deadline. and transactions can do both read and write operations. The timestamp method on relies on the arrival time in the system. each transaction uses a timestamp to schedule the transactions (Abbot). Researchers indicate that for most studies. multiple users can read from the same piece of data. For example. if two students demand to take the remaining spot for a section of a class and they hit submit at the same time. only one student should be able to register for it (Abbot). A transaction is different from a stream because a stream only allows read-only operations. Throughout this article.Although the real-time database system may seem like a simple system. problems arise during overload when two or more database transactions require access to the same portion of the database. For example. Real-time databases can process these requests utilizing scheduling algorithms for concurrency control. transactions are sporadic with unpredictable arrival times. In real-time databases. and a main memory pool(Haritsa). but they cannot both modify it (Abbot). prioritizing both students requests in some way. the system gives an . we assume that the system has a single processor. A priority mapper unit assigns a level of importance to each transaction upon its arrival in the database system that is dependent on how the system views times and other priorities. a disk based database.

and performs retrieval and manipulation of information. as a part of a realtime system. buy-sell decisions have to be made promptly.earlier request deadline to a higher priority and a later deadline to a lower priority (Haritsa). In order to capitalize on the opportunities. translating an 800 number into an actual numbradar tracking and others. Example applications that handle large amounts of data and have stringent timing requirements include telephone switching (e.g. The graphs of the different markets appear to be very unstable and yet a database has to keep track of current values for all of the markets of the New York Stock Exchange (Kanitkar). Below is a comparison of different scheduling algorithms. 4. Like a conventional database system. not fairness. Some transactions should get to CPU based on criticalness. If one student had 22 credits and the other had 100 credits. often with a time constraint so that the financial overhead in performing the trade actions are well compensated by the benefit resulting from the trade. An example is a group of people calling to order a Highest Value PT = 1/VT The deadline is not important. However. a RTDBS functions as a repository of data. An example is registering for classes where the student selects a block of classes that he wishes to take and presses submit. 5. for example. If the telephone switchboards were overloaded. Earliest Deadline PT = DT product. has the added burden of ensuring some degree of confidence in meeting the system's timing requirements. higher priorities often take up precedence. the person with 100 credits would take priority (Value based scheduling). 3. The value of a transaction is not important. Since price discrepancies are usually short-lived. In this scenario. How does it run? Database management systems provide tools for such organization. The resulting integrated system. Where it can be implemented? . Value inflated deadline PT = DT/VT Gives equal weight to deadline and values based on scheduling. Who are the people involved in this database application? What are their functions? For example. automated searching and processing of large amounts of trading information are very desirable. a stock market changes very rapidly and is dynamic. involves trading commodities in different markets at different prices. a radar surveillance system detects aircraft "images" or "radar signatures". This is an example of least slack that can wait the least amount of time. A school registration system probably uses this technique when the server receives two registration transactions. so in recent years there has been interest in "merging" database and real-time technology. which provides database operations with real-time constraints is generally called a real-time database system (RTDBS) [1]. As another example. people who call 911 should get priority (Snodgrass). provides effcient storage. Arbitrage trading. a RTDBS. whose "tasks" are associated with time constraints.

The forced wait policy may involve waiting for higher priority transactions to process in order to prevent deadlock. The latter method helps increase the accuracy of the system and can cut down on the number of necessary processes that are aborted (Kang). return only temporally valid data. The policy avoids conflict by asking all non-requesting blocks to wait until the most essential block of data is processed (Abbot). altitude. A conventional database cannot work under these circumstances because the inconsistencies between the real world objects and the data that represents them are too severe for simple modifications. and support priority scheduling. This process helps ensure the latest information in time critical systems. To enter the data in the records. This range is called temporal validity. how facts are associated with real-time system. If any of this information is late. They use timing constraints that represent a certain range of values for which the data are valid. an air-traffic control system constantly monitors hundreds of aircraft and makes decisions about incoming flight paths and determines the order in which aircraft should land based on data such as fuel. An effective system needs to be able to handle timesensitive queries. the result could be devastating. When designing a system. one should consider how to represent valid time. the timestamp can support transactions by providing clear time references. This range is called temporal validity. What is this database application? . A conventional database cannot work under these circumstances because the inconsistencies between the real world objects and the data that represents them are too severe for simple modifications. the forced wait policy can improve performance by 50 percent (Porkka). Also. and speed. it is important to consider what the system should do when deadlines are not met. Why it is needed? Real-time databases are traditional databases that use an extension to give the additional power to yield reliable responses. often a sensor or an input device monitors the state of the physical system and updates the database with new information to reflect the physical system more accurately (Abbot). To address issues of obsolete data. What are the examples of these database applications? y y ADDM Real-Time Database Systems 2001 Spatial database 1. For example. They use timing constraints that represent a certain range of values for which the data are valid. consider how to represent attribute values in the database so that process transactions and data consistency have no violations. When designing a real-time database system. The forced wait policy delays processing until the data is updated using new input data. 6. 7. An additional way of dealing with conflict resolution in a real-time database system besides deadlines is a wait policy method. While studies in labs have found that data-deadline based policies do not improve performance significantly. Generally relying on wait policies is a not optimal (Kang). Another example of when data can be delayed is when a block of data is about to expire.Real-time databases are traditional databases that use an extension to give the additional power to yield reliable responses.

polygon area. If the first and last vertex of a line are identical the feature can also be of the type polygon (a closed line). Spatial Predicates: Allows true/false queries such as 'is there a residence located within a mile of the area we are planning to build the landfill?' (see DE-9IM) Constructor Functions: Creates new features with an SQL query specifying the vertices (points of nodes) which can make up lines. for example by providing a buffer around them. lines and polygons. The following query types and many more are supported by the Open Geospatial Consortium: y y y y Spatial Measurements: Finds the distance between points. 3.A spatial database is a database that is optimized to store and query data that is related to objects in space. In addition to typical SQL queries such as SELECT statements. etc. additional functionality needs to be added for databases to process spatial data types. Instead. What is the architectural design of this database? Database systems use indexes to quickly look up values and the way that most databases index data is not optimal for spatial queries. These are typically called geometry or feature. While typical databases can understand various numeric and character types of data. including points. intersecting features. 2. How does it run? . Spatial Functions: Modify existing features to create new ones. etc. Observer Functions: Queries which return specific information about a feature such as the location of the center of a circle y Not all spatial databases support these query types. The Open Geospatial Consortium created the Simple Features specification and sets standards for adding spatial functionality to database systems. spatial databases can perform a wide variety of spatial operations. spatial databases use a spatial index to speed up database operations.

y y y y y -All OpenGIS Specifications compliant products -Open source spatial databases and APIs. subject to proper DBMS licensing). DBMS development is a complicated task. are transferred into the main memory. Thus. MySQL). in the case of Open source DBMSs (e. Users. If more precise information is needed about the successful data sets. They are typically the employees of a DBMS vendor (e. volunteers or people supported by interested companies and organizations. users could go to the legacy GDL's data source to find what they are looking for. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work.e. using a computerized procedure to define their needs (the system could perform a preliminary but very useful filtering of the data sets. IBM.A data warehouse system getting some data from different GDLs and integrating them in a single database. small queries as well as large queries are processed efficiently. Where it can be implemented? We proposed a storage and access architecture for geographic database systems. and the only ones who touch its code. though they are the intended and main beneficiaries of a DBMS. Sometimes the application itself is packaged and sold as a separate product. Application developers and Database administrators . some of which are OpenGIS compliant -Boeing's Spatial Query Server (Official Site) spatially enables Sybase ASE. etc. medical doctors..) . -Smallworld VMDS. Sybase).These are the people that design and build the DBMS product. The latter group members design the needed database and maintain it. insurance people. Using our scene architecture. . accountants. This Architecture integrates a number of various concepts and techniques for efficient query Processing.These are the people that design and build a database-based application that uses the DBMS. all scenes intersecting the query region. they are only indirectly involved with it. i. functions. which may include the DBMS inside (see Embedded database. Oracle.g. DBMS developers . The first group members write the needed application programs which the application comprises. 3. For each scene just onesearch operation on secondary storage is necessary.These people know the application and its end-user interfaces. subtrees of the R*-tree. or. and utilities.. 2.g. or sold separately as an add-on to the DBMS. Using such architecture greatly reduces the problems previously mentioned (and completely eliminates them if the metadata stored/derived in the warehouse are sufficient). They are typically skilled systems programmers. Small queries are processed by single page accesses as described before. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. but need not know nor understand the underlying DBMS.. 4.g. the native GE Smallworld GIS database -SpatiaLite extends Sqlite with spatial datatypes. From there. instead of consulting several different GDL could do a one-stop initial search in this data warehouse and find the preliminary information they need. 5. Who are the people involved in this database application? What are their functions? 1. If a range query specifies a larger query region. Microsoft. Application's end-users (e.

ISO.16.16 only supported spatial data in MyISAM tables. Quadtree and Hilbert curve directly in the graph -AllegroGraph . Organizations. MySQL versions earlier than 5. It includes extension syntax for SPARQL queries -MongoDB supports geospatial indexes in 2D 6. NDB. InnoDB. Why it is needed? Creates flexible and scalable solutions that integrate easily into any organization's IT environment by adopting security policies.0.a Graph database provides a novel mechanism for efficient storage and retrieval of two-dimensional geospatial coordinates for Resource Description Framework data. and governments manage and display spatial data in an intuitive. Functions that test spatial relationships are limited to working with minimum bounding rectangles rather than the actual geometries. Farms.0. ecological impact -Medical Imaging -Road Map Grid (spatial index) Z-order (curve) Quadtree Octree UB-tree R-tree: R+ tree R* tree Hilbert R-tree X-tree kd-tree Temporal database . easy-to-understand format that facilitates collaboration and data interoperability. and ARCHIVE also support spatial features. BDB. and adjusting to multiple types of user-profile demands. with support for spatial types -Oracle Spatial -Microsoft SQL Server has support for spatial types since version 2008 -PostgreSQL DBMS (database management system) uses the spatial extension PostGIS to implement the standardized datatype geometry and corresponding functions. following OGC. 7. As of MySQL 5. adapting to corporate standards.Graph database that can build 1D and 2D indexes as Btree. What are the examples of these database applications? y y y y y y y y y y y y y y y y y -Census Data -NASA satellites imagery .terabytes of data per day -Weather and Climate Data -Rivers. including the free DB2 Express-C. enterprises. It happens when all the pieces of spatial data infrastructure are in place.y y y y y y y y -IBM DB2 Spatial Extender can be used to enable any edition of DB2. -Neo4j . and INSPIRE standards. -MySQL DBMS implements the datatype geometry plus some spatial functions that haven't been implemented according to the OpenGIS specifications.

this approach introduces additional complexities such as dealing with the validity of (foreign) keys. Bitemporal data combines both Valid and Transaction Time. Imagine that we come up with a temporal database storing data about the 18th century. How does it run? More specifically the temporal aspects usually include valid-time and transaction-time. it becomes possible to store different database states. Two more fields are added to the Person table: Transaction-From and Transaction-To. for example. such as Decision Time. Transaction-From is the time a transaction was made. This enables queries that show the state of the database at a given time. causing the queries to produce different results. A time period attached to the data expresses when it was valid or stored in the database.1.g.    Valid time denotes the time period during which a fact is true with respect to the real world. What is this database application? Temporal database is a database with builtin time aspects. 3. The valid time of these facts is somewhere between 1701 and 1800. Note that these two time periods do not have to be the same for a single fact. Bi-temporal relation contains both valid and transaction time. By attaching a time period to the data. Temporal rollback (e. in the database. whereas the transaction time starts when we insert the facts into the database. However even the most simple temporal query rewriting the history of an . Transaction time is the time a transaction was made. 1998.the database may have been altered since 1992.g. where did the database believe John lived?") is provided by the transaction time. In that case the database would be called a multitemporal database as opposed to a bitemporal database. In order to achieve perfect archival quality it is of key importance to store the data under the schema version under which they firstly appeared. These attributes go together to form bitemporal data. It is possible to have timelines other than Valid Time and Transaction Time. 2. and Transaction-To is the time that the transaction was superseded (or infinity if it has not yet been superseded). However. The answers to these example questions may not be identical . The Valid Time stress the data for which a fact is true in the real world. Historical information (e. Transaction time is the time period during which a fact is stored in the database. This is good because it provides both temporal rollback and historical information. January 21. This is good because it provides both temporal rollback and historical information.: "Where did John live in 1992?") can be derived from valid time. What is the architectural design of this database? A bi-temporal relation contains both valid and transaction time.: "In 1992.

These people know the application and its end-user interfaces. 2. -Identification of an appropriate data type for time. they are only indirectly involved with it. The latest version complies with TSQL2. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. Sybase). A common solution is to provide automatic query rewriting. -Provide query algebra to deal with temporal data. or.g. Oracle. -TimeDB is a free temporal relational DBMS by TimeConsult. What are the examples of these database applications? . Sometimes the application itself is packaged and sold as a separate product. The latter group members design the needed database and maintain it.10 has temporal features built into the database.) . which may include the DBMS inside (see Embedded database. Thus. enables application developers and DBAs to manage current. -PostgreSQL has an open-source contributed package that can be installed in the database to manage temporal data. proposed and historical versions of data in the same database.These are the people that design and build the DBMS product. medical doctors. insurance people.g. Application's end-users (e. 4. DBMS development is a complicated task. in the case of Open source DBMSs (e. They are typically the employees of a DBMS vendor (e. though they are the intended and main beneficiaries of a DBMS. They are typically skilled systems programmers. 7. This process would be particularly taxing for users. The first group members write the needed application programs which the application comprises.a feature of Oracle Database. -Prevent fragmentation of an object description. -Compatible with old database without temporal data. but need not know nor understand the underlying DBMS. MySQL). y y y 6.. etc. 5.g. accountants. Microsoft..attribute value would be required to be manually rewritten under each of the schema versions. Why it is needed? y y y y y -It is use in dealing with variation of data over time. volunteers or people supported by interested companies and organizations.. DBMS developers . Teradata version 13. subject to proper DBMS licensing).10 has temporal features built into the database. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work.These are the people that design and build a database-based application that uses the DBMS. -Teradata version13. Where it can be implemented? The following implementations implement a bitemporal database in a relational database management system (RDBMS): y -Oracle Workspace Manager. or sold separately as an add-on to the DBMS. and the only ones who touch its code. 3. Application developers and Database administrators . Who are the people involved in this database application? What are their functions? 1. It runs as a frontend to Oracle that accepts TSQL2 statements and generates SQL92 statements. IBM.

What is the architectural design of this database? Unstructured Data (or unstructured information) refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables. multimedia objects etc. What is this database application? An unstructured-data database is intended to store in a manageable and protected way diverse objects that do not fit naturally and conveniently in common databases. while not formally defined can still be implied and y data with some form of structure may still be characterized as unstructured if its structure is not helpful for the desired processing task. Most established DBMSs now support unstructured data in various ways. 3. documents. journals. y structure. Unstructured information is typically text-heavy. the entire possible object collection does not fit into a predefined structured framework. The term is imprecise for several reasons. This results in irregularities and ambiguities that make it difficult to understand using traditional computer programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents. It may include email messages. and y unstructured information might have some structure (semi-structured) or even be highly structured but in ways that are unanticipated or unannounced. and new dedicated DBMSs are emerging. However. How does it run? . and facts as well. but may contain data such as dates. 2.y y y -A temporal data model -A temporal version of Structured Query Language -Short biography Unstructured-data database 1. The name may be misleading since some objects can be highly structured. numbers.

and large-scale patterns. including statistical and rule-based Natural Language Processing (NLP). The first group members write the needed application programs which the application comprises. They are typically the employees of a DBMS vendor (e. Thus. auditory. DBMS developers . though they are the intended and main beneficiaries of a DBMS. by examining word morphology. Sometimes the application itself is packaged and sold as a separate product.Unstructured information represents the vast majority of the data collected and accessible to enterprises. or sold separately as an add-on to the DBMS. 2. This inherent structure can be inferred from text. making it understandable to the users.. video. sentence syntax. volunteers or people supported by interested companies and organizations. Application developers and Database administrators . accountants. DBMS development is a complicated task. Unstructured information management (UIM) applications are software systems that analyze unstructured information (text. and visual structure that is inherent in all forms of human communication.. Common techniques for structuring text usually involve manual tagging with metadata or Part-of-speech tagging for further text mining-based structuring. 6. Application's end-users (e. organize. or. and deliver relevant knowledge to the user. for instance. UIM applications make use of a variety of analysis technologies. 5. and other small. Information Retrieval (IR). subject to proper DBMS licensing). Unstructured information can then be enriched and tagged to address ambiguities and relevancy-based techniques then used to facilitate search and discovery. Because of its seemingly infinite scalability.) .These are the people that design and build a database-based application that uses the DBMS. insurance people.g. A design-friendly lack of schema. medical doctors. 3.) to discover. images. IBM. The latter group members design the needed database and maintain it. Oracle. MySQL). or otherwise interpret. Both are well familiar with the DBMS product and use its user interfaces (as well as usually other tools) for their work.) provides a common framework for processing this information to extract meaning and create structured data about the information. Where it can be implemented? Data mining and text analytics and noisy text analytics techniques are different methods used to find patterns in. but need not know nor understand the underlying DBMS. Extraordinary fault tolerance. and the only ones who touch its code. they are only indirectly involved with it. etc. audio. Sybase). in the case of Open source DBMSs (e. High availability. etc.These are the people that design and build the DBMS product. 4.These people know the application and its end-user interfaces. Integration of both Restful and cloud computing technologies. which may include the DBMS inside (see Embedded database. . and ontology.. unstructured information management (UIM) make this data accessible and searchable via management systems and applications. machine learning. Microsoft. They are typically skilled systems programmers. UIMA (Unstructured Information Management Architecture. Why it is needed? Creates machine-processable structure exploits the linguistic. Creating order from an abundance of sources in many formats. Who are the people involved in this database application? What are their functions? 1. In analyzing unstructured information. and some of the popular DBMSs have been under development and enhancement (also to follow progress in technology) for decades. this information.g.g.

What are the examples of these database applications? y y y y y y y y y Books Journals Documents Metadata health records audio video files unstructured text (such as the body of an e-mail message. Web page. or word processor document) .7.

Russelle Ralph Ubando. Henry D. Espeleta. Eliza Mapanoo Submitted On: December 12. Verlourd S. 2011 .University of Perpetual Help System Laguna Sto. Jocson. Niño. Laguna Project in Database Management Submitted By: Garde II. Karl Neil Anthony C. Submitted To: Mrs. City of Biñan.

Sign up to vote on this title
UsefulNot useful