You are on page 1of 7

A distributed application may use different storage offerings for example, relational databases

and key-value storages to store data. Alternatively, these applications may also use stateful
components developed individually. Handling the complexity of accessing this data, i.e.,
handling of authorization, querying for data, failure handling etc. in user interface components
or processing components tightly couples those components to the used storage offering and
complicate the implementation of these components as a lot of the idiosyncrasies of data
handling have to be respected by them. Therefore, a later change to the application, for
example, the replacement of a storage offering with another one causes significant changes to
other application components. Instead, different data sources should be integrated to provide a
unified data access to other application components. Also, data may be stored at different cloud
providers that have to be integrated as well.

Scenario 5.3.6 Data Access Component

Introduction
Data Access Component provides us with the capabilities to keep and retrieve and manipulate
data elements in such a way that the inherent data access complexity is isolated and the
consistency of the data is assured. In application design, there is a fear that dealing with data
access complexity tightly couples application elements to the storage used. There is a need to
integrate various data sources to unify access to the data by various applications. Distributed
applications use many types of storage solutions ranging from relational databases, key-value
storages

Results
Data retrieval components are employed provide access to different data sources. Data
manipulation is coordinated by the data retrieval component when different storage offerings
are in use. With this solution if an interface changes or a storage offering is replaced the only
component that needs to be modified is the data access component.
Fig. 5.3.6.1 Data retrieval components

Discussion
The data access pattern isolates the complexity of accessing a variety of storage types caused
by unique interfaces, communication protocols, identification verification techniques.

Cloud Vendors may distribute their data storage, as such access component abstracts this
storage from other system components to introduce a unified data retrieval platform. This
makes it easy for Cloud Service Suppliers of Storage to be changed without affecting the
application. We are assured of information retrieval client-side consistencies in addition to the
consistency assured by the cloud service supplier.
For these naive implementations to assure client-side consistency models, the data handling
uses versions on data elements, and histories of operations executed by clients. We first cover
briefly how different client-side consistency assurances can be realized by a data access
component. These consistency assurances are also covered in detail by the eventual consistency
pattern. Afterwards, we cover how the consistent knowledge in both approaches regarding
versions numbers and operations identifiers can be ensured if the data access component itself
is scaled out.
 Monotonic Reads – One client will never read data that is older than what it has read
before.
 Read Your Writes – One client will immediately see data alterations performed by it.
These two consistency levels can be realized by data access components using version
identifiers associated with each data element. Upon every write of a data element, this
version identifier is increased. The data access component may then know the last
version accessed by a client and can drop any results of read and write operations that
are too old. If the data access component is scaled out, all instances of it need to have
consistent information about the version last seen by a client (consistency for that
client) or all clients (consistency for all clients).
 Monotonic Writes – Write operations of one client are executed in the order they were
issued. This client-side consistency can be ensured by storing the unique identifiers of
client’s operations in an operation history. If a data access component retrieves an
operation that it shall execute but the data to update does not reflect all previously
executed operations in the history, it can wait. Again, this requires that all instances of
a scaled-out data access component have a consistent knowledge about the operation
history of a client (consistency for that client) or all clients (consistency for all clients).
Scaling of the data access component is significantly hindered if it enables a client-side
consistency assurance that is not assured by the storage offerings, because the version
identifiers and operation identifiers have been consistent among data access component
instances. This can be realized in two ways. First, the data access components are
stateful components , thus, they maintain the identifiers internally. If an identifier shall
be increased, the data access component instances do so in an ACID transaction, as
described in the strict consistency pattern. The second approach is to implement the
data access component as a stateless component and store the version identifiers and
operation identifiers in a strict consistent storage offering that is accessed by all data
access component instances. In either case a hybrid access to data elements can now be
realized: clients can decide on every read if they would like to retrieve consistent data
(the version identifier and operation identifier is accessed) or if eventual-consistent data
is sufficient (only the eventual consistent storage offering is accessed).
This hybrid approach is also used by to some storage offerings and also reflected in pay-per-
use pricing models: customers pay less if they decide for eventual consistent reads as consistent
reads are harder to realize. Therefore, introducing any consistency to an eventually consistent
storage or a set of storage offerings that are integrated always has to be weighted with
performance and partitioning tolerance, because according to the CAP theorem not all of these
can be optimized at the same time. Especially, the initial consistency behaviour of a storage
offering can significantly impact performance if changed. Consider, for example, a storage
offering, that assures no client-side consistency, for which read-your-write consistency shall
be enabled. Therefore, a versioning identifier is associated with every data element and all
operations retrieving an obsolete data version are dropped and re-executed by the data access
component. However, that means that a write operation executed after a read operation may
take very long to return, as the data access component waits until it retrieves the last or a newer
version, thus, significantly affecting the performance experienced by the client. Also, when the
data access component computes the new version identifier it has to coordinate with all other
instances in case it is scaled-out. Reintroducing the problems of strict consistency to the
eventually consistent storage offering it accesses.
Configurability: to adjust the data structure supported by data access components two
characteristics have to be ensured. First, the data elements and their structure have to be
extensible to support additional data elements and to extend existing data elements with
additional data fields. Second, configured or new data elements have to be queried using
generic functionality. Therefore, the interface of the data access component and the structure
of handled data elements have to support configurability. The extensibility of data elements is
realized by a certain data structure, where each data element is associated with a list of arbitrary
data elements. This list may either be filled directly with data values or may be used as a pointer
to other data elements that shall be associated with the extended data element. For example, if
an application handles children of a school and the result of a test not commonly made by
schools shall be stored with data elements representing children, one of the data fields may be
used for it. If the test shall instead be modelled as a different data element containing more
information, for example, when a child took it, the test result can also be modelled as a separate
data element referenced in a field. The second characteristic of configurable data access
components is generic portion of their interfaces. To increase comprehensibility, interfaces
usually provide specialized application specific functions. These functions, for example, can
be used to specifically query children data elements in the above example. The semantic of
these functions is well-defined in scope of the application they are used in an, thus, significantly
ease interaction with the interface. However, if the data elements provided by the data access
component are extended, new data fields and new data elements cannot be respected in
specialized functions defined for an application. Therefore, a data access component should
also provide generic functions to access arbitrary data elements handled by it. These generic
functions should at least be usable to create, read, update, and delete data elements, thus, they
are called CRUD functions. Using these functions, data elements may be accessed using a
unique identifier, which is passed to the operations as parameter. Arbitrary data elements
provided by the data access component can, therefore, be queried and manipulated using the
generic functions, if no specialized functions exist for this purpose. A drawback of such
extensible data elements and generic access functions is that readability of the data access
component interface is drastically reduced, as a lot of the provided functionality is hidden
behind the same interface. Furthermore, if multiple components access the same data access
component, each of these components needs to implement specialized functionality, i.e., to
query children using the generic functionality, rather than using an interface function
specifically created for it. This may lead to a lot of redundancy in the application
implementation. Therefore, interface readability always needs to be weighed against the
flexibility of generic data manipulation interfaces.

Related Patterns
 Provider adapter: the abstraction that is provided by the data handling pattern for the
integrated storage offering should be an internal feature for every application
component accessing the provider interface. The best approach for the provider adapter
pattern is making the application component implementations that are loosely coupled
to cloud provider interface.
 the unification and abstraction that a data handling pattern provides for integrated
storage offering should be used internally by every application component accessing a
provider interface. This best practice is described generally in the provider adapter
(243) pattern ensuring that application component implementations are loosely coupled
to cloud provider interface specific.
 Storage offering: different storage offerings can be integrated by the data access
component and then the data access component will provide a unified access to storage
offerings. Many storage offerings are suitable for this form of integration. Each
application can access application specific functionality to access data that can be
provided by the data access component so rather than accessing operations to execute
general queries, it may offer operations to query accounts, users and inventory thereby
adding more semantics to the interface which is better than what the generic storage
interfaces offers.
 Restricted data access component: the data access component can be extended to
restrict data access or to delete confidential data elements if the components accessing
the data do have privileges to access the data as the rest of the application.
 Data abstractor: A data abstractor pattern is possible if the data access component can
provide data that is consistent, it can additionally implement a data abstractor pattern
to conceal the fact that data is consistent from the other application components and the
applications users if the applications use case is enabled to do so.

Known Uses
The REST architectural style uses the CRUD functions that were mentioned in the section for
related patterns and it limits the functions of interfaces to these integral operations. Amazon's
Simple Storage Service (S3) uses this style of interaction for its storage offerings. Fowler
discussed how access methods exceeding these basic data manipulation functions can be
designed. The adjustments to the database tables to meet the need of different customers has
been described by Chong et al and this covers the configurability of data elements. This is a
major enabling factor to share database instances among many customers and reduce costs
thereby addressing a larger customer market.

Conclusion
There is no single database product or technology which provides the complete solution for
data management challenges in cloud computing environment. There are different products
(implementations) are being used to target specific database design and implementation issues
in cloud environment. There is always a trade-off in using such technologies in terms of
performance, availability, consistency, concurrency, scalability and elasticity.

References
Abtahizadeh, S. A. (2016) “Université De Montréal Understanding the Impact of Cloud
Computing Patterns on Performance and Energy Consumption.” Available at:
https://publications.polymtl.ca/2296/1/2016_SeyedAmirhosseinAbtahizadeh.pdf (Accessed:
June 24, 2018).
Adewojo, A. A., Bass, J. M. and Allison, I. K. (2015) “Enhanced cloud patterns: A case
studyof multi-tenancy patterns,” International Conference on Information Society, i-Society
2015, pp. 53–58. doi: 10.1109/i-Society.2015.7366858.
Fehling, C. et al. (2014) Cloud Computing Patterns: Fundamentals to Design, Build, and
Manage Could Applications.
Fehling, C. (2015) “Cloud Computing Patterns: Identification, Design, and Application.” doi:
http://dx.doi.org/10.18419/opus-3596.
Ochei, L. C., Petrovski, A. and Bass, J. M. (2015) “Evaluating degrees of tenant isolation in
multitenancy patterns: A case study of cloud-hosted Version Control System (VCS),”
International Conference on Information Society, i-Society 2015, pp. 59–66. doi: 10.1109/i-
Society.2015.7366859.
Paraiso, F., Merle, P. and Seinturier, L. (2013) “Managing Elasticity Across Multiple Cloud
Providers,” pp. 53–60. Available at: https://hal.inria.fr/hal-
00790455/file/Managing_Elasticity_Across_Multiple_Cloud_Providers.pdf
SUMIT, K. and PRABHAKAR, T. (2016) “The Multi-tenant Pattern,” PLoP’16. Available
at: https://pdfs.semanticscholar.org/1ad2/d4ba4385c4c9a685a899ebbef7cbe04abc09.pdf
Zaidman, A. and Bezemer, C.-P. (2010) Challenges of Reengineering into Multi-Tenant SaaS
Applications. Delft.