Professional Documents
Culture Documents
corporate relational databases or legacy databases, or it may come from information systems outside the
corporate walls.
2. Data staging The data stored to sources should be extracted, cleansed to remove inconsistencies and fill gaps,
and integrated to merge heterogeneous sources into one common schema. The so-called Extraction,
Transformation, and Loading tools (ETL) can merge heterogeneous schemata, extract, transform, cleanse,
validate, filter, and load source data into a data warehouse (Jarke et al., 2000). Technologically speaking, this
stage deals with problems that are typical for distributed information systems, such as inconsistent data
management and incompatible data structures (Zhuge et al., 1996). Section 1.4 deals with a few points that are
3. Data warehouse layer Information is stored to one logically centralized single repository: a data warehouse. The
data warehouse can be directly accessed, but it can also be used as a source for creating data marts, which
partially replicate data warehouse contents and are designed for specific enterprise departments. Meta-data
repositories (section 1.6) store information on sources, access procedures, data staging, users, data mart
4. Analysis In this layer, integrated data is efficiently and flexibly accessed to issue reports, dynamically analyze
information, and simulate hypothetical business scenarios. Technologically speaking, it should feature aggregate
data navigators, complex query optimizers, and user-friendly GUIs. Section 1.7 deals with different types of
What is ETL?
ETL is an abbreviation of Extract, Transform and Load. In this process, an ETL tool extracts the data from different RDBMS
source systems then transforms the data like applying calculations, concatenations, etc. and then load the data into the Data
Warehouse system.
It's tempting to think a creating a Data warehouse is simply extracting data from multiple sources and loading into database
of a Data warehouse. This is far from the truth and requires a complex ETL process. The ETL process requires active inputs
from various stakeholders including developers, analysts, testers, top executives and is technically challenging.
In order to maintain its value as a tool for decision-makers, Data warehouse system needs to change with business
changes. ETL is a recurring activity (daily, weekly, monthly) of a Data warehouse system and needs to be agile, automated,
and well documented.
What is ETL?
Why do you need ETL?
ETL Process in Data Warehouses
Step 1) Extraction
Step 2) Transformation
Step 3) Loading
ETL tools
Best practices ETL process
It helps companies to analyze their business data for taking critical business decisions.
Transactional databases cannot answer complex business questions that can be answered by ETL.
A Data Warehouse provides a common data repository
ETL provides a method of moving the data from various sources into a data warehouse.
As data sources change, the Data Warehouse will automatically update.
Well-designed and documented ETL system is almost essential to the success of a Data Warehouse project.
Allow verification of data transformation, aggregation and calculations rules.
ETL process allows sample data comparison between the source and the target system.
ETL process can perform complex transformations and requires the extra area to store the data.
ETL helps to Migrate data into a Data Warehouse. Convert to the various formats and types to adhere to one
consistent system.
ETL is a predefined process for accessing and manipulating source data into the target database.
ETL offers deep historical context for the business.
It helps to improve productivity because it codifies and reuses without a need for technical skills.
DBMS, Hardware, Operating Systems and Communication Protocols. Sources could include legacy applications like
Mainframes, customized applications, Point of contact devices like ATM, Call switches, text files, spreadsheets, ERP, data
from vendors, partners amongst others.
Hence one needs a logical data map before data is extracted and loaded physically. This data map describes the
relationship between sources and target data.
1. Full Extraction
2. Partial Extraction- without update notification.
3. Partial Extraction- with update notification
Irrespective of the method used, extraction should not affect performance and response time of the source systems. These
source systems are live production databases. Any slow down or locking could effect company's bottom line.
Step 2) Transformation
Data extracted from source server is raw and not usable in its original form. Therefore it needs to be cleansed, mapped and
transformed. In fact, this is the key step where ETL process adds value and changes data such that insightful BI reports can
be generated.
In this step, you apply a set of functions on extracted data. Data that does not require any transformation is called as direct
move or pass through data.
In transformation step, you can perform customized operations on data. For instance, if the user wants sum-of-sales
revenue which is not in the database. Or if the first name and the last name in a table is in different columns. It is possible to
concatenate them before loading.
Step 3) Loading
Loading data into the target datawarehouse database is the last step of the ETL process. In a typical Data warehouse, huge
volume of data needs to be loaded in a relatively short period (nights). Hence, load process should be optimized for
performance.
In case of load failure, recover mechanisms should be configured to restart from the point of failure without data integrity
loss. Data Warehouse admins need to monitor, resume, cancel loads as per prevailing server performance.
Types of Loading:
Load verification
Ensure that the key field data is neither missing nor null.
Test modeling views based on the target tables.
Check that combined values and calculated measures.
Data checks in dimension table as well as history table.
Check the BI reports on the loaded fact and dimension table.
ETL tools
There are many Data Warehousing tools are available in the market. Here, are some most prominent one:
1. MarkLogic:
MarkLogic is a data warehousing solution which makes data integration easier and faster using an array of enterprise
features. It can query different types of data like documents, relationships, and metadata.
http://developer.marklogic.com/products
2. Oracle:
Oracle is the industry-leading database. It offers a wide range of choice of Data Warehouse solutions for both on-premises
and in the cloud. It helps to optimize customer experiences by increasing operational efficiency.
https://www.oracle.com/index.html
3. Amazon RedShift:
Amazon Redshift is Datawarehouse tool. It is a simple and cost-effective tool to analyze all types of data using standard
SQL and existing BI tools. It also allows running complex queries against petabytes of structured data.
https://aws.amazon.com/redshift/?nc2=h_m1
Every organization would like to have all the data clean, but most of them are not ready to pay to wait or not ready to wait.
To clean it all would simply take too long, so it is better not to try to cleanse all the data.
Always plan to clean something because the biggest reason for building the Data Warehouse is to offer cleaner and more
reliable data.
Before cleansing all the dirty data, it is important for you to determine the cleansing cost for every dirty data element.
To reduce storage costs, store summarized data into disk tapes. Also, the trade-off between the volume of data to be stored
and its detailed usage is required. Trade-off at the level of granularity of data to decrease the storage costs.
Summary:
ODS is the part of data warehouse architecture.where you are collecting and integrating the data and ensures the completeness and accuracy of the data,and it
provides the near data of warehouse data, شركه تنظيف منازل بالرياضنقل عفشit is like "Instant mix food" for hungry people:-D. ODS provides the data for impatient
business analyst for analysis.
مؤسسات ن Operational Data Store (ODS)
Definition - What does Operational Data Store (ODS) mean?
An operational data store (ODS) is a type of database that collects data from multiple sources for processing, after which it sends
the data to operational systems and data warehouses.
It provides a central interface or platform for all operational data used by enterprise systems and applications.
Full Extraction
Incremental Extraction
Full Extraction
The data is extracted completely from the source system. Since this extraction reflects all the data currently available on the source
system, there’s no need to keep track of changes to the data source since the last successful extraction. The source data will be
provided as-is and no additional logical information (for example, timestamps) is necessary on the source site. An example for a full
extraction may be an export file of a distinct table or a remote SQL statement scanning the complete source table.
Incremental Extraction
At a specific point in time, only the data that has changed since a well-defined event back in history will be extracted. This event
may be the last time of extraction or a more complex business event like the last booking day of a fiscal period. To identify this delta
change there must be a possibility to identify all the changed information since this specific time event. This information can be
either provided by the source data itself like an application column, reflecting the last-changed timestamp or a change table where
an appropriate additional mechanism keeps track of the changes besides the originating transactions. In most cases, using the latter
method means adding extraction logic to the source system.
Full Extraction
In this method, data is completly extracted from the source system. The source data will be provided as-is and no additional
logical information is necessary on the source system. Since it is complete extraction, so no need to track source system for
changes.
Incremental Extraction
In incremental extraction, the changes in source data need to be tracked since the last successful extraction. Only these changes in
data will be extracted and then loaded. Identifying the last changed data itself is the complex process and involve many lo
1. Full load: entire data dump that takes place the first time a data source is loaded into the warehouse
2. Incremental load: delta between target and source data is dumped at regular intervals. The last extract date is stored so that only records added
INSERT ALL
INTO suppliers (supplier_id, supplier_name) VALUES (1000, 'IBM')
INTO suppliers (supplier_id, supplier_name) VALUES (2000, 'Microsoft')
INTO customers (customer_id, customer_name, city) VALUES (999999, 'Anderson Construction', 'New York')
SELECT * FROM dual;
insert all
WEB HOSTING
PRO HOSTING
VPS & CLOUD
DOMAINS
DEDICATED SERVERS
COLOCATION
Home
>
kb
>
Linux Cat Command Usage with Examples
ABOUT US
BLOG
LOGIN
SUPPORT
CONTACT
Modifying file
File concatenation
$ cat filename
OR
$ cat > filename
OR
1) To view a file using cat command, you can use the following command.
$ cat filename
2) You can create a new file with the name file1.txt using the following cat command and you can type the text you want to insert in the file.
Make sure you type ‘Ctrl-d’ at the end to save the file.
Thanks
Now you can display the contents of the file file1.txt by using the following command.
$ cat file1.txt
Thanks
3) To create two sample files and you need to concatenate them, use the following command.
$ cat smaple1.txt
$ cat sample2.txt
Now you can concatenate these two files and can save to another file named sample3.txt. For this, use the below given command.
$ cat sample3.txt
$ cat *.txt
This is my first sample text file
5) To display the contents of a file with line number, use the following command.
$ cat -n file1.txt
3 Thanks
6) To copy the content of one file to another file, you can use the greater than ‘>’ symbol with the cat command.
7) To append the contents of one file to another, you can use the double greater than ‘>>’ symbol with the cat command.
If you need any further assistance please contact our support department. Specifying the search string as a regular expression pattern.
This will search for the lines which starts with a number. Regular expressions is huge topic and I am not covering it here. This example is just for
providing the usage of regular expressions.
By default, grep matches the given string/pattern even if it found as a substring in a file. The -w option to grep makes it match only the whole words.
Some times, if you are searching for an error in a log file; it is always good to know the lines around the error lines to know the cause of the error.
This will prints the matched lines along with the two lines before the matched lines.
This will display the matched lines and also five lines before and after the matched lines.
You can search for a string in all the files under the current directory and sub-directories with the help -r option.
grep -r "string" *
You can display the lines that are not matched with the specified search sting pattern using the -v option.
You can remove the blank lines using the grep command.
We can find the number of lines that matches the given string/pattern
We can just display the files that contains the given string/pattern.
grep -l "string" *
15. Display the file names that do not contain the pattern.
We can display the files which do not contain the matched string/pattern.
grep -L "string" *
We can make the grep command to display the position of the line which contains the matched string in a file using the -n option
The -b option allows the grep command to display the character position of the matched string in a file.
The ^ regular expression pattern specifies the start of a line. This can be used in grep to match the lines which start with the given string or pattern.
The $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which end with the given string or pattern.
If you like this post, please share it on google by clicking on the +1 button.
3 comments:
1.
Replies
1.
Ashutosh Prasad02 April, 2013 23:24
You can use 'r grep', it will re run the last grep executed.
Reply
2.
This is a fantastic site to look for useful options for unix commands and useful concepts.
One correction:
13. Displaying the count of number of matches.
grep -c "sting" file.txt
Consider the below text file as an input.
>cat file.txt
Sed command is mostly used to replace the text in a file. The below simple sed command replaces the word "unix" with "linux" in the file.
Here the "s" specifies the substitution operation. The "/" are delimiters. The "unix" is the search pattern and the "linux" is the replacement string.
By default, the sed command replaces the first occurrence of the pattern in each line and it won't replace the second, third...occurrence in the line.
Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word
"unix" with "linux" in a line.
The substitute flag /g (global replacement) specifies the sed command to replace all the occurrences of the string in the line.
Use the combination of /1, /2 etc and /g to replace all the patterns from the nth occurrence of a pattern in a line. The following sed command replaces
the third, fourth, fifth... "unix" word with "linux" word in a line.
You can use any delimiter other than the slash. As an example if you want to change the web url to another url as
In this case the url consists the delimiter character which we used. In that case you have to escape the slash with backslash character, otherwise the
substitution won't work.
Using too many backslashes makes the sed command look awkward. In this case we can change the delimiter to another character as shown in the
below example.
There might be cases where you want to search for the pattern and replace that pattern by adding some extra characters to it. In such cases & comes
in handy. The & represents the matched string.
The first pair of parenthesis specified in the pattern represents the \1, the second represents the \2 and so on. The \1,\2 can be used in the
replacement string to make changes to the source string. As an example, if you want to replace the word "unix" in a line with twice as the word like
"unixunix" use the sed command as below.
The parenthesis needs to be escaped with the backslash character. Another example is if you want to switch the words "unixlinux" as "linuxunix", the
sed command is
The /p print flag prints the replaced line twice on the terminal. If a line does not have the search pattern and is not replaced, then the /p prints that line
only once.
Use the -n option along with the /p print flag to display only the replaced lines. Here the -n option suppresses the duplicate rows generated by the /p
flag and prints the replaced lines only one time.
If you use -n alone without /p, then the sed does not print anything.
You can run multiple sed commands by piping the output of one sed command as input to another sed command.
Sed provides -e option to run multiple sed commands in a single sed command. The above output can be achieved in a single sed command as shown
below.
You can restrict the sed command to replace the string on a specific line number. An example is
The above sed command replaces the string only on the third line.
You can specify a range of line numbers to the sed command for replacing a string.
Here the sed command replaces the lines with range from 1 to 3. Another example is
Here $ indicates the last line in the file. So the sed command replaces the text from second line to last line in the file.
You can specify a pattern to the sed command to match in a line. If the pattern match occurs, then only the sed command looks for the string to be
replaced and if it finds, then the sed command replaces the string.
Here the sed command first looks for the lines which has the pattern "linux" and then replaces the word "unix" with "centos".
You can delete the lines a file by specifying the line number or a range or numbers.
The d option in sed command is used to delete a line. The syntax for deleting a line is:
Here N indicates Nth line in a file. In the following example, the sed command removes the first line in a file.
unix
fedora
debian
ubuntu
The following sed command is used to remove the footer line in a file. The $ indicates the last line of a file.
linux
unix
fedora
debian
This is similar to the first example. The below sed command removes the second line in a file.
linux
fedora
debian
ubuntu
The sed command can be used to delete a range of lines. The syntax is shown below:
> sed 'm,nd' file
Here m and n are min and max line numbers. The sed command removes the lines from m to n in the file. The following sed command deletes the lines ranging from
2 to 4:
linux
ubuntu
Use the negation (!) operator with d option in sed command. The following sed command removes all the lines except the header line.
linux
ubuntu
unix
fedora
debian
Here the sed command removes lines other than 2nd, 3rd and 4th.
You can specify the list of lines you want to remove in sed command with semicolon as a delimiter.
unix
fedora
debian
The ^$ indicates sed command to delete empty lines. However, this sed do not remove the lines that contain spaces.
In the following examples, the sed command deletes the lines in file which match the given pattern.
linux
fedora
debian
^ is to specify the starting of the line. Above sed command removes all the lines that start with character 'u'.
fedora
debian
ubuntu
$ is to indicate the end of the line. The above command deletes all the lines that end with character 'x'.
linux
unix
fedora
ubuntu
14. Delete lines starting from a pattern till the last line
linux
unix
Here the sed command removes the line that matches the pattern fedora and also deletes all the lines to the end of the file which appear next to this matching line.
linux
unix
fedora
debian
Here $ indicates the last line. If you want to delete Nth line only if it contains a pattern, then in place of $ place the line number.
Note: In all the above examples, the sed command prints the contents of the file on the unix or linux terminal by removing the lines. However the sed command
does not remove the lines from the source file. To Remove the lines from the source file itself, use the -i option with sed command.
If you dont wish to delete the lines from the original source file you can redirect the output of the sed command to another file.
Recommended Reading:
It is a central component of a multi-dimensional model which contains the measures to be analysed. Facts are related to
dimensions.
Additive Facts
Semi-additive Facts
Non-additive FactsData mining is a analytical process designed to explore data.There are four main types in data mining.
Regression (predictive)
Association Rule Discovery (descriptive)
Classification (predictive)
Clustering (descriptive)
Business Intelligence is conversion for data into usable information for companies.
Data Mining is commonly defined as the analysis of data for relationships and
patterns that have not previously been discovered by applying statistical and
mathematical methods. Business intelligence (BI) describes processes and
procedures for systematically gathering, storing, analyzing, and providing access
to data to help enterprises in making better operative and strategic business
decisions. BI applications include the activities of decision support systems,
management information systems, query and reporting, online analytical
processing (OLAP), statistical analysis, forecasting, and data mining.
A Key Performance Indicator (KPI) is a measurable value that demonstrates how effectively a company is achieving key business objectives.
Organizations use KPIs to evaluate their success at reaching targets. Learn more: What is a key performance indicator (KPI)?
The second tier consists of MicroStrategy Intelligence Server, which executes your
reports against the data warehouse. For an introduction to Intelligence Server,
see Processing your data: Intelligence Server.
If MicroStrategy Developer users connect via a two-tier project source (also called a direct
connection), they can access the data warehouse without Intelligence Server. For more
information on two-tier project sources, see Tying it all together: projects and project sources.
The third tier in this system is MicroStrategy Web or Mobile Server, which delivers the reports
to a client. For an introduction to MicroStrategy Web, see Administering MicroStrategy Web
and Mobile.
The last tier is the MicroStrategy Web client or MicroStrategy Mobile app, which
provides documents and reports to the users.
• Sharing objects
• Sharing data
• Managing the sharing of data and objects in a controlled and secure environment
3-Tier Architecture (Server Mode)
In the 3-tier architecture the client connects to the metadata and the data
Warehouse through the Intelligence Server. Following diagram shows the 3-tier
architecture.
Multi-Tier Architecture (Server Mode)
In the Multi-tier architecture the client connects to the Intelligence Server (IS)
through the Web server fr
Multi-Tier Architecture (Server Mode)
In the Multi-tier architecture the client connects to the Intelligence Server (IS)
through the Web server from the web browser. The IS in turn connects to the
metadata and the data Warehouse. Following diagram shows the 3-tier
architecture.
Difference between 2, 3, 4 tier connection?
In 2 tier architecture, the Micro Strategy Desktop itself queries against the Data
warehouse and the Metadata without the Intermediate tier of the Intelligence
server.
The 3 Tier architecture comprises an Intelligence server between Micro Strategy
Desktop and the data Warehouse and the Metadata.
The 4 tier architecture is same as 3 tier except it has an additional component of
Microstate Web.
Intelligence Server is the architectural foundation of the Micro Strategy platform.
It serves as a central point for the Micro Strategy metadata so you can manage
thousands of end user requests.
You are very limited in what you can do with 2-tier architecture. Things like
clustering, mobile, distribution services, report services, OLAP services,
scheduling, governing, I cubes, and project administration are only available via
Intelligence Server.
MicroStrategy Intelligence Server
MicroStrategy Intelligence Server is an analytical server optimized for
enterprise querying, reporting, and OLAP analysis. The important functions
of MicroStrategy Intelligence Server are:
• Sharing objects
• Sharing data
Ad by MongoDB
Automated MongoDB service built for the most sensitive workloads at any scale. Get started free.
3 Answers
A primary key is a special constraint on a column or set of columns. A primary key constraint ensures
that the column(s) so designated have no NULL values, and that every value is unique. Physically, a
primary key is implemented by the database system using a unique index, and all the columns in the
primary key must have been declared NOT NULL. A table may have only one primary key, but it may
be composite (consist of more than one column).
A surrogate key is any column or set of columns that can be declared as the primary key instead of a
"real" or natural key. Sometimes there can be several natural keys that could be declared as the
primary key, and these are all called candidate keys. So a surrogate is a candidate key. A table could
actually have more than one surrogate key, although this would be unusual. The most common type
of surrogate key is an increment integer, such as an auto_increment column in MySQL, or a
sequence in Oracle, or an identity column in SQL Server.
59Natural Keys
If a key’s attribute is used for identification independently of the database scheme, it is called a
Natural Key. In layman’s language, it means keys are natural if people use them, for example, SSN,
Invoice ID, Tax ID, Vehicle ID, person unique identifiers, etc. The attributes of a natural key always
exist in real world.
Pros:
Cons:
While using strings, joins are a bit slower as compared to the int data-type joins, storage is
more as well. Since storage is more, less data-values get stored per index page. Also, reading
strings is a two step process in some RDBMS: one to get the actual length of the string and
second to actually perform the read operation to get the value.
Locking contentions can arise while using application driven generation mechanism for the
key.
You can’t enter a record until the value is known since the value has some meaning.
Surrogate Keys
Surrogate keys have no “business” meaning and their only purpose is to identify a record in the
table. They are always generated independently of the current row data. Their generation can be
managed by the database system or the server itself.
Pros:
Business Logic is not in the keys.
Small 4-byte key (the surrogate key will most likely be an integer and SQL Server for example
requires only 4 bytes to store it, if a bigint, then 8 bytes).
No locking contentions because of unique constraint (this refers to the waits that get
developed when two sessions are trying to insert the same unique business key) as the
surrogates get generated by the Database and are cached.
Cons:
RecommendedAll
Sometimes the primary key is made up of real data and these are normally referred to as natural
keys, while other times the key is generated when a new record is inserted into a table.
When a primary key is generated at runtime, it is called a surrogate key. A surrogate key is typically a
numeric value.
A surrogate key in a database is a unique identifier for either an entity in the modeled world or an
object in the database. The surrogate key is not derived from application data, unlike a natural (or
business) key which is derived from application data.
The surrogate is internally generated by the system but is nevertheless visible to the user or
application.The value contains no semantic meaning
370 Views
Upvote
urrogate keys are keys that have no business meaning and are solely used
to identify a record in the table.
Such keys are either database generated (example: Identity in SQL Server,
Sequence in Oracle, Sequence/Identity in DB2 UDB etc.) or system
generated values (like generated via a table in the schema).
Surrogate keys are keys that have no business meaning and are solely used
to identify a record in the table.
Such keys are either database generated (example: Identity in SQL Server,
Sequence in Oracle, Sequence/Identity in DB2 UDB etc.) or system
generated values (like generated via a table in the schema).
Surrogate key = Artificial key generated internally that has no real
meaning outside the Db (e.g. a UniqueIdentifier or Int with Identity
property set etc.). Implemented in SQL Sevrver by Primary Key
Constraints on a column(s).
Conceptual Model
The main aim of this model is to establish the entities, their attributes, and
their relationships. In this Data modeling level, there is hardly any detail
available of the actual Database structure.
The 3 basic tenants of Data Model are
Entity: A real-world thing
Attribute: Characteristics or properties of an entity
Relationship: Dependency or association between two entities
For example:
Customer and Product are two entities. Customer number and name
are attributes of the Customer entity
Product name and price are attributes of product entity
Sale is the relationship between the customer and product
Conclusion
Data modeling is the process of developing data model for the data
to be stored in a Database.
Data Models ensure consistency in naming conventions, default
values, semantics, security while ensuring quality of the data.
Data Model structure helps to define the relational tables, primary
and foreign keys and stored procedures.
There are three types of conceptual, logical, and physical.
The main aim of conceptual model is to establish the entities, their
attributes, and their relationships.
Logical data model defines the structure of the data elements and set
the relationships between them.
A Physical Data Model describes the database specific
implementation of the data model.
The main goal of a designing data model is to make certain that data
objects offered by the functional team are represented accurately.
The biggest drawback is that even smaller change made in structure
require modification in the entire application.
Prev
Report a Bug
Next
ROLAP - detailed data is stored in a relational database in 3NF, star, or snowflake form. Queries must summarize data on the fly.
MOLAP - data is stored in multidimensional form - dimension and facts stored together. You can think of this a a persistent cube. Level of
detail is determined by the intersection of the dimension hierarchies.
HOALP - data is stored using a combination of relational and multi-dimensional storage. Summary data might persist as a cube, while detail
data is stored relationally, but transitioning between the two is invisible to the end-user.
ROLAP
Advatages -
The advantages of this model is it can handle a large amount of data and can leverage all the functionalities of the relational
database.
Disadvantages -
The disadvantages are that the performance is slow and each ROLAP report is an SQL query with all the limitations of the
genre. It is also limited by SQL functionalities
ROLAP vendors have tried to mitigate this problem by building into the tool out-of-the-box complex functions as well as
providing the users with an ability to define their own functions.
MOLAP
Advantages -
The advantages of this mode is that it provides excellent query performance and the cubes are built for fast data retrieval. All
calculations are pre-generated when the cube is created and can be easily applied while querying data.
Disadvantages -
The disadvantages of this model are that it can handle only a limited amount of data. Since all calculations have been pre-built
when the cube was created, the cube cannot be derived from a large volume of data. This deficiency can be bypassed by
including only summary level calculations while constructing the cube. This model also requires huge additional investment as
cube technology is proprietary and the knowledge base may not exist in the organization.
Conclusion:
Which one to opt between ROLAP and MOLAP depends upon the performance and complexity
of the query. MOLAP becomes the choice of a user if it wants the faste
Slice:-
Slice operation performs a selection on one dimension of the given cube, thus creates subset a cube. Below example depicts
how slice operation works-
Dice:-
Dice operation performs a selection on two or more dimension from a given cube and creates a sub-cube. Below example
depicts how slice operation works-
Roll-up:-
The roll-up operation (also called drill-up or aggregation operation) performs aggregation on a data cube, either by climbing
up a concept hierarchy for a dimension or by climbing down a concept hierarchy, i.e. dimension reduction. Below example
depicts how slice operation works-
Drill-down:-
Drill-down is the reverse operation of roll-up. It allows users to navigate among different levels of data i.e. most summarized
(up) to most details (down). Below example depicts how slice operation works-
Drill-down refers to the process of viewing data at a level of increased detail,
while roll-up refers to the process of viewing data with decreasing detail.Pivot:-
Pivot also known as rotation changes the dimensional rotation of the cube, i.e. rotates the axes to view the data from different
perspectives. The below cubes shows 2D representation of Pivot-
Categories
o
o
o
o
o
o
About
This SQL Tutorial provides you a summary of some of the most common
Oracle Scalar Functions. For this lesson’s exercises, use this link.
Oracle Scalar Functions allow you to perform different calculations on data values.
These functions operate on single rows only and produce one result per row. There
are different types of Scalar Functions, this tutorial covers the following:
Note: this tutorial focuses on Oracle Scalar Functions; for more details about
Group Functions, use this link.
Oracle String Functions
3 -- Result: 2
Returns the number of LENGTH
1 SELECT LENGTH('hello') characters of the specified
string expression.
2 FROM dual
3 -- Result: 5
3 -- Result: 'h$llo'
Returns the reverse order of a REVERSE
1 SELECT REVERSE('hello') string value.
2 FROM dual
3 -- Result: 'olleh'
Returns part of a text. SUBSTR
1 SELECT SUBSTR('hello' , 2,3)
2 FROM dual
3 -- Result: 'ell'
Returns a character expression LOWER
1 SELECT LOWER('HELLO') after converting uppercase
character data to lowercase.
2 FROM dual
3 -- Result: 'hello'
3 -- Result: 'HELLO'
Returns a LAST_DAY
1 SELECT LAST_DAY('15-AUG-2014') date representing
the last day of
2 FROM DUAL the month for
specified date.
3 -- Result: '31-AUG-2014'
3 -- Result: 60
5 SELECT ROUND(59.1)
6 FROM dual
7 -- Result: 59
0 SELECT TO_CHAR(1506)
3
FROM dual
0
4 -- Result : The string value '98'
0
5
SELECT TO_CHAR(1507, '$9,999')
0
FROM dual
6
-- Result : The string value '$1,507'
0
7
1
1
Converts a TO_DATE
SELECT TO_DATE('01-MAY-2015') string value to a
1
date
FROM dual
2
-- Result : The date value '01-MAY-2015'
3
SELECT TO_DATE('01/05/2015' ,
4
'dd/mm/yyyy')
5
FROM dual
6
-- Result : The date value : '01-MAY-2015'
Converts a TO_NUMB
1 SELECT TO_NUMBER('9432') string value to a ER
number
2 FROM dual
Share this:
Google
Facebook
LinkedIn
Twitter
Print
Email
Leave a Reply
Login
# Check Point
1) Data Integrity
2) Field Validations
3) Constraints
7
Does execution of the Stored Procedure
fires the required triggers
5) Triggers
6) Indexes
7) Transactions
8) Security
9) Performance
8 Is Database normalized
10
) Miscellaneous
1
Are the log events added in database for
all login events
11
) SQL Injection
12
) Backup and Recovery
Advertisements
NOV
21
Structure validation 1. Validate the source and target table structure against
corresponding mapping doc.
Data Consistency 1. The data type and length for a particular attribute
Issues may vary in files or tables though the semantic definition
is the same.
Example: Account number may be defined as:
Number (9) in one field or table and Varchar2(11) in
another table
Null Validation Verify the null values where "Not Null" specified for
specified column.
Duplicate check 1. Needs to validate the unique key, primary key and
any other column should be unique as per the business
requirements are having any duplicate rows.
EXPOSURE_TYPE, EXPOSURE_OPEN_DATE,
EXPOSURE_CLOSED_DATE, EXPOSURE_STATUS,
PAYMENT
DATE Validation Date values are using many areas in ETL development
for:
Complete Data 1. To validate the complete data set in source and target
Validation table minus query is best solution.
(using minus and
intersect)
2. We need to source minus target and target minus
source.
Some Useful test 1. Verify that extraction process did not extract duplicate
scenarios data from the source (usually this happens in repeatable
processes where at point zero we need to extract all
data from the source file, but the during the next
intervals we only need to capture the modified, and new
rows.)
Add a comment
ETL Testing
Classic
Flipcard
Magazine
Mosaic
Sidebar
Snapshot
Timeslide
1.
2.
NOV
21
Upcoming Posts
2. How to test the complete data set during SCD type 2 implementation?
View comments
3.
NOV
21
Structure validation 1. Validate the source and target table structure against
corresponding mapping doc.
Data Consistency 1. The data type and length for a particular attribute
Issues may vary in files or tables though the semantic definition
is the same.
Example: Account number may be defined as:
Number (9) in one field or table and Varchar2(11) in
another table
Null Validation Verify the null values where "Not Null" specified for
specified column.
Duplicate check 1. Needs to validate the unique key, primary key and
any other column should be unique as per the business
requirements are having any duplicate rows.
DATE Validation Date values are using many areas in ETL development
for:
Complete Data 1. To validate the complete data set in source and target
Validation table minus query is best solution.
(using minus and
intersect)
2. We need to source minus target and target minus
source.
Some Useful test 1. Verify that extraction process did not extract duplicate
scenarios data from the source (usually this happens in repeatable
processes where at point zero we need to extract all
data from the source file, but the during the next
intervals we only need to capture the modified, and new
rows.)
Add a comment
Loading
Dynamic Views theme. Powered by Blogger.