Professional Documents
Culture Documents
1. Fixed-length structure
2. Variable-length structure
3. Linked Structure
Fixed-Length Storage
• Record-Oriented
– In fixed-length storage each line of print is viewed as a
record, where all records have the same length, i.e.
each record accommodate the same number of
characters. Assume our record has length 80 unless
otherwise stated.
• Suppose the input consists of a program. Using a
record-oriented, fixed length storage medium,
the input data will appear in memory as shown in
figure, where we assume that 200 is the address
of the first character of the program.
Program
READ *,J,K
IF(J,LE,K)
PRINT *,J,K
ELSE
PRINT *,K,J
EFNDIF
STOP
END
Record stored sequentially in
computer
C P ROG R AM P R I N T I NG T WO
READ *,J,K
208 290
IF(J,LE,K) 300
PRINT *,J,K
ELSE
PRINT *,K,J
EFNDIF
STOP
END
Record stored sequentially in
computer
I F ( J , L E , K ) T H E N
READ *,J,K
840 850
IF(J,LE,K) 860
PRINT *,J,K
ELSE
PRINT *,K,J
EFNDIF
STOP
END
Advantages
• Advantages
– The ease of accessing data from any given record
– The ease of updating data in any given record (as
long as the length of the new data does not
exceed the record length)
• Disadvantages
– Time is wasted reading an entire record if most of
the storage consists of blank spaces.
– Certain records may require more space that
available.
– When the correction consists of more or fewer
characters than the original text, changing a
misspelled word requires the entire record to be
changed.
Variable-Length Storage with
Fixed Maximum
• Although string may be stored in fixed-length
memory location as above, there are advantages
in knowing the actual length of each string; one
does not have to read the entire record when the
string occupies only the beginning part of the
memory location.C PROGRAM PRINTING TWO INTEGERS IN
• The storage of INCREASING
variable-length
ORDER strings in memory
cells with fixed lengths can be done in two
general ways:
READ *,J,K
1. One can useIF(J,LE,K)
a marker that is two $$ signs, to signal
the end of the string.
PRINT *,J,K
2. One can list ELSE
the length of the string as an additional
PRINT
item in the pointer *,K,J .
array
EFNDIF
STOP
END
Linked Storage
• Computer must be able to correct and
modify the printed matter, which usually
means deleting, changing, and inserting
words, phrases, sentences and even
paragraphs in the text. The fixed-length
memory cells do not easily lend
themselves to these operations. For this
reason strings are stored by means of
linked lists.
Linked List
A B C ∅
Head
dat pointe
a r
Linked Storage
• String may b used in a linked list as follows. Each
memory cell is assigned one character or a fixed
number of characters, and a link contained in the cell
gives the address of the cell containing the next
character or goup of characters in the string. For
example:
To be or not to be, that is the question.
Linked Storage
T O B
T O B E O R
• Indexing (find())
– Indexing refers to finding the location of the substring.
find(string)
find(string, positionFirstChar)
find(string, positionFirstChar, len)
rfind()-(Find last occurrence of string or substring)
• Concatenation
– String concatenation is the operation of joining two character strings end to end.
For example, the strings "snow" and "ball" may be concatenated to give "snowball".
s.replace(4,3,"x");
• Erase
s.erase(4,5);
s.erase(4);
Question
A. A text T and a pattern P are in
memory. Write an algorithm which
B.
A.[Find
[Findthe
theindex
indexof
ofP]
P]Set
SetK=Find(T,P)
K=Find(T,P)
deletes
Repeat whileevery
Repeatwhile k=!0
k=!0 occurrence of P in T
a)
a) [Replace
[Delete PPfrom
fromT]
Q]
Set
SetT=Replace(T,P,Q)
T=Delete(T, Find(T,P),Length(P))
a)
a) [Update
[Updateindex]
index]Set
SetK=
K=Find(T,P)
Find(T,P)
B. A text T and a pattern P and Q are
[End
[Endof
ofloop]
loop]
Writ
WritTT
in memory. Write an algorithm
Exit
Exit
which replaces every occurrence of
P in T by Q.
Pattern matching Algorithm
• Given strings T (text) and P(pattern), the
pattern matching problem consists of finding a
substring of T equal to P
• T: “the rain in spain stays mainly on the plain”
• P: “n th”
• We assume that the length of pattern does not
exceed the length of text.
• Applications:
– Text editors
– Web search engines (e.g. Google)
The Brute Force Algorithm
• Check each position in the text T to
see if the pattern P starts in that posi
tion
T: a n d r e w T: a n d r e w
P: r e w P: r e w
P moves 1 char at a time through T
The Brute Force Algorithm
• The first pattern matching algorithm is the one in which we compare a given
pattern P with each of the substring of T, moving from left to right, until we
get a match.
• Let Wk denote the substring of T having the same length as P and beginning
with the Kth character of .
Wk = Substring(T,K,LENGTH(P))
• First we compare P, character by character, with first substring W1
• If all the characters are the same, then P= W1 and so P appears in T and
Index(T,P)=1.
• If some characters of p is not the same as corresponding character W1 . Then
P is not equal to W1 and we can move on to the next substring W2
• The process stops when we find the match of P with some substring Wk and
so P appears in T and Index(T,P)=K, or
• We exhaust all the Wk with no match that means P does not appear in T.
• The maximum value of substring K is equal to Length(T)-Length(P) +1.
The Brute Force Algorithm
• P and T are strings with length R and S, respectively, and are stored
as array with one character per element. The algorithm finds the
Index of P in T
1. [Initialize] Set K= 1 and MAX=S-R+1
2. Repeat Step 3 to 5 while K<=MAX
3. Repeat for L=1 to R [Test each character of P]
If P[L]!= T[K+L-1], then: Go to step 5.
[End of inner loop]
4. [Success] Set INDEX=K, and Exit
5. Set K=K+1
[End of Step 2 outer loop]
6. [Failure] Set INDEX=0
7. Exit.
Analysis
• Brute force pattern matching runs in time O(mn) in the worst case.