You are on page 1of 19

Writing Algorithm

1. Accepting the input file


2. Split the file into 4KB pieces
3. Inform the disk header to write these pieces into available free block numbers
4. Commit the enntry in the FAT with block number details

cd c:\windows\System32
certutil.exe -hashfile "e:\Hadoop Administration\Others\Blk_1" MD5
Reading Algorithm

1. Contact FAT and check whether file entry is available or not


2. Take Block numbers and request disk header to read the blocks
3. Header will read the blocks continuously
4. Once all blocks are read, arrange them in the order as per FAT Block order and give
Problems with current industry hardware?

1. Due to smaller block size (4K), lot space is getting wasted for metadata
Ex: 1024GB - 924GB = 100GB (FAT - almost 10% of your total disk size)
2. Data + Metadata are sitting in the same disk
3. Seek Time
4. Data Size (Limited to disk capacity)
Normal disk format:

Tracks
Blocks
Sectors
if FAT is not existing, create Empty FAT (Metadata file)
if FAT is already existing, truncate FAT (Metadata file)
Reserving space for FAT
Create Disk header number for the disk

HDFS format:

No tracks/blocks/sectors because it is a logical layer


Create fsimage & edits
Create namespaceID for HDFS
Storage in normal hard disk

Sl. No File Name Path Ownership Permissions

u: venkat u: RWX
1 f1.txt /data g: admingroup g: R-X

u: venkat u: RWX
2 f2.txt /data g: admingroup g: R-X

-------
-------
-------

u: venkat u: RWX
10 f10.txt /data g: admingroup g: R-X

u: venkat u: RWX
11 f11.txt /data g: admingroup g: R-X
Timestamps Size Block Numbers
C: 2/14/2018 07:30:00
M: 2/14/2018 07:35:00 20K B1, B2, B3, B4, B5
A: 2/14/2018 07:40:00

C: 2/14/2018 08:30:00
M: 2/14/2018 08:35:00 12K B6, B7, B8
A: 2/14/2018 08:40:00

C: 2/14/2018 09:30:00
M: 2/14/2018 09:35:00 16K B51, B52, B53, B54
A: 2/14/2018 09:40:00

C: 2/14/2018 09:30:00 B10001, B10002,


M: 2/14/2018 09:35:00 16K B11003, B11004
A: 2/14/2018 09:40:00
Storage in Master FAT

Sl. No File Name Path Ownership Permissions

u: venkat u: RWX
1 f1.txt /data g: admingroup g: R-X

u: venkat u: RWX
2 f2.txt /data g: admingroup g: R-X

-------
-------
-------

u: venkat u: RWX
10 f10.txt /data g: admingroup g: R-X
D1

20G

8000 320
Timestamps Size Replication Block Numbers
C: 2/14/2018 07:30:00 D1:BLK0001, D2:BLK0002,
M: 2/14/2018 07:35:00 256M 3 D3:BLK0003, D1:BLK0004,
A: 2/14/2018 07:40:00

C: 2/14/2018 08:30:00 D3:BLK0006, D1:BLK0007,


M: 2/14/2018 08:35:00 192M 3 D2:BLK0008
A: 2/14/2018 08:40:00

D3:BLK00051,
C: 2/14/2018 09:30:00 D1:BLK00052,
M: 2/14/2018 09:35:00 256M 3 D2:BLK00053,
A: 2/14/2018 09:40:00 D3:BLK00054

Metadata Disk
D2 D3

120G 510G =650G

1920 8160
Storage in HDFS

Sl. No File Name Path Ownership Permissions

u: venkat u: RWX
1 f1.txt /data g: admingroup g: R-X

u: venkat u: RWX
2 f2.txt /data g: admingroup g: R-X

-------
-------
-------

u: venkat u: RWX
10 f10.txt /data g: admingroup g: R-X
D1
Timestamps Size Replication Block Numbers
C: 2/14/2018 07:30:00 D1:BLK0001, D2:BLK0002,
M: 2/14/2018 07:35:00 256M 3 D3:BLK0003, D1:BLK0004,
A: 2/14/2018 07:40:00

C: 2/14/2018 08:30:00 D3:BLK0006, D1:BLK0007,


M: 2/14/2018 08:35:00 192M 3 D2:BLK0008
A: 2/14/2018 08:40:00

D3:BLK00051,
C: 2/14/2018 09:30:00 D1:BLK00052,
M: 2/14/2018 09:35:00 256M 3 D2:BLK00053,
A: 2/14/2018 09:40:00 D3:BLK00054

Metadata Disk
D2 D3
Sl.No File name Path Owner Persmissions Timestamp Size Replication
Edits
1 f3.txt /project1 venkat:venkatgroup -rwxrw-rw- MAC 50M 3

FSImage
1 f1.txt /project1 venkat:venkatgroup -rwxrw-rw- MAC 100M 3
2 f2.txt /project1 venkat:venkatgroup -rwxrw-rw- MAC 256M 2

Block Report
B1 DN1 DN2 DN4
B2 DN2 DN3 DN3
B3 DN1 DN2
B4 DN2 DN3
B5 DN3 DN4
B6 DN4 DN1
B7 DN1 DN3 DN4
Blocks

B7

B1, B2
B3, B4, B5,B6
NN
192.168.56.161

sda OS
sdb /mnt/disk1 50G
sdc /mnt/disk2 50G
sdd /mnt/disk3 50G

DN1 DN2
192.168.56.162 192.168.56.163

sda OS OS
sdb /mnt/disk1 50G 50G
sdc /mnt/disk2 50G 50G
sdd /mnt/disk3 50G 50G
DN3
192.168.56.164

OS
50G
50G
50G

You might also like