You are on page 1of 41

Lecture 4

Data representation
Storing information
bit can contain only two values: 0 and 1
2 bits can contain four values: 00, 01, 10, 11

and so on:

8 bits can store 256 different data


Howto store wholenumbers
We can store it in 8 bit number, starting from 0 to 255
0 - 0000 0000
255 - 1111 1111
Howto store negative and positive ints in 8 bits
Set first bit as a sign, so 0 means positive, and 1 is negative

0000 0100 is 4

1000 0100 is -4

But… it will have two zeros:

1000 0000 and 0000 0000 and they are not equal!

It stores numbers from -127 to 127, totally 255 numbers


53 0011 0101 For subtraction of numbers this approach will give different
+
−37 1010 0101 result!
16 1101 1010
Two's complementary way
In negative numbers store complement of number

0000 0100 - 4
11100000 - -32
11010001 - -47
It store 256 numbers, from -128 to 127
Two's complementary way
There are several ways of converting negative decimal number into binary form:

1) -128 64 32 16 8 4 2 1
1 1 0 1 1 0 1 0
-128+64+16+8+2= -38

-128 64 32 16 8 4 2 1
1 1 1 1 1 0 1 0

-128+64+32+16+8+2=-6
Two's complementary way
2)For instance, we want to convert -34 into binary form. To do it we need to
identify binary form of +34.

128 64 32 16 8 4 2 1
0 0 1 0 0 0 1 0

Then we should replace all zeros with ones and all ones with zeros.

128 64 32 16 8 4 2 1
1 1 0 1 1 1 0 1
+ 1101 1101
Then we should add 1 bit to gotten binary number: 0000 0001 Answer:1101 1110
1101 1110
Two's complementary way
3)For instance, we want to convert -34 into binary form. To do it we need to
identify binary form of +34.

128 64 32 16 8 4 2 1
0 0 1 0 0 0 1 0

Then we should write all bits as they appear from right side till the first
occurrence of one(included). Then all the other bits as in previous approach
should be inverted.
0 0 1 0 0 0 1 0
1 1 0 1 1 1 1 0 Answer:1101 1110
Howto represent real numbers inbinary
(11.1875 )10 =(?.?)2
(11)10 =(1011)2
(.1875)10 =(.?)2

Number Number after Number


(∙) before (.)
0.1875*2 0.375 0.375 0
0.375*2 0.750 0.750 0 (.1875)10 =(.0011)2
0.75*2 1.50 0.500 1 (11.1875)10 =(1011.0011

0.50*2 1.00 0.000 1


Storing text
To store text early day computer engineers created ASCII code
ASCII code was 8 bit, so it means that it can store 256 different characters

ASCII contained different symbols, english alphabet (uppercase and lowercase


letters)
ASCII code
ASCII coding is standard coding in computers.
65 - 93 for capital letters
97 - 123 for lower case letters
48 - 58 for digits
ASCII is stored in one byte memory

A = 65 = 01000001
z = 123 = 01111011
Problem: Howto store non-English characters
Early approach: every alphabet used it's own encodings.

Problem: How to store text that contains letters from different alphabets
Different encodings
Windows-1250 for Central European languages that use Latin script, (Polish,
Czech, Slovak, Hungarian, Slovene, Serbian, Croatian, Romanian and Albanian)
Windows-1251 for Cyrillic alphabets
Windows-1252 for Western languages
Windows-1253 for Greek
Windows-1254 for Turkish
Windows-1255 for Hebrew
and etc.
Problem: World alphabets
There are many alphabets that are used in the world:

Latin (spanish, german, finnish), Arabic (Persian), Hebrew, Chinese


hieroglyphs, Korean, Japanese (Hiragana, Katakana), Cyrillic (Kazakh,Tatar,
Serbian, Ukrainian), Tamil, Armenian, Mongolian, Greek, georgian.

How to represent all of them in one document?


Problem: howto write following text:
‫ديجصخش‬
‫בוט םדא‬
좋은 사람
καλό πρόσωπο
មនុស្សម្នា ក់ដល្
நல் ல நபர
Using different encoding for each script won’t allow you to write text with
different scripts
Unicode
All symbols stored in one table.
Modern version contains 28 ancient and historic scripts (alphabets) and 72
modern scripts
Contains 110,000 characters
Can store text containing different scripts
UTF-8 what is it?
UTF-8 (UCS Transformation Format—8-bit) is a variable-width encoding that
can represent every character in the Unicode character set.

It is compatible with ASCII

● means any file stored by UTF-8 but from symbols that are present in
ASCII, will be same as stored by UTF-8
UTF-8
a = 65 = 01000001

¢ = 11000010 10100010

€ = 11100010 10000010 10101100

欽 = 6B3D
Use Unicode symbols in Python
Put following to the first line of python code
# -*- coding: utf-8 -*-

print u“қазақша”
Images and colors
Image is a set of pixels.

Pixel is one cell on screen, which contains only one color.

Image is stored in sequence of pixels, which is represented by its colors


Howto storecolor
Approach #1: Combine three colors: Cyan, Magenta, Yellow. Used in printers.
Approach #2: Combine three colors: Red, Green, Blue. Used in displays.

In computers it mostly saves every color


CMYk vsRGB
Combination of Cyan+Magenta+Yellow gives black, and if there is no color it
gives white

Whilst combination of Red+Green+Blue gives white, and if there is no color it


gives black

So why CMY is used in printing and why RGB is used in monitors?


Image
Image is a sequence of pixels
Pixel is one cell on screen of monitor, it displays color.

Color is a combination of three colors (RED, GREEN, BLUE)

Bitmap - is a map of pixels


Bit depth
The amount of colours that can be represented in a bitmapped image is
dictated by the bit depth.

Bit depth Available colours


8 bits per pixel 256 (2^8)
16 bits per pixel 65,536 (2^16)

24 bits per pixel 16,777,216 (2^24)


PBM
PBM file format to represent bitmap images.

So 1 means white, and 0 means black


PNM
PNM file format to represent color images
Vector graphics
Vector graphics are stored as a list of attributes. The attributes are used by
the computer to create the graphic. Rather than storing the data for each
pixel, the computer will generate an object by looking at its attributes

It saves geometrical information about image.


Raster (Bitmap) vs Vector graphics
Raster vs Vector
● Loads faster: Raster.
● Can be zoomed without loss of quality: Vector
● Takes less memory for simple figures: Vector
● Used in typography: Vector
● Best for real-world images:Raster
● try to understand why?
SVG (Scalable vectorgraphics)
SVG most common vector graphics format.

Used in web pages and in mobile applications

Format that is based on XML, and can create vector graphics.


Plain text data/file formats
That are standard data formats that are stored in plain form

This file formats are used to interchange data in web, applications and etc.

● JSON
● XML
● HTML
● CSV
XML: extensible markuplanguage
<group name=”D03”>
<student id=”332”>John Black</student>
<student id=”321”>Mike Pawn</student>
<student id=”320”>Jeremy King</student>
</group>
JSON: javascript object notation
[ { name: “A04”,
students:
[ {id:”332”,name:“John Black”},
{id:”322”,name:“Jeremy King”} ]
},{ name: “B04”,
students:
[ {id:”332”,name:“John Black”},
{id:”322”,name:“Jeremy King”} ]
}
]
CSV
Tabular data saved in CSV format

name,surname,group
steve,jobs,A03
michael,phelps,B03

Can be opened by Excel, used for sending tabular information


Have questions?
What were these texts?
They were document formats
Who uses them?
Developers use it, to send data between different applications
Can we use other format or create format by ourselves?
Yes, but this are standard formats, so everyone knows it, and also there
are tons of libraries working with them
HTML
Hyper Text Markup Language: is markup language that stores how
elements are placed on a web-page

<html>
<body>
<h1>Header</h1>
</body>
</html>
HTML
<p>Paragraph</p>
<h1>Header</h1>
<img src="1.jpg"/>
<ul><li>Item</li><li>Item</li><li>Item</li></ul>
<a href="1.html">Link to item</a>
Browsers
Web browsers retrieve data (mostly HTML code) from server and displays it on
screen

Nowadays browsers are free, but before people had to buy browsers
History of browser
1990 - World Wide Web browser (later renamed to Nexus)

1993 - Mosaic (later called Netscape)

1995 - Internet Explorer, as answer to Netscape

1996 - Opera

2004 - Firefox 1.0. on the base of Netscape

2003 - Apple’s Safari

2008 - Google’s Chrome


Usage ofbrowsers

You might also like