Professional Documents
Culture Documents
3 Lexical analysis
In case of Python
1.3.3 Identifiers
1.3.3.1 Keywords
1.3.3.2 Reserved classes of identifiers
1.3.4 Literals
1.3.4.1 String literals
1.3.4.1.1 String literal concatenation
1.3.4.1.2 Formatted string literals
1.3.4.2 Bytes literals
1.3.4.3 Numeric literals
1.3.4.4 Integer literals
1.3.4.5 Floating point literals
1.3.4.6 Imaginary literals
1.3.5 Operators
1.3.6 Delimiters
https://en.wikipedia.org/wiki/Source_lines_of_code
1.3.1.1 Logical lines
The end of a logical line is represented by the token NEWLINE.
Statements cannot cross logical line boundaries except where NEWLINE is
allowed by the syntax (e.g., between statements in compound statements).
A logical line is constructed from one or more physical lines by
following the explicit or implicit line joining rules.
All of these forms can be used equally, regardless of platform. The end
of input also serves as an implicit terminator for the final physical line.
When embedding Python, source code strings should be passed to Python
APIs using the standard C conventions for newline characters (the \n character,
representing ASCII LF, is the line terminator).
1.3.1.3 Comments
A comment starts with a hash character (#) that is not part of a string
literal, and ends at the end of the physical line.
A comment signifies the end of the logical line unless the implicit
line joining rules are invoked.
Comments are ignored by the syntax.
the first group of this expression names the encoding of the source
code file.
The encoding declaration must appear on a line of its own.
If it is the second line, the first line must also be a comment-only
line.
In addition, if the first bytes of the file are the UTF-8 byte-
order mark (b'\xef\xbb\xbf'), the declared file encoding is UTF-8 (this is
supported, among others, by Microsoft’s notepad).
If an encoding is declared, the encoding name must be recognized
by Python. The encoding is used for all lexical analysis, including string
literals, comments and identifiers.
a = 33
b = 200
if b \#A line ending in a backslash cannot carry a
comment.[Execution terminates here]
> a:
a = 33
b = 200
if b \#A backslash does not continue a comment.
[Execution terminates here]
> a:
1.3.1.8 Indentation
Leading whitespace (spaces and tabs) at the beginning of a logical line
is used
to compute the indentation level of the line, which in turn is used to
determine the grouping of statements.
1. Tabs are replaced (from left to right) by one to eight spaces such
that the total number of characters up to and including
the replacement is a multiple of eight (this is intended to be
the same rule as used by Unix).
1.3.3 Identifiers
Identifiers (also referred to as names) are described by the following
lexical definitions.
The syntax of identifiers in Python is based on the Unicode standard annex
UAX-31, with elaboration and changes as defined below;
see also PEP 3131 for further details.Python 3.0 introduces additional
characters from outside the ASCII range (see PEP 3131).
For these characters, the classification uses the version of the Unicode
Character Database as included in the unicodedata module.
Lu - uppercase letters
Ll - lowercase letters
Lt - titlecase letters
Lm - modifier letters
Lo - other letters
Nl - letter numbers
Mn - nonspacing marks
Nd - decimal numbers
Pc - connector punctuations
Other_ID_Continue - likewise
All identifiers are converted into the normal form NFKC while
parsing; comparison of identifiers is based on NFKC.
A non-normative HTML file listing all valid
identifier characters for Unicode 4.1 can be found at
https://www.unicode.org/Public/13.0.0/ucd/DerivedCoreProperties.txt
1.3.3.1 Keywords
1.3.3.2 Reserved classes of identifiers
1.3.4 Literals
1.3.5 Operators
The following tokens are operators:
+ - * ** / // % @
<< >> & | ^ ~ :=
< > <= >= == !=
1.3.6 Delimiters
The following tokens serve as delimiters in the grammar:
( ) [ ] { }
, : . ; @ = ->
+= -= *= /= //= %= @=
&= |= ^= >>= <<= **=
The following printing ASCII characters have special meaning as part of other
tokens or are otherwise significant to the lexical analyzer:
' " # \
The following printing ASCII characters are not used in Python. Their
occurrence outside string literals and comments is an unconditional error:
$ ? `
2. Comments in Python start with the hash character, #, and extend to the end
of the physical line.
A comment may appear at the start of a line or following whitespace or code,
but not within a string literal.
Numbers:
1. Integer
2. Floating Point
Strings
1. strings, which can be expressed in several ways.
2. can be enclosed in single quotes ('...') or double quotes ("...")
with the same result.
3. \ can be used to escape quotes.
4. Unlike other languages, special characters such as \n have the same
meaning with both single ('...') and double ("...") quotes. The only difference
between the two is that within single quotes you don’t need to escape " (but you
have to escape \')
Python 3 basics
https://www.geeksforgeeks.org/python-3-basics/
1. Variables and Data Structures
2. Comments
3. Input and Output
4. Selection
5. Functions
6. Iteration (Looping)
7. Modules
In other programming languages like C, C++, and Java, you will need to
declare the type of variables but in Python
you don’t need to do that. Just type in the variable and when values
will be given to it, then it will automatically
know whether the value given would be an int, float, or char or even a
String.
myNumber2 = 4.5
print(myNumber2)
myNumber ="helloworld"
print(myNumber)
Output:
3
4.5
helloworld
Comments: