You are on page 1of 4

10/4/21, 8:07 PM Data Science Intern_2nd round Assessment 

Data Science Intern_2nd round


Assessment 
The assessment will take approximately 120 minutes to complete. Please submit the answers
within the allotted time. 

1. Name  *

KONDA RAVI KIRAN

2. Email ID *

ravikiran.konda9999@gmail.com

3. Contact Number  *

9652580852

4. Write function to split paragraph into sentences. Do not use libraries for sentence
tokenization.

https://forms.office.com/Pages/ResponsePage.aspx?id=JFcuFV4pQUWLO5ts4aMZJzlLZC67UBJGnjlu-jxpKIFURDUyT0RTRTJHQUJSSDVUWDNEO… 1/4
10/4/21, 8:07 PM Data Science Intern_2nd round Assessment 

def convert_into_sentances(data):

data= " " + data+ " "

data= data.replace("\n"," ")

if "”" in text: text = data.replace(".”","”.")

if "\"" in text: text = data.replace(".\"","\".")

if "!" in text: text = data.replace("!\"","\"!")

if "?" in text: text = data.replace("?\"","\"?")

data= data.replace(".",".<stop>")

data= data.replace("?","?<stop>")

data= data.replace("!","!<stop>")

data= data.replace("<prd>",".")

sentences = data.split("<stop>")

sentences = sentences[:-1]

sentences = [s.strip() for s in sentences]

return sentences

convert_into_sentances()

# GIVE THE PARAGRAH IN VARIABLE AND READ THE FUNCTION

5. Write a function to get n-gram from text. Print the output as highest frequency
first. In case 2 or more have equal occurrences, print in alphabetical order. Note -
Input could be either sentence or para. (Note - an N-gram is a sequence of N-
words. For eg. In this statement - "Apple and peach are fruits.", for a 2-gram, we
get "Apple and", "and peach", "peach are" and "are fruits." . A 3-gram would be
"Apple and peach", "and peach are", "peach are fruits.'.

text = data ; # need a paragraph here

n_grams = [i.lemma_ for i in text]

print(n_grams)

6. Given N documents and a query, we want to score the documents such that


document with maximum match with the query appears on top. Write a function
to implement this and justify your logic.

Sorry to say 1 time is not sufficient sir

https://forms.office.com/Pages/ResponsePage.aspx?id=JFcuFV4pQUWLO5ts4aMZJzlLZC67UBJGnjlu-jxpKIFURDUyT0RTRTJHQUJSSDVUWDNEO… 2/4
10/4/21, 8:07 PM Data Science Intern_2nd round Assessment 

7. Write a function to remove duplicates from an array.

def Remove_duplicates(numbers):

rm_duplicates = []

for num in numbers:

if num not in rm_duplicates:

rm_duplicates.append(num)

return rm_duplicates

Remove_duplicates(numbers)

# provide numbers in the format of list. like

#x =[numbers ]

# Remove_duplicate(x)

8. Write a function to calculate factorial of a number.

def find_fact():

n =int(input("Enter the num to check factorial "));

fact =1
if (n < 0):

print("NO Negitive n's")

elif (n == 0):

print("The factorial of 0 is 1")

else:

for i in range(1,n + 1):

fact = fact * i

print("The factorial of",n,"is",fact)


find_fact()

This content is created by the owner of the form. The data you submit will be sent to the form owner. Microsoft is not
responsible for the privacy or security practices of its customers, including those of this form owner. Never give out your
password.

Powered by Microsoft Forms |


The owner of this form has not provided a privacy statement as to how they will use your response data. Do not provide
personal or sensitive information.
| Terms of use

https://forms.office.com/Pages/ResponsePage.aspx?id=JFcuFV4pQUWLO5ts4aMZJzlLZC67UBJGnjlu-jxpKIFURDUyT0RTRTJHQUJSSDVUWDNEO… 3/4
10/4/21, 8:07 PM Data Science Intern_2nd round Assessment 

https://forms.office.com/Pages/ResponsePage.aspx?id=JFcuFV4pQUWLO5ts4aMZJzlLZC67UBJGnjlu-jxpKIFURDUyT0RTRTJHQUJSSDVUWDNEO… 4/4

You might also like