You are on page 1of 1

Stride.

AI NLP Assessment
1) What would be a good size of data set need to train a model which would be able to
multiply two matrices. Explain your logic.

A) 100 B) 1000 C) 5000 D) None of these

2) In a corpus of N documents, one document is randomly picked. The document


contains a total of T terms and the term “data” appears K times. What is the correct value for
the product of TF (term frequency) and IDF (inverse-document-frequency), if the term “data”
appears in approximately one-third of the total documents?

3) Write a Python code which takes a Digital PDF document, and returns total number of
paragraphs in it, also prints each of the paragraph.

4) Write a Python code which takes a Digital PDF document, and returns total number of
tables in it, also prints each of the table.

5) What are the possible features of a text corpus

A. Count of word in a document


B. Boolean feature – presence of word in a document
C. Vector notation of word
D. Part of Speech Tag
E. Basic Dependency Grammar
F. Entire document as a feature

You might also like