Professional Documents
Culture Documents
import numpy as np
df = pd.read_sas('sample.sas7bdat')
df.head()
df.describe()
count 22318.000000 71311.000000 43267.000000 71311.000000 62171.000000 62171.000000 71266.000000 62171.000000 71311.000000
mean 2030.854288 0.606737 0.788014 0.443929 0.152611 0.113719 0.891238 0.022985 0.042798
std 570.785415 0.488478 1.304314 0.496850 0.359615 0.317472 1.378112 0.229908 0.270853
min 1901.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 1983.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
50% 1991.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
75% 1999.000000 1.000000 3.000000 1.000000 0.000000 0.000000 2.000000 0.000000 0.000000
max 9996.000000 1.000000 3.000000 1.000000 1.000000 1.000000 12.000000 9.000000 6.000000
df = pd.read_csv('sample.csv')
df.head()
df.describe()
age Medu Fedu traveltime studytime failures famrel freetime goout Dalc
count 395.000000 395.000000 395.000000 395.000000 395.000000 395.000000 395.000000 395.000000 395.000000 395.000000 395.000
mean 16.696203 2.749367 2.521519 1.448101 2.035443 0.334177 3.944304 3.235443 3.108861 1.481013 2.291
std 1.276043 1.094735 1.088201 0.697505 0.839240 0.743651 0.896659 0.998862 1.113278 0.890741 1.287
min 15.000000 0.000000 0.000000 1.000000 1.000000 0.000000 1.000000 1.000000 1.000000 1.000000 1.000
25% 16.000000 2.000000 2.000000 1.000000 1.000000 0.000000 4.000000 3.000000 2.000000 1.000000 1.000
50% 17.000000 3.000000 2.000000 1.000000 2.000000 0.000000 4.000000 3.000000 3.000000 1.000000 2.000
75% 18.000000 4.000000 3.000000 2.000000 2.000000 0.000000 5.000000 4.000000 4.000000 2.000000 3.000
max 22.000000 4.000000 4.000000 4.000000 4.000000 3.000000 5.000000 5.000000 5.000000 5.000000 5.000
df = pd.read_xml('sample.xml')
df.head()
df.describe()
df = pd.read_json('sample.json')
df.head()
df.describe()
sepalLength sepalWidth petalLength petalWidth
df = pd.read_excel(url)
print(df)
df.describe()
count 700.000000 700.000000 700.000000 7.000000e+02 700.000000 7.000000e+02 700.000000 700.000000 700 700.
2014-
mean 1608.294286 96.477143 118.428571 1.827594e+05 13150.354629 1.696091e+05 145475.211429 24133.860371 04-28
21:36:00
2013-
min 200.000000 3.000000 7.000000 1.799000e+03 0.000000 1.655080e+03 918.000000 -40617.500000 09-01
00:00:00
2013-
25% 905.000000 5.000000 12.000000 1.739175e+04 800.320000 1.592800e+04 7490.000000 2805.960000 12-24
06:00:00
2014-
50% 1542.500000 10.000000 20.000000 3.798000e+04 2585.250000 3.554020e+04 22506.250000 9242.200000 05-16
12:00:00
2014-
75% 2229.125000 250.000000 300.000000 2.790250e+05 15956.343750 2.610775e+05 245607.500000 22662.000000 09-08
12:00:00
2014-
max 4492.500000 260.000000 350.000000 1.207500e+06 149677.500000 1.159200e+06 950625.000000 262200.000000 12-01
00:00:00
std 867.427859 108.602612 136.775515 2.542623e+05 22962.928775 2.367263e+05 203865.506118 42760.626563 NaN
import pickle
df=pd.read_pickle('data.pkl')
df
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Pytho
n311\site-packages\pandas\io\pickle.py:206, in read_pickle(filepath_or_buffer, compression, storage_options)
205 warnings.simplefilter("ignore", Warning)
--> 206 return pickle.load(handles.handle)
207 except excs_to_catch:
208 # e.g.
209 # "No module named 'pandas.core.sparse.series'"
210 # "Can't get attribute '__nat_unpickle' on <module 'pandas._libs.tslib"
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Pytho
n311\site-packages\pandas\io\pickle.py:211, in read_pickle(filepath_or_buffer, compression, storage_options)
206 return pickle.load(handles.handle)
207 except excs_to_catch:
208 # e.g.
209 # "No module named 'pandas.core.sparse.series'"
210 # "Can't get attribute '__nat_unpickle' on <module 'pandas._libs.tslib"
--> 211 return pc.load(handles.handle, encoding=None)
212 except UnicodeDecodeError:
213 # e.g. can occur for files written in py27; see GH#28645 and GH#31988
214 return pc.load(handles.handle, encoding="latin-1")
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Pytho
n311\site-packages\pandas\compat\pickle_compat.py:225, in load(fh, encoding, is_verbose)
222 # "Unpickler" has no attribute "is_verbose" [attr-defined]
223 up.is_verbose = is_verbose # type: ignore[attr-defined]
--> 225 return up.load()
226 except (ValueError, TypeError):
227 raise
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Pytho
n311\site-packages\pandas\compat\pickle_compat.py:156, in Unpickler.find_class(self, module, name)
154 key = (module, name)
155 module, name = _class_locations_map.get(key, key)
--> 156 return super().find_class(module, name)
url = 'sample2.txt'
txt = open(url,mode="r")
text=txt.read()
print(text)
txt.close()
print(len(text))
Aeque enim contingit omnibus fidibus, ut incontentae sint.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quae cum ita sint, effectum est nihil esse malum, quod
turpe non sit. Itaque nostrum est-quod nostrum dico, artis est-ad ea principia, quae accepimus. Quod totum contr
a est. Duo Reges: constructio interrete. Atqui iste locus est, Piso, tibi etiam atque etiam confirmandus, inquam
; Quamvis enim depravatae non sint, pravae tamen esse possunt. Duarum enim vitarum nobis erunt instituta capiend
a.
Non igitur de improbo, sed de callido improbo quaerimus, qualis Q. Audio equidem philosophi vocem, Epicure, sed
quid tibi dicendum sit oblitus es. Ex ea difficultate illae fallaciloquae, ut ait Accius, malitiae natae sunt. A
t multis malis affectus. Nam quibus rebus efficiuntur voluptates, eae non sunt in potestate sapientis. Quis est
tam dissimile homini. Ut proverbia non nulla veriora sint quam vestra dogmata. Si quicquam extra virtutem habeat
ur in bonis. Sed plane dicit quod intellegit. Paulum, cum regem Persem captum adduceret, eodem flumine invectio?
Qui ita affectus, beatum esse numquam probabis; Sed nimis multa. Nam prius a se poterit quisque discedere quam a
ppetitum earum rerum, quae sibi conducant, amittere. Familiares nostros, credo, Sironem dicis et Philodemum, cum
optimos viros, tum homines doctissimos. Quod iam a me expectare noli. Quid ergo?
Eademne, quae restincta siti? Ita relinquet duas, de quibus etiam atque etiam consideret. Illa videamus, quae a
te de amicitia dicta sunt. Eaedem res maneant alio modo. Quid ergo attinet gloriose loqui, nisi constanter loqua
re? Prioris generis est docilitas, memoria; Portenta haec esse dicit, neque ea ratione ullo modo posse vivi; Bea
tum, inquit. Bestiarum vero nullum iudicium puto.
Quem Tiberina descensio festo illo die tanto gaudio affecit, quanto L. Quorum sine causa fieri nihil putandum es
t. Tria genera bonorum; Nunc dicam de voluptate, nihil scilicet novi, ea tamen, quae te ipsum probaturum esse co
nfidam. Illud dico, ea, quae dicat, praeclare inter se cohaerere. Fortemne possumus dicere eundem illum Torquatu
m? Hoc tu nunc in illo probas. Cur post Tarentum ad Archytam?
import PyPDF2
pdf = open('file-sample_150kB.pdf', 'rb')
pdf = PyPDF2.PdfReader(pdf)
page = pdf.pages[0]
print(page.extract_text())
Lorem ipsum
Lorem ipsum dolor sit amet, consectetur adipiscing
elit. Nunc ac faucibus odio.
Vestibulum neque massa, scelerisque sit amet ligula eu, congue molestie mi. Praesent ut
varius sem. Nullam at porttitor arcu, nec lacinia nisi. Ut ac dolor vitae odio interdum
condimentum. Vivamus dapibus sodales ex, vitae malesuada ipsum cursus
convallis. Maecenas sed egestas nulla, ac condimentum orci. Mauris diam felis,
vulputate ac suscipit et, iaculis non est. Curabitur semper arcu ac ligula semper, nec luctus
nisl blandit. Integer lacinia ante ac libero lobortis imperdiet. Nullam mollis convallis ipsum,
ac accumsan nunc vehicula vitae. Nulla eget justo in felis tristique fringilla. Morbi sit amet
tortor quis risus auctor condimentum. Morbi in ullamcorper elit. Nulla iaculis tellus sit amet
mauris tempus fringilla.
Maecenas mauris lectus, lobortis et purus mattis, blandit dictum tellus.
·Maecenas non lorem quis tellus placerat varius.
·Nulla facilisi.
·Aenean congue fringilla justo ut aliquam.
·Mauris id ex erat. Nunc vulputate neque vitae justo facilisis, non condimentum ante
sagittis.
·Morbi viverra semper lorem nec molestie.
·Maecenas tincidunt est efficitur ligula euismod, sit amet ornare est vulputate.
Row 1Row 2Row 3Row 4024681012
Column 1
Column 2
Column 3
df = pd.read_excel('Financial Sample.xlsx')
af = df[['Segment', 'Product']]
print(af)
Segment Product
0 Government Carretera
1 Government Carretera
2 Midmarket Carretera
3 Midmarket Carretera
4 Midmarket Carretera
.. ... ...
695 Small Business Amarilla
696 Small Business Amarilla
697 Government Montana
698 Government Paseo
699 Channel Partners VTT
Text
0 Lorem ipsum \nLorem ipsum dolor sit amet, cons...
1 In non mauris justo. Duis vehicula mi vel mi p...
2 Lorem ipsum dolor sit amet, consectetur adipis...
3
Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js