Professional Documents
Culture Documents
If you want to write your own UDF(User Defined Function) and inject in one specific
part in the pipeline,
it is straightforward.
Data Schema is not enforced explicitly but implicitly. I think this is big one,
too.
The debugging of pig scripts in my experience is %90 of time schema and since it
does not enforce an explicit schema, sometimes one data structure goes bytearray,
which is a �raw� data type and unless you coerce the fields even the strings, they
turn bytearray without notice.
This may propagate for other steps of the data processing.
A class for Java programs to connect to Pig. Typically a program will create a
PigServer instance
pig
Basic commands
sh ls
clear
help
DUMP truck_events_subset;