Tag
hadoop
4 datasets-
PigTutorial - Pig Wiki
Offsite — Apache Pig is a platform for analyzing large data sets. Pig’s language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing. Pig Latin ... -
Amazon Web Services Public Datasets » Data Wrangling Blog
Offsite — -
The Cornell Web Lab - The Cornell Web Lab
Offsite — -
Wikipedia Datasets for the Hadoop Hack | Cloudera
Offsite —