I have written a pig script that would generate tuples of a hive table. I am trying to dump the results to a specific partition in HDFS where hive stores the table date. As of now the partition value I am using is a timestamp string value that is generated inside pigscript. I have to use this timestamp string value to store my pig script results but i am have no idea how to do that. Any help would be greatly appreciated.
We started with Q&A. Technical documentation is next, and we need your help.
Whether you're a beginner or an experienced developer, you can contribute.
If I understand it right you read some data from a partition of a HIVE table and want to store into another HIVE table partitions, right? A HIVI partition (form HDFS perspective) is just a subfolder which name is constructed like this: fieldname_the_partitioning_is_based_on=value For example you have a date partition it looks like this: hdfs_to_your_hive_table/date=20160607/ So all you need is to specify this output location in the store statement STORE mydata INTO '$HIVE_DB.$TABLE' USING org.apache.hive.hcatalog.pig.HCatStorer('date=$today'); |
|||
|