Random Selection
select * from data_base.table_name
where rand() <=0.01
distribute by rand()
sort by rand()
limit 100000;
Download Manually
Run the Hive Query.
When it is finished, scroll down to where results are and use the download icon (fourth from top)
Download from Hive Programmatically
Use MOBAXTERM to connect to server
Use VI/VIM to put query in a .hql file. Use i to insert and :wq to save and exit
Use nohup to run and direct the .hql results to a file
[ajayuser@server~]$ mkdir ajay
[ajayuser@server~]$ cd ajay
[ajayuser@serverajay]$ ls
[ajayuser@serverajay]$ vi agesex.hql
[ajayuser@serverajay]$ mv agesex.hql customer_demo.hql
[ajayuser@serverajay]$ ls
customer_demo.hql
[ajayuser@serverajay]$ nohup hive -f customer_demo.hql >> log_cust.${date}.log;
[ajayuser@serverajay]$ nohup: ignoring input and redirecting stderr to stdout
To check progress
[ajayuser@serverajay]$ tail -f log_cust.${date}.log