River IQ

Image

spark udf with withColumn

  Ashish Kumar Spark February 14, 2020

import org.apache.spark.sql.functions._ val events = Seq ( (1,1,2,3,4), (2,1,2,3,4), (3,1,2,3,4), (4,1,2,3,4), (5,1,2,3,4)).toDF("ID","amt1","amt2","amt3","amt4") var prev_amt5=0 var i=1 def getamt5value(ID:Int,amt1:Int,amt2:Int,amt3:Int,amt4:Int) : Int = {     if(i==1){ i=i+1 prev_amt5=0   }else{ i=i+1   }   if (ID == 0)   { if(amt1==0) {   val cur_amt5= 1   prev_amt5=cur_amt5   cur_amt5 }else{   val cur_amt5=1*(amt2+amt3)   prev_amt5=cur_amt5   cur_amt5 }   }el...

Read more
Image

transfer files widow to Linux

  Ashish Kumar other February 14, 2020

riveriq_copytoedge.bat@echo off set timestamp=%DATE:/=-%_%TIME::=-%C:\"Program Files (x86)"\WinSCP\WinSCP /script="E: iveriqrtifactscopytoedgeconfig iveriq_copytoedge.txt" /log="E: iveriqloggingcopytoedgeconfig%timestamp%.log" riveriq_copytoedge.txt# Being intended for interactive session, we are not enabling batch mode# Connectopen sftp://testuser:password123@testserver.riveriq.com/# Synchronize paths provided via environment variablessynchronize remote E: iveriq estsourcedatacleansedDataconfig /data/home/riveriq/testsource/data...

Read more
Image

Read JCEKS Containing Secret Keys using java

  Ashish Kumar java February 14, 2020

package com.riveriq.db2con.driver;import java.sql.Connection;import java.sql.DriverManager;import java.sql.SQLException;import java.util.Properties;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import com.riveriq.exception.CustomException;import com.riveriq.util.ReadJceks;public class DB2Connection { private Connection conn; private static DB2Connection db2connection; private static Logger LOGGER = LoggerFactory.getLogger(DB2Connection.class); private DB2Connection() { } public Connection getConnection(Properties prop) throws SQLExcept...

Read more
Image

How to remove new lines within double quotes

  Ashish Kumar other February 14, 2020

!/usr/bin/perluse warnings;use strict;use Path::Tiny; use Text::CSV;use Time::Piece;use File::Path qw( make_path );use diagnostics;use Try::Tiny;#use File::NCopy;use File::Copy::Recursive qw(fcopy rcopy dircopy fmove rmove dirmove);use Time::HiRes qw( time );my $start = time();my $date = localtime->strftime('%Y%m%d');my $feed_date = $date;if(exists($ARGV[3])){  $feed_date = $ARGV[3];}# build source directory path ==>my $source_feed_dir = $ARGV[0];my $source_feed_dir_path = path($source_feed_dir);# process i.e. current date ...

Read more
Image

Databricks Log4j Configuration

  Ashish Kumar Databricks January 15, 2020

System.out.println("Caught Error") This will write output on console window. This is fine when you are developing or running manually but what if you are scheduling as job or automating your application. In that case these output or log should go to some persistent location. You can persist it anywhere database table, email server or in a log file. So here we are discussing how to write our log into log file and the one of solution is Log4j. Here I won't be explaining much about Log4j. I'm sure you must be knowing or you can ...

Read more
Image

Log4j Configuration with spark-submit

  Ashish Kumar java January 15, 2020

This is 2nd part of log4j configuration for spark application. For more understanding about log4j you can follow below link.https://www.linkedin.com/pulse/databricks-log4j-configuration-ashish-kumar/Spark-submit/usr/hdp/3.0.1.0-187/spark2/bin/spark-submit --master yarn --queue dev --deploy-mode client --class com.riveriq.log4jExample --driver-java-options "-Dlog4j.configuration=file:/home/riveriq/log4j/conf/log4j.xml" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/home/riveriq/log4j/conf/log4j.xml" --num-exe...

Read more
Image

Azure Databricks Notebook - How to get current workspace name

  Ashish Kumar Databricks January 15, 2020

Sometimes you also have been in some situation where you feel something should be very easy but once you started looking for that, you found it's not Here some sort of things happened with me also and sharing my learning with you all. I was looking to get current workspace name from notebook. As wanted to get environment(dev/test/stg/prod) type from workspace name and using same in notebook configuration. I did some research but couldn't succeed or I would say it won't be possible to get workspace details from notebook and reason be...

Read more
Image

Sqoop import to Text, Avro, Parquet, Sequence

  Ashish Kumar sqoop January 27, 2019

In my previous article I explained how we can sqoop data in avro file, what kind of error it can throw and how we can resolve them...Now here I am going to show you how we can sqoop import into multiple file format and build table on top of that. As we know that we can sqoop data into multiple file format but sqoop support direct import for four file format. File Format Argument Description Avro Data Files --as-avrodatafile Imports data to Avro Data Files ...

Read more
Image

Hive Integration with Spark

  Ashish Kumar Spark January 22, 2019

Are you struggling to access hive using spark?Is your hive table is not showing in spark?No worry here I am going to show you the key changes made in HDP 3.0 for hive and how we can access hive using spark. Now in HDP 3.0 both spark and hive ha their own meta store. Hive uses the "hive" catalog, and Spark uses the "spark" catalog. With HDP 3.0 in Ambari you can find below configuration for spark.As we know before we could access hive table in spark using HiveContext/SparkSession but now in HDP 3.0 we can access hive using Hive ...

Read more
Image

Sqoop Import in Avro Files

  Ashish Kumar sqoop January 22, 2019

Here today I will show you how we can sqoop data into avro file format.Yeah, we know it very simple put --as-avrodatafile with your sqoop import command as per all apache documentation but in real life does all documented command works as simple as written???Defiantly not…And it happened same as others… so no worry here I’m goanna show you all probable issue you can face and how you need to debug and resolution for the same and if you have some different issue please comment. We will try to solve it together.But before talking to th...

Read more