We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

4.00/5 (Submit Your Rating)

Jacksonville, FL

SUMMARY:

  • 7+ years of extensive product development experience in Data Analytics, Data Modeling and Software development using Java, Big Data (Hadoop, Apache Spark).
  • Real - time experience with Hadoop ecosystem major components like Map Reduce, HDFS, YARN, Sqoop, Hive, Pig, HBase, Spark
  • Experience as Hadoop and Java Developer
  • Experience on NoSQL databases that includes HBase
  • Extensive knowledge on how to create and monitor Hadoopcluster on VM, Hortonworkssandbox, Cloudera on Linux.
  • Loaded the data in Spark and performed in-memory data computation to generate theoutput response
  • Experience on SparkSQL to load tables into HDFS to run select queries on top
  • Experience in using Pig script to extract the data from data files to load into HDFS
  • Good knowledge in how to import and export data between RDBMS and HDFS using Sqoop
  • Good knowledge in using job scheduling and monitoring tools like Zookeeper
  • Knowledge in designing and developing Mobile Application using Java Technologies like JDBCand IDE tools like Eclipse

TECHNICAL SKILLS:

Technologies: Core Java, Python, Apache Spark, Hadoop, Hive, Impala, Sqoop, Kafka, Spark Streaming, AWS S3 storage, NoSQL-Cassandra and Hbase, RDBMS, Rest Api and ShellScript.

Environment: Linux (Ubuntu), Cloudera distribution, Eclipse.

Process/Methodologies: Agile, Scrum, Git, Jira, Confluence and BitBucket.

PROFESSIONAL EXPERIENCE:

Sr. Big Data/Hadoop Developer

Confidential, Jacksonville, FL

Responsibilities:

  • Datai sextractedfromtheDatabasesandmigratedtoHDFSusingSparkProcess.
  • Designed architectureofnewmodulestohandlethedifferenttypeofdata,relationbetweendataanddatadependencies.
  • Enabled compressionatvariousphaseslikeonintermediatedata,finaloutput,toachievetheperformanceimprovementinHiveQueries
  • Used ORC(OptimizedRowColumnar)fileformattoimprovetheperformanceofHiveQueries.
  • Extracted thedatafromRDBMSintoHDFSusingtheSqoop
  • Analysis andintegrationofrawdatafromvarioussources.
  • Extraction of thedataforthetransformation,calculationandaggregation.
  • Collect allthelogsfromsourcesystemsintoHDFSusingKafkaandperformanalyticsonit.
  • Loaded all data-setsintoHivefromSourceCSVfilesusingSparkusingSpark-Javajobs.
  • Migrated the computational code in hql toSparkSQL.
  • Completed dataextraction,aggregationandanalysisinHDFSbyusingCoreSparkandSparkSQLandstorethedataneededtohive.
  • Writing SparkjobsinJavatoperformtransformationandaggregationontheprepareddata.
  • Writing Junittestcaseforcodevalidation.UsedMockitoandLombokforbettermentinJunit.
  • Refactoring/improving codequality.

Environment: Core Java, Junit, Mockito, Lombok, Apache Spark, SparkSql, Hive, sqoop, Hbase andRDBMS.

Big DataDeveloper

Confidential, Dorchester, MA

Responsibilities:

  • Database ArchitecturedesignfordatatomigrationfromMySQLtoCassandradatabase.
  • Used sparkstreamingtoprocessdatafromMySQLtoCassandrausingsparkjobswritteninJava.
  • Designe darchitectureofnewmodulestohandlethedifferenttypeofdata,relationbetweendataanddatadependencies.
  • Enabled compressionatvariousphaseslikeonintermediatedata,finaloutput,toachievetheperformanceimprovementinHiveQueries
  • Used ORC(OptimizedRowColumnar)fileformattoimprovetheperformanceofHiveQueries.
  • Created reports on agreegateddata.

Environment: CoreJava,ApacheSpark,Kafka,SparkStreaming,CassandraandMySQL.

Sr. Hadoop Developer

Confidential

Responsibilities:

  • Involvedwithingestingdatareceivedfromvariousrelationaldatabaseproviders,onHDFSforanalysisandotherbigdataoperations.
  • Created Hive tables to import large data sets from various relational databases using Sqoopand export the analysed data back for visualization and report generation by the BIteam.
  • Used default MapReduceInput and OutputFormats.
  • SupportedinsettingupQAenvironmentandupdatingconfigurationsforimplementingscriptswithHiveandSqoop
  • DevelopedandConfiguredKafkabrokerstopipelineserverlogsdataintosparkstreaming.
  • Loadingandtransformingoflargesetsofstructuredandsemistructureddata
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside ofHDFS
  • ManagingandReviewingHadoopLogFiles,deployandMaintainingHadoopCluster
  • R&D on “Share state of Apache Spark RDD between different Spark Application/Jobs using Apache Ignite” and prepared POC on the same.

Environment: CoreJava,ApacheSpark,Hadoop,HDFS,MapReduce,Kafka,Sqoop,Hive,ZookeeperandLinux(Ubuntu).

Jr. Hadoop Developer

Confidential

Responsibilities:

  • Database ArchitectureDesign.
  • Involved in gathering and analyzing userrequirements.
  • ResponsibleforInstallationandconfigurationofHive, Sqoop,and ShellScriptontheHadoopcluster.
  • Developedsqoopscriptstoimportexportdatafromrelationalsourcesandhandledincrementalloadingonthecustomer,transactiondata bydate.
  • DevelopedsimpleandcomplexprogramsinJavaforDataAnalysisondifferentdataformats.
  • InvolvedinmovingalllogfilesgeneratedfromvarioussourcestoHDFSforfurtherprocessingthroughJavaCronjobs.
  • ImportthedatafromdifferentsourceslikeHDFSintoSparkRDD.
  • ResponsibleforanalyzingandcleansingrawdatabyperformingHivequeriesandrunningsqoopscriptsondata.
  • Installing, Upgrading and Managing HadoopClusters.

Environment: Hadoop,MapReduce,HDFS,Hive,MySQL,Sqoop,Shellscripting,CronJobs,ApacheKafka,Core-Java.

Java-Hadoop Developer

Confidential

Responsibilities:

  • DatabaseArchitectureDesign.Involvedingatheringandanalyzinguserrequirements.
  • DevelopeddatabasearchitecturedesignfordatastoreonImpala.
  • PerformedDatapreparationandtransformationondatafromkafkastoredonAWSs3bucket.
  • PerformedcalculationandaggregationonprepareddatainImpalaforreportingpurpose.
  • Created Api’s to retrieve aggregateddata.

Environment: Core-Java,Hadoop,Kafka,Zookeeper,HDFS,Hive,MySQL,ApacheKafka,AWSS3storageandRestAPI.

We'd love your feedback!