Hadoop Admin/developer Resume
0/5 (Submit Your Rating)
Woburn, MA
SUMMARY
- 6+ years of total IT experience with certified training by MapR.
- In - depth knowledge as a software developer and been trained rigorously across different platforms.
- Worked on data warehousing for financial firm JP Morgan chase as a client mainly on the data architecture, data warehousing design and implementation, table structure, aggregation, functions mappings fact and dimension tables, logical and physical database design, reconciliation data modeling, reporting process metadata, and ETL processes using Informatica, TLM, and business reporting tool.
- Experience with SQL, Oracle, RDBMS, UNIX, Java and Object Oriented Design training at Confidential
- Working knowledge of account reconciliations of Investment Banking and Capital Markets Data
- Mentored the client team for more than six months and helped with resolving production issues throughout.
PROFESSIONAL EXPERIENCE
Confidential, Woburn, MA
Hadoop Admin/Developer
Responsibilities:
- Hands on experience in installing, configuring and using ecosystem components like HadoopMapReduce,HDFS,Hbase,Oozie,Hive,Cassandra,Sqoop,Pig.
- Installed and configuredMapReduce,HIVEand theHDFS; implemented CDH3 Hadoop cluster onCentOS.
- Experience in working in most of thebackendintegrationusingspringand pythonfor processing the data.
- DevelopedMapReduceprograms inJavaorpythonto parse the raw data populate staging tables and store the refined data in partitioned tables.
- CreateHBasetables to load large sets of structured, semi-structured and unstructured data coming fromUNIX,NoSQLand a variety of portfolios.
- Experience in working with theHBaseas aNoSQLDB in the MapReduce framework.
- Importing and exporting data intoHDFSandHiveusingSqoop.
- Experience in writing withPig latin scripts&hivequeries for transforming the semi unstructured & structured data sets.
- Experience in working with theCassandraClientprogram for fetching the compressed data from theNoSQLDB.
- CreatedHivequeries to get the predictive analysis on patient records.
- Worked onSqoopto export data intoHDFSandHive.
- Experience in working withHortonworksframework that allows for the distributed processing of large data sets across clusters of computers.
- Experiencein working withcluster computing frameworks likeapachesparkandstorm.
- Experience in working with the Message transfer engines,Kafkaand other messaging technologies.
- Experience in working with the data flows andETLand processing of semi-structured & unstructured data usingPIG.
- Experience in working with the backendETLtools likeInformatica.
- UsedOozieto automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Automated all the jobs,for pulling data from FTP server to load data into Hive tables, usingOozieworkflows.
- Cluster coordination services throughZoo Keeper,Involved in loading data fromlog filestoHDFS.
- Experience in working with the RestFul webservices JAX-RS using Jersey, Restlet.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Tested raw data and executed performance scripts.
- Worked on Multi Clustered environment and setting upClouderaHadoop echo-System.
- Experience in working with theBI tools such asPentaho BI, Hyperion System.
- Experiencein working with the RDBMS likeOracle, MySQL.
- Worked with theUNIXshell scriptingand other scripting languages, good understanding of theGroovyscripts.
- Worked withLinux shell scriptingfor moving the files to theHadoop Cluster.
- Excellent communication skills, interpersonal skills, problem solving skills a very good team player.
- Good understanding ofBigData, cloud related tools and technologies likeFlume, Avro,Chukwa,Whirr,PentahoandHorton Works.
- Designed and implemented four Agile Scrum teams for product development.
- Experience in working with theJIRAan issue tracking system, and an environment with high deadlines.
Environment: BigData, Hadoop,MapReduce,HDFS,Hbase,Oozie,Hive,Cassandra,Sqoop,Pig, Chukwa, Pentaho, Agile, Scrum, Jira, Zoo Keeper, HDFS, Cloudera, Pig latin, ETL, NOSQL, Java, Python, UNIX.
Confidential, St. Louis, MO
Hadoop Admin/Developer
Responsibilities:
- Installed and configuredHadoopMap Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Loaded the customer profiles data, customer-spending data, and credit from legacy warehouses onto HDFS using Sqoop.
- Built data pipeline usingPig and Java Map Reduceto store onto HDFS.
- Applied transformations and filtered both traffic using Pig.
- Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer-using Hive and stored the results inHBase.
- Performed unit testing usingMRUnit.
- Responsible for buildingscalable distributed data solutionsusingHadoop
- Installed and configuredHive, Pig,Sqoop, FlumeandOozieon theHadoopcluster
- Setup and benchmarkedHadoop/HBaseclusters for internal use
- Developed Simple to complex Map/reduce Jobs using HiveandPig
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations usingHive,MapReduce,loaded data intoHDFSand Extracted the data fromMySQLinto HDFS usingSqoop
- Analyzed the data by performingHive queriesand runningPig scriptsto study customer behavior
- InstalledOozieworkflow engineto run multiple Hive and Pig jobs
- Exported the analyzed data to the relational databases usingSqoopfor visualization and to generate reports for the BI team
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Provide support data analysts in running Pig and Hive queries.
- Involved inHiveQL.
- Involved in Pig Latin.
- Importing and exporting Data fromMysql/Oracle toHiveQLusing SQOOP.
- Importing and exporting Data fromMysql/Oracle to HDFSusing SQOOP.
Environment: Hadoop, MapReduce, HDFS, HQL, MR Unit, Hive, Pig,Sqoop, Flume,Oozie, HiveQL,Pig Latin, MYSQL.