We provide IT Staff Augmentation Services!

Big Data Consultant Resume

2.00/5 (Submit Your Rating)

New, JerseY

SUMMARY:

  • 19+ years of experience in software design, client delivery, project management, implementation, testing & maintenance.
  • Experience in installing, configuring and administrating the Hadoop Cluster of Major Hadoop Distributions such as Cloudera, Hortonworks etc.
  • Hands on experience with Apache Hadoop components like HDFS, MapReduce, Hive, Impala, HBase, Pig, Sqoop, Oozie, Flume, Big Data and Big Data Analytics.
  • Developed/migrated numerous applications on Spark using Spark Core and Spark Streaming APIs in Java. Optimized MapReduce jobs/SQL queries into Spark Transformations using Spark RDDs.
  • Experience in the successful implementation of ETL solutions on data extraction, transformation and load in Sqoop, Hive, Pig, Spark etc. Worked with NoSQL databases like HBase.
  • Good experience in working with cloud environment like Amazon Web Services (AWS).
  • Experience in developing and integrating Spark Framework with Kafka Topics for real time streaming applications.
  • Committed to excellence, self - motivator, fast-learner, team-player, and a prudent developer with strong problem-solving skills and communication skills.
  • Expertise in Implementing Haven as a Service (Haas) big data solutions in Multiple Client locations.
  • Expertise in design and development of various web and enterprise applications using Java/J2EE and big data technologies in cloud environment like Hp Helion Cloud and AWS.

TECHNICAL SKILLS:

Programming Languages: JAVA, SQL, PL/SQL, Python, Linux Shell scripting

Big Data Components: HDFS, Hive, Impala, Sqoop, Pig, Spark, SCALA, Oozie, Flume, Kafka, Hbase, MapReduce, HBase

Web Technologies: HTML, DHTML, XML, XSLT, CSS, DOM, SAX, AJAX

J2EE Technologies: Servlets, JSP, Maven, JDBC, JMS

Tomcat 5.0, Iplanet, CA workload automation

Databases: Oracle, SQL Server, Sybase, Vertica, Hive

Frameworks: Struts, spring, Hibernate, JUnit, Log4j, Apache Camel

ETL: Talend

Operating Systems: Windows, Red Hat Linux, Solaris, HP Helion Cloud, Amazon Web services (AWS)

PROFESSIONAL EXPERIENCE:

Confidential, New Jersey

Big Data Consultant

Responsibilities:

  • Responsible for delivery of the project both timely and quality of the application.
  • Worked on Store Innovations projects
  • Planned and defined scope for the project
  • Tasks planning and sequencing and resource allocation
  • Monitored and Reported progress on the same with the management
  • Worked with different vendors involved in the delivery of this project
  • Documented all release timelines and planned release activities as a release Manager
  • Planned, scheduled and controlled the build in addition to testing and deployment of the release
  • Followed all documentation process for the release (moving the application to different environment DEV / SIT / UAT / PROD)

Environment: .Net, IIS server, Java script, html

Confidential

Big Data Consultant

Responsibilities:

  • Involved in Release Management of the project
  • Worked on APD Acquirer Card Processing
  • Leading the offshore development team
  • Planning, Scheduling and controlling the build in addition to testing and deploying releases
  • Worked with multi-disciplinary team to understand the impact analysis of upstream and downstream data.
  • Scheduled the jobs using the Linux software utility Cron.
  • Worked on Migration of the applications from WebSphere 5.1 to 7.0
  • Generated Use case diagrams, Class diagrams, and Sequence diagrams using Microsoft Visio
  • Documented all release procedures for the CAB meetings
  • Used LDAP for authentication

Environment: IBM RAD 7.5, WebSphere Application server 7, LDAP, Sybase, Linux Shell scripting, Cron

Confidential

Lead Big Data Consultant

Responsibilities:

  • Being new to Tibco, had a very fast learning curve to learn new technology
  • Worked on NPI (Number Portability Improvements)
  • Involved in requirements gathering with the Business analysts from Netherlands
  • Worked on Agile Methodology to capture the correct requirements
  • Mentored a team of 6 People in Chennai
  • Designed and developed various Tibco BW Components needed for the application
  • Documented as per the process throughout the SDLC.
  • Deployed the application to the different environments Test / UAT / PROD
  • Created a support team to support the application once the application is Migrated to PROD

Environment: Tibco BW, IProcess

Confidential

Big Data Consultant

Responsibilities:

  • Managing fully distributed Hadoop cluster is an additional responsibility assigned to me. I was trained to overtake the responsibilities of a Hadoop Administrator, which includes managing the cluster, Upgrades and installation of tools that uses Hadoop ecosystem.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume on

    Cloudera Distribution

  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project. Load and transform large sets of structured, semi structured and unstructured data.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries. Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Worked on POC’s with Apache Spark using scala to implement spark in project. Consumed the data from Kafka using Apache spark.
  • Actively involved in code review and bug fixing for improving the performance.
  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Read the streaming social media data from (Datasift / GNIP).
  • The Streamed data is a nested Json objects which is split into multiple records and then they are converted into a canonical xml and ingested into Topics for further processing, The Datasift provides social media data like (Facebook, twitter, you tube, etc.)
  • RSS feeds are also pulled in a separate flow, the raw RSS feed is converted into canonical for further processing and stored in HDFS.
  • Used Regular expression to split the xml data stored in a file to do reprocessing of the records.
  • Created java components to Connect with Autonomy Idol (and validate each individual tweet and list of all #tags and identify the sentiment score for the value given)
  • Updates Vertica Table with all the #tags and Sentiment scores for live report generation.
  • The Entire time for the flow of data from Social Media post to the report is just 90 seconds. (Tibco sport fire is used to create reports like Trending Drivers, Sentiment analysis of the Race and broadcasted live during the race)
  • Used Maven to build the applications. Created scripts for the same including the dependent jars needed for the build)

Confidential

Big Data Consultant

Environment: - Hadoop, Cloudera, Hive, Impala, Spark, Scala, Sqoop, Parquet, HDFS, Eclipse, Teradata, Java, Shell Script, Python, TPT (Teradata Parallel Transporter)

Responsibilities:

  • Developed ingestion framework to pull data from Teradata, Oracle and .dat files sources and load into
  • Hive/Impala using Spark with Scala
  • Improve the performance of existing spark jobs using different spark and Teradata options to handle
  • ~10TB of data per run
  • Write Hive queries to enable data for modeling jobs
  • Write shell script to automate Spark and Sqoop jobs for ingestion and validation
  • Manage the onsite and offshore team and give guidance to complete tasks within defined SLAs
  • Creating plan for data load and scramble process based new business requirements
  • Developed Validation framework to do the data quality check
  • Developed TPT framework to load data from Teradata to hive using TPT export option
  • Worked on SDL framework to generate models and analysis from the data

Big Data Consultant

Confidential

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsibilities include analysis of these various applications, designing of the enterprise applications, co-ordination with client, meetings with business users, functional and technical guide to the team, project management.
  • Involved in the process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster and it components.
  • Developed Sqoop jobs and MapReduce programs in Java to extract data and load into HDFS from various RDBMS system such as Oracle, SQL Server etc.
  • Designed and implemented Hive/Impala/Pig/Spark queries/scripts and functions for evaluation, filtering, loading and storing of data.
  • Written Scala code to create Schema RDD’s and accessed them using Spark SQL.
  • Written Map Reduce and Spark Code to power data for extraction, transformation and aggregation from multiple file formats including XML, CSV & other compressed file formats.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
  • Extensively used Oozie workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows with many actions.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
  • Developed scripts using Python and Linux Shell scripts to create SFTP file transfer between ODC and Tulsa Network and data conversion of EBCDIC to ASCII.
  • Developed Monitoring scripts to test the VPN Connectivity between the ODC and Tulsa networks and automated the same in autosys.
  • Used Active Directory for Directory Services.
  • Also did a setup of a similar Haas environment for State of Wisconsin in AWS cloud.
  • Created all DLLs needed for the application in Vertica and grant access to Vertica based on the roles in the AD (Active directory)

Confidential

Lead Big Data Consultant

Responsibilities:

  • Client relationship and project management including regular meetings with our customer to ensure scope and quality management.
  • Worked on Optima (Manual Adjustments)
  • Responsible for delivery and quality of delivery of the application on time
  • Managed a Team size of 5 at offshore Chennai and 3 at offshore China
  • Responsible for Onsite-Offshore Co-ordination with both Chennai and China team.
  • Developed web application using Flex, Java, JSP and Hibernate
  • Created store procedures in oracle for some business rules
  • Managed Installation of application to different environment and support activities

Environment: J2ee, Flex, Hibernate, Oracle 10g, Unix

Confidential

Project Manager

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Worked on GSTS, VPPSTcBOM integration, VMAS, STRIM
  • Developed the User interface using JSP, JavaScript, CSS, HTML
  • Developed the Java Code using Eclipse as IDE
  • Developed JSPs and Servlets to dynamically generate HTML and display the data to the client side. Extensively used JSP tag libraries.
  • Worked with Struts as a unified MVC framework and developed Tag Libraries.
  • Used Struts framework in UI designing and validations
  • Developed Action Classes, which acts as the controller in Struts framework
  • Used PL/SQL to make complex queries and retrieve data from Oracle database
  • Used ANT scripts to build the application and deploy on Web Logic Application Server
  • Designed, written and maintained complex reusable methods which implements stored procedures to fetch data from the database
  • Prepare the Unit Test Case document / user handbook for test cases.

Environment: WebLogic, Actano (Cots product), Oracle, Webservices, SOAP, XSD, J2ee, Struts, ServletPLSQL Store Procedures, Unix

Confidential

Team Lead

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as requirement gathering, data modeling, analysis, architecture design & development for the project
  • Mentored a team of 3 in offshore
  • Created requirement / design document for the team in offshore.
  • Alchemy is a third-party tool used by Clorox Legal department to save all the legal documents. Which are scanned through MFC (Multi-functional printer)
  • Integration of MFC with the tool was very challenging and also ported the old documents which are stored in a different system, developing some java routines to read the PDF files.
  • Managed the client Expectation being it technical or communication and explained them to the offshore team.
  • Tracked all task assignment with the offshore and reported to the Client Manager.

Environment: J2ee, Oracle, MFP (Multi-Functional Printer)

Confidential

Team Member

Responsibilities:

  • Developed Proof of Concept to compare 2 different MQ series (Sonic MQ and IBM WebSphere MQ)
  • Developed high quality coding in J2ee using JSP / Servlets
  • Performed unit testing and bug fixes
  • Created documentation for all QA Process and maintained the same in SharePoint
  • Managed and controlled access to the code repositories
  • Resolved and troubleshoot problems and complex issues

Environment: J2EE, JMS, IBM Web sphere MQ

Confidential

Team Member

Responsibilities:

  • Involved in support of KRS System in Production and enhancement releases
  • Migrated the Application from IPlanet server to WebLogic.
  • Created Multi Lingual support for this application
  • Developed high quality coding in J2ee using JSP / Servlets
  • Performed unit testing and bug fixes
  • Created documentation for all QA Process and maintained the same in SharePoint
  • Managed and controlled access to the code repositories
  • Resolved and troubleshoot problems and complex issues

Environment: WebLogic, J2ee, JSP, Servlets, oracle, Iplanet, Unix, IDOL (Indexing Server)

We'd love your feedback!